Additional Readings

  • Zikopoulos, P. and Eaton, C., 2011. Understanding big data: Analytics for enterprise class hadoop and streaming data. McGraw-Hill Osborne Media.
  • Pentreath, N., 2015. Machine Learning with Spark. Packt Publishing Ltd.
  • Kotsiantis, S.B., Zaharakis, I. and Pintelas, P., 2007. Supervised machine learning: A review of classification techniques. Emerging artificial intelligence applications in computer engineering160, pp.3-24.
  • Random Forests,
  • Logistic Regression,


  • Breiman, L. (2001). Random forests. Machine Learning, 45, 5-32.
  • Chen, C.P. and Zhang, C.Y., 2014. Data-intensive applications, challenges, techniques and technologies: A survey on Big Data. Information Sciences275, pp.314-347.
  • Chen, H., Chiang, R.H. and Storey, V.C., 2012. Business intelligence and analytics: from big data to big impact. MIS quarterly, pp.1165-1188.
  • Chen, M., Mao, S. and Liu, Y., 2014. Big data: A survey. Mobile Networks and Applications19(2), pp.171-209.
  • Cruz, J. A., & Wishart, D. S. (2006). Applications of machine learning in cancer prediction and prognosis. Cancer informatics, 2, 117693510600200030.
  • Dietrich, D., B. Heller, and B. Yang. "Data Science and Big Data Analytics: Discovering." Analyzing, Visualizing and Presenting Data (2015).
  • Gandomi, A. and Haider, M., 2015. Beyond the hype: Big data concepts, methods, and analytics. International Journal of Information Management35(2), pp.137-144.
  • Faruk, A., & Cahyono, E. S. (2018). Prediction and Classification of Low Birth Weight Data Using Machine Learning Techniques. Indonesian Journal of Science and Technology3(1), 18-28.
  • Fayyad, U.M., 1996. Data mining and knowledge discovery: Making sense out of data. IEEE Expert: Intelligent Systems and Their Applications11(5), pp.20-25.
  • Hans-Hermann, B.O.C.K., 2008. Origins and extensions of the k-means algorithm in cluster analysis. Journal Electronique d’Histoire des Probabilités et de la Statistique Electronic Journal for History of Probability and Statistics4(2).
  • Hota, J., 2013. Adoption of in-memory analytics. CSI Communications, pp.20-22.
  • Khan, N., Yaqoob, I., Hashem, I. A. T., Inayat, Z., Ali, M., Kamaleldin, W., ... & Gani, A. (2014). Big data: survey, technologies, opportunities, and challenges. The Scientific World Journal, 2014.
  • Liao, S.H., Chu, P.H. and Hsiao, P.Y., 2012. Data mining techniques and applications–A decade review from 2000 to 2011. Expert systems with applications39(12), pp.11303-11311.
  • Najafabadi, M. M., Villanustre, F., Khoshgoftaar, T. M., Seliya, N., Wald, R., & Muharemagic, E. (2015). Deep learning applications and challenges in big data analytics. Journal of Big Data2(1), 1. 
  • Prajapati, V., 2013. Big data analytics with R and Hadoop. Packt Publishing Ltd.
  • Samuel, A., 2000. Some studies in machine learning using the game of checkers. IBM Journal of research and development, 44(1.2):206–226.
  • Satish, N., Sundaram, N., Patwary, M.M.A., Seo, J., Park, J., Hassaan, M.A., Sengupta, S., Yin, Z. and Dubey, P., 2014, June. Navigating the maze of graph analytics frameworks using massive graph datasets. In Proceedings of the 2014 ACM SIGMOD international conference on Management of data (pp. 979-990). ACM.
  • Smith, L.I., 2002. A tutorial on principal components analysis.
  • Qiu, J., Wu, Q., Ding, G., Xu, Y., & Feng, S. (2016). A survey of machine learning for big data processing. EURASIP Journal on Advances in Signal Processing, 2016(1), 67.
  • Tutorial Point 2015, Available from:  [Last accessed: March 2018]

Last modified: Thursday, 14 June 2018, 2:16 AM