Be the first to like this
Scalability has been an essential factor for any kind of computational algorithm while considering its performance. In this Big Data era, gathering of large amounts of data is becoming easy. Data analysis on Big Data is not feasible using the existing Machine Learning (ML) algorithms and it perceives them to perform poorly. This is due to the fact that the computational logic for these algorithms is previously designed in sequential way. MapReduce becomes the solution for handling billions of data efficiently. In this report we discuss the basic building block for the computations behind ML algorithms, two different attempts to parallelize machine learning algorithms using MapReduce and a brief description on the overhead in parallelization of ML algorithms.