MapReduceINTRODUCTION:Seek time is improving more slowly than transfer rate. If the data access pattern is dominatedby seeks, it will take longer to read or write large portions of the dataset than streaming throughit, which operates at the transfer rate. That’s why, MapReduce is here to serve.MAPREDUCE:Some key concepts on MapReduce are laid down below: Linearly scalable programming model. Updating a small portion of records RDBMS Updating the majority of records MapReduce MapReduce is good for Ad hoc analysis. MapReduce is suitable for applications where the data is written once, read many times, whereas RDBMS is good for datasets that are continually updated. MapReduce works well on unstructured or semi-structured data. [Structured Data: A defined format, e.g. XML documents. Semi-structured Data: There may be a schema; it is often ignored so it may be used only as a guide to the structure of the data, e.g. a Spreadsheet. Unstructured Data: Does not have any particular internal structure, e.g. Plain text or Image data.] MapReduce interprets the data at processing time. Keys and Values of MapReduce are chosen by the person analyzing the data. Use of high-level query languages, e.g. Pig, Hive etc. If double the size of the input data, a job will run twice as slow. But, if you also double the size of the cluster, a job will run as fast as the original one – true for MapReduce, not for SQL Queries. “Data Locality” is the heart of MapReduce.
CONCLUSION:Here, we have tried to figure out some specialties of MapReduce which makes it a far betteroption in handling Big Data. Still the document is somewhat abstract for sure. Some moreanalysis will clarify the concepts.