Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Glossary of Big Data Terms


Published on

Reference Table for Hadoop and Big Data, Includes references for bytes, to megabytes to terabytes to petabytes as well as key Big Data Terms such as HDFS, HBase and Hadoop.

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

Glossary of Big Data Terms

  1. 1. Top Big Data TermsTerm DefinitionHadoop Open-source software framework that supports the running of applicationson large clusters of commodity hardware. Hadoop is written in Java.HDFS Stands for Hadoop Distributed File System. HDFS is a distributed file systemthat stores large files across multiple machines. The system replicates dataacross multiple machines and understand what data is being processed whenand by whomMapReduce MapReduce is a programming model for processing large data sets with aparallel, distributed algorithm on a cluster. Its Map() procedure filters andsorts and its Reduce() procedure performs summary operations.Hive A Data Warehouse infrastructure built on top of Hadoop for providing datasummarization, query, and analysis.Hbase HBase is an open source, non-relational, distributed database and runs ontop of HDFS.Cassandra Apache Cassandra is an open source distributed database managementsystem designed to handle very large amounts of data spread out acrossmany commodity servers.Source: Wikipedia (mainly)
  2. 2. Sizes that MatterName Value Example1 Bit = The smallest unit of data that a computer uses. It can be usedto represent two states of information, such as Yes or No.1 Byte = 8 Bits. A Byte can represent 256 states of information. 1 Bytecould be equal to one character. 10 Bytes could be equal to aword. 100 Bytes would equal an average sentence.1 kilobyte (kB) 1024 bytes 1 Kilobyte would be equal to a paragraph.1 megabyte (MB) 1024 kB 3-1/2 inch floppy disks can hold 1.44 Megabytes or theequivalent of a small book. 600 Megabytes is about theamount of data that will fit on a CD-ROM disk.1 gigabyte (GB) 1024 MB 1GB could hold the contents of about 10 yards of books .1 terabyte (TB) 1024 GB 1 TB could hold 1,000 copies of the Encyclopedia Britannica.1 petabyte (PB) 1024 TB 500 million floppy disks1 exabyte (EB) 1024 PB 5 Exabytes could = all of the words ever spoken by mankind.1 zettabyte (ZB) 1024 PB ?Source: