Big Data involves tapping into diverse data sets to find unknown relationships and make data-driven business decisions. The process involves acquiring all available data, organizing and analyzing the large data using massive parallelism, and making real-time decisions based on the insights gleaned. However, accurately capturing data is challenging, and failures to do so consistently can render later analysis meaningless due to garbage in resulting in garbage out.