NoSQL : approach to data management and database design that's useful for very large sets of distributed data. Hadoop: free, Java-based programming framework that supports the processing of large data sets in a distributed computing environment Map Reduce: software framework that allows developers to write programs that process massive amounts of unstructured data in parallel across a distributed cluster of processor s or stand-alone computers. Map, a function that parcels out work to different nodes in the distributed cluster. Reduce, another function that collates the work and resolves the results into a single value.
No need to explain Mention some company names
The Next Frontier for Innovation, Competition and Productivity
• ‘Big Data’ is similar to ‘small data’, butbigger•…but having data bigger it requires differentapproaches:• Techniques, tools and architecture•…with an aim to solve new problems• …or old problems in a better way
Why Big Data• Key enablers of appearance and growth of Big Dataare–Increase of storage capacities–Increase of processing power–Availability of data–Every day we create 2.5 quintillion bytes of data;90% of the data in the world today has beencreated in the last two years alone
Big Data Analytics• Examining large amount of data• Appropriate information• Identification of hidden patterns, unknown correlations• Competitive advantage• Better business decisions: strategic and operational• Effective marketing, customer satisfaction, increasedrevenue
Applications for Big Data AnalyticsHomeland SecurityFinanceSmarter HealthcareMulti-channel salesTelecomManufacturingTraffic ControlTrading Analytics Fraud and RiskLog AnalysisSearch QualityRetail: Churn, NBO
Healthcare• 80% of medical data is unstructured and is clinicallyrelevant• Data resides in multiple places like individual EMRs,lab and imaging systems, physician notes, medicalcorrespondence, claims etc• Leveraging Big Data• Build sustainable healthcare systems• Collaborate to improve care and outcomes• Increase access to healthcare
Market SizeSource: Wikibon Taming Big DataBy 2015 4.4 million IT jobs in Big Data ; 1.9 million is in US itself
India – Big Data• Gaining attraction• Huge market opportunities for IT services (82.9% ofrevenues) and analytics firms (17.1 % )• Current market size is $200 million. By 2015 $1billion• The opportunity for Indian service providers lies inoffering services around Big Data implementationand analytics for global multinationals
India will require a minimum of 1 lakh data scientists in the next couple of yearsin addition to data analysts and data managers to support the Big Data space.
NoSQL : non-relational or at least non-SQL databasesolutions such as HBase (also a part of the Hadoopecosystem), Cassandra, MongoDB, Riak, CouchDB, andmany others.Hadoop: It is an ecosystem of software packages,including MapReduce, HDFS, and a whole host of othersoftware packages