SlideShare a Scribd company logo
1 of 28
Download to read offline
Big Trends
in

Big Data
2013 AITP Region-5 Technical Conference

-Naresh Chintalcheru
Agenda - Big Data Trends
●
●
●
●

Batch to Real Time
Sql, Sql, Sql …
Cloud Platform Support
Apache Hadoop 2.0
○
○
○

Improved Performance
Improved Scalability
Improved Security

● Applications
○
○

Pattern Discovery Analytics
Sophisticated Visualization

● BI & Data Warehouse
● Big Data Vision
Agenda - Big Data Trends
●
●
●
●

Batch to Real Time
Sql, Sql, Sql …
Cloud Platform Support
Hadoop 2.0
○
○
○

Improved Performance
Improved Scalability
Improved Security

● Applications
○
○

Pattern Discovery Analytics
Sophisticated Visualization

● BI & Data Warehouse
● Big Data Vision
Batch to Real Time

Changing image of Big Data from Batch to Real Time
Hadoop + MapReduce = Batch Processing
Batch to Real Time
● Companies need real time processing of Big Data for
various applications including online Fraud Detection,
CEP (Complex Event Processing) and more.
● Emerging new frameworks, architectures and tools are
making the real time processing dream come true.
Big Data Real-Time Computing Systems
● Twitter’s Storm is an open source, distributed, faulttolerant and real time computation system.
○ Storm is a stream processing system
○ Unlike Hadoop jobs Strom jobs never stop continue
to process data as it arrives
● Other Real Time systems include Streambase,
HStreaming, Apache S4, Dempsy and Esper.
Agenda - Big Data Trends
●
●
●
●

Batch to Real Time
Sql, Sql, Sql …
Cloud Platform Support
Hadoop 2.0
○
○
○

Improved Performance
Improved Scalability
Improved Security

● Applications
○
○

Pattern Discovery Analytics
Sophisticated Visualization

● BI & Data Warehouse
● Big Data Vision
Big Data Sql Tools
Big Data Processing include ...
● Writing complex Java MapReduce Jobs
● Apache Pig Latin scripting
● Slow Sql processing from Apache Hive
Big Data Sql Tools
Inspired with Google’s Dremel paper now many vendors
offer faster SQL based tools
● Google BigQuery
● Cloudera Impala
● IBM BigSql
● Greenplum HAWQ
● Hortonworks Stinger (Improve Hive Sql by x100)
● Apache Drill
Agenda - Big Data Trends
●
●
●
●

Batch to Real Time
Sql, Sql, Sql …
Cloud Platform Support
Hadoop 2.0
○
○
○

Improved Performance
Improved Scalability
Improved Security

● Applications
○
○

Pattern Discovery Analytics
Sophisticated Visualization

● BI & Data Warehouse
● Big Data Vision
Big Data And Cloud
Big Data needs many computing nodes for Data Storage
and Data Processing which are elastic in nature …
● Cloud VM based computing is a perfect solution for
Big Data infrastructure
● Public Cloud MegaStar Amazon AWS announced
support for Hadoop, which means spin off Hadoop
installed VM with basic configuration in 10mins
Agenda - Big Data Trends
●
●
●
●

Batch to Real Time
Sql, Sql, Sql …
Cloud Platform Support
Hadoop 2.0
○
○
○

Improved Performance
Improved Scalability
Improved Security

● Applications
○
○

Pattern Discovery Analytics
Sophisticated Visualization

● BI & Data Warehouse
● Big Data Vision
Hadoop 2.0
New in Hadoop 2x
● Improved Performance with YARN aka MapReduce 2.0
● Improved Scalability with HDFS Federation
● Support for Microsoft Windows
● Improved Security
● HDFS Snapshots
Hadoop 2.0 - Performance
Improved Performance with YARN aka MapReduce 2.0
● MapReduce JobTracker managed both Resource
management and App Job life-cycle together before.
● Now two functions are divided into separate
components.
● Application Master negotiates with global Resource
Manager for various Job requests
Hadoop 2.0 - Scalability
HDFS Federation
● No more single NameNode(NN) and SNN.
● HDFS Federation supports multiple independent
NameNodes and Namespaces.
● Each DataNode(DN) registers with all the NameNodes in
the cluster. DN sends periodic heartbeats & block
reports and handle commands from all NN.
Hadoop 2.0 - Security
Improved Security
● Enforcement of HDFS file permission by NN and Access
Control List (ACL) of users and groups
● Block Access Tokens for access control to Data block.
● Job Tokens to enforce Task authorization
● Network Encryption & Kerberos RPC. Now HDFS file
transfer can be configured for encryption
Hadoop 2.0 - HDFS Snapshots
Improved Backup & Disaster Recovery
● HDFS Snapshots are read-only point-in-time copies of
the file system.
● Snapshots can be taken on a subtree or entire file
system.
● Useful for data backup, protection against user errors
and disaster recovery
Agenda - Big Data Trends
●
●
●
●

Batch to Real Time
Sql, Sql, Sql …
Cloud Platform Support
Hadoop 2.0
○
○
○

Improved Performance
Improved Scalability
Improved Security

● Applications
○
○

Pattern Discovery Analytics
Sophisticated Visualization

● BI & Data Warehouse
● Big Data Vision
Big Data Applications
● Infrastructure layer of Big Data is largely solved (.........
secret Hadoop)
● Now the future innovation is focused on applications and
analytics
Big Data Analytic Applications
Pattern Discovery and Sense-Making based analytic
applications.
● Wibi Data: Lessons learned and predictive apps
● Recorded Future: Web intelligence for Business decisions
● Nutonian: Uncovers relationships hidden with in complex
data
● R Studio: Data analysis tool
Big Data - Visualization Applications
Sophisticated Big Data Visualization tools.
● IBM BigSheets
● D3.js
● Fathom
● Processing.org
Agenda - Big Data Trends
●
●
●
●

Batch to Real Time
Sql, Sql, Sql …
Cloud Platform Support
Hadoop 2.0
○
○
○

Improved Performance
Improved Scalability
Improved Security

● Applications
○
○

Pattern Discovery Analytics
Sophisticated Visualization

● BI & Data Warehouse
● Big Data Vision
Big Data & Business Intelligence
Support from various BI vendors IBM Cognos, SAP Business
Objects & Oracle Hyperion to connect directly to Hadoop Data
using Apache Hive connectors.
Big Data & Data Warehouse
Challenge of new multiple unstructured data sources such as
Clickstreams, Social media, Mobile, Sensors and Web Logs
requires massive processing and traditional data warehouse
cost to scale.
The Big question is data warehouse survive the Big Data ?
More on this in my next presentation :)
Agenda - Big Data Trends
●
●
●
●

Batch to Real Time
Sql, Sql, Sql …
Cloud Platform Support
Hadoop 2.0
○
○
○

Improved Performance
Improved Scalability
Improved Security

● Applications
○
○

Pattern Discovery Analytics
Sophisticated Visualization

● BI & Data Warehouse
● Big Data Vision
Big Data Vision

Big Data requires a Big Vision
Big Data requires Big Vision
● Unlike Business Intelligence, Big Data is an innovation
originated from the IT side.
● The Business departments, which should come up with Big
Data usage requirements needs constant coaching on the
potential of the Big Data intelligence and successful
stories.
Thank You
Feedback appreciated
Nash Chintalcheru
Chintal75@gmail.com
309-242-1615
Presentation pdf : www.slideshare.net/chintal75

More Related Content

What's hot

Analytics over Terabytes of Data at Twitter
Analytics over Terabytes of Data at TwitterAnalytics over Terabytes of Data at Twitter
Analytics over Terabytes of Data at TwitterImply
 
Apache Druid Vision and Roadmap
Apache Druid Vision and RoadmapApache Druid Vision and Roadmap
Apache Druid Vision and RoadmapImply
 
Archmage, Pinterest’s Real-time Analytics Platform on Druid
Archmage, Pinterest’s Real-time Analytics Platform on DruidArchmage, Pinterest’s Real-time Analytics Platform on Druid
Archmage, Pinterest’s Real-time Analytics Platform on DruidImply
 
Apache Druid: The Foundation of Fortune 500 “Analytical Decision-Making"
Apache Druid: The Foundation of Fortune 500 “Analytical Decision-Making"Apache Druid: The Foundation of Fortune 500 “Analytical Decision-Making"
Apache Druid: The Foundation of Fortune 500 “Analytical Decision-Making"Rommel Garcia
 
Splunk: Druid on Kubernetes with Druid-operator
Splunk: Druid on Kubernetes with Druid-operatorSplunk: Druid on Kubernetes with Druid-operator
Splunk: Druid on Kubernetes with Druid-operatorImply
 
Why data warehouses cannot support hot analytics
Why data warehouses cannot support hot analyticsWhy data warehouses cannot support hot analytics
Why data warehouses cannot support hot analyticsImply
 
Druid in Spot Instances
Druid in Spot InstancesDruid in Spot Instances
Druid in Spot InstancesImply
 
Zeotap: Data Modeling in Druid for Non temporal and Nested Data
Zeotap: Data Modeling in Druid for Non temporal and Nested DataZeotap: Data Modeling in Druid for Non temporal and Nested Data
Zeotap: Data Modeling in Druid for Non temporal and Nested DataImply
 
What’s New in Imply 3.3 & Apache Druid 0.18
What’s New in Imply 3.3 & Apache Druid 0.18What’s New in Imply 3.3 & Apache Druid 0.18
What’s New in Imply 3.3 & Apache Druid 0.18Imply
 
How Netflix Uses Druid in Real-time to Ensure a High Quality Streaming Experi...
How Netflix Uses Druid in Real-time to Ensure a High Quality Streaming Experi...How Netflix Uses Druid in Real-time to Ensure a High Quality Streaming Experi...
How Netflix Uses Druid in Real-time to Ensure a High Quality Streaming Experi...Imply
 
August meetup - All about Apache Druid
August meetup - All about Apache Druid August meetup - All about Apache Druid
August meetup - All about Apache Druid Imply
 
Benchmarking Apache Druid
Benchmarking Apache Druid Benchmarking Apache Druid
Benchmarking Apache Druid Matt Sarrel
 
Druid meetup 2018-03-13
Druid meetup 2018-03-13Druid meetup 2018-03-13
Druid meetup 2018-03-13gianmerlino
 
A Study Review of Common Big Data Architecture for Small-Medium Enterprise
A Study Review of Common Big Data Architecture for Small-Medium EnterpriseA Study Review of Common Big Data Architecture for Small-Medium Enterprise
A Study Review of Common Big Data Architecture for Small-Medium EnterpriseRidwan Fadjar
 
Apache Druid®: A Dance of Distributed Processes
 Apache Druid®: A Dance of Distributed Processes Apache Druid®: A Dance of Distributed Processes
Apache Druid®: A Dance of Distributed ProcessesImply
 
OSMC 2009 | Implementing a large monitoring infrastructure with Nagios and Ga...
OSMC 2009 | Implementing a large monitoring infrastructure with Nagios and Ga...OSMC 2009 | Implementing a large monitoring infrastructure with Nagios and Ga...
OSMC 2009 | Implementing a large monitoring infrastructure with Nagios and Ga...NETWAYS
 
Blue Pill/Red Pill: The Matrix of Thousands of Data Streams
Blue Pill/Red Pill: The Matrix of Thousands of Data StreamsBlue Pill/Red Pill: The Matrix of Thousands of Data Streams
Blue Pill/Red Pill: The Matrix of Thousands of Data StreamsDatabricks
 
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...Edwin Poot
 
High Performance and Scalable Geospatial Analytics on Cloud with Open Source
High Performance and Scalable Geospatial Analytics on Cloud with Open SourceHigh Performance and Scalable Geospatial Analytics on Cloud with Open Source
High Performance and Scalable Geospatial Analytics on Cloud with Open SourceDataWorks Summit
 
Druid Adoption Tips and Tricks
Druid Adoption Tips and TricksDruid Adoption Tips and Tricks
Druid Adoption Tips and TricksImply
 

What's hot (20)

Analytics over Terabytes of Data at Twitter
Analytics over Terabytes of Data at TwitterAnalytics over Terabytes of Data at Twitter
Analytics over Terabytes of Data at Twitter
 
Apache Druid Vision and Roadmap
Apache Druid Vision and RoadmapApache Druid Vision and Roadmap
Apache Druid Vision and Roadmap
 
Archmage, Pinterest’s Real-time Analytics Platform on Druid
Archmage, Pinterest’s Real-time Analytics Platform on DruidArchmage, Pinterest’s Real-time Analytics Platform on Druid
Archmage, Pinterest’s Real-time Analytics Platform on Druid
 
Apache Druid: The Foundation of Fortune 500 “Analytical Decision-Making"
Apache Druid: The Foundation of Fortune 500 “Analytical Decision-Making"Apache Druid: The Foundation of Fortune 500 “Analytical Decision-Making"
Apache Druid: The Foundation of Fortune 500 “Analytical Decision-Making"
 
Splunk: Druid on Kubernetes with Druid-operator
Splunk: Druid on Kubernetes with Druid-operatorSplunk: Druid on Kubernetes with Druid-operator
Splunk: Druid on Kubernetes with Druid-operator
 
Why data warehouses cannot support hot analytics
Why data warehouses cannot support hot analyticsWhy data warehouses cannot support hot analytics
Why data warehouses cannot support hot analytics
 
Druid in Spot Instances
Druid in Spot InstancesDruid in Spot Instances
Druid in Spot Instances
 
Zeotap: Data Modeling in Druid for Non temporal and Nested Data
Zeotap: Data Modeling in Druid for Non temporal and Nested DataZeotap: Data Modeling in Druid for Non temporal and Nested Data
Zeotap: Data Modeling in Druid for Non temporal and Nested Data
 
What’s New in Imply 3.3 & Apache Druid 0.18
What’s New in Imply 3.3 & Apache Druid 0.18What’s New in Imply 3.3 & Apache Druid 0.18
What’s New in Imply 3.3 & Apache Druid 0.18
 
How Netflix Uses Druid in Real-time to Ensure a High Quality Streaming Experi...
How Netflix Uses Druid in Real-time to Ensure a High Quality Streaming Experi...How Netflix Uses Druid in Real-time to Ensure a High Quality Streaming Experi...
How Netflix Uses Druid in Real-time to Ensure a High Quality Streaming Experi...
 
August meetup - All about Apache Druid
August meetup - All about Apache Druid August meetup - All about Apache Druid
August meetup - All about Apache Druid
 
Benchmarking Apache Druid
Benchmarking Apache Druid Benchmarking Apache Druid
Benchmarking Apache Druid
 
Druid meetup 2018-03-13
Druid meetup 2018-03-13Druid meetup 2018-03-13
Druid meetup 2018-03-13
 
A Study Review of Common Big Data Architecture for Small-Medium Enterprise
A Study Review of Common Big Data Architecture for Small-Medium EnterpriseA Study Review of Common Big Data Architecture for Small-Medium Enterprise
A Study Review of Common Big Data Architecture for Small-Medium Enterprise
 
Apache Druid®: A Dance of Distributed Processes
 Apache Druid®: A Dance of Distributed Processes Apache Druid®: A Dance of Distributed Processes
Apache Druid®: A Dance of Distributed Processes
 
OSMC 2009 | Implementing a large monitoring infrastructure with Nagios and Ga...
OSMC 2009 | Implementing a large monitoring infrastructure with Nagios and Ga...OSMC 2009 | Implementing a large monitoring infrastructure with Nagios and Ga...
OSMC 2009 | Implementing a large monitoring infrastructure with Nagios and Ga...
 
Blue Pill/Red Pill: The Matrix of Thousands of Data Streams
Blue Pill/Red Pill: The Matrix of Thousands of Data StreamsBlue Pill/Red Pill: The Matrix of Thousands of Data Streams
Blue Pill/Red Pill: The Matrix of Thousands of Data Streams
 
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
 
High Performance and Scalable Geospatial Analytics on Cloud with Open Source
High Performance and Scalable Geospatial Analytics on Cloud with Open SourceHigh Performance and Scalable Geospatial Analytics on Cloud with Open Source
High Performance and Scalable Geospatial Analytics on Cloud with Open Source
 
Druid Adoption Tips and Tricks
Druid Adoption Tips and TricksDruid Adoption Tips and Tricks
Druid Adoption Tips and Tricks
 

Viewers also liked

Lie Cheat & Steal to build Hyper-Fast Applications using Event-Driven Archite...
Lie Cheat & Steal to build Hyper-Fast Applications using Event-Driven Archite...Lie Cheat & Steal to build Hyper-Fast Applications using Event-Driven Archite...
Lie Cheat & Steal to build Hyper-Fast Applications using Event-Driven Archite...Naresh Chintalcheru
 
3rd Generation Web Application Platforms
3rd Generation Web Application Platforms3rd Generation Web Application Platforms
3rd Generation Web Application PlatformsNaresh Chintalcheru
 
Object-Oriented Polymorphism Unleashed
Object-Oriented Polymorphism UnleashedObject-Oriented Polymorphism Unleashed
Object-Oriented Polymorphism UnleashedNaresh Chintalcheru
 
Java7 New Features and Code Examples
Java7 New Features and Code ExamplesJava7 New Features and Code Examples
Java7 New Features and Code ExamplesNaresh Chintalcheru
 
Asynchronous Processing in Java/JEE/Spring
Asynchronous Processing in Java/JEE/SpringAsynchronous Processing in Java/JEE/Spring
Asynchronous Processing in Java/JEE/SpringNaresh Chintalcheru
 

Viewers also liked (6)

Lie Cheat & Steal to build Hyper-Fast Applications using Event-Driven Archite...
Lie Cheat & Steal to build Hyper-Fast Applications using Event-Driven Archite...Lie Cheat & Steal to build Hyper-Fast Applications using Event-Driven Archite...
Lie Cheat & Steal to build Hyper-Fast Applications using Event-Driven Archite...
 
3rd Generation Web Application Platforms
3rd Generation Web Application Platforms3rd Generation Web Application Platforms
3rd Generation Web Application Platforms
 
Object-Oriented Polymorphism Unleashed
Object-Oriented Polymorphism UnleashedObject-Oriented Polymorphism Unleashed
Object-Oriented Polymorphism Unleashed
 
Java7 New Features and Code Examples
Java7 New Features and Code ExamplesJava7 New Features and Code Examples
Java7 New Features and Code Examples
 
Asynchronous Processing in Java/JEE/Spring
Asynchronous Processing in Java/JEE/SpringAsynchronous Processing in Java/JEE/Spring
Asynchronous Processing in Java/JEE/Spring
 
Mule ESB Fundamentals
Mule ESB FundamentalsMule ESB Fundamentals
Mule ESB Fundamentals
 

Similar to Big Trends in Big Data

Vmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanVmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanJim Kaskade
 
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)Sascha Dittmann
 
Critical Breakthroughs and Challenges in Big Data and Analytics
Critical Breakthroughs and Challenges in Big Data and AnalyticsCritical Breakthroughs and Challenges in Big Data and Analytics
Critical Breakthroughs and Challenges in Big Data and AnalyticsData Driven Innovation
 
Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015
Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015
Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015Rajit Saha
 
Game Analytics at London Apache Druid Meetup
Game Analytics at London Apache Druid MeetupGame Analytics at London Apache Druid Meetup
Game Analytics at London Apache Druid MeetupJelena Zanko
 
Making Bank Predictive and Real-Time
Making Bank Predictive and Real-TimeMaking Bank Predictive and Real-Time
Making Bank Predictive and Real-TimeDataWorks Summit
 
AWS Summit Berlin 2013 - Big Data Analytics
AWS Summit Berlin 2013 - Big Data AnalyticsAWS Summit Berlin 2013 - Big Data Analytics
AWS Summit Berlin 2013 - Big Data AnalyticsAWS Germany
 
HIPAS UCP HSP Openstack Sascha Oehl
HIPAS UCP HSP Openstack Sascha OehlHIPAS UCP HSP Openstack Sascha Oehl
HIPAS UCP HSP Openstack Sascha OehlSascha Oehl
 
Understanding Hadoop
Understanding HadoopUnderstanding Hadoop
Understanding HadoopAhmed Ossama
 
1. beyond mission critical virtualizing big data and hadoop
1. beyond mission critical   virtualizing big data and hadoop1. beyond mission critical   virtualizing big data and hadoop
1. beyond mission critical virtualizing big data and hadoopChiou-Nan Chen
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for ExperimentationGleb Kanterov
 
Get Started Quickly with IBM's Hadoop as a Service
Get Started Quickly with IBM's Hadoop as a ServiceGet Started Quickly with IBM's Hadoop as a Service
Get Started Quickly with IBM's Hadoop as a ServiceIBM Cloud Data Services
 
Google Cloud Dataflow Two Worlds Become a Much Better One
Google Cloud Dataflow Two Worlds Become a Much Better OneGoogle Cloud Dataflow Two Worlds Become a Much Better One
Google Cloud Dataflow Two Worlds Become a Much Better OneDataWorks Summit
 
ds_Pivotal_Big_Data_Suite_Product_Suite
ds_Pivotal_Big_Data_Suite_Product_Suiteds_Pivotal_Big_Data_Suite_Product_Suite
ds_Pivotal_Big_Data_Suite_Product_SuiteRobin Fong 方俊强
 
Srikanth hadoop hyderabad_3.4yeras - copy
Srikanth hadoop hyderabad_3.4yeras - copySrikanth hadoop hyderabad_3.4yeras - copy
Srikanth hadoop hyderabad_3.4yeras - copysrikanth K
 
Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013
Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013
Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013Amazon Web Services
 

Similar to Big Trends in Big Data (20)

Vmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanVmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps Ironfan
 
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)
 
Critical Breakthroughs and Challenges in Big Data and Analytics
Critical Breakthroughs and Challenges in Big Data and AnalyticsCritical Breakthroughs and Challenges in Big Data and Analytics
Critical Breakthroughs and Challenges in Big Data and Analytics
 
Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015
Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015
Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015
 
Game Analytics at London Apache Druid Meetup
Game Analytics at London Apache Druid MeetupGame Analytics at London Apache Druid Meetup
Game Analytics at London Apache Druid Meetup
 
Modern Thinking área digital MSKM 21/09/2017
Modern Thinking área digital MSKM 21/09/2017Modern Thinking área digital MSKM 21/09/2017
Modern Thinking área digital MSKM 21/09/2017
 
Making Bank Predictive and Real-Time
Making Bank Predictive and Real-TimeMaking Bank Predictive and Real-Time
Making Bank Predictive and Real-Time
 
Data Platform on GCP
Data Platform on GCPData Platform on GCP
Data Platform on GCP
 
AWS Summit Berlin 2013 - Big Data Analytics
AWS Summit Berlin 2013 - Big Data AnalyticsAWS Summit Berlin 2013 - Big Data Analytics
AWS Summit Berlin 2013 - Big Data Analytics
 
HIPAS UCP HSP Openstack Sascha Oehl
HIPAS UCP HSP Openstack Sascha OehlHIPAS UCP HSP Openstack Sascha Oehl
HIPAS UCP HSP Openstack Sascha Oehl
 
Understanding Hadoop
Understanding HadoopUnderstanding Hadoop
Understanding Hadoop
 
1. beyond mission critical virtualizing big data and hadoop
1. beyond mission critical   virtualizing big data and hadoop1. beyond mission critical   virtualizing big data and hadoop
1. beyond mission critical virtualizing big data and hadoop
 
Ibm db2 big sql
Ibm db2 big sqlIbm db2 big sql
Ibm db2 big sql
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for Experimentation
 
Get Started Quickly with IBM's Hadoop as a Service
Get Started Quickly with IBM's Hadoop as a ServiceGet Started Quickly with IBM's Hadoop as a Service
Get Started Quickly with IBM's Hadoop as a Service
 
Google Cloud Dataflow Two Worlds Become a Much Better One
Google Cloud Dataflow Two Worlds Become a Much Better OneGoogle Cloud Dataflow Two Worlds Become a Much Better One
Google Cloud Dataflow Two Worlds Become a Much Better One
 
ds_Pivotal_Big_Data_Suite_Product_Suite
ds_Pivotal_Big_Data_Suite_Product_Suiteds_Pivotal_Big_Data_Suite_Product_Suite
ds_Pivotal_Big_Data_Suite_Product_Suite
 
Srikanth hadoop hyderabad_3.4yeras - copy
Srikanth hadoop hyderabad_3.4yeras - copySrikanth hadoop hyderabad_3.4yeras - copy
Srikanth hadoop hyderabad_3.4yeras - copy
 
Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013
Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013
Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013
 
GDSC Cloud Jam.pptx
GDSC Cloud Jam.pptxGDSC Cloud Jam.pptx
GDSC Cloud Jam.pptx
 

More from Naresh Chintalcheru

Bimodal IT for Speed and Innovation
Bimodal IT for Speed and InnovationBimodal IT for Speed and Innovation
Bimodal IT for Speed and InnovationNaresh Chintalcheru
 
Introduction to Node.js Platform
Introduction to Node.js PlatformIntroduction to Node.js Platform
Introduction to Node.js PlatformNaresh Chintalcheru
 
Problems opening SOA to the Online Web Applications
Problems opening SOA to the Online Web ApplicationsProblems opening SOA to the Online Web Applications
Problems opening SOA to the Online Web ApplicationsNaresh Chintalcheru
 
Design & Develop Batch Applications in Java/JEE
Design & Develop Batch Applications in Java/JEEDesign & Develop Batch Applications in Java/JEE
Design & Develop Batch Applications in Java/JEENaresh Chintalcheru
 
Building Next Generation Real-Time Web Applications using Websockets
Building Next Generation Real-Time Web Applications using WebsocketsBuilding Next Generation Real-Time Web Applications using Websockets
Building Next Generation Real-Time Web Applications using WebsocketsNaresh Chintalcheru
 
Automation Testing using Selenium
Automation Testing using SeleniumAutomation Testing using Selenium
Automation Testing using SeleniumNaresh Chintalcheru
 
Design & Development of Web Applications using SpringMVC
Design & Development of Web Applications using SpringMVC Design & Development of Web Applications using SpringMVC
Design & Development of Web Applications using SpringMVC Naresh Chintalcheru
 

More from Naresh Chintalcheru (10)

Cars.com Journey to AWS Cloud
Cars.com Journey to AWS CloudCars.com Journey to AWS Cloud
Cars.com Journey to AWS Cloud
 
Bimodal IT for Speed and Innovation
Bimodal IT for Speed and InnovationBimodal IT for Speed and Innovation
Bimodal IT for Speed and Innovation
 
Reactive systems
Reactive systemsReactive systems
Reactive systems
 
Introduction to Node.js Platform
Introduction to Node.js PlatformIntroduction to Node.js Platform
Introduction to Node.js Platform
 
Problems opening SOA to the Online Web Applications
Problems opening SOA to the Online Web ApplicationsProblems opening SOA to the Online Web Applications
Problems opening SOA to the Online Web Applications
 
Design & Develop Batch Applications in Java/JEE
Design & Develop Batch Applications in Java/JEEDesign & Develop Batch Applications in Java/JEE
Design & Develop Batch Applications in Java/JEE
 
Building Next Generation Real-Time Web Applications using Websockets
Building Next Generation Real-Time Web Applications using WebsocketsBuilding Next Generation Real-Time Web Applications using Websockets
Building Next Generation Real-Time Web Applications using Websockets
 
Automation Testing using Selenium
Automation Testing using SeleniumAutomation Testing using Selenium
Automation Testing using Selenium
 
Design & Development of Web Applications using SpringMVC
Design & Development of Web Applications using SpringMVC Design & Development of Web Applications using SpringMVC
Design & Development of Web Applications using SpringMVC
 
Android Platform Architecture
Android Platform ArchitectureAndroid Platform Architecture
Android Platform Architecture
 

Recently uploaded

Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 

Recently uploaded (20)

Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 

Big Trends in Big Data

  • 1. Big Trends in Big Data 2013 AITP Region-5 Technical Conference -Naresh Chintalcheru
  • 2. Agenda - Big Data Trends ● ● ● ● Batch to Real Time Sql, Sql, Sql … Cloud Platform Support Apache Hadoop 2.0 ○ ○ ○ Improved Performance Improved Scalability Improved Security ● Applications ○ ○ Pattern Discovery Analytics Sophisticated Visualization ● BI & Data Warehouse ● Big Data Vision
  • 3. Agenda - Big Data Trends ● ● ● ● Batch to Real Time Sql, Sql, Sql … Cloud Platform Support Hadoop 2.0 ○ ○ ○ Improved Performance Improved Scalability Improved Security ● Applications ○ ○ Pattern Discovery Analytics Sophisticated Visualization ● BI & Data Warehouse ● Big Data Vision
  • 4. Batch to Real Time Changing image of Big Data from Batch to Real Time Hadoop + MapReduce = Batch Processing
  • 5. Batch to Real Time ● Companies need real time processing of Big Data for various applications including online Fraud Detection, CEP (Complex Event Processing) and more. ● Emerging new frameworks, architectures and tools are making the real time processing dream come true.
  • 6. Big Data Real-Time Computing Systems ● Twitter’s Storm is an open source, distributed, faulttolerant and real time computation system. ○ Storm is a stream processing system ○ Unlike Hadoop jobs Strom jobs never stop continue to process data as it arrives ● Other Real Time systems include Streambase, HStreaming, Apache S4, Dempsy and Esper.
  • 7. Agenda - Big Data Trends ● ● ● ● Batch to Real Time Sql, Sql, Sql … Cloud Platform Support Hadoop 2.0 ○ ○ ○ Improved Performance Improved Scalability Improved Security ● Applications ○ ○ Pattern Discovery Analytics Sophisticated Visualization ● BI & Data Warehouse ● Big Data Vision
  • 8. Big Data Sql Tools Big Data Processing include ... ● Writing complex Java MapReduce Jobs ● Apache Pig Latin scripting ● Slow Sql processing from Apache Hive
  • 9. Big Data Sql Tools Inspired with Google’s Dremel paper now many vendors offer faster SQL based tools ● Google BigQuery ● Cloudera Impala ● IBM BigSql ● Greenplum HAWQ ● Hortonworks Stinger (Improve Hive Sql by x100) ● Apache Drill
  • 10. Agenda - Big Data Trends ● ● ● ● Batch to Real Time Sql, Sql, Sql … Cloud Platform Support Hadoop 2.0 ○ ○ ○ Improved Performance Improved Scalability Improved Security ● Applications ○ ○ Pattern Discovery Analytics Sophisticated Visualization ● BI & Data Warehouse ● Big Data Vision
  • 11. Big Data And Cloud Big Data needs many computing nodes for Data Storage and Data Processing which are elastic in nature … ● Cloud VM based computing is a perfect solution for Big Data infrastructure ● Public Cloud MegaStar Amazon AWS announced support for Hadoop, which means spin off Hadoop installed VM with basic configuration in 10mins
  • 12. Agenda - Big Data Trends ● ● ● ● Batch to Real Time Sql, Sql, Sql … Cloud Platform Support Hadoop 2.0 ○ ○ ○ Improved Performance Improved Scalability Improved Security ● Applications ○ ○ Pattern Discovery Analytics Sophisticated Visualization ● BI & Data Warehouse ● Big Data Vision
  • 13. Hadoop 2.0 New in Hadoop 2x ● Improved Performance with YARN aka MapReduce 2.0 ● Improved Scalability with HDFS Federation ● Support for Microsoft Windows ● Improved Security ● HDFS Snapshots
  • 14. Hadoop 2.0 - Performance Improved Performance with YARN aka MapReduce 2.0 ● MapReduce JobTracker managed both Resource management and App Job life-cycle together before. ● Now two functions are divided into separate components. ● Application Master negotiates with global Resource Manager for various Job requests
  • 15. Hadoop 2.0 - Scalability HDFS Federation ● No more single NameNode(NN) and SNN. ● HDFS Federation supports multiple independent NameNodes and Namespaces. ● Each DataNode(DN) registers with all the NameNodes in the cluster. DN sends periodic heartbeats & block reports and handle commands from all NN.
  • 16. Hadoop 2.0 - Security Improved Security ● Enforcement of HDFS file permission by NN and Access Control List (ACL) of users and groups ● Block Access Tokens for access control to Data block. ● Job Tokens to enforce Task authorization ● Network Encryption & Kerberos RPC. Now HDFS file transfer can be configured for encryption
  • 17. Hadoop 2.0 - HDFS Snapshots Improved Backup & Disaster Recovery ● HDFS Snapshots are read-only point-in-time copies of the file system. ● Snapshots can be taken on a subtree or entire file system. ● Useful for data backup, protection against user errors and disaster recovery
  • 18. Agenda - Big Data Trends ● ● ● ● Batch to Real Time Sql, Sql, Sql … Cloud Platform Support Hadoop 2.0 ○ ○ ○ Improved Performance Improved Scalability Improved Security ● Applications ○ ○ Pattern Discovery Analytics Sophisticated Visualization ● BI & Data Warehouse ● Big Data Vision
  • 19. Big Data Applications ● Infrastructure layer of Big Data is largely solved (......... secret Hadoop) ● Now the future innovation is focused on applications and analytics
  • 20. Big Data Analytic Applications Pattern Discovery and Sense-Making based analytic applications. ● Wibi Data: Lessons learned and predictive apps ● Recorded Future: Web intelligence for Business decisions ● Nutonian: Uncovers relationships hidden with in complex data ● R Studio: Data analysis tool
  • 21. Big Data - Visualization Applications Sophisticated Big Data Visualization tools. ● IBM BigSheets ● D3.js ● Fathom ● Processing.org
  • 22. Agenda - Big Data Trends ● ● ● ● Batch to Real Time Sql, Sql, Sql … Cloud Platform Support Hadoop 2.0 ○ ○ ○ Improved Performance Improved Scalability Improved Security ● Applications ○ ○ Pattern Discovery Analytics Sophisticated Visualization ● BI & Data Warehouse ● Big Data Vision
  • 23. Big Data & Business Intelligence Support from various BI vendors IBM Cognos, SAP Business Objects & Oracle Hyperion to connect directly to Hadoop Data using Apache Hive connectors.
  • 24. Big Data & Data Warehouse Challenge of new multiple unstructured data sources such as Clickstreams, Social media, Mobile, Sensors and Web Logs requires massive processing and traditional data warehouse cost to scale. The Big question is data warehouse survive the Big Data ? More on this in my next presentation :)
  • 25. Agenda - Big Data Trends ● ● ● ● Batch to Real Time Sql, Sql, Sql … Cloud Platform Support Hadoop 2.0 ○ ○ ○ Improved Performance Improved Scalability Improved Security ● Applications ○ ○ Pattern Discovery Analytics Sophisticated Visualization ● BI & Data Warehouse ● Big Data Vision
  • 26. Big Data Vision Big Data requires a Big Vision
  • 27. Big Data requires Big Vision ● Unlike Business Intelligence, Big Data is an innovation originated from the IT side. ● The Business departments, which should come up with Big Data usage requirements needs constant coaching on the potential of the Big Data intelligence and successful stories.
  • 28. Thank You Feedback appreciated Nash Chintalcheru Chintal75@gmail.com 309-242-1615 Presentation pdf : www.slideshare.net/chintal75