Real-time Analytics in Bigdata
Ecosystem
ACCESS DATA REAL-TIME
What is Big Data?
Big data is a term that describes the large volume of
data – both structured and unstructured – that
inundates a business on a day-to-day basis. But it’s
not the amount of data that’s important. It’s what
organizations do with the data that matters. Big
data can be analyzed for insights that lead to better
decisions and strategic business moves.
Data is Exploding
Today, every 2 minutes we
are generating same
amount of data that was
created from the
beginning of time until the
year 2000.
Every minute we spend over
200 million emails, generate
almost 2 million Facebook
likes, send over 250 thousand
tweets, and upload over
20,000 photos on Facebook.
Over 90% of all the data
in the world was
created in the past 18
months.
Google alone
processes 40
thousand
search queries
per second,
making it over
3.5 billion in a
single day.
Over 100 hours of video are
uploaded on YouTube every minute
and it would take you around 15
years to watch every video uploaded
by users in one day.
If you burned all the
data created in just
one day onto DVDs,
you could stack
them on each other
and reach the moon
– twice.
The number of bits of
information stored in the
digital universe is thought to
have exceeded the number of
stats in the physical universe
in 2007.
The big data industry is expected to
grow from US $10.2 billion in 2013 to
about US $54.3 billion by 2017.
TECHNOLOGIES FOR
REAL-TIME
ANALYTICS SOLUTION
Apache Kafka
Fast, scalable, and durable
Based on modern-cluster
centric design
Handles hundreds of
megabytes of reads and
writes per second
Designed to allow a single
cluster to serve
Apache Storm
Free, open-source, distributed,
and real-time computation
system
Simple and can be used with any
programming language
Fast, guaranteed data
processing, easy to set up and
operate
Integrates with queuing and
database technologies
Spark
Open-source, distributed
computing framework
Addresses critical challenges to
advanced analytics in Hadoop
Supports in-memory processing
and is faster than MapReduce
Offers integrated framework for
advanced analytics
Druid
Open-source infrastructure for
real-time exploratory analytics
Druid’s real-time nodes employ
lock-free ingestion for append-
only data sets
Leverages memory mapping
capabilities and uses distributed
architecture
Druid offers multi-dimensional
filtering
Companies and their Big Data
Solutions
WHAT THE COMPANIES OFFER
Enterprise big data initiatives face a massive
challenge in processing and pulling value out
of volume. But, the right big data services can
process huge volumes of data to extract the
kind of actionable insights that can truly drive
a business forward.
Big data analytics accelerators and aggregators
Partnerships and alliances with major big data solutions vendors
Big data maturity roadmaps and reference architecture
Starting point to endpoint implementation assessments
Industry-specific key performance indicator (KPI) toolkits
Innovative industry frameworks tailored for specific industry needs
Big data labs and Centers of Excellence (CoEs) across multiple locations that
focus on product evaluation and performance benchmarking
Employee count: 15,000+
www.mindtree.com
Technology used:
In-house experts use technology, proven frameworks and
tools and domain expertise to turn problems into
successful business outcomes, delivering data
visualization, enterprise data management, business
intelligence and data analytic solutions under one
umbrella.
Central to Cognizant's strategy around
discovering and driving business value in big
data is our innovative suite of solutions. Each
leverages big data technologies to deliver
enhanced insight and analytics to various
industries.
Solution accelerators
Big data lab on demand
Idea to implementation
Data visualization and analytics
Technology evaluation and piloting
Big data strategy and roadmap definition
Employee count: 100000+
www.cognizant.com
Technology used:
• Big Data Analytics Value Assessment (BAVA) Framework
• iSMART (integrated Social Media Analytics and Reporting Tool
• SCOREL (stock correlation analytics)
• SmartNode
• Hadoop
Cybage’s expertise covers an array of relevant
tooling, frameworks, and building blocks. The pre-
verified and gaps-addressed core Hadoop
frameworks remove the guesswork out of
implementation. The Big Data insights, and cloud
infrastructure has made it imperative for products
and services to create and deliver experiences
through digital channels and infrastructure.
Coordinated infrastructure and workflow frameworks
Quick Analytics
NoSQL databases: MongoDB, Cassandra, HBase, and Neo4j
Distributed log processing: Flume, Scribe, and Chu​kwa
Hadoop-focused QA: Comprehensive big data verification, cluster
benchmarking, and performance tuning
Specialized test methodology: Purpose-engineered statistical test methodology
for big data solution verification
Focused big data test team: Dedicated QA Architect and big data test team
Employee count: 5,000+
www.cybage.com
Technology used:
Sqoop, Hive, PentaHo, SSRS, Cognos, and Qlikview,
Hadoop
To help organizations make sense of their data,
Persistent has developed ShareInsights – A unique
platform that allows organizations to analyze an
overlay of enterprise data with public or cloud
sources to derive meaningful insights. An open
platform, ShareInsights enables users to mine
meaningful insights from the data sources that
matter to them and share them with a wide
audience. Users can quickly and easily on-board
new use cases and summarize large volumes of
unstructured data.
Multi-Faceted Data allowing user to gain interesting insights
Quick Analytics
Seamlessly share insights on Facebook or the ShareInsights Gallery
Library of algorithms and integration with third party datasets, including public
datasets
Built-in visualizations
Drill down capabilities to find particular behavior
Analyzes unstructured text
Employee count: 8,000+
www.persistent.com
Technology used:
Hadoop, Sqoop, SciDB
Benefits of Big Data
WHAT YOU CAN ACHIEVE WITH BIG DATA
Big Data
Dialogue
with
Consumers
New
Products &
Services
Risk
Analysis
Faster and
Better
Reduced
Cost
Customize
in Real
Time

Real-time Analytics in Big data

  • 1.
    Real-time Analytics inBigdata Ecosystem ACCESS DATA REAL-TIME
  • 2.
    What is BigData? Big data is a term that describes the large volume of data – both structured and unstructured – that inundates a business on a day-to-day basis. But it’s not the amount of data that’s important. It’s what organizations do with the data that matters. Big data can be analyzed for insights that lead to better decisions and strategic business moves.
  • 3.
    Data is Exploding Today,every 2 minutes we are generating same amount of data that was created from the beginning of time until the year 2000. Every minute we spend over 200 million emails, generate almost 2 million Facebook likes, send over 250 thousand tweets, and upload over 20,000 photos on Facebook. Over 90% of all the data in the world was created in the past 18 months. Google alone processes 40 thousand search queries per second, making it over 3.5 billion in a single day. Over 100 hours of video are uploaded on YouTube every minute and it would take you around 15 years to watch every video uploaded by users in one day. If you burned all the data created in just one day onto DVDs, you could stack them on each other and reach the moon – twice. The number of bits of information stored in the digital universe is thought to have exceeded the number of stats in the physical universe in 2007. The big data industry is expected to grow from US $10.2 billion in 2013 to about US $54.3 billion by 2017.
  • 4.
  • 5.
    Apache Kafka Fast, scalable,and durable Based on modern-cluster centric design Handles hundreds of megabytes of reads and writes per second Designed to allow a single cluster to serve Apache Storm Free, open-source, distributed, and real-time computation system Simple and can be used with any programming language Fast, guaranteed data processing, easy to set up and operate Integrates with queuing and database technologies Spark Open-source, distributed computing framework Addresses critical challenges to advanced analytics in Hadoop Supports in-memory processing and is faster than MapReduce Offers integrated framework for advanced analytics Druid Open-source infrastructure for real-time exploratory analytics Druid’s real-time nodes employ lock-free ingestion for append- only data sets Leverages memory mapping capabilities and uses distributed architecture Druid offers multi-dimensional filtering
  • 6.
    Companies and theirBig Data Solutions WHAT THE COMPANIES OFFER
  • 7.
    Enterprise big datainitiatives face a massive challenge in processing and pulling value out of volume. But, the right big data services can process huge volumes of data to extract the kind of actionable insights that can truly drive a business forward. Big data analytics accelerators and aggregators Partnerships and alliances with major big data solutions vendors Big data maturity roadmaps and reference architecture Starting point to endpoint implementation assessments Industry-specific key performance indicator (KPI) toolkits Innovative industry frameworks tailored for specific industry needs Big data labs and Centers of Excellence (CoEs) across multiple locations that focus on product evaluation and performance benchmarking Employee count: 15,000+ www.mindtree.com Technology used: In-house experts use technology, proven frameworks and tools and domain expertise to turn problems into successful business outcomes, delivering data visualization, enterprise data management, business intelligence and data analytic solutions under one umbrella.
  • 8.
    Central to Cognizant'sstrategy around discovering and driving business value in big data is our innovative suite of solutions. Each leverages big data technologies to deliver enhanced insight and analytics to various industries. Solution accelerators Big data lab on demand Idea to implementation Data visualization and analytics Technology evaluation and piloting Big data strategy and roadmap definition Employee count: 100000+ www.cognizant.com Technology used: • Big Data Analytics Value Assessment (BAVA) Framework • iSMART (integrated Social Media Analytics and Reporting Tool • SCOREL (stock correlation analytics) • SmartNode • Hadoop
  • 9.
    Cybage’s expertise coversan array of relevant tooling, frameworks, and building blocks. The pre- verified and gaps-addressed core Hadoop frameworks remove the guesswork out of implementation. The Big Data insights, and cloud infrastructure has made it imperative for products and services to create and deliver experiences through digital channels and infrastructure. Coordinated infrastructure and workflow frameworks Quick Analytics NoSQL databases: MongoDB, Cassandra, HBase, and Neo4j Distributed log processing: Flume, Scribe, and Chu​kwa Hadoop-focused QA: Comprehensive big data verification, cluster benchmarking, and performance tuning Specialized test methodology: Purpose-engineered statistical test methodology for big data solution verification Focused big data test team: Dedicated QA Architect and big data test team Employee count: 5,000+ www.cybage.com Technology used: Sqoop, Hive, PentaHo, SSRS, Cognos, and Qlikview, Hadoop
  • 10.
    To help organizationsmake sense of their data, Persistent has developed ShareInsights – A unique platform that allows organizations to analyze an overlay of enterprise data with public or cloud sources to derive meaningful insights. An open platform, ShareInsights enables users to mine meaningful insights from the data sources that matter to them and share them with a wide audience. Users can quickly and easily on-board new use cases and summarize large volumes of unstructured data. Multi-Faceted Data allowing user to gain interesting insights Quick Analytics Seamlessly share insights on Facebook or the ShareInsights Gallery Library of algorithms and integration with third party datasets, including public datasets Built-in visualizations Drill down capabilities to find particular behavior Analyzes unstructured text Employee count: 8,000+ www.persistent.com Technology used: Hadoop, Sqoop, SciDB
  • 11.
    Benefits of BigData WHAT YOU CAN ACHIEVE WITH BIG DATA
  • 12.