SlideShare a Scribd company logo
Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com
Applications of BIG Data in
Social Media Networks
Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com
Introduction to Big Data
Big data is the term for a collection of data sets so
large and complex that it becomes difficult to
process using on-hand database management
tools or traditional data processing applications
Systems / Enterprises generate huge amount of
data from Terabytes to Petabytes
Understanding Big Data can give you innovative
solutions to various queries.
Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com
Hadoop in Social Media Networks
Facebook
Twitter
LinkedIn
Pinterest
Instagram
StumbleUpon
Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com
Facebook
Facebook is one of Hadoop and Big Data's biggest champion and is rumored to operate the
single largest HDFS cluster in the world comprising on 100 petabytes of disk space. The site
stores more than 250 billion photos, with 350 million new ones uploaded every day.
Hive, the data warehousing infrastructure Facebook helped develop, is central to meeting the
company's reporting needs. Facebook uses Hive for faster querying on graph tools.
Hadoop is used in every Facebook product and in a variety of ways. User actions such as a "like"
or a status update are stored in a highly distributed, customized MySQL database but
applications such as Facebook Messaging run on top of HBase, Hadoop's NoSQL database
framework.
All messages sent on desktop or mobile are stored in HBase. Additionally, the company uses
Hadoop and Hive to generate reports for third-party developers and advertisers who need to
track the success of their applications or campaigns.
Get Started with BIG Data & Hadoop
Applications of Hadoop in Social Media Networks
Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com
Get Started with BIG Data & Hadoop
Applications of Hadoop in Social Media Networks
Twitter
Twitter uses Hadoop, especially Pig and Hbase for Data analysis. Twitter has large data storage
and processing requirements, therefore it uses Hadoop to optimize data storage and workflow
solutions.
It uses Hadoop and Pig layers on top of LZO files which are compressed for quick analysis of
data. Due to this, it has almost completely negated infrastructure failures.
Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com
Applications of Hadoop in Social Media Networks
LinkedIn
LinkedIn crunches more than 120 billion relationships per day and uses Hadoop capabilities to
blend large scale data computation with high volume, low latency site serving.
The ‘People You May Know’ feature is the result of scanning through billions of user triggers &
activities via a pipeline of 82 MapReduce jobs each processing 16TB of data. This job uses a
statistical model to predict the probability of two people knowing each other. It uses bloom
filters to speed up large joins, yielding a 10x performance improvement.
LinkedIn also builds an index structure in their Hadoop pipeline - this produces a multi-TB
lookup structure that uses perfect hashing (requiring only 2.5 bits per key). This process results
in faster server - cluster responses; it takes LinkedIn about 90 minutes to build a 900 GB data
store on a 45 node development cluster.
Get Started with BIG Data & Hadoop
Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com
Applications of Hadoop in Social Media Networks
Pinterest
Pinterest’s engineering team has created a self-serving platform — comprised of home grown,
open source and commercial tools — that hooks in with Hadoop to orchestrate and process all
of the company’s enormous amounts of data, consisting of 30 billion pins.
They extensively utilize MapReduce for the processing of stored data which is synced with an
AWS cloud, wherein all their data is currently stored.
Instagram
Instagram operates one of the world’s largest Hadoop clusters and very likely the world’s
largest MySQL implementation. It has developed everything from a PHP-optimized platform to
a NoSQL database to a tool for auto-provisioning and configuring tens of thousands of servers
wherein all the images are stored & processed..
Get Started with BIG Data & Hadoop
Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com
Applications of Hadoop in Social Media Networks
StumbleUpon
StumbleUpon uses Hadoop to make the best recommendations. Every thumb-up, stumble and
share (among other feedback from users) is stored so it can make better decisions about what
pages to show you next.
StumbleUpon uses Hbase extensively because of its cost-effectiveness, fast data retrieval
capabilities and dependability.
Get Started with BIG Data & Hadoop
© 2015 BlueCamphor Technologies (P) Ltd.
How Are Insights Derived from BIG
Data?
Say Hello to Hadoop Development :)
Get Started with BIG Data & Hadoop
Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com
Introduction to Hadoop
Apache Hadoop is a framework that allows the distributed processing of large data sets across clusters of
commodity computers using a simple programming model.
It is an Open-Source Data Management technology with extensive storage and distributed processing.
Hadoop
Characteristics
Flexible
Reliable
Economical
Scalable Get Started with BIG Data & Hadoop
Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com
Why Should You Learn Hadoop?
• Hadoop is a modern BIG Data processing framework which utilizes distributed computing clusters.
• It can process, manage and store unstructured data with absolute ease.
• Due to its scalability & effectiveness, companies are heavily adopting Hadoop.
• It has become the standard for storing, processing and analyzing big data.
• BIG Data & Hadoop professionals are in extremely high demand.
• This is the first step to becoming a data scientist or data architect.
Get Started with BIG Data & Hadoop
Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com
Reports – 2015
Here are some of the other Reports that convey the same message:
• “It is expected that Hadoop and Big Data Analytics will grow to around $13.9 billion from 2012 to 2017.”
(Markets and Markets Research Report)
• “The Data market currently with the fastest growth are Hadoop and NoSQL software and services.”
(Technology Research Organization, Wikibon)
• “The Big Data market will likely hit $23.8 billion with a 31.7% rise per year.” (IDC Report)
• “Almost 90% organizations have embarked on Hadoop related projects and thus Hadoop skills are in huge
demand.” (According to the Big Data Executive Survey 2013)
• “Companies hiring Hadoop professionals Analytics professionals obtain a 250 percent hike in their
salaries.” (According to Analytics Industry Report)
Gartner had predicted, “Hadoop will be in most advanced analytics products by 2015.” By far and large, this
prediction has turned out to be true. Hadoop is “The Technology” to harness Big Data
Get Started with BIG Data & Hadoop
Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com
Predictions for 2015
According to the Forrester Report:
• A new data economy will arise, Hadooponomics. Due to its vast data prowess, enterprise adoption of
Hadoop will become compulsory. All those companies, which were unsure of adopting Hadoop till 2014,
are bound to adopt it in 2015
• The surge in demand for Hadoop job professionals will be fulfilled by Java professionals & other software
engineers who will upgrade their skills
• The creation of SQL-on-Hadoop options will further enhance usability & adoption
• Due to the introduction of Yarn (Version 2.0), the usage of Hadoop will go beyond analytics. It will
transform into an application platform giving birth to new frameworks in the Hadoop ecosystem
• New Hadoop sources will emerge from organizations like Oracle, HP, Tibco and SAP. Vendors such as Red
Hat, VMware and Microsoft will include Hadoop in their operating Systems
Get Started with BIG Data & Hadoop
Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com
Why SkillSpeed?
Course
Curriculum
from Industry
Experts
Instructor Led
Live Virtual
Sessions
Lifetime
access to
Course
Content via
LMS
100%
Placement
Assistance
24x7 Support
Get Started with BIG Data & Hadoop
Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com
Course Topics
Module 1
Introduction to Big
Data and Hadoop
Module 2
HDFS Internals, Hadoop
Configurations and
Data Loading
Module 3
Introduction to Map
Reduce
Module 4
Advanced Map Reduce
Concepts
Module 5
Introduction to Pig
Module 6
Advanced Pig and
Introduction to Hive
Module 7
Advanced Hive
Concepts
Module 8
Extending Hive and
HBase Introduction
Module 9
Advanced HBase and
Oozie Introduction
Module 10
Project Set-up
Discussion
Get Started with BIG Data & Hadoop
Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com
Corporate Partners
Get Started with BIG Data & Hadoop
Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com
Lines open 24/7
To know more about the course, Please contact:
IND +91-90660-20904 USA 1866-607-6547 (Toll Free)
Or reach us at
sales@skillspeed.com
Contact Us
BIG Data & Hadoop Applications in Social Media

More Related Content

What's hot

Ceph Performance and Sizing Guide
Ceph Performance and Sizing GuideCeph Performance and Sizing Guide
Ceph Performance and Sizing Guide
Jose De La Rosa
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
Philippe Julio
 
Graylog Engineering - Design Your Architecture
Graylog Engineering - Design Your ArchitectureGraylog Engineering - Design Your Architecture
Graylog Engineering - Design Your Architecture
Graylog
 
Hadoop Infrastructure @Uber Past, Present and Future
Hadoop Infrastructure @Uber Past, Present and FutureHadoop Infrastructure @Uber Past, Present and Future
Hadoop Infrastructure @Uber Past, Present and Future
DataWorks Summit
 
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DBDistributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
YugabyteDB
 
Thinking Big - Big data: principes et architecture
Thinking Big - Big data: principes et architecture Thinking Big - Big data: principes et architecture
Thinking Big - Big data: principes et architecture
Lilia Sfaxi
 
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Simplilearn
 
Big data analytics in banking sector
Big data analytics in banking sectorBig data analytics in banking sector
Big data analytics in banking sector
Anil Rana
 
Big data analysis using map/reduce
Big data analysis using map/reduceBig data analysis using map/reduce
Big data analysis using map/reduce
RenuSuren
 
Big Data
Big DataBig Data
Big Data
NGDATA
 
1.mysql disk io 모니터링 및 분석사례
1.mysql disk io 모니터링 및 분석사례1.mysql disk io 모니터링 및 분석사례
1.mysql disk io 모니터링 및 분석사례
I Goo Lee
 
Google Big Table
Google Big TableGoogle Big Table
Google Big Table
Omar Al-Sabek
 
Big data
Big dataBig data
Big data
Ami Redwan Haq
 
Big Data Analytics with Spark
Big Data Analytics with SparkBig Data Analytics with Spark
Big Data Analytics with Spark
Mohammed Guller
 
Mapreduce by examples
Mapreduce by examplesMapreduce by examples
Mapreduce by examples
Andrea Iacono
 
Galera Cluster Best Practices for DBA's and DevOps Part 1
Galera Cluster Best Practices for DBA's and DevOps Part 1Galera Cluster Best Practices for DBA's and DevOps Part 1
Galera Cluster Best Practices for DBA's and DevOps Part 1
Codership Oy - Creators of Galera Cluster
 
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Edureka!
 
Apache Ignite vs Alluxio: Memory Speed Big Data Analytics
Apache Ignite vs Alluxio: Memory Speed Big Data AnalyticsApache Ignite vs Alluxio: Memory Speed Big Data Analytics
Apache Ignite vs Alluxio: Memory Speed Big Data Analytics
DataWorks Summit
 
MySQL High Availability Solutions
MySQL High Availability SolutionsMySQL High Availability Solutions
MySQL High Availability Solutions
Mydbops
 
Snowflake Datawarehouse Architecturing
Snowflake Datawarehouse ArchitecturingSnowflake Datawarehouse Architecturing
Snowflake Datawarehouse Architecturing
Ishan Bhawantha Hewanayake
 

What's hot (20)

Ceph Performance and Sizing Guide
Ceph Performance and Sizing GuideCeph Performance and Sizing Guide
Ceph Performance and Sizing Guide
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 
Graylog Engineering - Design Your Architecture
Graylog Engineering - Design Your ArchitectureGraylog Engineering - Design Your Architecture
Graylog Engineering - Design Your Architecture
 
Hadoop Infrastructure @Uber Past, Present and Future
Hadoop Infrastructure @Uber Past, Present and FutureHadoop Infrastructure @Uber Past, Present and Future
Hadoop Infrastructure @Uber Past, Present and Future
 
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DBDistributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
 
Thinking Big - Big data: principes et architecture
Thinking Big - Big data: principes et architecture Thinking Big - Big data: principes et architecture
Thinking Big - Big data: principes et architecture
 
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
 
Big data analytics in banking sector
Big data analytics in banking sectorBig data analytics in banking sector
Big data analytics in banking sector
 
Big data analysis using map/reduce
Big data analysis using map/reduceBig data analysis using map/reduce
Big data analysis using map/reduce
 
Big Data
Big DataBig Data
Big Data
 
1.mysql disk io 모니터링 및 분석사례
1.mysql disk io 모니터링 및 분석사례1.mysql disk io 모니터링 및 분석사례
1.mysql disk io 모니터링 및 분석사례
 
Google Big Table
Google Big TableGoogle Big Table
Google Big Table
 
Big data
Big dataBig data
Big data
 
Big Data Analytics with Spark
Big Data Analytics with SparkBig Data Analytics with Spark
Big Data Analytics with Spark
 
Mapreduce by examples
Mapreduce by examplesMapreduce by examples
Mapreduce by examples
 
Galera Cluster Best Practices for DBA's and DevOps Part 1
Galera Cluster Best Practices for DBA's and DevOps Part 1Galera Cluster Best Practices for DBA's and DevOps Part 1
Galera Cluster Best Practices for DBA's and DevOps Part 1
 
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
 
Apache Ignite vs Alluxio: Memory Speed Big Data Analytics
Apache Ignite vs Alluxio: Memory Speed Big Data AnalyticsApache Ignite vs Alluxio: Memory Speed Big Data Analytics
Apache Ignite vs Alluxio: Memory Speed Big Data Analytics
 
MySQL High Availability Solutions
MySQL High Availability SolutionsMySQL High Availability Solutions
MySQL High Availability Solutions
 
Snowflake Datawarehouse Architecturing
Snowflake Datawarehouse ArchitecturingSnowflake Datawarehouse Architecturing
Snowflake Datawarehouse Architecturing
 

Viewers also liked

Process Scheduling on Hadoop at Expedia
Process Scheduling on Hadoop at ExpediaProcess Scheduling on Hadoop at Expedia
Process Scheduling on Hadoop at Expedia
huguk
 
Pinterest hadoop summit_talk
Pinterest hadoop summit_talkPinterest hadoop summit_talk
Pinterest hadoop summit_talk
Krishna Gade
 
August 2016 HUG: Open Source Big Data Ingest with StreamSets Data Collector
August 2016 HUG: Open Source Big Data Ingest with StreamSets Data Collector August 2016 HUG: Open Source Big Data Ingest with StreamSets Data Collector
August 2016 HUG: Open Source Big Data Ingest with StreamSets Data Collector
Yahoo Developer Network
 
August 2016 HUG: Recent development in Apache Oozie
August 2016 HUG: Recent development in Apache OozieAugust 2016 HUG: Recent development in Apache Oozie
August 2016 HUG: Recent development in Apache Oozie
Yahoo Developer Network
 
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...
Yahoo Developer Network
 
Hadoop Platform at Yahoo
Hadoop Platform at YahooHadoop Platform at Yahoo
Hadoop Platform at Yahoo
DataWorks Summit/Hadoop Summit
 

Viewers also liked (6)

Process Scheduling on Hadoop at Expedia
Process Scheduling on Hadoop at ExpediaProcess Scheduling on Hadoop at Expedia
Process Scheduling on Hadoop at Expedia
 
Pinterest hadoop summit_talk
Pinterest hadoop summit_talkPinterest hadoop summit_talk
Pinterest hadoop summit_talk
 
August 2016 HUG: Open Source Big Data Ingest with StreamSets Data Collector
August 2016 HUG: Open Source Big Data Ingest with StreamSets Data Collector August 2016 HUG: Open Source Big Data Ingest with StreamSets Data Collector
August 2016 HUG: Open Source Big Data Ingest with StreamSets Data Collector
 
August 2016 HUG: Recent development in Apache Oozie
August 2016 HUG: Recent development in Apache OozieAugust 2016 HUG: Recent development in Apache Oozie
August 2016 HUG: Recent development in Apache Oozie
 
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...
 
Hadoop Platform at Yahoo
Hadoop Platform at YahooHadoop Platform at Yahoo
Hadoop Platform at Yahoo
 

Similar to BIG Data & Hadoop Applications in Social Media

BIG Data & Hadoop Applications in Logistics
BIG Data & Hadoop Applications in LogisticsBIG Data & Hadoop Applications in Logistics
BIG Data & Hadoop Applications in Logistics
Skillspeed
 
IBM Smarter Analytics
IBM Smarter AnalyticsIBM Smarter Analytics
IBM Smarter Analytics
Adrian Turcu
 
Big data an elephant business opportunities
Big data an elephant   business opportunitiesBig data an elephant   business opportunities
Big data an elephant business opportunities
Bigdata Meetup Kochi
 
BIG Data & Hadoop Applications in E-Commerce
BIG Data & Hadoop Applications in E-CommerceBIG Data & Hadoop Applications in E-Commerce
BIG Data & Hadoop Applications in E-Commerce
Skillspeed
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
POSSCON
 
The Big Picture on Big Data and Cognos
The Big Picture on Big Data and CognosThe Big Picture on Big Data and Cognos
The Big Picture on Big Data and Cognos
Senturus
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
Hortonworks
 
Hadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - Jaspersoft
Hortonworks
 
Top Trends in Building Data Lakes for Machine Learning and AI
Top Trends in Building Data Lakes for Machine Learning and AI Top Trends in Building Data Lakes for Machine Learning and AI
Top Trends in Building Data Lakes for Machine Learning and AI
Holden Ackerman
 
Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
Hadoop Hive Tutorial | Hive Fundamentals | Hive ArchitectureHadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
Skillspeed
 
Big Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsBig Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential Tools
FredReynolds2
 
Big Data & Open Source - Neil Jadhav
Big Data & Open Source - Neil JadhavBig Data & Open Source - Neil Jadhav
Big Data & Open Source - Neil Jadhav
Swapnil (Neil) Jadhav
 
BIG Data & Hadoop Applications in Retail
BIG Data & Hadoop Applications in RetailBIG Data & Hadoop Applications in Retail
BIG Data & Hadoop Applications in Retail
Skillspeed
 
Hadoop for Finance - sample chapter
Hadoop for Finance - sample chapterHadoop for Finance - sample chapter
Hadoop for Finance - sample chapter
Rajiv Tiwari
 
R and Big Data using Revolution R Enterprise with Hadoop
R and Big Data using Revolution R Enterprise with HadoopR and Big Data using Revolution R Enterprise with Hadoop
R and Big Data using Revolution R Enterprise with Hadoop
Revolution Analytics
 
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu BariApache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
jaxconf
 
Future of Enterprise PaaS (Cloud Foundry Summit 2014)
 Future of Enterprise PaaS (Cloud Foundry Summit 2014) Future of Enterprise PaaS (Cloud Foundry Summit 2014)
Future of Enterprise PaaS (Cloud Foundry Summit 2014)
VMware Tanzu
 
Top 10 renowned big data companies
Top 10 renowned big data companiesTop 10 renowned big data companies
Top 10 renowned big data companies
Robert Smith
 
Moustafa Soliman "HP Vertica- Solving Facebook Big Data challenges"
Moustafa Soliman "HP Vertica- Solving Facebook Big Data challenges" Moustafa Soliman "HP Vertica- Solving Facebook Big Data challenges"
Moustafa Soliman "HP Vertica- Solving Facebook Big Data challenges"
Dataconomy Media
 
Future of Enterprise PaaS
Future of Enterprise PaaSFuture of Enterprise PaaS
Future of Enterprise PaaS
SAP Technology
 

Similar to BIG Data & Hadoop Applications in Social Media (20)

BIG Data & Hadoop Applications in Logistics
BIG Data & Hadoop Applications in LogisticsBIG Data & Hadoop Applications in Logistics
BIG Data & Hadoop Applications in Logistics
 
IBM Smarter Analytics
IBM Smarter AnalyticsIBM Smarter Analytics
IBM Smarter Analytics
 
Big data an elephant business opportunities
Big data an elephant   business opportunitiesBig data an elephant   business opportunities
Big data an elephant business opportunities
 
BIG Data & Hadoop Applications in E-Commerce
BIG Data & Hadoop Applications in E-CommerceBIG Data & Hadoop Applications in E-Commerce
BIG Data & Hadoop Applications in E-Commerce
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
The Big Picture on Big Data and Cognos
The Big Picture on Big Data and CognosThe Big Picture on Big Data and Cognos
The Big Picture on Big Data and Cognos
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
 
Hadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - Jaspersoft
 
Top Trends in Building Data Lakes for Machine Learning and AI
Top Trends in Building Data Lakes for Machine Learning and AI Top Trends in Building Data Lakes for Machine Learning and AI
Top Trends in Building Data Lakes for Machine Learning and AI
 
Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
Hadoop Hive Tutorial | Hive Fundamentals | Hive ArchitectureHadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
 
Big Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsBig Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential Tools
 
Big Data & Open Source - Neil Jadhav
Big Data & Open Source - Neil JadhavBig Data & Open Source - Neil Jadhav
Big Data & Open Source - Neil Jadhav
 
BIG Data & Hadoop Applications in Retail
BIG Data & Hadoop Applications in RetailBIG Data & Hadoop Applications in Retail
BIG Data & Hadoop Applications in Retail
 
Hadoop for Finance - sample chapter
Hadoop for Finance - sample chapterHadoop for Finance - sample chapter
Hadoop for Finance - sample chapter
 
R and Big Data using Revolution R Enterprise with Hadoop
R and Big Data using Revolution R Enterprise with HadoopR and Big Data using Revolution R Enterprise with Hadoop
R and Big Data using Revolution R Enterprise with Hadoop
 
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu BariApache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
 
Future of Enterprise PaaS (Cloud Foundry Summit 2014)
 Future of Enterprise PaaS (Cloud Foundry Summit 2014) Future of Enterprise PaaS (Cloud Foundry Summit 2014)
Future of Enterprise PaaS (Cloud Foundry Summit 2014)
 
Top 10 renowned big data companies
Top 10 renowned big data companiesTop 10 renowned big data companies
Top 10 renowned big data companies
 
Moustafa Soliman "HP Vertica- Solving Facebook Big Data challenges"
Moustafa Soliman "HP Vertica- Solving Facebook Big Data challenges" Moustafa Soliman "HP Vertica- Solving Facebook Big Data challenges"
Moustafa Soliman "HP Vertica- Solving Facebook Big Data challenges"
 
Future of Enterprise PaaS
Future of Enterprise PaaSFuture of Enterprise PaaS
Future of Enterprise PaaS
 

More from Skillspeed

Run Your First Hadoop 2.x Program
Run Your First Hadoop 2.x ProgramRun Your First Hadoop 2.x Program
Run Your First Hadoop 2.x Program
Skillspeed
 
Sentiment Analysis via R Programming
Sentiment Analysis via R ProgrammingSentiment Analysis via R Programming
Sentiment Analysis via R Programming
Skillspeed
 
Predicting Consumer Behaviour via Hadoop
Predicting Consumer Behaviour via HadoopPredicting Consumer Behaviour via Hadoop
Predicting Consumer Behaviour via Hadoop
Skillspeed
 
Top 5 Tasks Of A Hadoop Developer Webinar
Top 5 Tasks Of A Hadoop Developer WebinarTop 5 Tasks Of A Hadoop Developer Webinar
Top 5 Tasks Of A Hadoop Developer Webinar
Skillspeed
 
Decoding Puppet & Jenkins via DevOps
Decoding Puppet & Jenkins via DevOpsDecoding Puppet & Jenkins via DevOps
Decoding Puppet & Jenkins via DevOps
Skillspeed
 
Skillspeed Affiliate Program
Skillspeed Affiliate ProgramSkillspeed Affiliate Program
Skillspeed Affiliate Program
Skillspeed
 
Python and BIG Data analytics | Python Fundamentals | Python Architecture
Python and BIG Data analytics | Python Fundamentals | Python ArchitecturePython and BIG Data analytics | Python Fundamentals | Python Architecture
Python and BIG Data analytics | Python Fundamentals | Python Architecture
Skillspeed
 
BIG Data & Hadoop Applications in Healthcare
BIG Data & Hadoop Applications in HealthcareBIG Data & Hadoop Applications in Healthcare
BIG Data & Hadoop Applications in Healthcare
Skillspeed
 
Hadoop for Business Intelligence Professionals
Hadoop for Business Intelligence ProfessionalsHadoop for Business Intelligence Professionals
Hadoop for Business Intelligence Professionals
Skillspeed
 
BIG Data & Hadoop Applications in Finance
BIG Data & Hadoop Applications in FinanceBIG Data & Hadoop Applications in Finance
BIG Data & Hadoop Applications in Finance
Skillspeed
 
Introduction to MapReduce | MapReduce Architecture | MapReduce Fundamentals
Introduction to MapReduce | MapReduce Architecture | MapReduce FundamentalsIntroduction to MapReduce | MapReduce Architecture | MapReduce Fundamentals
Introduction to MapReduce | MapReduce Architecture | MapReduce Fundamentals
Skillspeed
 
Introduction to Pig | Pig Architecture | Pig Fundamentals
Introduction to Pig | Pig Architecture | Pig FundamentalsIntroduction to Pig | Pig Architecture | Pig Fundamentals
Introduction to Pig | Pig Architecture | Pig Fundamentals
Skillspeed
 
HDFS & MapReduce
HDFS & MapReduceHDFS & MapReduce
HDFS & MapReduce
Skillspeed
 

More from Skillspeed (13)

Run Your First Hadoop 2.x Program
Run Your First Hadoop 2.x ProgramRun Your First Hadoop 2.x Program
Run Your First Hadoop 2.x Program
 
Sentiment Analysis via R Programming
Sentiment Analysis via R ProgrammingSentiment Analysis via R Programming
Sentiment Analysis via R Programming
 
Predicting Consumer Behaviour via Hadoop
Predicting Consumer Behaviour via HadoopPredicting Consumer Behaviour via Hadoop
Predicting Consumer Behaviour via Hadoop
 
Top 5 Tasks Of A Hadoop Developer Webinar
Top 5 Tasks Of A Hadoop Developer WebinarTop 5 Tasks Of A Hadoop Developer Webinar
Top 5 Tasks Of A Hadoop Developer Webinar
 
Decoding Puppet & Jenkins via DevOps
Decoding Puppet & Jenkins via DevOpsDecoding Puppet & Jenkins via DevOps
Decoding Puppet & Jenkins via DevOps
 
Skillspeed Affiliate Program
Skillspeed Affiliate ProgramSkillspeed Affiliate Program
Skillspeed Affiliate Program
 
Python and BIG Data analytics | Python Fundamentals | Python Architecture
Python and BIG Data analytics | Python Fundamentals | Python ArchitecturePython and BIG Data analytics | Python Fundamentals | Python Architecture
Python and BIG Data analytics | Python Fundamentals | Python Architecture
 
BIG Data & Hadoop Applications in Healthcare
BIG Data & Hadoop Applications in HealthcareBIG Data & Hadoop Applications in Healthcare
BIG Data & Hadoop Applications in Healthcare
 
Hadoop for Business Intelligence Professionals
Hadoop for Business Intelligence ProfessionalsHadoop for Business Intelligence Professionals
Hadoop for Business Intelligence Professionals
 
BIG Data & Hadoop Applications in Finance
BIG Data & Hadoop Applications in FinanceBIG Data & Hadoop Applications in Finance
BIG Data & Hadoop Applications in Finance
 
Introduction to MapReduce | MapReduce Architecture | MapReduce Fundamentals
Introduction to MapReduce | MapReduce Architecture | MapReduce FundamentalsIntroduction to MapReduce | MapReduce Architecture | MapReduce Fundamentals
Introduction to MapReduce | MapReduce Architecture | MapReduce Fundamentals
 
Introduction to Pig | Pig Architecture | Pig Fundamentals
Introduction to Pig | Pig Architecture | Pig FundamentalsIntroduction to Pig | Pig Architecture | Pig Fundamentals
Introduction to Pig | Pig Architecture | Pig Fundamentals
 
HDFS & MapReduce
HDFS & MapReduceHDFS & MapReduce
HDFS & MapReduce
 

Recently uploaded

Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 

Recently uploaded (20)

Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 

BIG Data & Hadoop Applications in Social Media

  • 1. Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Applications of BIG Data in Social Media Networks
  • 2. Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Introduction to Big Data Big data is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications Systems / Enterprises generate huge amount of data from Terabytes to Petabytes Understanding Big Data can give you innovative solutions to various queries.
  • 3. Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Hadoop in Social Media Networks Facebook Twitter LinkedIn Pinterest Instagram StumbleUpon
  • 4. Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Facebook Facebook is one of Hadoop and Big Data's biggest champion and is rumored to operate the single largest HDFS cluster in the world comprising on 100 petabytes of disk space. The site stores more than 250 billion photos, with 350 million new ones uploaded every day. Hive, the data warehousing infrastructure Facebook helped develop, is central to meeting the company's reporting needs. Facebook uses Hive for faster querying on graph tools. Hadoop is used in every Facebook product and in a variety of ways. User actions such as a "like" or a status update are stored in a highly distributed, customized MySQL database but applications such as Facebook Messaging run on top of HBase, Hadoop's NoSQL database framework. All messages sent on desktop or mobile are stored in HBase. Additionally, the company uses Hadoop and Hive to generate reports for third-party developers and advertisers who need to track the success of their applications or campaigns. Get Started with BIG Data & Hadoop Applications of Hadoop in Social Media Networks
  • 5. Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Get Started with BIG Data & Hadoop Applications of Hadoop in Social Media Networks Twitter Twitter uses Hadoop, especially Pig and Hbase for Data analysis. Twitter has large data storage and processing requirements, therefore it uses Hadoop to optimize data storage and workflow solutions. It uses Hadoop and Pig layers on top of LZO files which are compressed for quick analysis of data. Due to this, it has almost completely negated infrastructure failures.
  • 6. Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Applications of Hadoop in Social Media Networks LinkedIn LinkedIn crunches more than 120 billion relationships per day and uses Hadoop capabilities to blend large scale data computation with high volume, low latency site serving. The ‘People You May Know’ feature is the result of scanning through billions of user triggers & activities via a pipeline of 82 MapReduce jobs each processing 16TB of data. This job uses a statistical model to predict the probability of two people knowing each other. It uses bloom filters to speed up large joins, yielding a 10x performance improvement. LinkedIn also builds an index structure in their Hadoop pipeline - this produces a multi-TB lookup structure that uses perfect hashing (requiring only 2.5 bits per key). This process results in faster server - cluster responses; it takes LinkedIn about 90 minutes to build a 900 GB data store on a 45 node development cluster. Get Started with BIG Data & Hadoop
  • 7. Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Applications of Hadoop in Social Media Networks Pinterest Pinterest’s engineering team has created a self-serving platform — comprised of home grown, open source and commercial tools — that hooks in with Hadoop to orchestrate and process all of the company’s enormous amounts of data, consisting of 30 billion pins. They extensively utilize MapReduce for the processing of stored data which is synced with an AWS cloud, wherein all their data is currently stored. Instagram Instagram operates one of the world’s largest Hadoop clusters and very likely the world’s largest MySQL implementation. It has developed everything from a PHP-optimized platform to a NoSQL database to a tool for auto-provisioning and configuring tens of thousands of servers wherein all the images are stored & processed.. Get Started with BIG Data & Hadoop
  • 8. Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Applications of Hadoop in Social Media Networks StumbleUpon StumbleUpon uses Hadoop to make the best recommendations. Every thumb-up, stumble and share (among other feedback from users) is stored so it can make better decisions about what pages to show you next. StumbleUpon uses Hbase extensively because of its cost-effectiveness, fast data retrieval capabilities and dependability. Get Started with BIG Data & Hadoop
  • 9. © 2015 BlueCamphor Technologies (P) Ltd. How Are Insights Derived from BIG Data? Say Hello to Hadoop Development :) Get Started with BIG Data & Hadoop
  • 10. Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Introduction to Hadoop Apache Hadoop is a framework that allows the distributed processing of large data sets across clusters of commodity computers using a simple programming model. It is an Open-Source Data Management technology with extensive storage and distributed processing. Hadoop Characteristics Flexible Reliable Economical Scalable Get Started with BIG Data & Hadoop
  • 11. Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Why Should You Learn Hadoop? • Hadoop is a modern BIG Data processing framework which utilizes distributed computing clusters. • It can process, manage and store unstructured data with absolute ease. • Due to its scalability & effectiveness, companies are heavily adopting Hadoop. • It has become the standard for storing, processing and analyzing big data. • BIG Data & Hadoop professionals are in extremely high demand. • This is the first step to becoming a data scientist or data architect. Get Started with BIG Data & Hadoop
  • 12. Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Reports – 2015 Here are some of the other Reports that convey the same message: • “It is expected that Hadoop and Big Data Analytics will grow to around $13.9 billion from 2012 to 2017.” (Markets and Markets Research Report) • “The Data market currently with the fastest growth are Hadoop and NoSQL software and services.” (Technology Research Organization, Wikibon) • “The Big Data market will likely hit $23.8 billion with a 31.7% rise per year.” (IDC Report) • “Almost 90% organizations have embarked on Hadoop related projects and thus Hadoop skills are in huge demand.” (According to the Big Data Executive Survey 2013) • “Companies hiring Hadoop professionals Analytics professionals obtain a 250 percent hike in their salaries.” (According to Analytics Industry Report) Gartner had predicted, “Hadoop will be in most advanced analytics products by 2015.” By far and large, this prediction has turned out to be true. Hadoop is “The Technology” to harness Big Data Get Started with BIG Data & Hadoop
  • 13. Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Predictions for 2015 According to the Forrester Report: • A new data economy will arise, Hadooponomics. Due to its vast data prowess, enterprise adoption of Hadoop will become compulsory. All those companies, which were unsure of adopting Hadoop till 2014, are bound to adopt it in 2015 • The surge in demand for Hadoop job professionals will be fulfilled by Java professionals & other software engineers who will upgrade their skills • The creation of SQL-on-Hadoop options will further enhance usability & adoption • Due to the introduction of Yarn (Version 2.0), the usage of Hadoop will go beyond analytics. It will transform into an application platform giving birth to new frameworks in the Hadoop ecosystem • New Hadoop sources will emerge from organizations like Oracle, HP, Tibco and SAP. Vendors such as Red Hat, VMware and Microsoft will include Hadoop in their operating Systems Get Started with BIG Data & Hadoop
  • 14. Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Why SkillSpeed? Course Curriculum from Industry Experts Instructor Led Live Virtual Sessions Lifetime access to Course Content via LMS 100% Placement Assistance 24x7 Support Get Started with BIG Data & Hadoop
  • 15. Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Course Topics Module 1 Introduction to Big Data and Hadoop Module 2 HDFS Internals, Hadoop Configurations and Data Loading Module 3 Introduction to Map Reduce Module 4 Advanced Map Reduce Concepts Module 5 Introduction to Pig Module 6 Advanced Pig and Introduction to Hive Module 7 Advanced Hive Concepts Module 8 Extending Hive and HBase Introduction Module 9 Advanced HBase and Oozie Introduction Module 10 Project Set-up Discussion Get Started with BIG Data & Hadoop
  • 16. Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Corporate Partners Get Started with BIG Data & Hadoop
  • 17. Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Lines open 24/7 To know more about the course, Please contact: IND +91-90660-20904 USA 1866-607-6547 (Toll Free) Or reach us at sales@skillspeed.com Contact Us

Editor's Notes

  1. SkillSpeed offer virtual instructor lead courses designed to bridge the time to competency gap experienced by the technology companies. USP of SkillSpeed is the subject matter expert (SME). SMEs are industry experts and has a good understanding and hands-on industry experience of the technology. This industry expert designs, develops, and delivers the course. SkillSpeed provides you: Course Curriculum from Industry Experts Instructor Led Live Virtual Sessions Real life industry case studies  - Live Virtual Interactions Interaction with industry experts  - Lifetime access to all course content via the LMS   - 24*7 support   - 100% placement assistance