SlideShare a Scribd company logo
1 of 22
SKILLWISE-BIG DATA
Big Data Analytics
What is the aim of the course
Focus is on “Systems” and applications for cloud-based
storage and processing of BIG DATA.
+Big Data - Definition
+Big Data - Analytics
+Big Data - Storage (HDFS)
+Big Data - Computing (Map/Reduce)
+Big Data - Database (HBase)
+Big Data – Graph DB (Titan)
+Big Data - Streaming (Strom)
• Get Convinced about “Big Data”
• Understand why we need a different paradigm.
• Ascertain with confidence the need to look at data computing in a different
way.
• Realize the potential of big data
– All of you are skilled enough to get into it.
• What we will not do
– Do research on why things have evolved into the current trends as it stands.
– Try to be hands-on – But not guaranteed
Introduction to Big Data
What are we going to understand
• What is Big Data?
• Why we landed up there?
• To whom does it matter
• Where is the money?
• Are we ready to handle it?
• What are the concerns?
• Tools and Technologies
– Is Big Data <=> Hadoop
Simple to start
• What is the maximum file size you have dealt so far?
– Movies/Files/Streaming video that you have used?
– What have you observed?
• What is the maximum download speed you get?
• Simple computation
– How much time to just transfer.
What is big data?
• “Every day, we create 2.5 quintillion bytes of data — so
much that 90% of the data in the world today has been
created in the last two years alone. This data comes
from everywhere: sensors used to gather climate
information, posts to social media sites, digital pictures
and videos, purchase transaction records, and cell
phone GPS signals to name a few.
This data is “big data.”
Huge amount of data
• There are huge volumes of data in the world:
+ From the beginning of recorded time until 2003,
+ We created 5 billion gigabytes (exabytes) of data.
+ In 2011, the same amount was created every two days
+ In 2013, the same amount of data is created every 10
minutes.
Big data spans three dimensions:
Volume, Velocity and Variety• Volume: Enterprises are awash with ever-growing data of all types, easily amassing
terabytes—even petabytes—of information.
– Turn 12 terabytes of Tweets created each day into improved product sentiment
analysis
– Convert 350 billion annual meter readings to better predict power consumption
• Velocity: Sometimes 2 minutes is too late. For time-sensitive processes such as catching
fraud, big data must be used as it streams into your enterprise in order to maximize its
value.
– Scrutinize 5 million trade events created each day to identify potential fraud
– Analyze 500 million daily call detail records in real-time to predict customer churn
faster
– The latest I have heard is 10 nano seconds delay is too much.
• Variety: Big data is any type of data - structured and unstructured data such as text,
sensor data, audio, video, click streams, log files and more. New insights are found
when analyzing these data types together.
– Monitor 100’s of live video feeds from surveillance cameras to target points of
interest
– Exploit the 80% data growth in images, video and documents to improve customer
satisfaction
Finally….
`Big- Data’ is similar to ‘Small-data’ but bigger
.. But having data bigger it requires different
approaches:
Techniques, tools, architecture
… with an aim to solve new problems
Or old problems in a better way
Whom does it matter
• Research Community 
• Business Community - New tools, new
capabilities, new infrastructure, new business
models etc.,
• On sectors
Financial Services..
How are revenues looking like….
The Social Layer in an Instrumented Interconnected World
2+
billion
people on
the Web
by end
2011
30 billion RFID
tags today
(1.3B in 2005)
4.6
billion
camera
phones
world wide
100s of
millions
of GPS
enabled
devices sold
annually
76 million smart
meters in 2009…
200M by 2014
12+ TBs
of tweet data
every day
25+ TBs of
log data
every day
?TBsof
dataeveryday
What does Big Data trigger?
BIG DATA is not just HADOOP
Manage & store huge
volume of any data
Hadoop File System
MapReduce
Manage streaming data Stream Computing
Analyze unstructured data Text Analytics Engine
Data WarehousingStructure and control data
Integrate and govern all
data sources
Integration, Data Quality, Security,
Lifecycle Management, MDM
Understand and navigate
federated big data sources
Federated Discovery and Navigation
Types of tools typically used in Big
Data Scenario
• Where is the processing hosted?
– Distributed server/cloud
• Where data is stored?
– Distributed Storage (eg: Amazon s3)
• Where is the programming model?
– Distributed processing (Map Reduce)
• How data is stored and indexed?
– High performance schema free database
• What operations are performed on the data?
– Analytic/Semantic Processing (Eg. RDF/OWL)
When dealing with Big Data is hard
• When the operations on data are complex:
– Eg. Simple counting is not a complex problem.
– Modeling and reasoning with data of different kinds can get
extremely complex
• Good news with big-data:
– Often, because of the vast amount of data, modeling
techniques can get simpler (e.g., smart counting can
replace complex model-based analytics)…
– …as long as we deal with the scale.
Time for thinking
• What do you do with the data.
– Lets take an example:
• “From application developers to video streamers, organizations of all
sizes face the challenge of capturing, searching, analyzing, and
leveraging as much as terabytes of data per second—too much for the
constraints of traditional system capabilities and database
management tools.”
Why Big-Data?
• Key enablers for the appearance and growth
of ‘Big-Data’ are:
+Increase in storage capabilities
+Increase in processing power
+Availability of data
SKILLWISE-BIGDATA ANALYSIS

More Related Content

What's hot (20)

Chapter 4 what is data and data types
Chapter 4  what is data and data typesChapter 4  what is data and data types
Chapter 4 what is data and data types
 
Big Data Characteristics And Process PowerPoint Presentation Slides
Big Data Characteristics And Process PowerPoint Presentation SlidesBig Data Characteristics And Process PowerPoint Presentation Slides
Big Data Characteristics And Process PowerPoint Presentation Slides
 
Big data
Big dataBig data
Big data
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Big data
Big dataBig data
Big data
 
Big data tools
Big data toolsBig data tools
Big data tools
 
Team 2 Big Data Presentation
Team 2 Big Data PresentationTeam 2 Big Data Presentation
Team 2 Big Data Presentation
 
Big data
Big dataBig data
Big data
 
Big data 2017 final
Big data 2017   finalBig data 2017   final
Big data 2017 final
 
Big Data
Big DataBig Data
Big Data
 
Big data, Big decision
Big data, Big decisionBig data, Big decision
Big data, Big decision
 
Moneytree - Data Aggregation with SWF
Moneytree - Data Aggregation with SWFMoneytree - Data Aggregation with SWF
Moneytree - Data Aggregation with SWF
 
Ppt for Application of big data
Ppt for Application of big dataPpt for Application of big data
Ppt for Application of big data
 
Applications of Big Data
Applications of Big DataApplications of Big Data
Applications of Big Data
 
Mining Big Data in Real Time
Mining Big Data in Real TimeMining Big Data in Real Time
Mining Big Data in Real Time
 
Big Data - Applications and Technologies Overview
Big Data - Applications and Technologies OverviewBig Data - Applications and Technologies Overview
Big Data - Applications and Technologies Overview
 
Big data
Big dataBig data
Big data
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 
Presentation Big Data
Presentation Big DataPresentation Big Data
Presentation Big Data
 
5 v of big data
5 v of big data5 v of big data
5 v of big data
 

Viewers also liked

Big Data Analysis : Deciphering the haystack
Big Data Analysis : Deciphering the haystack Big Data Analysis : Deciphering the haystack
Big Data Analysis : Deciphering the haystack Srinath Perera
 
Imagining Supply Chain Processes Outside-in. Building Value Networks at IBM t...
Imagining Supply Chain Processes Outside-in. Building Value Networks at IBM t...Imagining Supply Chain Processes Outside-in. Building Value Networks at IBM t...
Imagining Supply Chain Processes Outside-in. Building Value Networks at IBM t...Lora Cecere
 
Security issues associated with big data in cloud
Security issues associated  with big data in cloudSecurity issues associated  with big data in cloud
Security issues associated with big data in cloudsornalathaNatarajan
 
OpenSource Big Data Platform - Flamingo Project
OpenSource Big Data Platform - Flamingo ProjectOpenSource Big Data Platform - Flamingo Project
OpenSource Big Data Platform - Flamingo ProjectBYOUNG GON KIM
 
Big Data Analytics in Energy & Utilities
Big Data Analytics in Energy & UtilitiesBig Data Analytics in Energy & Utilities
Big Data Analytics in Energy & UtilitiesAnders Quitzau
 
Big Data Platforms: An Overview
Big Data Platforms: An OverviewBig Data Platforms: An Overview
Big Data Platforms: An OverviewC. Scyphers
 

Viewers also liked (8)

Big Data Analysis : Deciphering the haystack
Big Data Analysis : Deciphering the haystack Big Data Analysis : Deciphering the haystack
Big Data Analysis : Deciphering the haystack
 
Imagining Supply Chain Processes Outside-in. Building Value Networks at IBM t...
Imagining Supply Chain Processes Outside-in. Building Value Networks at IBM t...Imagining Supply Chain Processes Outside-in. Building Value Networks at IBM t...
Imagining Supply Chain Processes Outside-in. Building Value Networks at IBM t...
 
Security issues associated with big data in cloud
Security issues associated  with big data in cloudSecurity issues associated  with big data in cloud
Security issues associated with big data in cloud
 
Big Data (security Issue)
Big Data (security Issue)Big Data (security Issue)
Big Data (security Issue)
 
Big data security
Big data securityBig data security
Big data security
 
OpenSource Big Data Platform - Flamingo Project
OpenSource Big Data Platform - Flamingo ProjectOpenSource Big Data Platform - Flamingo Project
OpenSource Big Data Platform - Flamingo Project
 
Big Data Analytics in Energy & Utilities
Big Data Analytics in Energy & UtilitiesBig Data Analytics in Energy & Utilities
Big Data Analytics in Energy & Utilities
 
Big Data Platforms: An Overview
Big Data Platforms: An OverviewBig Data Platforms: An Overview
Big Data Platforms: An Overview
 

Similar to SKILLWISE-BIGDATA ANALYSIS

big-datagroup6-150317090053-conversion-gate01.pdf
big-datagroup6-150317090053-conversion-gate01.pdfbig-datagroup6-150317090053-conversion-gate01.pdf
big-datagroup6-150317090053-conversion-gate01.pdfVirajSaud
 
Data analytics introduction
Data analytics introductionData analytics introduction
Data analytics introductionamiyadash
 
Big Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalBig Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalIIIT Allahabad
 
Big data4businessusers
Big data4businessusersBig data4businessusers
Big data4businessusersBob Hardaway
 
Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01nayanbhatia2
 
Special issues on big data
Special issues on big dataSpecial issues on big data
Special issues on big dataVedanand Singh
 
Content1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docxContent1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docxdickonsondorris
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementTony Bain
 
ppt final.pptx
ppt final.pptxppt final.pptx
ppt final.pptxkalai75
 
Big Data and Data Science: The Technologies Shaping Our Lives
Big Data and Data Science: The Technologies Shaping Our LivesBig Data and Data Science: The Technologies Shaping Our Lives
Big Data and Data Science: The Technologies Shaping Our LivesRukshan Batuwita
 

Similar to SKILLWISE-BIGDATA ANALYSIS (20)

big-datagroup6-150317090053-conversion-gate01.pdf
big-datagroup6-150317090053-conversion-gate01.pdfbig-datagroup6-150317090053-conversion-gate01.pdf
big-datagroup6-150317090053-conversion-gate01.pdf
 
Big data
Big dataBig data
Big data
 
Data analytics introduction
Data analytics introductionData analytics introduction
Data analytics introduction
 
Big data.pptx
Big data.pptxBig data.pptx
Big data.pptx
 
Big Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalBig Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar Semwal
 
Big data4businessusers
Big data4businessusersBig data4businessusers
Big data4businessusers
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 
Big_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptxBig_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptx
 
bigdatappt.pptx
bigdatappt.pptxbigdatappt.pptx
bigdatappt.pptx
 
Ictam big data
Ictam big dataIctam big data
Ictam big data
 
Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01
 
Special issues on big data
Special issues on big dataSpecial issues on big data
Special issues on big data
 
Content1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docxContent1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docx
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data Management
 
ppt final.pptx
ppt final.pptxppt final.pptx
ppt final.pptx
 
Big Data
Big DataBig Data
Big Data
 
big-data-notes1.ppt
big-data-notes1.pptbig-data-notes1.ppt
big-data-notes1.ppt
 
Big Data.pptx
Big Data.pptxBig Data.pptx
Big Data.pptx
 
Big Data and Data Science: The Technologies Shaping Our Lives
Big Data and Data Science: The Technologies Shaping Our LivesBig Data and Data Science: The Technologies Shaping Our Lives
Big Data and Data Science: The Technologies Shaping Our Lives
 
Kartikey tripathi
Kartikey tripathiKartikey tripathi
Kartikey tripathi
 

More from Skillwise Consulting (19)

Insurace brochure for clients
Insurace brochure for clientsInsurace brochure for clients
Insurace brochure for clients
 
Health care profile
Health care profileHealth care profile
Health care profile
 
Manufacturing profile
Manufacturing profileManufacturing profile
Manufacturing profile
 
Skillwise profile
Skillwise profileSkillwise profile
Skillwise profile
 
Technology platform
Technology platformTechnology platform
Technology platform
 
JMETER-SKILLWISE
JMETER-SKILLWISEJMETER-SKILLWISE
JMETER-SKILLWISE
 
SKILLWISE_SELENIUM
SKILLWISE_SELENIUMSKILLWISE_SELENIUM
SKILLWISE_SELENIUM
 
Android Application Fundamentals.
Android Application Fundamentals.Android Application Fundamentals.
Android Application Fundamentals.
 
Skillwise Consulting_Android
Skillwise Consulting_AndroidSkillwise Consulting_Android
Skillwise Consulting_Android
 
Technical Comptency_ppt
Technical Comptency_pptTechnical Comptency_ppt
Technical Comptency_ppt
 
Advanced Soft skill_Skillwise Consulting
Advanced Soft skill_Skillwise ConsultingAdvanced Soft skill_Skillwise Consulting
Advanced Soft skill_Skillwise Consulting
 
Technical Skillwise
Technical SkillwiseTechnical Skillwise
Technical Skillwise
 
Softskill skillwise consulting ppt
Softskill skillwise consulting pptSoftskill skillwise consulting ppt
Softskill skillwise consulting ppt
 
Skillwise Consulting_Soft skills
Skillwise Consulting_Soft skillsSkillwise Consulting_Soft skills
Skillwise Consulting_Soft skills
 
Skillwise_Technical competency
Skillwise_Technical competencySkillwise_Technical competency
Skillwise_Technical competency
 
Skillwise consulting _Soft Skills
Skillwise consulting _Soft SkillsSkillwise consulting _Soft Skills
Skillwise consulting _Soft Skills
 
Skillwise Consulting -Technical competency
Skillwise Consulting -Technical competencySkillwise Consulting -Technical competency
Skillwise Consulting -Technical competency
 
Skillwise Profile
Skillwise ProfileSkillwise Profile
Skillwise Profile
 
Skillwise Consulting
Skillwise ConsultingSkillwise Consulting
Skillwise Consulting
 

Recently uploaded

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 

Recently uploaded (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 

SKILLWISE-BIGDATA ANALYSIS

  • 2.
  • 4. What is the aim of the course Focus is on “Systems” and applications for cloud-based storage and processing of BIG DATA. +Big Data - Definition +Big Data - Analytics +Big Data - Storage (HDFS) +Big Data - Computing (Map/Reduce) +Big Data - Database (HBase) +Big Data – Graph DB (Titan) +Big Data - Streaming (Strom)
  • 5. • Get Convinced about “Big Data” • Understand why we need a different paradigm. • Ascertain with confidence the need to look at data computing in a different way. • Realize the potential of big data – All of you are skilled enough to get into it. • What we will not do – Do research on why things have evolved into the current trends as it stands. – Try to be hands-on – But not guaranteed
  • 7. What are we going to understand • What is Big Data? • Why we landed up there? • To whom does it matter • Where is the money? • Are we ready to handle it? • What are the concerns? • Tools and Technologies – Is Big Data <=> Hadoop
  • 8. Simple to start • What is the maximum file size you have dealt so far? – Movies/Files/Streaming video that you have used? – What have you observed? • What is the maximum download speed you get? • Simple computation – How much time to just transfer.
  • 9. What is big data? • “Every day, we create 2.5 quintillion bytes of data — so much that 90% of the data in the world today has been created in the last two years alone. This data comes from everywhere: sensors used to gather climate information, posts to social media sites, digital pictures and videos, purchase transaction records, and cell phone GPS signals to name a few. This data is “big data.”
  • 10. Huge amount of data • There are huge volumes of data in the world: + From the beginning of recorded time until 2003, + We created 5 billion gigabytes (exabytes) of data. + In 2011, the same amount was created every two days + In 2013, the same amount of data is created every 10 minutes.
  • 11. Big data spans three dimensions: Volume, Velocity and Variety• Volume: Enterprises are awash with ever-growing data of all types, easily amassing terabytes—even petabytes—of information. – Turn 12 terabytes of Tweets created each day into improved product sentiment analysis – Convert 350 billion annual meter readings to better predict power consumption • Velocity: Sometimes 2 minutes is too late. For time-sensitive processes such as catching fraud, big data must be used as it streams into your enterprise in order to maximize its value. – Scrutinize 5 million trade events created each day to identify potential fraud – Analyze 500 million daily call detail records in real-time to predict customer churn faster – The latest I have heard is 10 nano seconds delay is too much. • Variety: Big data is any type of data - structured and unstructured data such as text, sensor data, audio, video, click streams, log files and more. New insights are found when analyzing these data types together. – Monitor 100’s of live video feeds from surveillance cameras to target points of interest – Exploit the 80% data growth in images, video and documents to improve customer satisfaction
  • 12. Finally…. `Big- Data’ is similar to ‘Small-data’ but bigger .. But having data bigger it requires different approaches: Techniques, tools, architecture … with an aim to solve new problems Or old problems in a better way
  • 13. Whom does it matter • Research Community  • Business Community - New tools, new capabilities, new infrastructure, new business models etc., • On sectors Financial Services..
  • 14. How are revenues looking like….
  • 15. The Social Layer in an Instrumented Interconnected World 2+ billion people on the Web by end 2011 30 billion RFID tags today (1.3B in 2005) 4.6 billion camera phones world wide 100s of millions of GPS enabled devices sold annually 76 million smart meters in 2009… 200M by 2014 12+ TBs of tweet data every day 25+ TBs of log data every day ?TBsof dataeveryday
  • 16. What does Big Data trigger?
  • 17. BIG DATA is not just HADOOP Manage & store huge volume of any data Hadoop File System MapReduce Manage streaming data Stream Computing Analyze unstructured data Text Analytics Engine Data WarehousingStructure and control data Integrate and govern all data sources Integration, Data Quality, Security, Lifecycle Management, MDM Understand and navigate federated big data sources Federated Discovery and Navigation
  • 18. Types of tools typically used in Big Data Scenario • Where is the processing hosted? – Distributed server/cloud • Where data is stored? – Distributed Storage (eg: Amazon s3) • Where is the programming model? – Distributed processing (Map Reduce) • How data is stored and indexed? – High performance schema free database • What operations are performed on the data? – Analytic/Semantic Processing (Eg. RDF/OWL)
  • 19. When dealing with Big Data is hard • When the operations on data are complex: – Eg. Simple counting is not a complex problem. – Modeling and reasoning with data of different kinds can get extremely complex • Good news with big-data: – Often, because of the vast amount of data, modeling techniques can get simpler (e.g., smart counting can replace complex model-based analytics)… – …as long as we deal with the scale.
  • 20. Time for thinking • What do you do with the data. – Lets take an example: • “From application developers to video streamers, organizations of all sizes face the challenge of capturing, searching, analyzing, and leveraging as much as terabytes of data per second—too much for the constraints of traditional system capabilities and database management tools.”
  • 21. Why Big-Data? • Key enablers for the appearance and growth of ‘Big-Data’ are: +Increase in storage capabilities +Increase in processing power +Availability of data