SlideShare a Scribd company logo
WHAT IS BIG DATA?
BY AKHMAD ZAKI ALSAFI
(C-left) 2020
WHY BIG DATA
(FROM BUSINESS AND TECHNICAL PERSPECTIVE)
1. Cost Saving
• Maximixing Big Data Storage
• Using Cluster File System Mechanism
• By Optimising Hadoop
• Using Network Filesystem
2. Time Reductions
• Make Decisions Based on Realtime Data
• Using Real Time Streaming and Real Time Visual Analytical
Tools.
• Like Kafka, Storm, Splunk, etc.
3. New Product Development
• Knowing What Customer Wants from Their Digital Behaviors
• Using Machine Learning tools like Apache Spark, Tensorflow,
Torch, GPU Computation, etc.
• Models like Clustering, Logistic Regression, Deep Learning
4. Understand Market Conditions
• What to promote? What strategy to maintain? What, Which, to
Whom customer loyalty program to give?
• Models like collaborative Filtering, etc.
5. Control Online Reputations
• Using Natural Language Processing to do a Sentiment
Analysis and Make decision to increase customer
engagement.
• Tools like chatbot, etc.
We Do it
Statistically and
With Computation
Heavy Weight!!!
One More!
We Need: BIG DATA CONVERGED
INFRASTRUCTURE Platform For
Unifiying All Tools and Mechanism.
And We Have to Present All The Those
Tools in That Platform In an End User
Friendly Interface, Either As Part of the
Platform if the Platform have it, or by
using another third Party Products
JUST CHECK GARTNER FOR PRODUCTS
STRATEGIES FOR IMPLEMENTING BIG DATA
FOR END USER / CUSTOMER
1. Identify What You Want !
• decide whether you want to:
1. increase the efficiency of customer reps
2. improve operational efficiency
3. increase revenues
4. provide better customer experience
5. improve marketing
2. Leverage Proven Big Data Strategy!
• Performance Management
1. Using business intelligence tools
2. Get historical data from database and store on Hadoop
3. Consists of grouping, aggregating, counting volume and other grouped information.
• Data Exploration
1. Gather information about customer’s behavior
2. Generate new Products and revenue streams
• Social Analytics
1. Sentiment analysis
3. Identify Infrastructural Changes !
• Create Infrastructure that makes integration of data easy
4. Establish Talent Pool!
5. Obsess Over Customer Satisfaction!
6. Ensure Usability!
• The output of Big Data process can be consumed in a format understandable by all staff and
department’s person in charge.
7. Be Agile! (Embrace all circumtances)
WHAT IS BIGDATA
HISTORY OF BIGDATA
HOW BIG DATA ARE STORED?
3 MAIN COMPONENTS
KEY TERMS OF BIG DATA
SOURCE OF BIG DATA AND IT’S FORMAT
HADOOP
A GLANCE OF THE LOCAL FILE SYSTEM
Format
Failure
Theft
Deletion Space Management
HADOOP MECHANISMS
Hadoop FS management
MAP REDUCE
ALGORITHMS
NATURE OF BIG DATA
DISTRIBUTED DATA
FAULT TOLERANCE
DISTRIBUTED PROCESSING
EXAMPLE OF HADOOP BIG DATA STORAGE
APACHE HIVE
FEATURES OF HIVE
1. HDFS Storage
2. Designed for OLAP
3. SQL Interface
4. Fast, Scalable, Extensible
5. Supported by Big Data Execution Engines, like
Apache Spark, Map Reduce, Apache Tez, etc.
ARCHITECTURE OF HIVE
ARCHITECTURE EXPLAINED
1. User Interface
2. Metastore (Store schema, metadata, etc)
3. HiveQL (Querying Metastore for information)
4. Big Data Execution Engine (Distributed Computing Tools spread all over
Hadoop Cluster)
5. HDFS/HBASE, Hadoop basic storage tools.
EXAMPLE OF HADOOP BIG DATA EXECUTION
ENGINE
WHAT IS APACHE SPARK?
• Open Source Software
• Data Processing Engine
• Have Tools for streaming data, transformation of data, and preprocessing data
• In Memory Processing Engine
• Support programming on Java, Scala, Python, and R
APACHE SPARK ECOSYSTEM
BETTER THAN MAP REDUCE
• Map Reduce is I/O Heavy process, need write-in and write-out directly to disk,
Spark is in-memory processing engine.
DISTRIBUTED COMPUTING USING SPARK
SPARK STREAMING
SPARK USE CASE (1) - STREAMING
SPARK USE CASE (2) - ETL
MACHINE LEARNING
WHY MACHINE LEARNING
• Find Pattern on Vast Amount of Data. What Pattern? Behavior Pattern!
• Mining Data for hidden information, hidden pattern
• Mimic Human, in the way of handling communication and semantics.
• Develop complex systems.
ML APPLICATIONS
MOTIVATING EXAMPLE
LEARNING TO FILTER SPAM
THE LEARNING PROCESS
LEARNING ALGORITHMS
MACHINE LEARNING EXAMPLE: EMAIL SPAM
FILTER
DECISION TREE
DECISION TREE EXAMPLE
THANK YOU
(TERIMA KASIH)
(C-Left) 2020
BY AKHMAD ZAKI ALSAFI

More Related Content

What's hot

Intro to big data and applications - day 2
Intro to big data and applications - day 2Intro to big data and applications - day 2
Intro to big data and applications - day 2
Parviz Vakili
 
View on big data technologies
View on big data technologiesView on big data technologies
View on big data technologies
Krisshhna Daasaarii
 
Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...
Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...
Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...
yashbheda
 
Hadoop Training Tutorial for Freshers
Hadoop Training Tutorial for FreshersHadoop Training Tutorial for Freshers
Hadoop Training Tutorial for Freshers
rajkamaltibacademy
 
Big data introduction
Big data introductionBig data introduction
Big data introduction
Chirag Ahuja
 
Big Data
Big DataBig Data
Big Data
Priyanka Tuteja
 
Big data
Big dataBig data
Big data
Nausheen Hasan
 
Big Data
Big DataBig Data
Big Data
Neha Mehta
 
Big Data: The 6 Key Skills Every Business Needs
Big Data: The 6 Key Skills Every Business NeedsBig Data: The 6 Key Skills Every Business Needs
Big Data: The 6 Key Skills Every Business Needs
Bernard Marr
 
Big Data
Big DataBig Data
Big Data
Vinayak Kamath
 
Big data
Big dataBig data
Big data
ArchanaMani2
 
BIG DATA & DATA ANALYTICS
BIG  DATA & DATA  ANALYTICSBIG  DATA & DATA  ANALYTICS
BIG DATA & DATA ANALYTICS
NAGARAJAGIDDE
 
Business intelligence architectures.pdf
Business intelligence architectures.pdfBusiness intelligence architectures.pdf
Business intelligence architectures.pdf
Anand572211
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
Vipin Batra
 
Big data tools
Big data toolsBig data tools
Big data tools
Novita Sari
 
Big data peresintaion
Big data peresintaion Big data peresintaion
Big data peresintaion
ahmed alshikh
 
Moneytree - Data Aggregation with SWF
Moneytree - Data Aggregation with SWFMoneytree - Data Aggregation with SWF
Moneytree - Data Aggregation with SWF
Ross Sharrott
 
Introduction to Big Data Hadoop Training Online by www.itjobzone.biz
Introduction to Big Data Hadoop Training Online by www.itjobzone.bizIntroduction to Big Data Hadoop Training Online by www.itjobzone.biz
Introduction to Big Data Hadoop Training Online by www.itjobzone.biz
ITJobZone.biz
 
Big Stream Processing Systems, Big Graphs
Big Stream Processing Systems, Big GraphsBig Stream Processing Systems, Big Graphs
Big Stream Processing Systems, Big Graphs
Petr Novotný
 
Big data analysis
Big data analysisBig data analysis
Big data analysis
SAishwaryaDinesh
 

What's hot (20)

Intro to big data and applications - day 2
Intro to big data and applications - day 2Intro to big data and applications - day 2
Intro to big data and applications - day 2
 
View on big data technologies
View on big data technologiesView on big data technologies
View on big data technologies
 
Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...
Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...
Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...
 
Hadoop Training Tutorial for Freshers
Hadoop Training Tutorial for FreshersHadoop Training Tutorial for Freshers
Hadoop Training Tutorial for Freshers
 
Big data introduction
Big data introductionBig data introduction
Big data introduction
 
Big Data
Big DataBig Data
Big Data
 
Big data
Big dataBig data
Big data
 
Big Data
Big DataBig Data
Big Data
 
Big Data: The 6 Key Skills Every Business Needs
Big Data: The 6 Key Skills Every Business NeedsBig Data: The 6 Key Skills Every Business Needs
Big Data: The 6 Key Skills Every Business Needs
 
Big Data
Big DataBig Data
Big Data
 
Big data
Big dataBig data
Big data
 
BIG DATA & DATA ANALYTICS
BIG  DATA & DATA  ANALYTICSBIG  DATA & DATA  ANALYTICS
BIG DATA & DATA ANALYTICS
 
Business intelligence architectures.pdf
Business intelligence architectures.pdfBusiness intelligence architectures.pdf
Business intelligence architectures.pdf
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Big data tools
Big data toolsBig data tools
Big data tools
 
Big data peresintaion
Big data peresintaion Big data peresintaion
Big data peresintaion
 
Moneytree - Data Aggregation with SWF
Moneytree - Data Aggregation with SWFMoneytree - Data Aggregation with SWF
Moneytree - Data Aggregation with SWF
 
Introduction to Big Data Hadoop Training Online by www.itjobzone.biz
Introduction to Big Data Hadoop Training Online by www.itjobzone.bizIntroduction to Big Data Hadoop Training Online by www.itjobzone.biz
Introduction to Big Data Hadoop Training Online by www.itjobzone.biz
 
Big Stream Processing Systems, Big Graphs
Big Stream Processing Systems, Big GraphsBig Stream Processing Systems, Big Graphs
Big Stream Processing Systems, Big Graphs
 
Big data analysis
Big data analysisBig data analysis
Big data analysis
 

Similar to What is Big Data ?

Data analytics & its Trends
Data analytics & its TrendsData analytics & its Trends
Data analytics & its Trends
Dr.K.Sreenivas Rao
 
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseUsing the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Rizaldy Ignacio
 
5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer
Caserta
 
Introduction To Big Data & Hadoop
Introduction To Big Data & HadoopIntroduction To Big Data & Hadoop
Introduction To Big Data & Hadoop
Blackvard
 
SAP HANA_class1.pptx
SAP HANA_class1.pptxSAP HANA_class1.pptx
SAP HANA_class1.pptx
SudhaVukkalkar1
 
Big Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsBig Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential Tools
FredReynolds2
 
Hadoop Master Class : A concise overview
Hadoop Master Class : A concise overviewHadoop Master Class : A concise overview
Hadoop Master Class : A concise overview
Abhishek Roy
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
ANAND PRAKASH
 
The CIO guide to Big Data Archiving
The CIO guide to Big Data ArchivingThe CIO guide to Big Data Archiving
The CIO guide to Big Data Archiving
LindaWatson19
 
Big Data in Azure
Big Data in AzureBig Data in Azure
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadh
Mithlesh Sadh
 
Big-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigBig-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigManish Chopra
 
Ask bigger questions
Ask bigger questionsAsk bigger questions
Ask bigger questions
South West Data Meetup
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinar
Michael Hiskey
 
Atlanta Data Science Meetup | Qubole slides
Atlanta Data Science Meetup | Qubole slidesAtlanta Data Science Meetup | Qubole slides
Atlanta Data Science Meetup | Qubole slides
Qubole
 
New big data architecture in hadoop.pptx
New big data architecture in hadoop.pptxNew big data architecture in hadoop.pptx
New big data architecture in hadoop.pptx
VanshGupta597842
 
Big data
Big dataBig data
Big data
Pietro Nardone
 
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
ssuserd3a367
 
Jason Huang, Solutions Engineer, Qubole at MLconf ATL - 9/18/15
Jason Huang, Solutions Engineer, Qubole at MLconf ATL - 9/18/15Jason Huang, Solutions Engineer, Qubole at MLconf ATL - 9/18/15
Jason Huang, Solutions Engineer, Qubole at MLconf ATL - 9/18/15
MLconf
 

Similar to What is Big Data ? (20)

Data analytics & its Trends
Data analytics & its TrendsData analytics & its Trends
Data analytics & its Trends
 
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseUsing the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
 
5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer
 
Introduction To Big Data & Hadoop
Introduction To Big Data & HadoopIntroduction To Big Data & Hadoop
Introduction To Big Data & Hadoop
 
SAP HANA_class1.pptx
SAP HANA_class1.pptxSAP HANA_class1.pptx
SAP HANA_class1.pptx
 
unit 1 big data.pptx
unit 1 big data.pptxunit 1 big data.pptx
unit 1 big data.pptx
 
Big Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsBig Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential Tools
 
Hadoop Master Class : A concise overview
Hadoop Master Class : A concise overviewHadoop Master Class : A concise overview
Hadoop Master Class : A concise overview
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
The CIO guide to Big Data Archiving
The CIO guide to Big Data ArchivingThe CIO guide to Big Data Archiving
The CIO guide to Big Data Archiving
 
Big Data in Azure
Big Data in AzureBig Data in Azure
Big Data in Azure
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadh
 
Big-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigBig-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-Koenig
 
Ask bigger questions
Ask bigger questionsAsk bigger questions
Ask bigger questions
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinar
 
Atlanta Data Science Meetup | Qubole slides
Atlanta Data Science Meetup | Qubole slidesAtlanta Data Science Meetup | Qubole slides
Atlanta Data Science Meetup | Qubole slides
 
New big data architecture in hadoop.pptx
New big data architecture in hadoop.pptxNew big data architecture in hadoop.pptx
New big data architecture in hadoop.pptx
 
Big data
Big dataBig data
Big data
 
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
 
Jason Huang, Solutions Engineer, Qubole at MLconf ATL - 9/18/15
Jason Huang, Solutions Engineer, Qubole at MLconf ATL - 9/18/15Jason Huang, Solutions Engineer, Qubole at MLconf ATL - 9/18/15
Jason Huang, Solutions Engineer, Qubole at MLconf ATL - 9/18/15
 

Recently uploaded

ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
Jayaprasanna4
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
MdTanvirMahtab2
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
Amil Baba Dawood bangali
 
Automobile Management System Project Report.pdf
Automobile Management System Project Report.pdfAutomobile Management System Project Report.pdf
Automobile Management System Project Report.pdf
Kamal Acharya
 
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdfCOLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
Kamal Acharya
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
AhmedHussein950959
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptx
ViniHema
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
seandesed
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation & Control
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
Robbie Edward Sayers
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
VENKATESHvenky89705
 
The role of big data in decision making.
The role of big data in decision making.The role of big data in decision making.
The role of big data in decision making.
ankuprajapati0525
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
SamSarthak3
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
Kamal Acharya
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
AJAYKUMARPUND1
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
Pipe Restoration Solutions
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSETECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
DuvanRamosGarzon1
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
bakpo1
 

Recently uploaded (20)

ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
 
Automobile Management System Project Report.pdf
Automobile Management System Project Report.pdfAutomobile Management System Project Report.pdf
Automobile Management System Project Report.pdf
 
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdfCOLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptx
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
 
The role of big data in decision making.
The role of big data in decision making.The role of big data in decision making.
The role of big data in decision making.
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
 
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSETECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
 

What is Big Data ?

  • 1. WHAT IS BIG DATA? BY AKHMAD ZAKI ALSAFI (C-left) 2020
  • 2. WHY BIG DATA (FROM BUSINESS AND TECHNICAL PERSPECTIVE) 1. Cost Saving • Maximixing Big Data Storage • Using Cluster File System Mechanism • By Optimising Hadoop • Using Network Filesystem 2. Time Reductions • Make Decisions Based on Realtime Data • Using Real Time Streaming and Real Time Visual Analytical Tools. • Like Kafka, Storm, Splunk, etc. 3. New Product Development • Knowing What Customer Wants from Their Digital Behaviors • Using Machine Learning tools like Apache Spark, Tensorflow, Torch, GPU Computation, etc. • Models like Clustering, Logistic Regression, Deep Learning 4. Understand Market Conditions • What to promote? What strategy to maintain? What, Which, to Whom customer loyalty program to give? • Models like collaborative Filtering, etc. 5. Control Online Reputations • Using Natural Language Processing to do a Sentiment Analysis and Make decision to increase customer engagement. • Tools like chatbot, etc. We Do it Statistically and With Computation Heavy Weight!!! One More! We Need: BIG DATA CONVERGED INFRASTRUCTURE Platform For Unifiying All Tools and Mechanism. And We Have to Present All The Those Tools in That Platform In an End User Friendly Interface, Either As Part of the Platform if the Platform have it, or by using another third Party Products JUST CHECK GARTNER FOR PRODUCTS
  • 3. STRATEGIES FOR IMPLEMENTING BIG DATA FOR END USER / CUSTOMER 1. Identify What You Want ! • decide whether you want to: 1. increase the efficiency of customer reps 2. improve operational efficiency 3. increase revenues 4. provide better customer experience 5. improve marketing 2. Leverage Proven Big Data Strategy! • Performance Management 1. Using business intelligence tools 2. Get historical data from database and store on Hadoop 3. Consists of grouping, aggregating, counting volume and other grouped information. • Data Exploration 1. Gather information about customer’s behavior 2. Generate new Products and revenue streams • Social Analytics 1. Sentiment analysis 3. Identify Infrastructural Changes ! • Create Infrastructure that makes integration of data easy 4. Establish Talent Pool! 5. Obsess Over Customer Satisfaction! 6. Ensure Usability! • The output of Big Data process can be consumed in a format understandable by all staff and department’s person in charge. 7. Be Agile! (Embrace all circumtances)
  • 6. HOW BIG DATA ARE STORED?
  • 8. KEY TERMS OF BIG DATA
  • 9. SOURCE OF BIG DATA AND IT’S FORMAT
  • 11. A GLANCE OF THE LOCAL FILE SYSTEM Format Failure Theft Deletion Space Management
  • 19. EXAMPLE OF HADOOP BIG DATA STORAGE APACHE HIVE
  • 20. FEATURES OF HIVE 1. HDFS Storage 2. Designed for OLAP 3. SQL Interface 4. Fast, Scalable, Extensible 5. Supported by Big Data Execution Engines, like Apache Spark, Map Reduce, Apache Tez, etc.
  • 22. ARCHITECTURE EXPLAINED 1. User Interface 2. Metastore (Store schema, metadata, etc) 3. HiveQL (Querying Metastore for information) 4. Big Data Execution Engine (Distributed Computing Tools spread all over Hadoop Cluster) 5. HDFS/HBASE, Hadoop basic storage tools.
  • 23. EXAMPLE OF HADOOP BIG DATA EXECUTION ENGINE
  • 24. WHAT IS APACHE SPARK? • Open Source Software • Data Processing Engine • Have Tools for streaming data, transformation of data, and preprocessing data • In Memory Processing Engine • Support programming on Java, Scala, Python, and R
  • 26. BETTER THAN MAP REDUCE • Map Reduce is I/O Heavy process, need write-in and write-out directly to disk, Spark is in-memory processing engine.
  • 29. SPARK USE CASE (1) - STREAMING
  • 30. SPARK USE CASE (2) - ETL
  • 32. WHY MACHINE LEARNING • Find Pattern on Vast Amount of Data. What Pattern? Behavior Pattern! • Mining Data for hidden information, hidden pattern • Mimic Human, in the way of handling communication and semantics. • Develop complex systems.
  • 37. MACHINE LEARNING EXAMPLE: EMAIL SPAM FILTER
  • 40. THANK YOU (TERIMA KASIH) (C-Left) 2020 BY AKHMAD ZAKI ALSAFI