What is Big Data ?

WHAT IS BIG DATA?
BY AKHMAD ZAKI ALSAFI
(C-left) 2020

WHY BIG DATA
(FROM BUSINESS AND TECHNICAL PERSPECTIVE)
1. Cost Saving
• Maximixing Big Data Storage
• Using Cluster File System Mechanism
• By Optimising Hadoop
• Using Network Filesystem
2. Time Reductions
• Make Decisions Based on Realtime Data
• Using Real Time Streaming and Real Time Visual Analytical
Tools.
• Like Kafka, Storm, Splunk, etc.
3. New Product Development
• Knowing What Customer Wants from Their Digital Behaviors
• Using Machine Learning tools like Apache Spark, Tensorflow,
Torch, GPU Computation, etc.
• Models like Clustering, Logistic Regression, Deep Learning
4. Understand Market Conditions
• What to promote? What strategy to maintain? What, Which, to
Whom customer loyalty program to give?
• Models like collaborative Filtering, etc.
5. Control Online Reputations
• Using Natural Language Processing to do a Sentiment
Analysis and Make decision to increase customer
engagement.
• Tools like chatbot, etc.
We Do it
Statistically and
With Computation
Heavy Weight!!!
One More!
We Need: BIG DATA CONVERGED
INFRASTRUCTURE Platform For
Unifiying All Tools and Mechanism.
And We Have to Present All The Those
Tools in That Platform In an End User
Friendly Interface, Either As Part of the
Platform if the Platform have it, or by
using another third Party Products
JUST CHECK GARTNER FOR PRODUCTS

STRATEGIES FOR IMPLEMENTING BIG DATA
FOR END USER / CUSTOMER
1. Identify What You Want !
• decide whether you want to:
1. increase the efficiency of customer reps
2. improve operational efficiency
3. increase revenues
4. provide better customer experience
5. improve marketing
2. Leverage Proven Big Data Strategy!
• Performance Management
1. Using business intelligence tools
2. Get historical data from database and store on Hadoop
3. Consists of grouping, aggregating, counting volume and other grouped information.
• Data Exploration
1. Gather information about customer’s behavior
2. Generate new Products and revenue streams
• Social Analytics
1. Sentiment analysis
3. Identify Infrastructural Changes !
• Create Infrastructure that makes integration of data easy
4. Establish Talent Pool!
5. Obsess Over Customer Satisfaction!
6. Ensure Usability!
• The output of Big Data process can be consumed in a format understandable by all staff and
department’s person in charge.
7. Be Agile! (Embrace all circumtances)

SOURCE OF BIG DATA AND IT’S FORMAT

A GLANCE OF THE LOCAL FILE SYSTEM
Format
Failure
Theft
Deletion Space Management

HADOOP MECHANISMS
Hadoop FS management

EXAMPLE OF HADOOP BIG DATA STORAGE
APACHE HIVE

FEATURES OF HIVE
1. HDFS Storage
2. Designed for OLAP
3. SQL Interface
4. Fast, Scalable, Extensible
5. Supported by Big Data Execution Engines, like
Apache Spark, Map Reduce, Apache Tez, etc.

ARCHITECTURE EXPLAINED
1. User Interface
2. Metastore (Store schema, metadata, etc)
3. HiveQL (Querying Metastore for information)
4. Big Data Execution Engine (Distributed Computing Tools spread all over
Hadoop Cluster)
5. HDFS/HBASE, Hadoop basic storage tools.

EXAMPLE OF HADOOP BIG DATA EXECUTION
ENGINE

WHAT IS APACHE SPARK?
• Open Source Software
• Data Processing Engine
• Have Tools for streaming data, transformation of data, and preprocessing data
• In Memory Processing Engine
• Support programming on Java, Scala, Python, and R

BETTER THAN MAP REDUCE
• Map Reduce is I/O Heavy process, need write-in and write-out directly to disk,
Spark is in-memory processing engine.

DISTRIBUTED COMPUTING USING SPARK

SPARK USE CASE (1) - STREAMING

WHY MACHINE LEARNING
• Find Pattern on Vast Amount of Data. What Pattern? Behavior Pattern!
• Mining Data for hidden information, hidden pattern
• Mimic Human, in the way of handling communication and semantics.
• Develop complex systems.

MOTIVATING EXAMPLE
LEARNING TO FILTER SPAM

MACHINE LEARNING EXAMPLE: EMAIL SPAM
FILTER

THANK YOU
(TERIMA KASIH)
(C-Left) 2020
BY AKHMAD ZAKI ALSAFI

What is Big Data ?

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to What is Big Data ?

Similar to What is Big Data ? (20)

Recently uploaded

Recently uploaded (20)

What is Big Data ?