Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
A National Workshop On
Big Data Analytics
Introduction & Logistics
1
Theme of this Course
Large-Scale Data
Management Big Data Analytics
Data Science and
Analytics
• How to manage very large ...
Introduction to Big Data
What is Big Data?
What makes data, “Big” Data?
3
Big Data Definition
• No single standard definition…
“Big Data” is data whose scale, diversity, and
complexity require new...
Characteristics of Big Data:
1-Scale (Volume)
• Data Volume
• 44x increase from 2009 2020
• From 0.8 zettabytes to 35zb
• ...
Characteristics of Big Data:
2-Complexity (Varity)
• Various formats, types, and
structures
• Text, numerical, images, aud...
Characteristics of Big Data:
3-Speed (Velocity)
• Data is begin generated fast and need to be processed
fast
• Online Data...
Big Data: 3V’s
8
Some Make it 4V’s
9
Harnessing Big Data
• OLTP: Online Transaction Processing (DBMSs)
• OLAP: Online Analytical Processing (Data Warehousing)
...
Who’s Generating Big Data
Social media and networks
(all of us are generating data)
Scientific instruments
(collecting all...
The Model Has
Changed…
• The Model of Generating/Consuming Data has Changed
Old Model: Few companies are generating data, ...
What’s driving Big Data
- Ad-hoc querying and reporting
- Data mining techniques
- Structured data, typical sources
- Smal...
Value of Big Data Analytics
• Big data is more real-time in
nature than traditional DW
applications
• Traditional DW archi...
Challenges in Handling Big
Data
• The Bottleneck is in technology
• New architecture, algorithms, techniques are needed
• ...
What Technology Do We Have
For Big Data ??
16
17
Big Data Technology
18
What You Will Learn…
• We focus on Hadoop/MapReduce technology
• Learn the platform (how it is designed and works)
• How b...
Course Logistics
20
Textbook & Reading List
• No specific textbook
• Big Data is a relatively new topic (so no fixed syllabus)
• Reading List
...
You’ve finished this document.
Download and read it offline.
Upcoming SlideShare
What is Big Data?
Next
Upcoming SlideShare
What is Big Data?
Next
Download to read offline and view in fullscreen.

Share

Presentation on Big Data Analytics

Download to read offline

Presentation on Big Data Analytics

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Presentation on Big Data Analytics

  1. 1. A National Workshop On Big Data Analytics Introduction & Logistics 1
  2. 2. Theme of this Course Large-Scale Data Management Big Data Analytics Data Science and Analytics • How to manage very large amounts of data and extract value and knowledge from them 2
  3. 3. Introduction to Big Data What is Big Data? What makes data, “Big” Data? 3
  4. 4. Big Data Definition • No single standard definition… “Big Data” is data whose scale, diversity, and complexity require new architecture, techniques, algorithms, and analytics to manage it and extract value and hidden knowledge from it… 4
  5. 5. Characteristics of Big Data: 1-Scale (Volume) • Data Volume • 44x increase from 2009 2020 • From 0.8 zettabytes to 35zb • Data volume is increasing exponentially 5 Exponential increase in collected/generated data
  6. 6. Characteristics of Big Data: 2-Complexity (Varity) • Various formats, types, and structures • Text, numerical, images, audio, video, sequences, time series, social media data, multi-dim arrays, etc… • Static data vs. streaming data • A single application can be generating/collecting many types of data 6
  7. 7. Characteristics of Big Data: 3-Speed (Velocity) • Data is begin generated fast and need to be processed fast • Online Data Analytics • Late decisions  missing opportunities • Examples • E-Promotions: Based on your current location, your purchase history, what you like  send promotions right now for store next to you • Healthcare monitoring: sensors monitoring your activities and body  any abnormal measurements require immediate reaction 7
  8. 8. Big Data: 3V’s 8
  9. 9. Some Make it 4V’s 9
  10. 10. Harnessing Big Data • OLTP: Online Transaction Processing (DBMSs) • OLAP: Online Analytical Processing (Data Warehousing) • RTAP: Real-Time Analytics Processing (Big Data Architecture & technology) 10
  11. 11. Who’s Generating Big Data Social media and networks (all of us are generating data) Scientific instruments (collecting all sorts of data) Mobile devices (tracking all objects all the time) Sensor technology and networks (measuring all kinds of data) • The progress and innovation is no longer hindered by the ability to collect data • But, by the ability to manage, analyze, summarize, visualize, and discover knowledge from the collected data in a timely manner and in a scalable fashion 11
  12. 12. The Model Has Changed… • The Model of Generating/Consuming Data has Changed Old Model: Few companies are generating data, all others are consuming data New Model: all of us are generating data, and all of us are consuming data 12
  13. 13. What’s driving Big Data - Ad-hoc querying and reporting - Data mining techniques - Structured data, typical sources - Small to mid-size datasets - Optimizations and predictive analytics - Complex statistical analysis - All types of data, and many sources - Very large datasets - More of a real-time 13
  14. 14. Value of Big Data Analytics • Big data is more real-time in nature than traditional DW applications • Traditional DW architectures (e.g. Exadata, Teradata) are not well- suited for big data apps • Shared nothing, massively parallel processing, scale out architectures are well-suited for big data apps 14
  15. 15. Challenges in Handling Big Data • The Bottleneck is in technology • New architecture, algorithms, techniques are needed • Also in technical skills • Experts in using the new technology and dealing with big data 15
  16. 16. What Technology Do We Have For Big Data ?? 16
  17. 17. 17
  18. 18. Big Data Technology 18
  19. 19. What You Will Learn… • We focus on Hadoop/MapReduce technology • Learn the platform (how it is designed and works) • How big data are managed in a scalable, efficient way • Learn writing Hadoop jobs in different languages • Programming Languages: Java, C, Python • High-Level Languages: Apache Pig, Hive • Learn advanced analytics tools on top of Hadoop • RHadoop: Statistical tools for managing big data • Mahout: Data mining and machine learning tools over big data • Learn state-of-art technology from recent research papers • Optimizations, indexing techniques, and other extensions to Hadoop 19
  20. 20. Course Logistics 20
  21. 21. Textbook & Reading List • No specific textbook • Big Data is a relatively new topic (so no fixed syllabus) • Reading List • We will cover the state-of-art technology from research papers in big conferences • Many Hadoop-related papers are available on the course website • Related books: • Hadoop, The Definitive Guide [pdf] 21
  • AAanchalLaad

    Aug. 28, 2021
  • Charmi22

    Jun. 3, 2020
  • PreshCyber

    Aug. 19, 2019
  • JoanOlabisi1

    Aug. 15, 2019
  • VisvakS

    Jun. 12, 2018
  • NargisSultana14

    May. 14, 2018
  • SweetyS7

    Feb. 16, 2018
  • IsaraRampaigul

    Jan. 20, 2018
  • RashmiGB1

    Jan. 2, 2018
  • LijaJoy

    Nov. 13, 2017
  • ManuManu54

    Sep. 13, 2017
  • MohitMehta53

    Nov. 14, 2016
  • tamingchang

    Sep. 11, 2016
  • JosephPatterson2

    May. 3, 2016

Presentation on Big Data Analytics

Views

Total views

10,307

On Slideshare

0

From embeds

0

Number of embeds

11

Actions

Downloads

605

Shares

0

Comments

0

Likes

14

×