SlideShare a Scribd company logo
1 of 17
Diman Maharjan
070bex412
Pulchowk Campus
BIG DATA
An OVERVIEW
What is Big Data?
• Data sets with sizes beyond the ability of
commomly used software tools to capture,
curate,manage, and process the data within a
tolerable elapsed time.
• Difficult to process using on-hand database
management tools or traditional data
processing applications.
Big Data
• Big data may well be the Next Big thing in IT
world.
• New techniques , tools and architecture
Big Data today:
• Facebook system processes 2.5 billion pieces
of content and 500+ terabytes of data each
day. It’s pulling in 2.7 billion Like actions and
300 million photos per day, and it scans
roughly 105 terabytes of data each half hour.
• Walmart handles more than 1 million
customer transactions every hour.
Characteristics of Big Data
3 V’s
VOLUME VARIETYVELOCITY
Volume
• Size of data is being increased day by day,
minutes by minutes
• Big data deals with extremely large size of data
• Facebook system processes 2.5 billion pieces of
content and 500+ terabytes of data each day. It’s
pulling in 2.7 billion Like actions and 300 million
photos per day, and it scans roughly 105
terabytes of data each half hour.
• Walmart handles more than 1 million customer
transactions every hour.
Velocity
• High frequency stock trading algorithm reflect
market changes within microseconds
• Machine to machine processes exchange data
between billions of devices
• Infrastructures and sensors generate massive log
data in real time that has to be transferred
processed in less time
• Online gaming systems support millions of
concurrent users, each producing multiple inputs
per seconds
Variety
• Today , data is of various formats, types, and
structures.
• Text, numerical, images, 3D graphics, audio,
video, time series,sequences
• Structured , semistructured, unstructured
data
Generation of Big data
• Sensor technologies and networks
• Scientific instruments
• Social media and networks
• Online marketing and banking .
Big Data Analytics
• Examining large amount of data
• Appropriate information
• Identification of hidden patterns, unknown
correlations
• Competitive advantage
• Better business decisions: strategic and
operational
• Effective marketing , customer satisfaction ,
increased revenue
Distributed System
• A distributed system is a model in which
components located on networked computers
communicate and coordinate their actions by
passing messages. The components interact
with each other in order to achieve a common
goal.
Role of Distributed system in Big Data
• Distributed computing and parallel processing techniques can make
a significant difference in the latency experienced by customers,
suppliers, and partners. Many big data applications are dependent
on low latency because of the big data requirements for speed and
the volume and variety of the data.
• Provides the capability to process and analyze huge amounts of
data in near real time.
• Helps to meet big data demands
• Big data take advantage of availablr hardware by automating
processes like load balancing and optimization across a huge cluster
of nodes.
• Analysts able to use and process all the data rather than settling for
snapshots.
Who are the data scientists?
• Data scientists are a new breed of analytical
data expert who have the technical skills to
solve complex problems – and the curiosity to
explore what problems need to be solved.
• Part mathematician, part computer scientist
and part trend-spotter.
• Because they straddle both the business and
IT worlds, they’re highly sought-after and well-
paid.
Data Scientist Skillsets
Skills required for data scientist
• Curious and explorative mindset
• Ability to question existing practices and
devise alternatives
• Strong analytical skills
• Effective communication skills for diverse
audience
• Business problem-solving skills
• Cross-functional team management skills
Role and job duties of a Data scientist
• Collecting large amounts of unruly data and transforming it into a
more usable format.
• Solving business-related problems using data-driven techniques.
• Working with a variety of programming languages, including SAS, R
and Python.
• Having a solid grasp of statistics, including statistical tests and
distributions.
• Staying on top of analytical techniques such as machine learning,
deep learning and text analytics.
• Communicating and collaborating with both IT and business.
• Looking for order and patterns in data, as well as spotting trends
that can help a business’s bottom line.
Thank you

More Related Content

What's hot

What's hot (20)

big data
big data big data
big data
 
Big Data: The 6 Key Skills Every Business Needs
Big Data: The 6 Key Skills Every Business NeedsBig Data: The 6 Key Skills Every Business Needs
Big Data: The 6 Key Skills Every Business Needs
 
Overview of Big data(ppt)
Overview of Big data(ppt)Overview of Big data(ppt)
Overview of Big data(ppt)
 
Big data
Big dataBig data
Big data
 
Big Data Analytics - A Glimpse
Big Data Analytics - A GlimpseBig Data Analytics - A Glimpse
Big Data Analytics - A Glimpse
 
Introduction to Big Data & Big Data 1.0 System
Introduction to Big Data & Big Data 1.0 SystemIntroduction to Big Data & Big Data 1.0 System
Introduction to Big Data & Big Data 1.0 System
 
Big data intro.pptx
Big data intro.pptxBig data intro.pptx
Big data intro.pptx
 
Big data-ppt-
Big data-ppt-Big data-ppt-
Big data-ppt-
 
Where HADOOP fits in and challenges
Where HADOOP fits in and challengesWhere HADOOP fits in and challenges
Where HADOOP fits in and challenges
 
On Big Data Analytics - opportunities and challenges
On Big Data Analytics - opportunities and challengesOn Big Data Analytics - opportunities and challenges
On Big Data Analytics - opportunities and challenges
 
Overview of Big Data
Overview of Big DataOverview of Big Data
Overview of Big Data
 
Presentation Big Data
Presentation Big DataPresentation Big Data
Presentation Big Data
 
Big data Mining
Big data MiningBig data Mining
Big data Mining
 
IoT and Big Data
IoT and Big DataIoT and Big Data
IoT and Big Data
 
Big data Presentation
Big data PresentationBig data Presentation
Big data Presentation
 
Data Science
Data ScienceData Science
Data Science
 
5 v of big data
5 v of big data5 v of big data
5 v of big data
 
Big data.
Big data.Big data.
Big data.
 
Big data(1st presentation)
Big data(1st presentation)Big data(1st presentation)
Big data(1st presentation)
 
Big Data, Big Deal: For Future Big Data Scientists
Big Data, Big Deal: For Future Big Data ScientistsBig Data, Big Deal: For Future Big Data Scientists
Big Data, Big Deal: For Future Big Data Scientists
 

Similar to BIg Data Overview

Content1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docxContent1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docx
dickonsondorris
 
Big-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigBig-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-Koenig
Manish Chopra
 
ppt final.pptx
ppt final.pptxppt final.pptx
ppt final.pptx
kalai75
 
Unit-I- Introduction- Traits of Big Data-Final.pptx
Unit-I- Introduction- Traits of Big Data-Final.pptxUnit-I- Introduction- Traits of Big Data-Final.pptx
Unit-I- Introduction- Traits of Big Data-Final.pptx
subhashchandra197
 

Similar to BIg Data Overview (20)

bigdatappt.pptx
bigdatappt.pptxbigdatappt.pptx
bigdatappt.pptx
 
Content1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docxContent1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docx
 
Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 
Special issues on big data
Special issues on big dataSpecial issues on big data
Special issues on big data
 
Big_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptxBig_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptx
 
Big Data Analytics.pdfbgfjgjgghfhhffhdfyf
Big Data Analytics.pdfbgfjgjgghfhhffhdfyfBig Data Analytics.pdfbgfjgjgghfhhffhdfyf
Big Data Analytics.pdfbgfjgjgghfhhffhdfyf
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 
Big-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigBig-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-Koenig
 
ppt final.pptx
ppt final.pptxppt final.pptx
ppt final.pptx
 
TOPIC.pptx
TOPIC.pptxTOPIC.pptx
TOPIC.pptx
 
Unit-I- Introduction- Traits of Big Data-Final.pptx
Unit-I- Introduction- Traits of Big Data-Final.pptxUnit-I- Introduction- Traits of Big Data-Final.pptx
Unit-I- Introduction- Traits of Big Data-Final.pptx
 
Big data and analytics
Big data and analyticsBig data and analytics
Big data and analytics
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Kartikey tripathi
Kartikey tripathiKartikey tripathi
Kartikey tripathi
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data Management
 
Big_Data.pptx
Big_Data.pptxBig_Data.pptx
Big_Data.pptx
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Big Data Evolution
Big Data EvolutionBig Data Evolution
Big Data Evolution
 
Handling and Processing Big Data
Handling and Processing Big DataHandling and Processing Big Data
Handling and Processing Big Data
 

Recently uploaded

scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
HenryBriggs2
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
Epec Engineered Technologies
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Kandungan 087776558899
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
mphochane1998
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
AldoGarca30
 
Query optimization and processing for advanced database systems
Query optimization and processing for advanced database systemsQuery optimization and processing for advanced database systems
Query optimization and processing for advanced database systems
meharikiros2
 
Hospital management system project report.pdf
Hospital management system project report.pdfHospital management system project report.pdf
Hospital management system project report.pdf
Kamal Acharya
 

Recently uploaded (20)

Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
Worksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptxWorksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptx
 
Introduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdfIntroduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdf
 
Signal Processing and Linear System Analysis
Signal Processing and Linear System AnalysisSignal Processing and Linear System Analysis
Signal Processing and Linear System Analysis
 
Path loss model, OKUMURA Model, Hata Model
Path loss model, OKUMURA Model, Hata ModelPath loss model, OKUMURA Model, Hata Model
Path loss model, OKUMURA Model, Hata Model
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdf
 
PE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and propertiesPE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and properties
 
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
 
fitting shop and tools used in fitting shop .ppt
fitting shop and tools used in fitting shop .pptfitting shop and tools used in fitting shop .ppt
fitting shop and tools used in fitting shop .ppt
 
Convergence of Robotics and Gen AI offers excellent opportunities for Entrepr...
Convergence of Robotics and Gen AI offers excellent opportunities for Entrepr...Convergence of Robotics and Gen AI offers excellent opportunities for Entrepr...
Convergence of Robotics and Gen AI offers excellent opportunities for Entrepr...
 
Electromagnetic relays used for power system .pptx
Electromagnetic relays used for power system .pptxElectromagnetic relays used for power system .pptx
Electromagnetic relays used for power system .pptx
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
 
Query optimization and processing for advanced database systems
Query optimization and processing for advanced database systemsQuery optimization and processing for advanced database systems
Query optimization and processing for advanced database systems
 
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
 
Hospital management system project report.pdf
Hospital management system project report.pdfHospital management system project report.pdf
Hospital management system project report.pdf
 

BIg Data Overview

  • 2. What is Big Data? • Data sets with sizes beyond the ability of commomly used software tools to capture, curate,manage, and process the data within a tolerable elapsed time. • Difficult to process using on-hand database management tools or traditional data processing applications.
  • 3. Big Data • Big data may well be the Next Big thing in IT world. • New techniques , tools and architecture
  • 4. Big Data today: • Facebook system processes 2.5 billion pieces of content and 500+ terabytes of data each day. It’s pulling in 2.7 billion Like actions and 300 million photos per day, and it scans roughly 105 terabytes of data each half hour. • Walmart handles more than 1 million customer transactions every hour.
  • 5. Characteristics of Big Data 3 V’s VOLUME VARIETYVELOCITY
  • 6. Volume • Size of data is being increased day by day, minutes by minutes • Big data deals with extremely large size of data • Facebook system processes 2.5 billion pieces of content and 500+ terabytes of data each day. It’s pulling in 2.7 billion Like actions and 300 million photos per day, and it scans roughly 105 terabytes of data each half hour. • Walmart handles more than 1 million customer transactions every hour.
  • 7. Velocity • High frequency stock trading algorithm reflect market changes within microseconds • Machine to machine processes exchange data between billions of devices • Infrastructures and sensors generate massive log data in real time that has to be transferred processed in less time • Online gaming systems support millions of concurrent users, each producing multiple inputs per seconds
  • 8. Variety • Today , data is of various formats, types, and structures. • Text, numerical, images, 3D graphics, audio, video, time series,sequences • Structured , semistructured, unstructured data
  • 9. Generation of Big data • Sensor technologies and networks • Scientific instruments • Social media and networks • Online marketing and banking .
  • 10. Big Data Analytics • Examining large amount of data • Appropriate information • Identification of hidden patterns, unknown correlations • Competitive advantage • Better business decisions: strategic and operational • Effective marketing , customer satisfaction , increased revenue
  • 11. Distributed System • A distributed system is a model in which components located on networked computers communicate and coordinate their actions by passing messages. The components interact with each other in order to achieve a common goal.
  • 12. Role of Distributed system in Big Data • Distributed computing and parallel processing techniques can make a significant difference in the latency experienced by customers, suppliers, and partners. Many big data applications are dependent on low latency because of the big data requirements for speed and the volume and variety of the data. • Provides the capability to process and analyze huge amounts of data in near real time. • Helps to meet big data demands • Big data take advantage of availablr hardware by automating processes like load balancing and optimization across a huge cluster of nodes. • Analysts able to use and process all the data rather than settling for snapshots.
  • 13. Who are the data scientists? • Data scientists are a new breed of analytical data expert who have the technical skills to solve complex problems – and the curiosity to explore what problems need to be solved. • Part mathematician, part computer scientist and part trend-spotter. • Because they straddle both the business and IT worlds, they’re highly sought-after and well- paid.
  • 15. Skills required for data scientist • Curious and explorative mindset • Ability to question existing practices and devise alternatives • Strong analytical skills • Effective communication skills for diverse audience • Business problem-solving skills • Cross-functional team management skills
  • 16. Role and job duties of a Data scientist • Collecting large amounts of unruly data and transforming it into a more usable format. • Solving business-related problems using data-driven techniques. • Working with a variety of programming languages, including SAS, R and Python. • Having a solid grasp of statistics, including statistical tests and distributions. • Staying on top of analytical techniques such as machine learning, deep learning and text analytics. • Communicating and collaborating with both IT and business. • Looking for order and patterns in data, as well as spotting trends that can help a business’s bottom line.