SlideShare a Scribd company logo
stef-bauer.com/2012/12/10/you-need-a-zetta-what
“Big Data”

Hadoop Introduction



     Stefan Bauer
A little about me…

   Data Warehouse Administrator
       Architect (logical/physical)
       DBA (monitoring, space management, etc)
       SSIS Developer (build it… run it… support it)
       SSAS/SSRS (performance tuning, supporting)
       Performance monitoring (is it all working?)
       I am a geek (Some people have pointed that out about me…
        judge for yourself)
What we will cover
   Why do you care (or at least why you should)?
   General overview
   Basic terms (get us on the same page)
   A Look at some of the technology (aka demo)




   All of the technical parts are in a multi-part
    series on my Blog
What kind of data do sort
        through?
   Interesting technology…
   might not be for you




                                    You have big data…
             Getting there… might   and you know it!
             be something
             interesting to start
             working out the
             details…
What is that Hadoop thing I
       keep hearing about?
   A Framework (collection of technologies)
   Complex processing
   Massively parallel
   Large amounts of data
   Commodity hardware
Hadoop … what is it not

   Ad hoc analytics
   Low latency between data arrival,
    analysis, and query usage
   “fast” (speed is a relative thing)
       Facebook has interactive queries on Hadoop
        framework
   Good for small data
Terms
   Cloud
   Cluster
   Hadoop
   Hadoop Distributed File System (HDFS)
   Hue (Web Interface for Mapreduce/Oozie)
   Mapreduce
       Job Tracker
       Task Trackers (on Data Nodes)
   Oozie (Workflow Management)
Terms
   Pig (Distributed Transformation Scripting)
   Beeswax (Wrapper for Hive)
   Hive
       EDW on (10’s, 100’s, 1000’s servers)
       HiveQL (Based on Ansi SQL)
       Reporting Tools/Business Analytics
   Name Node
       Data Nodes
   Zookeeper (Distributed Configuration Management)
   Cloudera/MapR/Amazon/Hortonworks …
HDFS
Cloudera
Hive
Questions?
Questions?

Stef-Bauer.com


@stefbauer


Stef_Bauer@hotmail.com

More Related Content

What's hot

Big Data Introduction
Big Data IntroductionBig Data Introduction
Big Data Introduction
yalla4u
 
Big Data Hadoop Technology
Big Data Hadoop TechnologyBig Data Hadoop Technology
Big Data Hadoop Technology
Rahul Sharma
 
Cortana Analytics Workshop: Big Data @ Microsoft
Cortana Analytics Workshop: Big Data @ MicrosoftCortana Analytics Workshop: Big Data @ Microsoft
Cortana Analytics Workshop: Big Data @ Microsoft
MSAdvAnalytics
 
Big Data Processing with Hadoop-MapReduce in Cloud Systems
Big Data Processing with Hadoop-MapReduce in Cloud SystemsBig Data Processing with Hadoop-MapReduce in Cloud Systems
Big Data Processing with Hadoop-MapReduce in Cloud Systems
Intellipaat
 
Redis memory optimization sripathi, CTO hashedin
Redis memory optimization   sripathi, CTO hashedinRedis memory optimization   sripathi, CTO hashedin
Redis memory optimization sripathi, CTO hashedin
HashedIn Technologies
 
What does the future of Big data look like?How to get a fresher job in data a...
What does the future of Big data look like?How to get a fresher job in data a...What does the future of Big data look like?How to get a fresher job in data a...
What does the future of Big data look like?How to get a fresher job in data a...
Acutesoft Solutions India Pvt Ltd
 
Why Use Hadoop?
Why Use Hadoop?Why Use Hadoop?
Why Use Hadoop?
Datameer
 
Integrating Hadoop & Solr
Integrating Hadoop & SolrIntegrating Hadoop & Solr
Integrating Hadoop & Solr
Lucidworks (Archived)
 
Semantic web meetup 14.november 2013
Semantic web meetup 14.november 2013Semantic web meetup 14.november 2013
Semantic web meetup 14.november 2013
Jean-Pierre König
 
Big data overview
Big data overviewBig data overview
Big data overview
Akash Pramanik
 
Meetup at AI NextCon 2019: In-Stream data process, Data Orchestration & More
Meetup at AI NextCon 2019: In-Stream data process, Data Orchestration & MoreMeetup at AI NextCon 2019: In-Stream data process, Data Orchestration & More
Meetup at AI NextCon 2019: In-Stream data process, Data Orchestration & More
Alluxio, Inc.
 
The world with Cloud, Big Data, ML, IoT and AI
The world with Cloud, Big Data, ML, IoT and AIThe world with Cloud, Big Data, ML, IoT and AI
The world with Cloud, Big Data, ML, IoT and AI
MeenakshiGupta127
 
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Lucidworks (Archived)
 
Ernestas Sysojevas. Hadoop Essentials and Ecosystem
Ernestas Sysojevas. Hadoop Essentials and EcosystemErnestas Sysojevas. Hadoop Essentials and Ecosystem
Ernestas Sysojevas. Hadoop Essentials and Ecosystem
Volha Banadyseva
 
Accessing Hadoop Data using Hive
Accessing Hadoop Data using HiveAccessing Hadoop Data using Hive
Accessing Hadoop Data using Hive
Tejas Oza
 
Data Orchestration for AI, Big Data, and Cloud
Data Orchestration for AI, Big Data, and CloudData Orchestration for AI, Big Data, and Cloud
Data Orchestration for AI, Big Data, and Cloud
Alluxio, Inc.
 
Introdution to Apache Hadoop
Introdution to Apache HadoopIntrodution to Apache Hadoop
Introdution to Apache Hadoop
Mike Frampton
 
Nosql Introduction, Basics
Nosql Introduction, BasicsNosql Introduction, Basics
Nosql Introduction, Basics
Camellia Ghoroghi
 

What's hot (18)

Big Data Introduction
Big Data IntroductionBig Data Introduction
Big Data Introduction
 
Big Data Hadoop Technology
Big Data Hadoop TechnologyBig Data Hadoop Technology
Big Data Hadoop Technology
 
Cortana Analytics Workshop: Big Data @ Microsoft
Cortana Analytics Workshop: Big Data @ MicrosoftCortana Analytics Workshop: Big Data @ Microsoft
Cortana Analytics Workshop: Big Data @ Microsoft
 
Big Data Processing with Hadoop-MapReduce in Cloud Systems
Big Data Processing with Hadoop-MapReduce in Cloud SystemsBig Data Processing with Hadoop-MapReduce in Cloud Systems
Big Data Processing with Hadoop-MapReduce in Cloud Systems
 
Redis memory optimization sripathi, CTO hashedin
Redis memory optimization   sripathi, CTO hashedinRedis memory optimization   sripathi, CTO hashedin
Redis memory optimization sripathi, CTO hashedin
 
What does the future of Big data look like?How to get a fresher job in data a...
What does the future of Big data look like?How to get a fresher job in data a...What does the future of Big data look like?How to get a fresher job in data a...
What does the future of Big data look like?How to get a fresher job in data a...
 
Why Use Hadoop?
Why Use Hadoop?Why Use Hadoop?
Why Use Hadoop?
 
Integrating Hadoop & Solr
Integrating Hadoop & SolrIntegrating Hadoop & Solr
Integrating Hadoop & Solr
 
Semantic web meetup 14.november 2013
Semantic web meetup 14.november 2013Semantic web meetup 14.november 2013
Semantic web meetup 14.november 2013
 
Big data overview
Big data overviewBig data overview
Big data overview
 
Meetup at AI NextCon 2019: In-Stream data process, Data Orchestration & More
Meetup at AI NextCon 2019: In-Stream data process, Data Orchestration & MoreMeetup at AI NextCon 2019: In-Stream data process, Data Orchestration & More
Meetup at AI NextCon 2019: In-Stream data process, Data Orchestration & More
 
The world with Cloud, Big Data, ML, IoT and AI
The world with Cloud, Big Data, ML, IoT and AIThe world with Cloud, Big Data, ML, IoT and AI
The world with Cloud, Big Data, ML, IoT and AI
 
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
 
Ernestas Sysojevas. Hadoop Essentials and Ecosystem
Ernestas Sysojevas. Hadoop Essentials and EcosystemErnestas Sysojevas. Hadoop Essentials and Ecosystem
Ernestas Sysojevas. Hadoop Essentials and Ecosystem
 
Accessing Hadoop Data using Hive
Accessing Hadoop Data using HiveAccessing Hadoop Data using Hive
Accessing Hadoop Data using Hive
 
Data Orchestration for AI, Big Data, and Cloud
Data Orchestration for AI, Big Data, and CloudData Orchestration for AI, Big Data, and Cloud
Data Orchestration for AI, Big Data, and Cloud
 
Introdution to Apache Hadoop
Introdution to Apache HadoopIntrodution to Apache Hadoop
Introdution to Apache Hadoop
 
Nosql Introduction, Basics
Nosql Introduction, BasicsNosql Introduction, Basics
Nosql Introduction, Basics
 

Viewers also liked

Sql user group
Sql user groupSql user group
Sql user group
Stefan Bauer
 
My love
My loveMy love
My lovebymafe
 
Mathematics
MathematicsMathematics
Mathematicsbymafe
 
Darwinismo digital nova era do windows - ufv
Darwinismo digital   nova era do windows - ufvDarwinismo digital   nova era do windows - ufv
Darwinismo digital nova era do windows - ufv
André Paulovich
 
AT HOME
AT HOMEAT HOME
AT HOME
paula
 
Vs self rest
Vs self restVs self rest
Vs self rest
jianfeng
 
Internet per Umarells&Zdaore
Internet per Umarells&Zdaore Internet per Umarells&Zdaore
Internet per Umarells&Zdaore
tagbologna lab
 
Pasiva
PasivaPasiva
Pasiva
TeenyWeeny
 
Strange natural landscapes
Strange natural landscapesStrange natural landscapes
Strange natural landscapes
bymafe
 
Photos insolites
Photos insolitesPhotos insolites
Photos insolites
bymafe
 
Virtualidad
VirtualidadVirtualidad
Virtualidad
JuanKMillos
 
Hoja julio
Hoja julioHoja julio
Hoja julio
San José Ares
 
Test greek
Test greekTest greek
Test greekbymafe
 
Medical ehtics
Medical ehticsMedical ehtics
Medical ehtics
jianfeng
 
تصاميمي
تصاميميتصاميمي
تصاميميbotareq
 
Caso mp3
Caso mp3Caso mp3
Caso mp3
Wendy Gabriela
 
Food of the world
Food of the worldFood of the world
Food of the world
bymafe
 
Uusi kasvu ja uusi työ akava berd volume
Uusi kasvu ja uusi työ akava berd volumeUusi kasvu ja uusi työ akava berd volume
Uusi kasvu ja uusi työ akava berd volumeVesa Vuorenkoski
 
नेपाल भूकंप त्रासदी फाइनल
नेपाल भूकंप त्रासदी फाइनलनेपाल भूकंप त्रासदी फाइनल
नेपाल भूकंप त्रासदी फाइनल
ITC Infotech
 
Children Included
Children Included Children Included
Children Included
Katherine Lyddon
 

Viewers also liked (20)

Sql user group
Sql user groupSql user group
Sql user group
 
My love
My loveMy love
My love
 
Mathematics
MathematicsMathematics
Mathematics
 
Darwinismo digital nova era do windows - ufv
Darwinismo digital   nova era do windows - ufvDarwinismo digital   nova era do windows - ufv
Darwinismo digital nova era do windows - ufv
 
AT HOME
AT HOMEAT HOME
AT HOME
 
Vs self rest
Vs self restVs self rest
Vs self rest
 
Internet per Umarells&Zdaore
Internet per Umarells&Zdaore Internet per Umarells&Zdaore
Internet per Umarells&Zdaore
 
Pasiva
PasivaPasiva
Pasiva
 
Strange natural landscapes
Strange natural landscapesStrange natural landscapes
Strange natural landscapes
 
Photos insolites
Photos insolitesPhotos insolites
Photos insolites
 
Virtualidad
VirtualidadVirtualidad
Virtualidad
 
Hoja julio
Hoja julioHoja julio
Hoja julio
 
Test greek
Test greekTest greek
Test greek
 
Medical ehtics
Medical ehticsMedical ehtics
Medical ehtics
 
تصاميمي
تصاميميتصاميمي
تصاميمي
 
Caso mp3
Caso mp3Caso mp3
Caso mp3
 
Food of the world
Food of the worldFood of the world
Food of the world
 
Uusi kasvu ja uusi työ akava berd volume
Uusi kasvu ja uusi työ akava berd volumeUusi kasvu ja uusi työ akava berd volume
Uusi kasvu ja uusi työ akava berd volume
 
नेपाल भूकंप त्रासदी फाइनल
नेपाल भूकंप त्रासदी फाइनलनेपाल भूकंप त्रासदी फाइनल
नेपाल भूकंप त्रासदी फाइनल
 
Children Included
Children Included Children Included
Children Included
 

Similar to Hadoop intro

SQLSat 245 - Por Onde Começar no BigData
SQLSat 245 - Por Onde Começar no BigDataSQLSat 245 - Por Onde Começar no BigData
SQLSat 245 - Por Onde Começar no BigData
Diego Nogare
 
Hadoop and Big Data: Revealed
Hadoop and Big Data: RevealedHadoop and Big Data: Revealed
Hadoop and Big Data: Revealed
Sachin Holla
 
How Hadoop Revolutionized Data Warehousing at Yahoo and Facebook
How Hadoop Revolutionized Data Warehousing at Yahoo and FacebookHow Hadoop Revolutionized Data Warehousing at Yahoo and Facebook
How Hadoop Revolutionized Data Warehousing at Yahoo and Facebook
Amr Awadallah
 
Hands on Hadoop and pig
Hands on Hadoop and pigHands on Hadoop and pig
Hands on Hadoop and pig
Sudar Muthu
 
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Chris Baglieri
 
Zh tw cloud computing era
Zh tw cloud computing eraZh tw cloud computing era
Zh tw cloud computing era
TrendProgContest13
 
Cloud computing era
Cloud computing eraCloud computing era
Cloud computing era
TrendProgContest13
 
Introduction to apache hadoop
Introduction to apache hadoopIntroduction to apache hadoop
Introduction to apache hadoop
Shashwat Shriparv
 
Hadoop and BigData - July 2016
Hadoop and BigData - July 2016Hadoop and BigData - July 2016
Hadoop and BigData - July 2016
Ranjith Sekar
 
Hadoop Developer
Hadoop DeveloperHadoop Developer
Hadoop Developer
Edureka!
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
Flavio Vit
 
Big data or big deal
Big data or big dealBig data or big deal
Big data or big deal
eduarderwee
 
Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1
Thanh Nguyen
 
Overview of big data & hadoop version 1 - Tony Nguyen
Overview of big data & hadoop   version 1 - Tony NguyenOverview of big data & hadoop   version 1 - Tony Nguyen
Overview of big data & hadoop version 1 - Tony Nguyen
Thanh Nguyen
 
Interactive SQL-on-Hadoop and JethroData
Interactive SQL-on-Hadoop and JethroDataInteractive SQL-on-Hadoop and JethroData
Interactive SQL-on-Hadoop and JethroData
Ofir Manor
 
Hadoop Frameworks Panel__HadoopSummit2010
Hadoop Frameworks Panel__HadoopSummit2010Hadoop Frameworks Panel__HadoopSummit2010
Hadoop Frameworks Panel__HadoopSummit2010
Yahoo Developer Network
 
Hadoop Demystified + MapReduce (Java and C#), Pig, and Hive Demos
Hadoop Demystified + MapReduce (Java and C#), Pig, and Hive DemosHadoop Demystified + MapReduce (Java and C#), Pig, and Hive Demos
Hadoop Demystified + MapReduce (Java and C#), Pig, and Hive Demos
Lester Martin
 
Hopsworks in the cloud Berlin Buzzwords 2019
Hopsworks in the cloud Berlin Buzzwords 2019 Hopsworks in the cloud Berlin Buzzwords 2019
Hopsworks in the cloud Berlin Buzzwords 2019
Jim Dowling
 
Hadoop in action
Hadoop in actionHadoop in action
Hadoop in action
Mahmoud Yassin
 
Big Data & Hadoop Tutorial
Big Data & Hadoop TutorialBig Data & Hadoop Tutorial
Big Data & Hadoop Tutorial
Edureka!
 

Similar to Hadoop intro (20)

SQLSat 245 - Por Onde Começar no BigData
SQLSat 245 - Por Onde Começar no BigDataSQLSat 245 - Por Onde Começar no BigData
SQLSat 245 - Por Onde Começar no BigData
 
Hadoop and Big Data: Revealed
Hadoop and Big Data: RevealedHadoop and Big Data: Revealed
Hadoop and Big Data: Revealed
 
How Hadoop Revolutionized Data Warehousing at Yahoo and Facebook
How Hadoop Revolutionized Data Warehousing at Yahoo and FacebookHow Hadoop Revolutionized Data Warehousing at Yahoo and Facebook
How Hadoop Revolutionized Data Warehousing at Yahoo and Facebook
 
Hands on Hadoop and pig
Hands on Hadoop and pigHands on Hadoop and pig
Hands on Hadoop and pig
 
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
 
Zh tw cloud computing era
Zh tw cloud computing eraZh tw cloud computing era
Zh tw cloud computing era
 
Cloud computing era
Cloud computing eraCloud computing era
Cloud computing era
 
Introduction to apache hadoop
Introduction to apache hadoopIntroduction to apache hadoop
Introduction to apache hadoop
 
Hadoop and BigData - July 2016
Hadoop and BigData - July 2016Hadoop and BigData - July 2016
Hadoop and BigData - July 2016
 
Hadoop Developer
Hadoop DeveloperHadoop Developer
Hadoop Developer
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
 
Big data or big deal
Big data or big dealBig data or big deal
Big data or big deal
 
Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1
 
Overview of big data & hadoop version 1 - Tony Nguyen
Overview of big data & hadoop   version 1 - Tony NguyenOverview of big data & hadoop   version 1 - Tony Nguyen
Overview of big data & hadoop version 1 - Tony Nguyen
 
Interactive SQL-on-Hadoop and JethroData
Interactive SQL-on-Hadoop and JethroDataInteractive SQL-on-Hadoop and JethroData
Interactive SQL-on-Hadoop and JethroData
 
Hadoop Frameworks Panel__HadoopSummit2010
Hadoop Frameworks Panel__HadoopSummit2010Hadoop Frameworks Panel__HadoopSummit2010
Hadoop Frameworks Panel__HadoopSummit2010
 
Hadoop Demystified + MapReduce (Java and C#), Pig, and Hive Demos
Hadoop Demystified + MapReduce (Java and C#), Pig, and Hive DemosHadoop Demystified + MapReduce (Java and C#), Pig, and Hive Demos
Hadoop Demystified + MapReduce (Java and C#), Pig, and Hive Demos
 
Hopsworks in the cloud Berlin Buzzwords 2019
Hopsworks in the cloud Berlin Buzzwords 2019 Hopsworks in the cloud Berlin Buzzwords 2019
Hopsworks in the cloud Berlin Buzzwords 2019
 
Hadoop in action
Hadoop in actionHadoop in action
Hadoop in action
 
Big Data & Hadoop Tutorial
Big Data & Hadoop TutorialBig Data & Hadoop Tutorial
Big Data & Hadoop Tutorial
 

Recently uploaded

Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 

Recently uploaded (20)

Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 

Hadoop intro

  • 3. A little about me…  Data Warehouse Administrator  Architect (logical/physical)  DBA (monitoring, space management, etc)  SSIS Developer (build it… run it… support it)  SSAS/SSRS (performance tuning, supporting)  Performance monitoring (is it all working?)  I am a geek (Some people have pointed that out about me… judge for yourself)
  • 4. What we will cover  Why do you care (or at least why you should)?  General overview  Basic terms (get us on the same page)  A Look at some of the technology (aka demo)  All of the technical parts are in a multi-part series on my Blog
  • 5. What kind of data do sort through? Interesting technology… might not be for you You have big data… Getting there… might and you know it! be something interesting to start working out the details…
  • 6. What is that Hadoop thing I keep hearing about?  A Framework (collection of technologies)  Complex processing  Massively parallel  Large amounts of data  Commodity hardware
  • 7. Hadoop … what is it not  Ad hoc analytics  Low latency between data arrival, analysis, and query usage  “fast” (speed is a relative thing)  Facebook has interactive queries on Hadoop framework  Good for small data
  • 8. Terms  Cloud  Cluster  Hadoop  Hadoop Distributed File System (HDFS)  Hue (Web Interface for Mapreduce/Oozie)  Mapreduce  Job Tracker  Task Trackers (on Data Nodes)  Oozie (Workflow Management)
  • 9. Terms  Pig (Distributed Transformation Scripting)  Beeswax (Wrapper for Hive)  Hive  EDW on (10’s, 100’s, 1000’s servers)  HiveQL (Based on Ansi SQL)  Reporting Tools/Business Analytics  Name Node  Data Nodes  Zookeeper (Distributed Configuration Management)  Cloudera/MapR/Amazon/Hortonworks …
  • 10. HDFS
  • 11.
  • 12.
  • 14.
  • 15.
  • 16. Hive
  • 17.