Java one14 handsonhadoop

•Download as PPTX, PDF•

0 likes•552 views

The document introduces Daniel Templeton and Inyoung Cho, who will be hosting a hands-on Hadoop lab. They define big data as any data that is difficult to store in a traditional database due to size, changing schemas, or being unstructured. The lab will provide overviews of the core Hadoop components - HDFS is a distributed file system that chunks and replicates files across nodes, MapReduce provides parallel processing in two phases of mapping and reducing, Hive allows SQL queries on Hadoop data by translating queries to MapReduce jobs, Impala improves on Hive by removing the MapReduce layer, Pig provides a scripting language that is also translated to MapReduce jobs. The hands-on lab is self-paced and

Technology

1
Hands on Hadoop
Daniel Templeton & Inyoung Cho
Cloudera, Inc.

2
Your Hosts
Daniel Templeton
• Certification Developer
• Crusty, old HPC guy
• Likes Perl
Inyoung Cho
• Certification Developer
• Recovering Java
Evangelist
• Invented JavaOne Hands-on
Labs
©2014 Cloudera, Inc. 2 All rights reserved.

3
What is “Big Data”?
• Super-cool marketing buzz word
• “Come see our new line of BIG DATA toasters…”
• “The Five V’s”
• Any data that is difficult to store in a traditional
RDBMS
• Too big, changes schemas too often, unstructured, …
©2014 Cloudera, Inc. 3 All rights reserved.

What is Hadoop?
©2014 Cloudera, Inc. 4 All rights reserved.

What is Hadoop?
©2014 Cloudera, Inc. 5 All rights reserved.

6
HDFS in a Nutshell
• Distributed “file system” service
• Highly scalable and fault resilient
• Chunks files into “blocks” that are replicated and
distributed across the cluster
©2014 Cloudera, Inc. 6 All rights reserved.

7
MapReduce in a Nutshell
• Embarrassingly parallel batch execution engine
• Two phases: map and reduce
• https://www.youtube.com/watch?v=bcjSe0xCHbE
• Tasks are scheduled to run where the data is
• Jobs are written to Java API
©2014 Cloudera, Inc. 7 All rights reserved.

8
Hive in a Nutshell
• SQL engine for Hadoop
• Translates HiveQL into MapReduce jobs
©2014 Cloudera, Inc. 8 All rights reserved.

9
Impala in a Nutshell
• Hive with the MapReduce
©2014 Cloudera, Inc. 9 All rights reserved.

10
Pig in a Nutshell
• Script-like language for data operations
• Translates into MapReduce jobs
©2014 Cloudera, Inc. 10 All rights reserved.

11
The Lab
• Self-paced
• Should take right about 2 hours
• “Additional Exercises” if you finish early
• Inyoung and I are here to answer questions
• Have fun!
©2014 Cloudera, Inc. 11 All rights reserved.

12 ©2014 Cloudera, Inc. All rights reserved.
Aaron Myers &
Daniel Templeton

What's hot

Amazon EMRDataKitchen

Harnessing Spark and Cassandra with GroovySteve Pember

Chef ignited a DevOps revolution – BK BoxChef Software, Inc.

Hbasecon2013 Wrap UpMinwoo Kim

Serverspec and Sensu - Testing and Monitoring collidem_richardson

AWS for Start-ups - Case Study - PeoplePerHour Amazon Web Services

NLUUG print conference May 26 2016Igmar Palsenberg

Wido den hollander cloud stack and cephShapeBlue

Python & Cassandra - Best FriendsJon Haddad

Apache Cassandra ManagementInstaclustr

Open DatacentreDes Drury

Orchestrating VM & Container DeploymentsLars Wander

Cassandra @ Sony: The good, the bad, and the ugly part 2DataStax Academy

Scalable On-Demand Hadoop Clusters with Docker and Mesosnelsonadpresent

Chef vs Puppet vs Ansible vs SaltStack | Configuration Management Tools Compa...Edureka!

Kubernetes trainingDes Drury

DevOps, Cloud, and the Death of Backup Tape Changerske4qqq

Large Scale Data Analytics with Spark and Cassandra on the DSE PlatformDataStax Academy

Way to cloudAndrew Yongjoon Kong

Openstack summit 2015Andrew Yongjoon Kong

What's hot (20)

Amazon EMR

Harnessing Spark and Cassandra with Groovy

Chef ignited a DevOps revolution – BK Box

Hbasecon2013 Wrap Up

Serverspec and Sensu - Testing and Monitoring collide

AWS for Start-ups - Case Study - PeoplePerHour

NLUUG print conference May 26 2016

Wido den hollander cloud stack and ceph

Python & Cassandra - Best Friends

Apache Cassandra Management

Open Datacentre

Orchestrating VM & Container Deployments

Cassandra @ Sony: The good, the bad, and the ugly part 2

Scalable On-Demand Hadoop Clusters with Docker and Mesos

Chef vs Puppet vs Ansible vs SaltStack | Configuration Management Tools Compa...

Kubernetes training

DevOps, Cloud, and the Death of Backup Tape Changers

Large Scale Data Analytics with Spark and Cassandra on the DSE Platform

Way to cloud

Openstack summit 2015

Viewers also liked

Healthcare presentationSamy Rajan

Facebooks dilemmagarciagodoy7

Privacy and security on twitterEman Aldakheel

Team Building and Leadership Development Indiaorangesimran

Brandingwineandmeat11202005panakj051

Best Global Brands 2010 U Sทีจีเอ บางกอก

Chaparral biomeVini Kurnia Ramadhani

Work Strategyทีจีเอ บางกอก

7waystousesocialmediatobuildbrands Key 090417111441 Phpapp01ทีจีเอ บางกอก

EsaiVini Kurnia Ramadhani

Road signsminikui81

Power and politicsOmar Jacalne

July 2012 - Blue Grass Chemical Agent-Destruction Pilot Plant Monthly Status ...Program Executive Office, Assembled Chemical Weapons Alternatives (PEO ACWA)

Sildes on different topicsSadia Zareen

Collective bargaining plmOmar Jacalne

Final Marketing Presentationtaygiunto

N3 (Bunpou)Mae

CPA journal lurie shuv articleEhud Lurie

Viewers also liked (18)

Healthcare presentation

Facebooks dilemma

Privacy and security on twitter

Team Building and Leadership Development India

Brandingwineandmeat11202005

Best Global Brands 2010 U S

Chaparral biome

Work Strategy

7waystousesocialmediatobuildbrands Key 090417111441 Phpapp01

Esai

Road signs

Power and politics

July 2012 - Blue Grass Chemical Agent-Destruction Pilot Plant Monthly Status ...

Sildes on different topics

Collective bargaining plm

Final Marketing Presentation

N3 (Bunpou)

CPA journal lurie shuv article

Similar to Java one14 handsonhadoop

Applications on Hadoopmarkgrover

YARNAlex Moundalexis

Intro to Hadoop Presentation at Carnegie Mellon - Silicon Valleymarkgrover

Running Hadoop as Service in AltiScale PlatformInMobi Technology

Webinar: Productionizing Hadoop: Lessons Learned - 20101208Cloudera, Inc.

Application architectures with hadoop – big data techcon 2014Jonathan Seidman

Application architectures with Hadoop – Big Data TechCon 2014hadooparchbook

Impala use case @ edgeRam Kedem

Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesCloudera, Inc.

Big data - Online TrainingLearntek1

Houston Hadoop Meetup Presentation by Vikram Oberoi of ClouderaMark Kerzner

OpenStack and Ceph case study at the University of AlabamaKamesh Pemmaraju

Case Study: University Alabama-Birmingham.Red_Hat_Storage

Building a Hadoop Data Warehouse with ImpalaSwiss Big Data User Group

Big data and mstr bridge the elephantKognitio

Hashicorp at holaluzRicard Clau

Building a Hadoop Data Warehouse with Impalahuguk

Data Science and CDSWJason Hubbard

PyData: The Next Generation | Data Day Texas 2015Cloudera, Inc.

50 Shades of SQLDataWorks Summit

Similar to Java one14 handsonhadoop (20)

Applications on Hadoop

YARN

Intro to Hadoop Presentation at Carnegie Mellon - Silicon Valley

Running Hadoop as Service in AltiScale Platform

Webinar: Productionizing Hadoop: Lessons Learned - 20101208

Application architectures with hadoop – big data techcon 2014

Application architectures with Hadoop – Big Data TechCon 2014

Impala use case @ edge

Hadoop Essentials -- The What, Why and How to Meet Agency Objectives

Big data - Online Training

Houston Hadoop Meetup Presentation by Vikram Oberoi of Cloudera

OpenStack and Ceph case study at the University of Alabama

Case Study: University Alabama-Birmingham.

Building a Hadoop Data Warehouse with Impala

Big data and mstr bridge the elephant

Hashicorp at holaluz

Building a Hadoop Data Warehouse with Impala

Data Science and CDSW

PyData: The Next Generation | Data Day Texas 2015

50 Shades of SQL

Recently uploaded

HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics

GenAI Risks & Security Meetup 01052024.pdflior mazor

Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2

Apidays New York 2024 - The value of a flexible API Management solution for O...apidays

Scaling API-first – The story of a global engineering organizationRadu Cotescu

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous

Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge

Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal

Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun

A Domino Admins Adventures (Engage 2024)Gabriella Davis

Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer

TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc

GenCyber Cyber Security Day PresentationMichael W. Hawkins

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays

Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

Recently uploaded (20)

HTML Injection Attacks: Impact and Mitigation Strategies

GenAI Risks & Security Meetup 01052024.pdf

Exploring the Future Potential of AI-Enabled Smartphone Processors

Apidays New York 2024 - The value of a flexible API Management solution for O...

Scaling API-first – The story of a global engineering organization

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx

Axa Assurance Maroc - Insurer Innovation Award 2024

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke

Driving Behavioral Change for Information Management through Data-Driven Gree...

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf

Powerful Google developer tools for immediate impact! (2023-24 C)

A Domino Admins Adventures (Engage 2024)

Tata AIG General Insurance Company - Insurer Innovation Award 2024

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

GenCyber Cyber Security Day Presentation

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe

Handwritten Text Recognition for manuscripts and early printed texts

How to Troubleshoot Apps for the Modern Connected Worker

Java one14 handsonhadoop

1. 1 Hands on Hadoop Daniel Templeton & Inyoung Cho Cloudera, Inc.

2. 2 Your Hosts Daniel Templeton • Certification Developer • Crusty, old HPC guy • Likes Perl Inyoung Cho • Certification Developer • Recovering Java Evangelist • Invented JavaOne Hands-on Labs ©2014 Cloudera, Inc. 2 All rights reserved.

3. 3 What is “Big Data”? • Super-cool marketing buzz word • “Come see our new line of BIG DATA toasters…” • “The Five V’s” • Any data that is difficult to store in a traditional RDBMS • Too big, changes schemas too often, unstructured, … ©2014 Cloudera, Inc. 3 All rights reserved.

6. 6 HDFS in a Nutshell • Distributed “file system” service • Highly scalable and fault resilient • Chunks files into “blocks” that are replicated and distributed across the cluster ©2014 Cloudera, Inc. 6 All rights reserved.

7. 7 MapReduce in a Nutshell • Embarrassingly parallel batch execution engine • Two phases: map and reduce • https://www.youtube.com/watch?v=bcjSe0xCHbE • Tasks are scheduled to run where the data is • Jobs are written to Java API ©2014 Cloudera, Inc. 7 All rights reserved.

11. 11 The Lab • Self-paced • Should take right about 2 hours • “Additional Exercises” if you finish early • Inyoung and I are here to answer questions • Have fun! ©2014 Cloudera, Inc. 11 All rights reserved.

Java one14 handsonhadoop

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (18)

Similar to Java one14 handsonhadoop

Similar to Java one14 handsonhadoop (20)

More from templedf

More from templedf (9)

Recently uploaded

Recently uploaded (20)

Java one14 handsonhadoop