SlideShare a Scribd company logo
Lightweight wrapper for
 Hive on Amazon EMR	
        @stanaka
n 

n          CTO
      n 
Hive on Amazon EMR	
n    Amazon Elastic MapReduce

n    Hive
      n    HiveQL
EMR                                @   	
n               Hadoop
      n    20
n 

      n 

n 

n          MapReduce
      n    Hadoop Streaming
      n    Perl Mapper, Reducer
EMR         	
n               !
n                   !


n 

      n    S3
      n                 ?


n 
1:            S3            	
n                       S3        	
n    log2s3.pl
      n                  S3
      n    1        1      (logrotate)
Hive                  	
n    Hive                SerDe

n 

      n 

      n    label:value


n 
      n    LogFormat  "time:%tthost:%htreq:%rtstatus:%>s  ...
      n    time:[25/Sep/2011:12:17:51  +0900]              host:11.22.33.44
2:                                  	
n    Hive
      n 




n    Wrapper
      n                Perl
      n       Net::Amazon::EMR
      n    ruby elastic-mapreduce


n                      Net::Amazon::EMR::Wrapper
n 

n 

      n 

      n    SerDe
n    HiveQL
n 

      n    Perl
n                   !!!
use  Net::Amazon::EMR::Wrapper;

my  $emr  =  Net::Amazon::EMR::Wrapper-­‐>new({
        name  =>  ’testcluster',
        start_cluster  =>  1,
        num_instances  =>  5,
        slave_instance_type  =>  'm1.small',
        master_instance_type  =>  'm1.small',
        alive  =>  0,
        log  =>  Log::Dispatch::Config-­‐>instance,
        jar  =>  's3://somelogs/_lib/hadoop/hive/serde/hatena-­‐
serde.jar',
});
$emr-­‐>create_table('diary');
$emr-­‐>add_partition('diary',
      [  DateTime-­‐>new(year  =>  2011,  month  =>  5,  day  =>  19),
          DateTime-­‐>new(year  =>  2011,  month  =>  5,  day  =>  20)  ],
);

$emr-­‐>do_select(
      "select  *  from  logs  limit  10",    #  HiveQL
      “line_sample”);                                  #   
$emr-­‐>do_select(
      "select  count(1)  from  logs",
      "page  view");
$emr-­‐>do_select(
      "select  count(distinct  referer)  from  logs",  
      "unique  referer");

$emr-­‐>get_results;    #
%  perl  bin/sample.pl
[Sun  Sep  25  11:30:57  2011]  [notice]  j-­‐1IT82USV5OM0J  is  initiating
[Sun  Sep  25  11:37:00  2011]  [notice]  testcluster-­‐20449-­‐1317004256  
(j-­‐1IT82USV5OM0J)  is  ready.
[Sun  Sep  25  11:37:49  2011]  [notice]  Step  'create_table'  is  
finished.  (2  steps  left)
[Sun  Sep  25  11:47:20  2011]  [notice]  Step  'page  view'  is  finished.  
(1  steps  left)
[Sun  Sep  25  11:56:54  2011]  [notice]  Step  'unique  referer'  is  
finished.  (0  steps  left)

result:
$VAR1  =  {
                    'unique  referer'  =>  ’24235538',
                    'page  view'  =>  ’3596154323',
                };
[Sun  Sep  25  11:56:57  2011]  [notice]  Finished.
Wrapper              #1	
n 

      n    Perl
      n 

      n    cron
      n    HiveQL
Wrapper           #2	
n 

      n 

            n           ..
            n 


      n    S3
      n 
Lightweight wrapper for Hive on Amazon EMR

More Related Content

What's hot

Introduction to reactive programming & ReactiveCocoa
Introduction to reactive programming & ReactiveCocoaIntroduction to reactive programming & ReactiveCocoa
Introduction to reactive programming & ReactiveCocoa
Florent Pillet
 
DBD::Gofer 200809
DBD::Gofer 200809DBD::Gofer 200809
DBD::Gofer 200809
Tim Bunce
 
TLS305 Using DynamoDB with the AWS SDK for PHP - AWS re: Invent 2012
TLS305 Using DynamoDB with the AWS SDK for PHP - AWS re: Invent 2012TLS305 Using DynamoDB with the AWS SDK for PHP - AWS re: Invent 2012
TLS305 Using DynamoDB with the AWS SDK for PHP - AWS re: Invent 2012
Amazon Web Services
 
Going crazy with Node.JS and CakePHP
Going crazy with Node.JS and CakePHPGoing crazy with Node.JS and CakePHP
Going crazy with Node.JS and CakePHP
Mariano Iglesias
 
Rhebok, High Performance Rack Handler / Rubykaigi 2015
Rhebok, High Performance Rack Handler / Rubykaigi 2015Rhebok, High Performance Rack Handler / Rubykaigi 2015
Rhebok, High Performance Rack Handler / Rubykaigi 2015
Masahiro Nagano
 
Ruby meets Go
Ruby meets GoRuby meets Go
Ruby on Rails Oracle adaptera izstrāde
Ruby on Rails Oracle adaptera izstrādeRuby on Rails Oracle adaptera izstrāde
Ruby on Rails Oracle adaptera izstrāde
Raimonds Simanovskis
 
Puppet Camp Chicago 2014: Smoothing Troubles With Custom Types and Providers ...
Puppet Camp Chicago 2014: Smoothing Troubles With Custom Types and Providers ...Puppet Camp Chicago 2014: Smoothing Troubles With Custom Types and Providers ...
Puppet Camp Chicago 2014: Smoothing Troubles With Custom Types and Providers ...
Puppet
 
AnyMQ, Hippie, and the real-time web
AnyMQ, Hippie, and the real-time webAnyMQ, Hippie, and the real-time web
AnyMQ, Hippie, and the real-time web
clkao
 
Streams are Awesome - (Node.js) TimesOpen Sep 2012
Streams are Awesome - (Node.js) TimesOpen Sep 2012 Streams are Awesome - (Node.js) TimesOpen Sep 2012
Streams are Awesome - (Node.js) TimesOpen Sep 2012
Tom Croucher
 
Node.js - A Quick Tour
Node.js - A Quick TourNode.js - A Quick Tour
Node.js - A Quick Tour
Felix Geisendörfer
 
Don't Be Afraid of Abstract Syntax Trees
Don't Be Afraid of Abstract Syntax TreesDon't Be Afraid of Abstract Syntax Trees
Don't Be Afraid of Abstract Syntax Trees
Jamund Ferguson
 
Perl: Hate it for the Right Reasons
Perl: Hate it for the Right ReasonsPerl: Hate it for the Right Reasons
Perl: Hate it for the Right Reasons
Matt Follett
 
Workshop Infrastructure as Code - Suestra
Workshop Infrastructure as Code - SuestraWorkshop Infrastructure as Code - Suestra
Workshop Infrastructure as Code - Suestra
Mario IC
 
JavaScript on the GPU
JavaScript on the GPUJavaScript on the GPU
JavaScript on the GPU
Jarred Nicholls
 
Introduction to Nodejs
Introduction to NodejsIntroduction to Nodejs
Introduction to Nodejs
Gabriele Lana
 
Creating Reusable Puppet Profiles
Creating Reusable Puppet ProfilesCreating Reusable Puppet Profiles
Creating Reusable Puppet Profiles
Bram Vogelaar
 
Nodejs - A quick tour (v6)
Nodejs - A quick tour (v6)Nodejs - A quick tour (v6)
Nodejs - A quick tour (v6)
Felix Geisendörfer
 
Replacing "exec" with a type and provider: Return manifests to a declarative ...
Replacing "exec" with a type and provider: Return manifests to a declarative ...Replacing "exec" with a type and provider: Return manifests to a declarative ...
Replacing "exec" with a type and provider: Return manifests to a declarative ...
Puppet
 
Nodejs Explained with Examples
Nodejs Explained with ExamplesNodejs Explained with Examples
Nodejs Explained with Examples
Gabriele Lana
 

What's hot (20)

Introduction to reactive programming & ReactiveCocoa
Introduction to reactive programming & ReactiveCocoaIntroduction to reactive programming & ReactiveCocoa
Introduction to reactive programming & ReactiveCocoa
 
DBD::Gofer 200809
DBD::Gofer 200809DBD::Gofer 200809
DBD::Gofer 200809
 
TLS305 Using DynamoDB with the AWS SDK for PHP - AWS re: Invent 2012
TLS305 Using DynamoDB with the AWS SDK for PHP - AWS re: Invent 2012TLS305 Using DynamoDB with the AWS SDK for PHP - AWS re: Invent 2012
TLS305 Using DynamoDB with the AWS SDK for PHP - AWS re: Invent 2012
 
Going crazy with Node.JS and CakePHP
Going crazy with Node.JS and CakePHPGoing crazy with Node.JS and CakePHP
Going crazy with Node.JS and CakePHP
 
Rhebok, High Performance Rack Handler / Rubykaigi 2015
Rhebok, High Performance Rack Handler / Rubykaigi 2015Rhebok, High Performance Rack Handler / Rubykaigi 2015
Rhebok, High Performance Rack Handler / Rubykaigi 2015
 
Ruby meets Go
Ruby meets GoRuby meets Go
Ruby meets Go
 
Ruby on Rails Oracle adaptera izstrāde
Ruby on Rails Oracle adaptera izstrādeRuby on Rails Oracle adaptera izstrāde
Ruby on Rails Oracle adaptera izstrāde
 
Puppet Camp Chicago 2014: Smoothing Troubles With Custom Types and Providers ...
Puppet Camp Chicago 2014: Smoothing Troubles With Custom Types and Providers ...Puppet Camp Chicago 2014: Smoothing Troubles With Custom Types and Providers ...
Puppet Camp Chicago 2014: Smoothing Troubles With Custom Types and Providers ...
 
AnyMQ, Hippie, and the real-time web
AnyMQ, Hippie, and the real-time webAnyMQ, Hippie, and the real-time web
AnyMQ, Hippie, and the real-time web
 
Streams are Awesome - (Node.js) TimesOpen Sep 2012
Streams are Awesome - (Node.js) TimesOpen Sep 2012 Streams are Awesome - (Node.js) TimesOpen Sep 2012
Streams are Awesome - (Node.js) TimesOpen Sep 2012
 
Node.js - A Quick Tour
Node.js - A Quick TourNode.js - A Quick Tour
Node.js - A Quick Tour
 
Don't Be Afraid of Abstract Syntax Trees
Don't Be Afraid of Abstract Syntax TreesDon't Be Afraid of Abstract Syntax Trees
Don't Be Afraid of Abstract Syntax Trees
 
Perl: Hate it for the Right Reasons
Perl: Hate it for the Right ReasonsPerl: Hate it for the Right Reasons
Perl: Hate it for the Right Reasons
 
Workshop Infrastructure as Code - Suestra
Workshop Infrastructure as Code - SuestraWorkshop Infrastructure as Code - Suestra
Workshop Infrastructure as Code - Suestra
 
JavaScript on the GPU
JavaScript on the GPUJavaScript on the GPU
JavaScript on the GPU
 
Introduction to Nodejs
Introduction to NodejsIntroduction to Nodejs
Introduction to Nodejs
 
Creating Reusable Puppet Profiles
Creating Reusable Puppet ProfilesCreating Reusable Puppet Profiles
Creating Reusable Puppet Profiles
 
Nodejs - A quick tour (v6)
Nodejs - A quick tour (v6)Nodejs - A quick tour (v6)
Nodejs - A quick tour (v6)
 
Replacing "exec" with a type and provider: Return manifests to a declarative ...
Replacing "exec" with a type and provider: Return manifests to a declarative ...Replacing "exec" with a type and provider: Return manifests to a declarative ...
Replacing "exec" with a type and provider: Return manifests to a declarative ...
 
Nodejs Explained with Examples
Nodejs Explained with ExamplesNodejs Explained with Examples
Nodejs Explained with Examples
 

Viewers also liked

Securing big data (july 2012)
Securing big data (july 2012)Securing big data (july 2012)
Securing big data (july 2012)
Marc Vael
 
4 cebrations
4 cebrations4 cebrations
4 cebrations
Sergey70
 
Abr
AbrAbr
5 at the market.урок
5 at the market.урок5 at the market.урок
5 at the market.урок
Sergey70
 
Gato vs extraterestre presentacion numero2
Gato vs extraterestre presentacion numero2Gato vs extraterestre presentacion numero2
Gato vs extraterestre presentacion numero2
Fernando Espinel
 
Atelier Delplanque Fouré JFK2009
Atelier Delplanque Fouré JFK2009Atelier Delplanque Fouré JFK2009
Atelier Delplanque Fouré JFK2009
Pierre Trudelle
 
W.JONES Resume
W.JONES ResumeW.JONES Resume
W.JONES Resume
Willie Jones
 
Academic social club
Academic social clubAcademic social club
Academic social club
scharrlibrary
 
letter
letterletter
Handle list
Handle listHandle list
Handle list
Robin Luo
 
87 158-1-sm
87 158-1-sm87 158-1-sm
87 158-1-sm
Felipe Morgantini
 
Distribuciones de asientos
Distribuciones de asientosDistribuciones de asientos
Core Story - Markenbildung mittels Storytelling
Core Story - Markenbildung mittels StorytellingCore Story - Markenbildung mittels Storytelling
Core Story - Markenbildung mittels Storytelling
Pia Kleine Wieskamp
 
Poster moodboard
Poster moodboardPoster moodboard
Poster moodboard
afkbbs_
 
אלבום תמונות סדנא לבניית דידג'רידו 30.11.12
אלבום תמונות סדנא לבניית דידג'רידו 30.11.12אלבום תמונות סדנא לבניית דידג'רידו 30.11.12
אלבום תמונות סדנא לבניית דידג'רידו 30.11.12
gonen bar lev
 

Viewers also liked (16)

Securing big data (july 2012)
Securing big data (july 2012)Securing big data (july 2012)
Securing big data (july 2012)
 
4 cebrations
4 cebrations4 cebrations
4 cebrations
 
Abr
AbrAbr
Abr
 
5 at the market.урок
5 at the market.урок5 at the market.урок
5 at the market.урок
 
ASU Diploma
ASU DiplomaASU Diploma
ASU Diploma
 
Gato vs extraterestre presentacion numero2
Gato vs extraterestre presentacion numero2Gato vs extraterestre presentacion numero2
Gato vs extraterestre presentacion numero2
 
Atelier Delplanque Fouré JFK2009
Atelier Delplanque Fouré JFK2009Atelier Delplanque Fouré JFK2009
Atelier Delplanque Fouré JFK2009
 
W.JONES Resume
W.JONES ResumeW.JONES Resume
W.JONES Resume
 
Academic social club
Academic social clubAcademic social club
Academic social club
 
letter
letterletter
letter
 
Handle list
Handle listHandle list
Handle list
 
87 158-1-sm
87 158-1-sm87 158-1-sm
87 158-1-sm
 
Distribuciones de asientos
Distribuciones de asientosDistribuciones de asientos
Distribuciones de asientos
 
Core Story - Markenbildung mittels Storytelling
Core Story - Markenbildung mittels StorytellingCore Story - Markenbildung mittels Storytelling
Core Story - Markenbildung mittels Storytelling
 
Poster moodboard
Poster moodboardPoster moodboard
Poster moodboard
 
אלבום תמונות סדנא לבניית דידג'רידו 30.11.12
אלבום תמונות סדנא לבניית דידג'רידו 30.11.12אלבום תמונות סדנא לבניית דידג'רידו 30.11.12
אלבום תמונות סדנא לבניית דידג'רידו 30.11.12
 

Similar to Lightweight wrapper for Hive on Amazon EMR

Background Jobs - Com BackgrounDRb
Background Jobs - Com BackgrounDRbBackground Jobs - Com BackgrounDRb
Background Jobs - Com BackgrounDRb
Juan Maiz
 
Introduction to cloudforecast
Introduction to cloudforecastIntroduction to cloudforecast
Introduction to cloudforecast
Masahiro Nagano
 
Perl on Amazon Elastic MapReduce
Perl on Amazon Elastic MapReducePerl on Amazon Elastic MapReduce
Perl on Amazon Elastic MapReduce
Pedro Figueiredo
 
CLI Wizardry - A Friendly Intro To sed/awk/grep
CLI Wizardry - A Friendly Intro To sed/awk/grepCLI Wizardry - A Friendly Intro To sed/awk/grep
CLI Wizardry - A Friendly Intro To sed/awk/grep
All Things Open
 
Toolbox of a Ruby Team
Toolbox of a Ruby TeamToolbox of a Ruby Team
Toolbox of a Ruby Team
Arto Artnik
 
Profiling Ruby
Profiling RubyProfiling Ruby
Profiling Ruby
Ian Pointer
 
Hiveminder - Everything but the Secret Sauce
Hiveminder - Everything but the Secret SauceHiveminder - Everything but the Secret Sauce
Hiveminder - Everything but the Secret Sauce
Jesse Vincent
 
fast prototyping with sinatra sequel w2tags
fast prototyping with sinatra sequel w2tagsfast prototyping with sinatra sequel w2tags
fast prototyping with sinatra sequel w2tags
widi harsojo
 
Integrating Flex And Rails With Ruby Amf
Integrating Flex And Rails With Ruby AmfIntegrating Flex And Rails With Ruby Amf
Integrating Flex And Rails With Ruby Amf
railsconf
 
Flex With Rubyamf
Flex With RubyamfFlex With Rubyamf
Flex With Rubyamf
Tony Hillerson
 
MLflow with R
MLflow with RMLflow with R
MLflow with R
Databricks
 
Hacking parse.y (RubyConf 2009)
Hacking parse.y (RubyConf 2009)Hacking parse.y (RubyConf 2009)
Hacking parse.y (RubyConf 2009)
ujihisa
 
Under the Hood with Docker Swarm Mode - Drew Erny and Nishant Totla, Docker
Under the Hood with Docker Swarm Mode - Drew Erny and Nishant Totla, DockerUnder the Hood with Docker Swarm Mode - Drew Erny and Nishant Totla, Docker
Under the Hood with Docker Swarm Mode - Drew Erny and Nishant Totla, Docker
Docker, Inc.
 
Wider than rails
Wider than railsWider than rails
Wider than rails
Alexey Nayden
 
Exactly once with spark streaming
Exactly once with spark streamingExactly once with spark streaming
Exactly once with spark streaming
Quentin Ambard
 
Serializing Ruby Objects in Redis
Serializing Ruby Objects in RedisSerializing Ruby Objects in Redis
Serializing Ruby Objects in Redis
Brian Kaney
 
React Development with the MERN Stack
React Development with the MERN StackReact Development with the MERN Stack
React Development with the MERN Stack
Troy Miles
 
Introduction to Rails - presented by Arman Ortega
Introduction to Rails - presented by Arman OrtegaIntroduction to Rails - presented by Arman Ortega
Introduction to Rails - presented by Arman Ortega
arman o
 
Beijing Perl Workshop 2008 Hiveminder Secret Sauce
Beijing Perl Workshop 2008 Hiveminder Secret SauceBeijing Perl Workshop 2008 Hiveminder Secret Sauce
Beijing Perl Workshop 2008 Hiveminder Secret Sauce
Jesse Vincent
 
OSC2007-niigata - mashup
OSC2007-niigata - mashupOSC2007-niigata - mashup
OSC2007-niigata - mashup
Yuichiro MASUI
 

Similar to Lightweight wrapper for Hive on Amazon EMR (20)

Background Jobs - Com BackgrounDRb
Background Jobs - Com BackgrounDRbBackground Jobs - Com BackgrounDRb
Background Jobs - Com BackgrounDRb
 
Introduction to cloudforecast
Introduction to cloudforecastIntroduction to cloudforecast
Introduction to cloudforecast
 
Perl on Amazon Elastic MapReduce
Perl on Amazon Elastic MapReducePerl on Amazon Elastic MapReduce
Perl on Amazon Elastic MapReduce
 
CLI Wizardry - A Friendly Intro To sed/awk/grep
CLI Wizardry - A Friendly Intro To sed/awk/grepCLI Wizardry - A Friendly Intro To sed/awk/grep
CLI Wizardry - A Friendly Intro To sed/awk/grep
 
Toolbox of a Ruby Team
Toolbox of a Ruby TeamToolbox of a Ruby Team
Toolbox of a Ruby Team
 
Profiling Ruby
Profiling RubyProfiling Ruby
Profiling Ruby
 
Hiveminder - Everything but the Secret Sauce
Hiveminder - Everything but the Secret SauceHiveminder - Everything but the Secret Sauce
Hiveminder - Everything but the Secret Sauce
 
fast prototyping with sinatra sequel w2tags
fast prototyping with sinatra sequel w2tagsfast prototyping with sinatra sequel w2tags
fast prototyping with sinatra sequel w2tags
 
Integrating Flex And Rails With Ruby Amf
Integrating Flex And Rails With Ruby AmfIntegrating Flex And Rails With Ruby Amf
Integrating Flex And Rails With Ruby Amf
 
Flex With Rubyamf
Flex With RubyamfFlex With Rubyamf
Flex With Rubyamf
 
MLflow with R
MLflow with RMLflow with R
MLflow with R
 
Hacking parse.y (RubyConf 2009)
Hacking parse.y (RubyConf 2009)Hacking parse.y (RubyConf 2009)
Hacking parse.y (RubyConf 2009)
 
Under the Hood with Docker Swarm Mode - Drew Erny and Nishant Totla, Docker
Under the Hood with Docker Swarm Mode - Drew Erny and Nishant Totla, DockerUnder the Hood with Docker Swarm Mode - Drew Erny and Nishant Totla, Docker
Under the Hood with Docker Swarm Mode - Drew Erny and Nishant Totla, Docker
 
Wider than rails
Wider than railsWider than rails
Wider than rails
 
Exactly once with spark streaming
Exactly once with spark streamingExactly once with spark streaming
Exactly once with spark streaming
 
Serializing Ruby Objects in Redis
Serializing Ruby Objects in RedisSerializing Ruby Objects in Redis
Serializing Ruby Objects in Redis
 
React Development with the MERN Stack
React Development with the MERN StackReact Development with the MERN Stack
React Development with the MERN Stack
 
Introduction to Rails - presented by Arman Ortega
Introduction to Rails - presented by Arman OrtegaIntroduction to Rails - presented by Arman Ortega
Introduction to Rails - presented by Arman Ortega
 
Beijing Perl Workshop 2008 Hiveminder Secret Sauce
Beijing Perl Workshop 2008 Hiveminder Secret SauceBeijing Perl Workshop 2008 Hiveminder Secret Sauce
Beijing Perl Workshop 2008 Hiveminder Secret Sauce
 
OSC2007-niigata - mashup
OSC2007-niigata - mashupOSC2007-niigata - mashup
OSC2007-niigata - mashup
 

More from Shinji Tanaka

Mackerelによる
簡単サーバー管理入門と発展形
Mackerelによる
簡単サーバー管理入門と発展形Mackerelによる
簡単サーバー管理入門と発展形
Mackerelによる
簡単サーバー管理入門と発展形
Shinji Tanaka
 
Using Windows Azure
Using Windows AzureUsing Windows Azure
Using Windows Azure
Shinji Tanaka
 
MySQL Multi-master on EC2
MySQL Multi-master on EC2MySQL Multi-master on EC2
MySQL Multi-master on EC2
Shinji Tanaka
 
Original Server Conference
Original Server ConferenceOriginal Server Conference
Original Server Conference
Shinji Tanaka
 
Performance and Scalability of Web Service
Performance and Scalability of Web ServicePerformance and Scalability of Web Service
Performance and Scalability of Web Service
Shinji Tanaka
 
How to use Virtualization Technology in Hatena
How to use Virtualization Technology in HatenaHow to use Virtualization Technology in Hatena
How to use Virtualization Technology in Hatena
Shinji Tanaka
 
Hatena's Infrastructure from the beginning
Hatena's Infrastructure from the beginningHatena's Infrastructure from the beginning
Hatena's Infrastructure from the beginning
Shinji Tanaka
 

More from Shinji Tanaka (8)

Mackerelによる
簡単サーバー管理入門と発展形
Mackerelによる
簡単サーバー管理入門と発展形Mackerelによる
簡単サーバー管理入門と発展形
Mackerelによる
簡単サーバー管理入門と発展形
 
Using Windows Azure
Using Windows AzureUsing Windows Azure
Using Windows Azure
 
MySQL Multi-master on EC2
MySQL Multi-master on EC2MySQL Multi-master on EC2
MySQL Multi-master on EC2
 
Original Server Conference
Original Server ConferenceOriginal Server Conference
Original Server Conference
 
Scala on Hadoop
Scala on HadoopScala on Hadoop
Scala on Hadoop
 
Performance and Scalability of Web Service
Performance and Scalability of Web ServicePerformance and Scalability of Web Service
Performance and Scalability of Web Service
 
How to use Virtualization Technology in Hatena
How to use Virtualization Technology in HatenaHow to use Virtualization Technology in Hatena
How to use Virtualization Technology in Hatena
 
Hatena's Infrastructure from the beginning
Hatena's Infrastructure from the beginningHatena's Infrastructure from the beginning
Hatena's Infrastructure from the beginning
 

Recently uploaded

Password Rotation in 2024 is still Relevant
Password Rotation in 2024 is still RelevantPassword Rotation in 2024 is still Relevant
Password Rotation in 2024 is still Relevant
Bert Blevins
 
Comparison Table of DiskWarrior Alternatives.pdf
Comparison Table of DiskWarrior Alternatives.pdfComparison Table of DiskWarrior Alternatives.pdf
Comparison Table of DiskWarrior Alternatives.pdf
Andrey Yasko
 
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
Priyanka Aash
 
Salesforce AI & Einstein Copilot Workshop
Salesforce AI & Einstein Copilot WorkshopSalesforce AI & Einstein Copilot Workshop
Salesforce AI & Einstein Copilot Workshop
CEPTES Software Inc
 
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptxDublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Kunal Gupta
 
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
Edge AI and Vision Alliance
 
How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx
Adam Dunkels
 
The Role of Technology in Payroll Statutory Compliance (1).pdf
The Role of Technology in Payroll Statutory Compliance (1).pdfThe Role of Technology in Payroll Statutory Compliance (1).pdf
The Role of Technology in Payroll Statutory Compliance (1).pdf
paysquare consultancy
 
Pigging Unit Lubricant Oil Blending Plant
Pigging Unit Lubricant Oil Blending PlantPigging Unit Lubricant Oil Blending Plant
Pigging Unit Lubricant Oil Blending Plant
LINUS PROJECTS (INDIA)
 
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxRPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
SynapseIndia
 
The Role of IoT in Australian Mobile App Development - PDF Guide
The Role of IoT in Australian Mobile App Development - PDF GuideThe Role of IoT in Australian Mobile App Development - PDF Guide
The Role of IoT in Australian Mobile App Development - PDF Guide
Shiv Technolabs
 
Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptxUse Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
SynapseIndia
 
Best Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdfBest Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdf
Tatiana Al-Chueyr
 
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite SolutionIPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Networks
 
Evolution of iPaaS - simplify IT workloads to provide a unified view of data...
Evolution of iPaaS - simplify IT workloads to provide a unified view of  data...Evolution of iPaaS - simplify IT workloads to provide a unified view of  data...
Evolution of iPaaS - simplify IT workloads to provide a unified view of data...
Torry Harris
 
Tirana Tech Meetup - Agentic RAG with Milvus, Llama3 and Ollama
Tirana Tech Meetup - Agentic RAG with Milvus, Llama3 and OllamaTirana Tech Meetup - Agentic RAG with Milvus, Llama3 and Ollama
Tirana Tech Meetup - Agentic RAG with Milvus, Llama3 and Ollama
Zilliz
 
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
Bert Blevins
 
DealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 editionDealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 edition
Yevgen Sysoyev
 
WPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide DeckWPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide Deck
Lidia A.
 
How Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdfHow Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdf
HackersList
 

Recently uploaded (20)

Password Rotation in 2024 is still Relevant
Password Rotation in 2024 is still RelevantPassword Rotation in 2024 is still Relevant
Password Rotation in 2024 is still Relevant
 
Comparison Table of DiskWarrior Alternatives.pdf
Comparison Table of DiskWarrior Alternatives.pdfComparison Table of DiskWarrior Alternatives.pdf
Comparison Table of DiskWarrior Alternatives.pdf
 
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
 
Salesforce AI & Einstein Copilot Workshop
Salesforce AI & Einstein Copilot WorkshopSalesforce AI & Einstein Copilot Workshop
Salesforce AI & Einstein Copilot Workshop
 
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptxDublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
 
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
 
How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx
 
The Role of Technology in Payroll Statutory Compliance (1).pdf
The Role of Technology in Payroll Statutory Compliance (1).pdfThe Role of Technology in Payroll Statutory Compliance (1).pdf
The Role of Technology in Payroll Statutory Compliance (1).pdf
 
Pigging Unit Lubricant Oil Blending Plant
Pigging Unit Lubricant Oil Blending PlantPigging Unit Lubricant Oil Blending Plant
Pigging Unit Lubricant Oil Blending Plant
 
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxRPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
 
The Role of IoT in Australian Mobile App Development - PDF Guide
The Role of IoT in Australian Mobile App Development - PDF GuideThe Role of IoT in Australian Mobile App Development - PDF Guide
The Role of IoT in Australian Mobile App Development - PDF Guide
 
Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptxUse Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
 
Best Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdfBest Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdf
 
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite SolutionIPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite Solution
 
Evolution of iPaaS - simplify IT workloads to provide a unified view of data...
Evolution of iPaaS - simplify IT workloads to provide a unified view of  data...Evolution of iPaaS - simplify IT workloads to provide a unified view of  data...
Evolution of iPaaS - simplify IT workloads to provide a unified view of data...
 
Tirana Tech Meetup - Agentic RAG with Milvus, Llama3 and Ollama
Tirana Tech Meetup - Agentic RAG with Milvus, Llama3 and OllamaTirana Tech Meetup - Agentic RAG with Milvus, Llama3 and Ollama
Tirana Tech Meetup - Agentic RAG with Milvus, Llama3 and Ollama
 
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
 
DealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 editionDealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 edition
 
WPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide DeckWPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide Deck
 
How Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdfHow Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdf
 

Lightweight wrapper for Hive on Amazon EMR

  • 1. Lightweight wrapper for Hive on Amazon EMR @stanaka
  • 2. n  n  CTO n 
  • 3. Hive on Amazon EMR n  Amazon Elastic MapReduce n  Hive n  HiveQL
  • 4. EMR @ n  Hadoop n  20 n  n  n  n  MapReduce n  Hadoop Streaming n  Perl Mapper, Reducer
  • 5. EMR n  ! n  ! n  n  S3 n  ? n 
  • 6. 1: S3 n  S3 n  log2s3.pl n  S3 n  1 1 (logrotate)
  • 7. Hive n  Hive SerDe n  n  n  label:value n  n  LogFormat  "time:%tthost:%htreq:%rtstatus:%>s  ... n  time:[25/Sep/2011:12:17:51  +0900]              host:11.22.33.44
  • 8. 2: n  Hive n  n  Wrapper n  Perl n  Net::Amazon::EMR n  ruby elastic-mapreduce n  Net::Amazon::EMR::Wrapper
  • 9. n  n  n  n  SerDe n  HiveQL n  n  Perl n  !!!
  • 10. use  Net::Amazon::EMR::Wrapper; my  $emr  =  Net::Amazon::EMR::Wrapper-­‐>new({        name  =>  ’testcluster',        start_cluster  =>  1,        num_instances  =>  5,        slave_instance_type  =>  'm1.small',        master_instance_type  =>  'm1.small',        alive  =>  0,        log  =>  Log::Dispatch::Config-­‐>instance,        jar  =>  's3://somelogs/_lib/hadoop/hive/serde/hatena-­‐ serde.jar', });
  • 11. $emr-­‐>create_table('diary'); $emr-­‐>add_partition('diary',      [  DateTime-­‐>new(year  =>  2011,  month  =>  5,  day  =>  19),          DateTime-­‐>new(year  =>  2011,  month  =>  5,  day  =>  20)  ], ); $emr-­‐>do_select(      "select  *  from  logs  limit  10",    #  HiveQL      “line_sample”);                                  #   $emr-­‐>do_select(      "select  count(1)  from  logs",      "page  view"); $emr-­‐>do_select(      "select  count(distinct  referer)  from  logs",        "unique  referer"); $emr-­‐>get_results;    #
  • 12. %  perl  bin/sample.pl [Sun  Sep  25  11:30:57  2011]  [notice]  j-­‐1IT82USV5OM0J  is  initiating [Sun  Sep  25  11:37:00  2011]  [notice]  testcluster-­‐20449-­‐1317004256   (j-­‐1IT82USV5OM0J)  is  ready. [Sun  Sep  25  11:37:49  2011]  [notice]  Step  'create_table'  is   finished.  (2  steps  left) [Sun  Sep  25  11:47:20  2011]  [notice]  Step  'page  view'  is  finished.   (1  steps  left) [Sun  Sep  25  11:56:54  2011]  [notice]  Step  'unique  referer'  is   finished.  (0  steps  left) result: $VAR1  =  {                    'unique  referer'  =>  ’24235538',                    'page  view'  =>  ’3596154323',                }; [Sun  Sep  25  11:56:57  2011]  [notice]  Finished.
  • 13. Wrapper #1 n  n  Perl n  n  cron n  HiveQL
  • 14. Wrapper #2 n  n  n  .. n  n  S3 n