SlideShare a Scribd company logo
Farzad Nozarian
4/25/15 @AUT
Purpose
This guide describes how to get Shark running locally. It creates a small Hive
installation on one machine and allows you to execute simple queries.
The only prerequisite for this guide is that you have Java and Scala
2.9.3 installed on your machine. If you don't have Scala 2.9.3, you can
download it by running:
2
$ wget http://www.scala-lang.org/files/archive/scala-2.9.3.tgz
$ tar xvfz scala-2.9.3.tgz
Running Shark In Other Modes
• You can also start your Shark in one of the three other supported modes:
• Running Shark on EC2
• Running Shark on a Cluster
• Running Shark with Tachyon
3
Let’s Start…(1/3)
• Download the binary distribution of Shark 0.8.
• The package contains two folders, shark-0.8.0 and hive-0.9.0-
shark-0.8.0-bin.
4
$ wget https://github.com/amplab/shark/releases/download/v0.8.0/shark-0.8.0-bin-
hadoop1.tgz # Hadoop 1/CDH3 - or -
$ wget https://github.com/amplab/shark/releases/download/v0.8.0/shark-0.8.0-bin-
cdh4.tgz # Hadoop 2/CDH4
$ tar xvfz shark-*-bin-*.tgz
$ cd shark-*-bin-*
• The Shark code is in the shark-0.8.0/ directory.
Let’s Start…(2/3)
• To setup your environment to run Shark locally, you need to set
HIVE_HOME and SCALA_HOME environmental variables in a file shark-
0.8.0/conf/shark-env.sh to point to the folders you just downloaded.
• Shark comes with a template file shark-env.sh.template that you can
copy and modify to get started:
5
$ cp shark-0.8.0/conf/shark-env.sh.template shark-0.8.0/conf/shark-env.sh
• Now edit the following two lines in shark-env.sh:
export HIVE_HOME=/path/to/hive-0.9.0-shark-0.8.0-bin
export SCALA_HOME=/path/to/scala-2.9.3
Let’s Start…(3/3)
• Next, create the default Hive warehouse directory. This is where Hive will
store table data for native tables:
6
$ sudo mkdir -p /user/hive/warehouse
$ sudo chmod 0777 /user/hive/warehouse # Or make your username the owner
• You can now start the Shark CLI:
$ ./bin/shark
• In addition to the Shark CLI, there are several executables in shark-0.8.0/bin:
bin/shark-withdebug
bin/shark-withinfo
: Runs Shark CLI with DEBUG level logs printed to the console.
: Runs Shark CLI with INFO level logs printed to the console.
Lab
Assignment
1. Launch the Shark shell.
2. Create a table called book … .
3. List all the columns of the table book.
4. Load the book table from the file books in
the local filesystem.
5. Create a table called novel, containing
those records from table book … .
6. Print out the list of available tables.
7. Count the number of records from the
table book.
8. Print out the total cost of the books with
authors who have the same last name.
9. Count the number of distinct last names.
10. Drop the tables.
7
Lab Assignment 5 (1/5)
1. Launch the Shark shell.
2. Create a table called book whose schema includes book's title,
description, author's first name, last name, and cost.
3. List all the columns of the table book.
8
shark
create table
book(title string, description string, firstname string, lastname string, cost int)
row format delimited fields terminated by 't';
describe book;
Lab Assignment 5 (2/5)
4. Load the book table from the file books in the local filesystem. The books
file has the following format:
9
load data local inpath 'books' into table book;
Speed love Long book about love Brian Dog 10
Long day Story about Monday Emily Blue 20
Flying Car Novel about airplanes Phil High 5
Short day Novel about a day Phil Dog 30
Lab Assignment 5 (3/5)
As an alternative solution, you can create the an external table. The
external keyword lets you to create a table and provide a location so that
Hive does not use a default location for this table. This would be useful if
you already have data generated.
10
create external table
exbook(title string, description string, firstname string, lastname string, cost int)
row format delimited fields terminated by 't'
location '<file location, excluding the name of the file>';
5. Create a table called novel, containing those records from table book
that have keyword “novel” in their description and cache it in memory.
create table novel TBLPROPERTIES('shark.cache'='MEMORY_ONLY')
as select * from book where description like "%Novel%";
Lab Assignment 5 (4/5)
6. Print out the list of available tables.
11
show tables;
select lastname, sum(cost) from book group by lastname;
7. Count the number of records from the table book.
select count(*) from book;
8. Print out the total cost of the books with authors who have the same last
name.
9. Count the number of distinct last names.
select count(distinct lastname) from book;
Lab Assignment 5 (5/5)
10. Drop the tables.
12
drop table book;
drop table novel;
References:
• https://github.com/amplab/shark/wiki/Running-Shark-Locally
13

More Related Content

What's hot

Web scraping with nutch solr
Web scraping with nutch solrWeb scraping with nutch solr
Web scraping with nutch solr
Mike Frampton
 
Friends of Solr - Nutch & HDFS
Friends of Solr - Nutch & HDFSFriends of Solr - Nutch & HDFS
Friends of Solr - Nutch & HDFS
Saumitra Srivastav
 
TP2 Big Data HBase
TP2 Big Data HBaseTP2 Big Data HBase
TP2 Big Data HBase
Amal Abid
 
eZ Publish cluster unleashed revisited
eZ Publish cluster unleashed revisitedeZ Publish cluster unleashed revisited
eZ Publish cluster unleashed revisited
Bertrand Dunogier
 
Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export
Rupak Roy
 
Install Wordpress in Ubuntu Linux by Tushar B. Kute
Install Wordpress in Ubuntu Linux by Tushar B. KuteInstall Wordpress in Ubuntu Linux by Tushar B. Kute
Install Wordpress in Ubuntu Linux by Tushar B. Kute
Tushar B Kute
 
Install Drupal in Ubuntu by Tushar B. Kute
Install Drupal in Ubuntu by Tushar B. KuteInstall Drupal in Ubuntu by Tushar B. Kute
Install Drupal in Ubuntu by Tushar B. Kute
Tushar B Kute
 
Apache Hadoop & Hive installation with movie rating exercise
Apache Hadoop & Hive installation with movie rating exerciseApache Hadoop & Hive installation with movie rating exercise
Apache Hadoop & Hive installation with movie rating exercise
Shiva Rama Krishna Dasharathi
 
Drupal from scratch
Drupal from scratchDrupal from scratch
Drupal from scratch
Rovic Honrado
 
Hadoop single node setup
Hadoop single node setupHadoop single node setup
Hadoop single node setup
Mohammad_Tariq
 
Introduction to scoop and its functions
Introduction to scoop and its functionsIntroduction to scoop and its functions
Introduction to scoop and its functions
Rupak Roy
 
DSpace Manual for BALID Trainee
DSpace Manual for BALID Trainee DSpace Manual for BALID Trainee
DSpace Manual for BALID Trainee
Nur Ahammad
 
InfiniFlux collector
InfiniFlux collectorInfiniFlux collector
InfiniFlux collector
InfiniFlux
 
Perl Programming - 04 Programming Database
Perl Programming - 04 Programming DatabasePerl Programming - 04 Programming Database
Perl Programming - 04 Programming Database
Danairat Thanabodithammachari
 
Introduction to DSpace
Introduction to DSpaceIntroduction to DSpace
Introduction to DSpace
Hardy Pottinger
 
Advanced topics in hive
Advanced topics in hiveAdvanced topics in hive
Advanced topics in hive
Uday Vakalapudi
 
DSpace Tutorial : Open Source Digital Library
DSpace Tutorial : Open Source Digital LibraryDSpace Tutorial : Open Source Digital Library
DSpace Tutorial : Open Source Digital Library
rajivkumarmca
 

What's hot (17)

Web scraping with nutch solr
Web scraping with nutch solrWeb scraping with nutch solr
Web scraping with nutch solr
 
Friends of Solr - Nutch & HDFS
Friends of Solr - Nutch & HDFSFriends of Solr - Nutch & HDFS
Friends of Solr - Nutch & HDFS
 
TP2 Big Data HBase
TP2 Big Data HBaseTP2 Big Data HBase
TP2 Big Data HBase
 
eZ Publish cluster unleashed revisited
eZ Publish cluster unleashed revisitedeZ Publish cluster unleashed revisited
eZ Publish cluster unleashed revisited
 
Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export
 
Install Wordpress in Ubuntu Linux by Tushar B. Kute
Install Wordpress in Ubuntu Linux by Tushar B. KuteInstall Wordpress in Ubuntu Linux by Tushar B. Kute
Install Wordpress in Ubuntu Linux by Tushar B. Kute
 
Install Drupal in Ubuntu by Tushar B. Kute
Install Drupal in Ubuntu by Tushar B. KuteInstall Drupal in Ubuntu by Tushar B. Kute
Install Drupal in Ubuntu by Tushar B. Kute
 
Apache Hadoop & Hive installation with movie rating exercise
Apache Hadoop & Hive installation with movie rating exerciseApache Hadoop & Hive installation with movie rating exercise
Apache Hadoop & Hive installation with movie rating exercise
 
Drupal from scratch
Drupal from scratchDrupal from scratch
Drupal from scratch
 
Hadoop single node setup
Hadoop single node setupHadoop single node setup
Hadoop single node setup
 
Introduction to scoop and its functions
Introduction to scoop and its functionsIntroduction to scoop and its functions
Introduction to scoop and its functions
 
DSpace Manual for BALID Trainee
DSpace Manual for BALID Trainee DSpace Manual for BALID Trainee
DSpace Manual for BALID Trainee
 
InfiniFlux collector
InfiniFlux collectorInfiniFlux collector
InfiniFlux collector
 
Perl Programming - 04 Programming Database
Perl Programming - 04 Programming DatabasePerl Programming - 04 Programming Database
Perl Programming - 04 Programming Database
 
Introduction to DSpace
Introduction to DSpaceIntroduction to DSpace
Introduction to DSpace
 
Advanced topics in hive
Advanced topics in hiveAdvanced topics in hive
Advanced topics in hive
 
DSpace Tutorial : Open Source Digital Library
DSpace Tutorial : Open Source Digital LibraryDSpace Tutorial : Open Source Digital Library
DSpace Tutorial : Open Source Digital Library
 

Viewers also liked

Apache Storm Tutorial
Apache Storm TutorialApache Storm Tutorial
Apache Storm Tutorial
Farzad Nozarian
 
Apache Hadoop MapReduce Tutorial
Apache Hadoop MapReduce TutorialApache Hadoop MapReduce Tutorial
Apache Hadoop MapReduce Tutorial
Farzad Nozarian
 
Big Data Processing in Cloud Computing Environments
Big Data Processing in Cloud Computing EnvironmentsBig Data Processing in Cloud Computing Environments
Big Data Processing in Cloud Computing Environments
Farzad Nozarian
 
Object Based Databases
Object Based DatabasesObject Based Databases
Object Based Databases
Farzad Nozarian
 
Big Data and Cloud Computing
Big Data and Cloud ComputingBig Data and Cloud Computing
Big Data and Cloud Computing
Farzad Nozarian
 
Apache Spark Tutorial
Apache Spark TutorialApache Spark Tutorial
Apache Spark Tutorial
Farzad Nozarian
 
S4: Distributed Stream Computing Platform
S4: Distributed Stream Computing PlatformS4: Distributed Stream Computing Platform
S4: Distributed Stream Computing Platform
Farzad Nozarian
 
Big data Clustering Algorithms And Strategies
Big data Clustering Algorithms And StrategiesBig data Clustering Algorithms And Strategies
Big data Clustering Algorithms And Strategies
Farzad Nozarian
 

Viewers also liked (8)

Apache Storm Tutorial
Apache Storm TutorialApache Storm Tutorial
Apache Storm Tutorial
 
Apache Hadoop MapReduce Tutorial
Apache Hadoop MapReduce TutorialApache Hadoop MapReduce Tutorial
Apache Hadoop MapReduce Tutorial
 
Big Data Processing in Cloud Computing Environments
Big Data Processing in Cloud Computing EnvironmentsBig Data Processing in Cloud Computing Environments
Big Data Processing in Cloud Computing Environments
 
Object Based Databases
Object Based DatabasesObject Based Databases
Object Based Databases
 
Big Data and Cloud Computing
Big Data and Cloud ComputingBig Data and Cloud Computing
Big Data and Cloud Computing
 
Apache Spark Tutorial
Apache Spark TutorialApache Spark Tutorial
Apache Spark Tutorial
 
S4: Distributed Stream Computing Platform
S4: Distributed Stream Computing PlatformS4: Distributed Stream Computing Platform
S4: Distributed Stream Computing Platform
 
Big data Clustering Algorithms And Strategies
Big data Clustering Algorithms And StrategiesBig data Clustering Algorithms And Strategies
Big data Clustering Algorithms And Strategies
 

Similar to Shark - Lab Assignment

Php introduction
Php introductionPhp introduction
Php introduction
Osama Ghandour Geris
 
Using Puppet on Linux, Windows, and Mac OSX
Using Puppet on Linux, Windows, and Mac OSXUsing Puppet on Linux, Windows, and Mac OSX
Using Puppet on Linux, Windows, and Mac OSX
Puppet
 
Gophers, whales and.. clouds? Oh my!
Gophers, whales and.. clouds? Oh my!Gophers, whales and.. clouds? Oh my!
Gophers, whales and.. clouds? Oh my!
Glenn 'devalias' Grant
 
Unleash your inner console cowboy
Unleash your inner console cowboyUnleash your inner console cowboy
Unleash your inner console cowboy
Kenneth Geisshirt
 
Step-by-Step Introduction to Apache Flink
Step-by-Step Introduction to Apache Flink Step-by-Step Introduction to Apache Flink
Step-by-Step Introduction to Apache Flink
Slim Baltagi
 
FreeBSD Jail Complete Example
FreeBSD Jail Complete ExampleFreeBSD Jail Complete Example
FreeBSD Jail Complete Example
Mohammed Farrag
 
Ansible ex407 and EX 294
Ansible ex407 and EX 294Ansible ex407 and EX 294
Ansible ex407 and EX 294
IkiArif1
 
Tanel Poder - Scripts and Tools short
Tanel Poder - Scripts and Tools shortTanel Poder - Scripts and Tools short
Tanel Poder - Scripts and Tools short
Tanel Poder
 
04 02-2018--Slackware Wire Shark Installation
04 02-2018--Slackware Wire Shark Installation04 02-2018--Slackware Wire Shark Installation
04 02-2018--Slackware Wire Shark Installation
Alexander Bitar
 
Linux configer
Linux configerLinux configer
Linux configer
MD. AL AMIN
 
Directories description
Directories descriptionDirectories description
Directories description
Dr.M.Karthika parthasarathy
 
Geecon 2019 - Taming Code Quality in the Worst Language I Know: Bash
Geecon 2019 - Taming Code Quality  in the Worst Language I Know: BashGeecon 2019 - Taming Code Quality  in the Worst Language I Know: Bash
Geecon 2019 - Taming Code Quality in the Worst Language I Know: Bash
Michał Kordas
 
Unix primer
Unix primerUnix primer
Unix primer
dummy
 
Bash shell
Bash shellBash shell
Bash shell
xylas121
 
390aLecture05_12sp.ppt
390aLecture05_12sp.ppt390aLecture05_12sp.ppt
390aLecture05_12sp.ppt
mugeshmsd5
 
Introduction to linux
Introduction to linuxIntroduction to linux
Introduction to linux
QIANG XU
 
Ruby
RubyRuby
Docker perl build
Docker perl buildDocker perl build
Docker perl build
Workhorse Computing
 
Final Report - Spark
Final Report - SparkFinal Report - Spark
Final Report - Spark
Syed Danyal Khaliq
 
Introduction to linux2
Introduction to linux2Introduction to linux2
Introduction to linux2
Gourav Varma
 

Similar to Shark - Lab Assignment (20)

Php introduction
Php introductionPhp introduction
Php introduction
 
Using Puppet on Linux, Windows, and Mac OSX
Using Puppet on Linux, Windows, and Mac OSXUsing Puppet on Linux, Windows, and Mac OSX
Using Puppet on Linux, Windows, and Mac OSX
 
Gophers, whales and.. clouds? Oh my!
Gophers, whales and.. clouds? Oh my!Gophers, whales and.. clouds? Oh my!
Gophers, whales and.. clouds? Oh my!
 
Unleash your inner console cowboy
Unleash your inner console cowboyUnleash your inner console cowboy
Unleash your inner console cowboy
 
Step-by-Step Introduction to Apache Flink
Step-by-Step Introduction to Apache Flink Step-by-Step Introduction to Apache Flink
Step-by-Step Introduction to Apache Flink
 
FreeBSD Jail Complete Example
FreeBSD Jail Complete ExampleFreeBSD Jail Complete Example
FreeBSD Jail Complete Example
 
Ansible ex407 and EX 294
Ansible ex407 and EX 294Ansible ex407 and EX 294
Ansible ex407 and EX 294
 
Tanel Poder - Scripts and Tools short
Tanel Poder - Scripts and Tools shortTanel Poder - Scripts and Tools short
Tanel Poder - Scripts and Tools short
 
04 02-2018--Slackware Wire Shark Installation
04 02-2018--Slackware Wire Shark Installation04 02-2018--Slackware Wire Shark Installation
04 02-2018--Slackware Wire Shark Installation
 
Linux configer
Linux configerLinux configer
Linux configer
 
Directories description
Directories descriptionDirectories description
Directories description
 
Geecon 2019 - Taming Code Quality in the Worst Language I Know: Bash
Geecon 2019 - Taming Code Quality  in the Worst Language I Know: BashGeecon 2019 - Taming Code Quality  in the Worst Language I Know: Bash
Geecon 2019 - Taming Code Quality in the Worst Language I Know: Bash
 
Unix primer
Unix primerUnix primer
Unix primer
 
Bash shell
Bash shellBash shell
Bash shell
 
390aLecture05_12sp.ppt
390aLecture05_12sp.ppt390aLecture05_12sp.ppt
390aLecture05_12sp.ppt
 
Introduction to linux
Introduction to linuxIntroduction to linux
Introduction to linux
 
Ruby
RubyRuby
Ruby
 
Docker perl build
Docker perl buildDocker perl build
Docker perl build
 
Final Report - Spark
Final Report - SparkFinal Report - Spark
Final Report - Spark
 
Introduction to linux2
Introduction to linux2Introduction to linux2
Introduction to linux2
 

Recently uploaded

Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
XfilesPro
 
WWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders AustinWWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders Austin
Patrick Weigel
 
Malibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed RoundMalibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed Round
sjcobrien
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
Ayan Halder
 
Oracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptxOracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptx
Remote DBA Services
 
Modelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - AmsterdamModelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - Amsterdam
Alberto Brandolini
 
socradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdfsocradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdf
SOCRadar
 
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesE-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
Quickdice ERP
 
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CDKuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
rodomar2
 
Oracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptxOracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptx
Remote DBA Services
 
Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
ICS
 
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s EcosystemUI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
Peter Muessig
 
zOS Mainframe JES2-JES3 JCL-JECL Differences
zOS Mainframe JES2-JES3 JCL-JECL DifferenceszOS Mainframe JES2-JES3 JCL-JECL Differences
zOS Mainframe JES2-JES3 JCL-JECL Differences
YousufSait3
 
Transform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR SolutionsTransform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR Solutions
TheSMSPoint
 
一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理
dakas1
 
Unveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdfUnveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdf
brainerhub1
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
Green Software Development
 
316895207-SAP-Oil-and-Gas-Downstream-Training.pptx
316895207-SAP-Oil-and-Gas-Downstream-Training.pptx316895207-SAP-Oil-and-Gas-Downstream-Training.pptx
316895207-SAP-Oil-and-Gas-Downstream-Training.pptx
ssuserad3af4
 
Requirement Traceability in Xen Functional Safety
Requirement Traceability in Xen Functional SafetyRequirement Traceability in Xen Functional Safety
Requirement Traceability in Xen Functional Safety
Ayan Halder
 
SMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API ServiceSMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API Service
Yara Milbes
 

Recently uploaded (20)

Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
 
WWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders AustinWWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders Austin
 
Malibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed RoundMalibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed Round
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
 
Oracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptxOracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptx
 
Modelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - AmsterdamModelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - Amsterdam
 
socradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdfsocradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdf
 
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesE-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
 
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CDKuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
 
Oracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptxOracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptx
 
Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
 
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s EcosystemUI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
 
zOS Mainframe JES2-JES3 JCL-JECL Differences
zOS Mainframe JES2-JES3 JCL-JECL DifferenceszOS Mainframe JES2-JES3 JCL-JECL Differences
zOS Mainframe JES2-JES3 JCL-JECL Differences
 
Transform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR SolutionsTransform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR Solutions
 
一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理
 
Unveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdfUnveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdf
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
 
316895207-SAP-Oil-and-Gas-Downstream-Training.pptx
316895207-SAP-Oil-and-Gas-Downstream-Training.pptx316895207-SAP-Oil-and-Gas-Downstream-Training.pptx
316895207-SAP-Oil-and-Gas-Downstream-Training.pptx
 
Requirement Traceability in Xen Functional Safety
Requirement Traceability in Xen Functional SafetyRequirement Traceability in Xen Functional Safety
Requirement Traceability in Xen Functional Safety
 
SMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API ServiceSMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API Service
 

Shark - Lab Assignment

  • 2. Purpose This guide describes how to get Shark running locally. It creates a small Hive installation on one machine and allows you to execute simple queries. The only prerequisite for this guide is that you have Java and Scala 2.9.3 installed on your machine. If you don't have Scala 2.9.3, you can download it by running: 2 $ wget http://www.scala-lang.org/files/archive/scala-2.9.3.tgz $ tar xvfz scala-2.9.3.tgz
  • 3. Running Shark In Other Modes • You can also start your Shark in one of the three other supported modes: • Running Shark on EC2 • Running Shark on a Cluster • Running Shark with Tachyon 3
  • 4. Let’s Start…(1/3) • Download the binary distribution of Shark 0.8. • The package contains two folders, shark-0.8.0 and hive-0.9.0- shark-0.8.0-bin. 4 $ wget https://github.com/amplab/shark/releases/download/v0.8.0/shark-0.8.0-bin- hadoop1.tgz # Hadoop 1/CDH3 - or - $ wget https://github.com/amplab/shark/releases/download/v0.8.0/shark-0.8.0-bin- cdh4.tgz # Hadoop 2/CDH4 $ tar xvfz shark-*-bin-*.tgz $ cd shark-*-bin-* • The Shark code is in the shark-0.8.0/ directory.
  • 5. Let’s Start…(2/3) • To setup your environment to run Shark locally, you need to set HIVE_HOME and SCALA_HOME environmental variables in a file shark- 0.8.0/conf/shark-env.sh to point to the folders you just downloaded. • Shark comes with a template file shark-env.sh.template that you can copy and modify to get started: 5 $ cp shark-0.8.0/conf/shark-env.sh.template shark-0.8.0/conf/shark-env.sh • Now edit the following two lines in shark-env.sh: export HIVE_HOME=/path/to/hive-0.9.0-shark-0.8.0-bin export SCALA_HOME=/path/to/scala-2.9.3
  • 6. Let’s Start…(3/3) • Next, create the default Hive warehouse directory. This is where Hive will store table data for native tables: 6 $ sudo mkdir -p /user/hive/warehouse $ sudo chmod 0777 /user/hive/warehouse # Or make your username the owner • You can now start the Shark CLI: $ ./bin/shark • In addition to the Shark CLI, there are several executables in shark-0.8.0/bin: bin/shark-withdebug bin/shark-withinfo : Runs Shark CLI with DEBUG level logs printed to the console. : Runs Shark CLI with INFO level logs printed to the console.
  • 7. Lab Assignment 1. Launch the Shark shell. 2. Create a table called book … . 3. List all the columns of the table book. 4. Load the book table from the file books in the local filesystem. 5. Create a table called novel, containing those records from table book … . 6. Print out the list of available tables. 7. Count the number of records from the table book. 8. Print out the total cost of the books with authors who have the same last name. 9. Count the number of distinct last names. 10. Drop the tables. 7
  • 8. Lab Assignment 5 (1/5) 1. Launch the Shark shell. 2. Create a table called book whose schema includes book's title, description, author's first name, last name, and cost. 3. List all the columns of the table book. 8 shark create table book(title string, description string, firstname string, lastname string, cost int) row format delimited fields terminated by 't'; describe book;
  • 9. Lab Assignment 5 (2/5) 4. Load the book table from the file books in the local filesystem. The books file has the following format: 9 load data local inpath 'books' into table book; Speed love Long book about love Brian Dog 10 Long day Story about Monday Emily Blue 20 Flying Car Novel about airplanes Phil High 5 Short day Novel about a day Phil Dog 30
  • 10. Lab Assignment 5 (3/5) As an alternative solution, you can create the an external table. The external keyword lets you to create a table and provide a location so that Hive does not use a default location for this table. This would be useful if you already have data generated. 10 create external table exbook(title string, description string, firstname string, lastname string, cost int) row format delimited fields terminated by 't' location '<file location, excluding the name of the file>'; 5. Create a table called novel, containing those records from table book that have keyword “novel” in their description and cache it in memory. create table novel TBLPROPERTIES('shark.cache'='MEMORY_ONLY') as select * from book where description like "%Novel%";
  • 11. Lab Assignment 5 (4/5) 6. Print out the list of available tables. 11 show tables; select lastname, sum(cost) from book group by lastname; 7. Count the number of records from the table book. select count(*) from book; 8. Print out the total cost of the books with authors who have the same last name. 9. Count the number of distinct last names. select count(distinct lastname) from book;
  • 12. Lab Assignment 5 (5/5) 10. Drop the tables. 12 drop table book; drop table novel;