SlideShare a Scribd company logo
1 of 15
Download to read offline
pivotal.io
875 Howard Street, Fifth Floor, San Francisco, CA 94103
 
 
Pivotal HAWQ
ON-HAND
 TUTORIAL(1)
 
 
 
 
 
 
 
 
 
 
 

More Related Content

What's hot

Oracle 11g R2 RAC setup on rhel 5.0
Oracle 11g R2 RAC setup on rhel 5.0Oracle 11g R2 RAC setup on rhel 5.0
Oracle 11g R2 RAC setup on rhel 5.0Santosh Kangane
 
还原Oracle中真实的cache recovery
还原Oracle中真实的cache recovery还原Oracle中真实的cache recovery
还原Oracle中真实的cache recoverymaclean liu
 
Automate DBA Tasks With Ansible
Automate DBA Tasks With AnsibleAutomate DBA Tasks With Ansible
Automate DBA Tasks With AnsibleIvica Arsov
 
Out of the Box Replication in Postgres 9.4(pgconfsf)
Out of the Box Replication in Postgres 9.4(pgconfsf)Out of the Box Replication in Postgres 9.4(pgconfsf)
Out of the Box Replication in Postgres 9.4(pgconfsf)Denish Patel
 
Oracle 10g Performance: chapter 02 aas
Oracle 10g Performance: chapter 02 aasOracle 10g Performance: chapter 02 aas
Oracle 10g Performance: chapter 02 aasKyle Hailey
 
Oracle upgrade
Oracle upgradeOracle upgrade
Oracle upgradeRaj p
 
DBA だってもっと効率化したい!〜最近の自動化事情とOracle Database〜
DBA だってもっと効率化したい!〜最近の自動化事情とOracle Database〜DBA だってもっと効率化したい!〜最近の自動化事情とOracle Database〜
DBA だってもっと効率化したい!〜最近の自動化事情とOracle Database〜Michitoshi Yoshida
 
MySQL 5.7 innodb_enhance_partii_20160527
MySQL 5.7 innodb_enhance_partii_20160527MySQL 5.7 innodb_enhance_partii_20160527
MySQL 5.7 innodb_enhance_partii_20160527Saewoong Lee
 
RAC+ASM: Stories to Share
RAC+ASM: Stories to ShareRAC+ASM: Stories to Share
RAC+ASM: Stories to Sharekutrovsky
 
PGConf.ASIA 2019 - High Availability, 10 Seconds Failover - Lucky Haryadi
PGConf.ASIA 2019 - High Availability, 10 Seconds Failover - Lucky HaryadiPGConf.ASIA 2019 - High Availability, 10 Seconds Failover - Lucky Haryadi
PGConf.ASIA 2019 - High Availability, 10 Seconds Failover - Lucky HaryadiEqunix Business Solutions
 
PGConf.ASIA 2019 Bali - Mission Critical Production High Availability Postgre...
PGConf.ASIA 2019 Bali - Mission Critical Production High Availability Postgre...PGConf.ASIA 2019 Bali - Mission Critical Production High Availability Postgre...
PGConf.ASIA 2019 Bali - Mission Critical Production High Availability Postgre...Equnix Business Solutions
 
Oracle applications r12.2.0 installation on linux
Oracle applications r12.2.0 installation on linuxOracle applications r12.2.0 installation on linux
Oracle applications r12.2.0 installation on linuxRavi Kumar Lanke
 
Yeti DNS - Experimenting at the root
Yeti DNS - Experimenting at the rootYeti DNS - Experimenting at the root
Yeti DNS - Experimenting at the rootMen and Mice
 
Namespaces for Local Networks
Namespaces for Local NetworksNamespaces for Local Networks
Namespaces for Local NetworksMen and Mice
 
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)Nag Arvind Gudiseva
 
RIPE 71 and IETF 94 reports webinar
RIPE 71 and IETF 94 reports webinarRIPE 71 and IETF 94 reports webinar
RIPE 71 and IETF 94 reports webinarMen and Mice
 
Red Hat Enterprise Linux OpenStack Platform on Inktank Ceph Enterprise
Red Hat Enterprise Linux OpenStack Platform on Inktank Ceph EnterpriseRed Hat Enterprise Linux OpenStack Platform on Inktank Ceph Enterprise
Red Hat Enterprise Linux OpenStack Platform on Inktank Ceph EnterpriseRed_Hat_Storage
 
Oracle 10g Performance: chapter 00 statspack
Oracle 10g Performance: chapter 00 statspackOracle 10g Performance: chapter 00 statspack
Oracle 10g Performance: chapter 00 statspackKyle Hailey
 
MySQL InnoDB Cluster 미리보기 (remote cluster test)
MySQL InnoDB Cluster 미리보기 (remote cluster test)MySQL InnoDB Cluster 미리보기 (remote cluster test)
MySQL InnoDB Cluster 미리보기 (remote cluster test)Seungmin Yu
 

What's hot (19)

Oracle 11g R2 RAC setup on rhel 5.0
Oracle 11g R2 RAC setup on rhel 5.0Oracle 11g R2 RAC setup on rhel 5.0
Oracle 11g R2 RAC setup on rhel 5.0
 
还原Oracle中真实的cache recovery
还原Oracle中真实的cache recovery还原Oracle中真实的cache recovery
还原Oracle中真实的cache recovery
 
Automate DBA Tasks With Ansible
Automate DBA Tasks With AnsibleAutomate DBA Tasks With Ansible
Automate DBA Tasks With Ansible
 
Out of the Box Replication in Postgres 9.4(pgconfsf)
Out of the Box Replication in Postgres 9.4(pgconfsf)Out of the Box Replication in Postgres 9.4(pgconfsf)
Out of the Box Replication in Postgres 9.4(pgconfsf)
 
Oracle 10g Performance: chapter 02 aas
Oracle 10g Performance: chapter 02 aasOracle 10g Performance: chapter 02 aas
Oracle 10g Performance: chapter 02 aas
 
Oracle upgrade
Oracle upgradeOracle upgrade
Oracle upgrade
 
DBA だってもっと効率化したい!〜最近の自動化事情とOracle Database〜
DBA だってもっと効率化したい!〜最近の自動化事情とOracle Database〜DBA だってもっと効率化したい!〜最近の自動化事情とOracle Database〜
DBA だってもっと効率化したい!〜最近の自動化事情とOracle Database〜
 
MySQL 5.7 innodb_enhance_partii_20160527
MySQL 5.7 innodb_enhance_partii_20160527MySQL 5.7 innodb_enhance_partii_20160527
MySQL 5.7 innodb_enhance_partii_20160527
 
RAC+ASM: Stories to Share
RAC+ASM: Stories to ShareRAC+ASM: Stories to Share
RAC+ASM: Stories to Share
 
PGConf.ASIA 2019 - High Availability, 10 Seconds Failover - Lucky Haryadi
PGConf.ASIA 2019 - High Availability, 10 Seconds Failover - Lucky HaryadiPGConf.ASIA 2019 - High Availability, 10 Seconds Failover - Lucky Haryadi
PGConf.ASIA 2019 - High Availability, 10 Seconds Failover - Lucky Haryadi
 
PGConf.ASIA 2019 Bali - Mission Critical Production High Availability Postgre...
PGConf.ASIA 2019 Bali - Mission Critical Production High Availability Postgre...PGConf.ASIA 2019 Bali - Mission Critical Production High Availability Postgre...
PGConf.ASIA 2019 Bali - Mission Critical Production High Availability Postgre...
 
Oracle applications r12.2.0 installation on linux
Oracle applications r12.2.0 installation on linuxOracle applications r12.2.0 installation on linux
Oracle applications r12.2.0 installation on linux
 
Yeti DNS - Experimenting at the root
Yeti DNS - Experimenting at the rootYeti DNS - Experimenting at the root
Yeti DNS - Experimenting at the root
 
Namespaces for Local Networks
Namespaces for Local NetworksNamespaces for Local Networks
Namespaces for Local Networks
 
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
 
RIPE 71 and IETF 94 reports webinar
RIPE 71 and IETF 94 reports webinarRIPE 71 and IETF 94 reports webinar
RIPE 71 and IETF 94 reports webinar
 
Red Hat Enterprise Linux OpenStack Platform on Inktank Ceph Enterprise
Red Hat Enterprise Linux OpenStack Platform on Inktank Ceph EnterpriseRed Hat Enterprise Linux OpenStack Platform on Inktank Ceph Enterprise
Red Hat Enterprise Linux OpenStack Platform on Inktank Ceph Enterprise
 
Oracle 10g Performance: chapter 00 statspack
Oracle 10g Performance: chapter 00 statspackOracle 10g Performance: chapter 00 statspack
Oracle 10g Performance: chapter 00 statspack
 
MySQL InnoDB Cluster 미리보기 (remote cluster test)
MySQL InnoDB Cluster 미리보기 (remote cluster test)MySQL InnoDB Cluster 미리보기 (remote cluster test)
MySQL InnoDB Cluster 미리보기 (remote cluster test)
 

Viewers also liked

[Hands on]pws가입하기
[Hands on]pws가입하기[Hands on]pws가입하기
[Hands on]pws가입하기seungdon Choi
 
Pivotal HD 3.0 설치가이드
Pivotal HD 3.0 설치가이드Pivotal HD 3.0 설치가이드
Pivotal HD 3.0 설치가이드seungdon Choi
 
PCF installation guide
PCF installation guidePCF installation guide
PCF installation guideseungdon Choi
 
PCF Roadshow - Learn the past
PCF Roadshow - Learn the pastPCF Roadshow - Learn the past
PCF Roadshow - Learn the pastseungdon Choi
 
Pivotal Big Data Suite 소개자료
Pivotal Big Data Suite 소개자료Pivotal Big Data Suite 소개자료
Pivotal Big Data Suite 소개자료seungdon Choi
 
gsoc_mentor for Shivram Mani
gsoc_mentor for Shivram Manigsoc_mentor for Shivram Mani
gsoc_mentor for Shivram ManiShivram Mani
 
James Watters - PCF Roadshow@Seoul
James Watters - PCF Roadshow@SeoulJames Watters - PCF Roadshow@Seoul
James Watters - PCF Roadshow@Seoulseungdon Choi
 
PXF HAWQ Unmanaged Data
PXF HAWQ Unmanaged DataPXF HAWQ Unmanaged Data
PXF HAWQ Unmanaged DataShivram Mani
 
Managing Apache HAWQ with Apache AMBARI
Managing Apache HAWQ with Apache AMBARIManaging Apache HAWQ with Apache AMBARI
Managing Apache HAWQ with Apache AMBARIMithun (Matt) Mathew
 
Hawq Hcatalog Integration
Hawq Hcatalog IntegrationHawq Hcatalog Integration
Hawq Hcatalog IntegrationShivram Mani
 
Apache Zeppelin Meetup Christian Tzolov 1/21/16
Apache Zeppelin Meetup Christian Tzolov 1/21/16 Apache Zeppelin Meetup Christian Tzolov 1/21/16
Apache Zeppelin Meetup Christian Tzolov 1/21/16 PivotalOpenSourceHub
 
Apache HAWQ : An Introduction
Apache HAWQ : An IntroductionApache HAWQ : An Introduction
Apache HAWQ : An IntroductionSandeep Kunkunuru
 
PCF Installation Guide
PCF Installation GuidePCF Installation Guide
PCF Installation Guideseungdon Choi
 
HAWQ: a massively parallel processing SQL engine in hadoop
HAWQ: a massively parallel processing SQL engine in hadoopHAWQ: a massively parallel processing SQL engine in hadoop
HAWQ: a massively parallel processing SQL engine in hadoopBigData Research
 
Pivotal HAWQ - High Availability (2014)
Pivotal HAWQ - High Availability (2014)Pivotal HAWQ - High Availability (2014)
Pivotal HAWQ - High Availability (2014)saravana krishnamurthy
 
Pivotal Strata NYC 2015 Apache HAWQ Launch
Pivotal Strata NYC 2015 Apache HAWQ LaunchPivotal Strata NYC 2015 Apache HAWQ Launch
Pivotal Strata NYC 2015 Apache HAWQ LaunchVMware Tanzu
 
Apache HAWQ and Apache MADlib: Journey to Apache
Apache HAWQ and Apache MADlib: Journey to ApacheApache HAWQ and Apache MADlib: Journey to Apache
Apache HAWQ and Apache MADlib: Journey to ApachePivotalOpenSourceHub
 

Viewers also liked (20)

[Hands on]pws가입하기
[Hands on]pws가입하기[Hands on]pws가입하기
[Hands on]pws가입하기
 
Pivotal HD 3.0 설치가이드
Pivotal HD 3.0 설치가이드Pivotal HD 3.0 설치가이드
Pivotal HD 3.0 설치가이드
 
PCF installation guide
PCF installation guidePCF installation guide
PCF installation guide
 
Pivotal CF 소개
Pivotal CF 소개 Pivotal CF 소개
Pivotal CF 소개
 
PCF Roadshow - Learn the past
PCF Roadshow - Learn the pastPCF Roadshow - Learn the past
PCF Roadshow - Learn the past
 
Pivotal Big Data Suite 소개자료
Pivotal Big Data Suite 소개자료Pivotal Big Data Suite 소개자료
Pivotal Big Data Suite 소개자료
 
PXF BDAM 2016
PXF BDAM 2016PXF BDAM 2016
PXF BDAM 2016
 
gsoc_mentor for Shivram Mani
gsoc_mentor for Shivram Manigsoc_mentor for Shivram Mani
gsoc_mentor for Shivram Mani
 
James Watters - PCF Roadshow@Seoul
James Watters - PCF Roadshow@SeoulJames Watters - PCF Roadshow@Seoul
James Watters - PCF Roadshow@Seoul
 
PXF HAWQ Unmanaged Data
PXF HAWQ Unmanaged DataPXF HAWQ Unmanaged Data
PXF HAWQ Unmanaged Data
 
Managing Apache HAWQ with Apache AMBARI
Managing Apache HAWQ with Apache AMBARIManaging Apache HAWQ with Apache AMBARI
Managing Apache HAWQ with Apache AMBARI
 
Hawq Hcatalog Integration
Hawq Hcatalog IntegrationHawq Hcatalog Integration
Hawq Hcatalog Integration
 
Apache Zeppelin Meetup Christian Tzolov 1/21/16
Apache Zeppelin Meetup Christian Tzolov 1/21/16 Apache Zeppelin Meetup Christian Tzolov 1/21/16
Apache Zeppelin Meetup Christian Tzolov 1/21/16
 
Apache HAWQ : An Introduction
Apache HAWQ : An IntroductionApache HAWQ : An Introduction
Apache HAWQ : An Introduction
 
PCF Installation Guide
PCF Installation GuidePCF Installation Guide
PCF Installation Guide
 
HAWQ: a massively parallel processing SQL engine in hadoop
HAWQ: a massively parallel processing SQL engine in hadoopHAWQ: a massively parallel processing SQL engine in hadoop
HAWQ: a massively parallel processing SQL engine in hadoop
 
Pivotal HAWQ - High Availability (2014)
Pivotal HAWQ - High Availability (2014)Pivotal HAWQ - High Availability (2014)
Pivotal HAWQ - High Availability (2014)
 
Pivotal Strata NYC 2015 Apache HAWQ Launch
Pivotal Strata NYC 2015 Apache HAWQ LaunchPivotal Strata NYC 2015 Apache HAWQ Launch
Pivotal Strata NYC 2015 Apache HAWQ Launch
 
Build & test Apache Hawq
Build & test Apache Hawq Build & test Apache Hawq
Build & test Apache Hawq
 
Apache HAWQ and Apache MADlib: Journey to Apache
Apache HAWQ and Apache MADlib: Journey to ApacheApache HAWQ and Apache MADlib: Journey to Apache
Apache HAWQ and Apache MADlib: Journey to Apache
 

Similar to Phd tutorial hawq_v0.1

Oracle cluster installation with grid and nfs
Oracle cluster  installation with grid and nfsOracle cluster  installation with grid and nfs
Oracle cluster installation with grid and nfsChanaka Lasantha
 
Oracle cluster installation with grid and iscsi
Oracle cluster  installation with grid and iscsiOracle cluster  installation with grid and iscsi
Oracle cluster installation with grid and iscsiChanaka Lasantha
 
Instrumentación de entrega continua con Gitlab
Instrumentación de entrega continua con GitlabInstrumentación de entrega continua con Gitlab
Instrumentación de entrega continua con GitlabSoftware Guru
 
PostgreSQL Portland Performance Practice Project - Database Test 2 Howto
PostgreSQL Portland Performance Practice Project - Database Test 2 HowtoPostgreSQL Portland Performance Practice Project - Database Test 2 Howto
PostgreSQL Portland Performance Practice Project - Database Test 2 HowtoMark Wong
 
Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...
Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...
Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...Nagios
 
AWS Study Group - Chapter 03 - Elasticity and Scalability Concepts [Solution ...
AWS Study Group - Chapter 03 - Elasticity and Scalability Concepts [Solution ...AWS Study Group - Chapter 03 - Elasticity and Scalability Concepts [Solution ...
AWS Study Group - Chapter 03 - Elasticity and Scalability Concepts [Solution ...QCloudMentor
 
How to install squid proxy on server or how to install squid proxy on centos o
How to install squid proxy on server  or how to install squid proxy on centos oHow to install squid proxy on server  or how to install squid proxy on centos o
How to install squid proxy on server or how to install squid proxy on centos oProxiesforrent
 
SANS @Night There's Gold in Them Thar Package Management Databases
SANS @Night There's Gold in Them Thar Package Management DatabasesSANS @Night There's Gold in Them Thar Package Management Databases
SANS @Night There's Gold in Them Thar Package Management DatabasesPhil Hagen
 
Wait Events 10g
Wait Events 10gWait Events 10g
Wait Events 10gsagai
 
UKOUG, Oracle Transaction Locks
UKOUG, Oracle Transaction LocksUKOUG, Oracle Transaction Locks
UKOUG, Oracle Transaction LocksKyle Hailey
 
How we setup Rsync-powered Incremental Backups
How we setup Rsync-powered Incremental BackupsHow we setup Rsync-powered Incremental Backups
How we setup Rsync-powered Incremental Backupsnicholaspaun
 
glance replicator
glance replicatorglance replicator
glance replicatoririx_jp
 
DNS_Tutorial 2.pptx
DNS_Tutorial 2.pptxDNS_Tutorial 2.pptx
DNS_Tutorial 2.pptxviditsir
 
4Developers: Dns vs webapp
4Developers: Dns vs webapp4Developers: Dns vs webapp
4Developers: Dns vs webappPROIDEA
 
Kicking off with Zend Expressive and Doctrine ORM (PHPNW2016)
Kicking off with Zend Expressive and Doctrine ORM (PHPNW2016)Kicking off with Zend Expressive and Doctrine ORM (PHPNW2016)
Kicking off with Zend Expressive and Doctrine ORM (PHPNW2016)James Titcumb
 

Similar to Phd tutorial hawq_v0.1 (20)

Oracle cluster installation with grid and nfs
Oracle cluster  installation with grid and nfsOracle cluster  installation with grid and nfs
Oracle cluster installation with grid and nfs
 
Oracle cluster installation with grid and iscsi
Oracle cluster  installation with grid and iscsiOracle cluster  installation with grid and iscsi
Oracle cluster installation with grid and iscsi
 
Instrumentación de entrega continua con Gitlab
Instrumentación de entrega continua con GitlabInstrumentación de entrega continua con Gitlab
Instrumentación de entrega continua con Gitlab
 
PostgreSQL Portland Performance Practice Project - Database Test 2 Howto
PostgreSQL Portland Performance Practice Project - Database Test 2 HowtoPostgreSQL Portland Performance Practice Project - Database Test 2 Howto
PostgreSQL Portland Performance Practice Project - Database Test 2 Howto
 
Stacki and Chef at Pardot
Stacki and Chef at PardotStacki and Chef at Pardot
Stacki and Chef at Pardot
 
Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...
Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...
Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...
 
AWS Study Group - Chapter 03 - Elasticity and Scalability Concepts [Solution ...
AWS Study Group - Chapter 03 - Elasticity and Scalability Concepts [Solution ...AWS Study Group - Chapter 03 - Elasticity and Scalability Concepts [Solution ...
AWS Study Group - Chapter 03 - Elasticity and Scalability Concepts [Solution ...
 
How to install squid proxy on server or how to install squid proxy on centos o
How to install squid proxy on server  or how to install squid proxy on centos oHow to install squid proxy on server  or how to install squid proxy on centos o
How to install squid proxy on server or how to install squid proxy on centos o
 
SANS @Night There's Gold in Them Thar Package Management Databases
SANS @Night There's Gold in Them Thar Package Management DatabasesSANS @Night There's Gold in Them Thar Package Management Databases
SANS @Night There's Gold in Them Thar Package Management Databases
 
Wait Events 10g
Wait Events 10gWait Events 10g
Wait Events 10g
 
Testing with PostgreSQL
Testing with PostgreSQLTesting with PostgreSQL
Testing with PostgreSQL
 
UKOUG, Oracle Transaction Locks
UKOUG, Oracle Transaction LocksUKOUG, Oracle Transaction Locks
UKOUG, Oracle Transaction Locks
 
Ex200
Ex200Ex200
Ex200
 
How we setup Rsync-powered Incremental Backups
How we setup Rsync-powered Incremental BackupsHow we setup Rsync-powered Incremental Backups
How we setup Rsync-powered Incremental Backups
 
Backups
BackupsBackups
Backups
 
glance replicator
glance replicatorglance replicator
glance replicator
 
DNS_Tutorial 2.pptx
DNS_Tutorial 2.pptxDNS_Tutorial 2.pptx
DNS_Tutorial 2.pptx
 
4Developers: Dns vs webapp
4Developers: Dns vs webapp4Developers: Dns vs webapp
4Developers: Dns vs webapp
 
Installing GravCMS
Installing GravCMSInstalling GravCMS
Installing GravCMS
 
Kicking off with Zend Expressive and Doctrine ORM (PHPNW2016)
Kicking off with Zend Expressive and Doctrine ORM (PHPNW2016)Kicking off with Zend Expressive and Doctrine ORM (PHPNW2016)
Kicking off with Zend Expressive and Doctrine ORM (PHPNW2016)
 

Recently uploaded

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024SynarionITSolutions
 

Recently uploaded (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 

Phd tutorial hawq_v0.1

  • 1. pivotal.io 875 Howard Street, Fifth Floor, San Francisco, CA 94103
  • 2.  
  • 5.  
  • 6.  
  • 7.  
  • 8.  
  • 9.  
  • 10.  
  • 11.  
  • 12.  
  • 13.  
  • 14.  
  • 15.  
  • 16.  
  • 17.  
  • 18.  
  • 19.  
  • 20.  
  • 22.  
  • 23.  
  • 24. pivotal.io 875 Howard Street, Fifth Floor, San Francisco, CA 94103 DOCUMENT  CONTROL     For any questions regarding this document contact: Name: Seungdon Choi E-mail: schoi@pivotal.io Document Revision History Date Version Description Author Reviewer 01/02/2015 0.1 Draft for internal review 03/02/2015 0.9 For Distribution
  • 25. pivotal.io 875 Howard Street, Fifth Floor, San Francisco, CA 94103 Table of Contents DOCUMENT
  • 26.  OVERVIEW ...................................................................................... 3 PREREQUISITE ...................................................................................................... 4 [예제]HAWQ
  • 30.  :
  • 33.  :
  • 36.  
  • 37. pivotal.io 875 Howard Street, Fifth Floor, San Francisco, CA 94103 Big Data 와 Hadoop 이 계속 시장에서 화두가 되고 있으며, 인터넷 기업뿐만 아니라 기업환경에서도 Hadoop 이 새로운 데이터 소스와 인프라로서 자리잡아 가고 있다. 하지만 운영과 분석을 위해서는 Map Reduce 등 새로운 기술을 배워야 하는 learning curve 들로 인해서 많은 기업들이 도입을 두려워 하고 있는 것도 현실이다. Pivotal 의 HAWQ 제품은 SQL-On-Hadoop 제품으로 기존에 사용자와 개발자가 익숙한 SQL 인터페이스를 제공하여 Hadoop 의 workload 를 기존의 Data Warehouse 처럼 쉽고 빠르게 빅데이터 프로젝트를 수행할 수 있게 한다. 본 문서는 Pivotal HD Single Node VM 을 사용하여 HAWQ 의 기본 사용법에 대한 Hands On Practice 를 다룬다.
  • 40.  
  • 41.  
  • 42.   Pivotal HAWQ 개발가이드 Pivotal HD 2.1.0 Documentation: http://pivotalhd.docs.pivotal.io/doc/2100/index.html HAWQ Administration Guide: http://pivotalhd.docs.pivotal.io/doc/2100/webhelp/index.html#topics/HAWQAdminist ration.html PXF Guide: http://pivotalhd.docs.pivotal.io/doc/2100/webhelp/index.html#topics/PivotalExtensio nFrameworkPXF.html Getting Started Data Tutorial: http://pivotalhd.docs.pivotal.io/tutorial/getting-started/overview.html
  • 43.  
  • 44.  
  • 45.  
  • 47.  
  • 48. pivotal.io 875 Howard Street, Fifth Floor, San Francisco, CA 94103 실습에서 사용될 Data Set 과 샘플 스크립트를 준비한다. (1)테스트를 위한 Single node VM 을 다운받아 수행시킨다. 자세한 내용은 https://pivotalkr.wordpress.com/2015/01/02/phd-single-node-vm 활용기 1-vm- 이미지-다운로드-sample-code/ 을 참고한다. (2)샘플 데이터를 하기의 URL 에서 다운받아 VM 에 업로드한다. https://github.com/gopivotal/pivotal-samples/tree/master/sample-data [pivhdsne:~]$ ls retail_demo categories_dim.tsv.gz date_dim.tsv.gz orders.tsv.gz customer_addresses_dim.tsv.gz email_addresses_dim.tsv.gz payment_methods.tsv.gz customers_dim.tsv.gz order_lineitems.tsv.gz products_dim.tsv.gz HDFS 에 상기 file 들을 load 한다. #hadoop fs –put retail_demo/ / 혹은 sh ./load_data_to_HDFS.sh 정상적으로 file 들이 load 됨을 확인한다. Hadoop fsck /retail_demo [pivhdsne:~]$ hadoop fs -ls /retail_demo Found 9 items -rw-r--r-- 3 gpadmin hadoop 590 2015-01-30 14:50 /retail_demo/categories_dim.tsv.gz -rw-r--r-- 3 gpadmin hadoop 53995977 2015-01-30 14:50 /retail_demo/customer_addresses_dim.tsv.gz -rw-r--r-- 3 gpadmin hadoop 4646775 2015-01-30 14:50 /retail_demo/customers_dim.tsv.gz -rw-r--r-- 3 gpadmin hadoop 17772 2015-01-30 14:50 /retail_demo/date_dim.tsv.gz -rw-r--r-- 3 gpadmin hadoop 7760971 2015-01-30 14:50 /retail_demo/email_addresses_dim.tsv.gz -rw-r--r-- 3 gpadmin hadoop 137780165 2015-01-30 14:50 /retail_demo/order_lineitems.tsv.gz -rw-r--r-- 3 gpadmin hadoop 72797064 2015-01-30 14:50 /retail_demo/orders.tsv.gz -rw-r--r-- 3 gpadmin hadoop 99 2015-01-30 14:50 /retail_demo/payment_methods.tsv.gz -rw-r--r-- 3 gpadmin hadoop 23333203 2015-01-30 14:50 /retail_demo/products_dim.tsv.gz [예제]HAWQ
  • 51.  
  • 52.  
  • 53. pivotal.io 875 Howard Street, Fifth Floor, San Francisco, CA 94103 HAWQ 의 internal table 을 생성하고, Sample 데이터를 로드하도록 하자. 우선 기존에 있는 retail_demo schema 를 삭제 후 재생성한다. [pivhdsne:hawq_tables]$ psql psql (8.2.15) Type help for help. gpadmin=# drop schema retail_demo; [pivhdsne:hawq_tables]$ psql psql (8.2.15) Type help for help. gpadmin=# drop schema retail_demo; gpadmin=# i /pivotal-samples/hawq/hawq_tables/create_hawq_tables.sql DROP TABLE CREATE TABLE DROP TABLE CREATE TABLE DROP TABLE CREATE TABLE DROP TABLE CREATE TABLE DROP TABLE CREATE TABLE DROP TABLE CREATE TABLE DROP TABLE CREATE TABLE DROP TABLE CREATE TABLE DROP TABLE CREATE TABLE Sample Data 를 copy 명령어를 사용하여 각 HAWQ table 에 load 한다. [pivhdsne:hawq_tables]$ cd /home/gpadmin/retail_demo/ [pivhdsne:retail_demo]$ ls -lrt total 293632
  • 54. pivotal.io 875 Howard Street, Fifth Floor, San Francisco, CA 94103 -rw-r--r-- 1 gpadmin gpadmin 590 Jan 30 14:42 categories_dim.tsv.gz -rw-r--r-- 1 gpadmin gpadmin 7760971 Jan 30 14:42 email_addresses_dim.tsv.gz -rw-r--r-- 1 gpadmin gpadmin 17772 Jan 30 14:42 date_dim.tsv.gz -rw-r--r-- 1 gpadmin gpadmin 4646775 Jan 30 14:42 customers_dim.tsv.gz -rw-r--r-- 1 gpadmin gpadmin 53995977 Jan 30 14:42 customer_addresses_dim.tsv.gz -rw-r--r-- 1 gpadmin gpadmin 137780165 Jan 30 14:42 order_lineitems.tsv.gz -rw-r--r-- 1 gpadmin gpadmin 23333203 Jan 30 14:42 products_dim.tsv.gz -rw-r--r-- 1 gpadmin gpadmin 99 Jan 30 14:42 payment_methods.tsv.gz -rw-r--r-- 1 gpadmin gpadmin 72797064 Jan 30 14:42 orders.tsv.gz zcat customers_dim.tsv.gz | psql -c COPY retail_demo.customers_dim_hawq FROM STDIN DELIMITER E't' NULL E''; zcat categories_dim.tsv.gz | psql -c COPY retail_demo.categories_dim_hawq FROM STDIN DELIMITER E't' NULL E''; zcat order_lineitems.tsv.gz | psql -c COPY retail_demo.order_lineitems_hawq FROM STDIN DELIMITER E't' NULL E''; zcat orders.tsv.gz | psql -c COPY retail_demo.orders_hawq FROM STDIN DELIMITER E't' NULL E''; zcat customer_addresses_dim.tsv.gz | psql -c COPY retail_demo.customer_addresses_dim_hawq FROM STDIN DELIMITER E't' NULL E''; zcat email_addresses_dim.tsv.gz | psql -c COPY retail_demo.email_addresses_dim_hawq FROM STDIN DELIMITER E't' NULL E''; zcat products_dim.tsv.gz | psql -c COPY retail_demo.products_dim_hawq FROM STDIN DELIMITER E't' NULL E''; zcat payment_methods.tsv.gz | psql -c COPY retail_demo.payment_methods_hawq FROM STDIN DELIMITER E't' NULL E''; zcat date_dim.tsv.gz | psql -c COPY retail_demo.date_dim_hawq FROM STDIN DELIMITER E't' NULL E''; 데이터가 정상적으로 load 되었는지 확인하자. [pivhdsne:hawq_tables]$ pwd /pivotal-samples/hawq/hawq_tables [pivhdsne:hawq_tables]$ sh ./verify_load_hawq_tables.sh Table Name | Count -----------------------------+------------------------ customers_dim_hawq | 401430 categories_dim_hawq | 56 customer_addresses_dim_hawq | 1130639 email_addresses_dim_hawq | 401430 order_lineitems_hawq | 1024158 orders_hawq | 512071 payment_methods_hawq | 5 products_dim_hawq | 698911 -----------------------------+------------------------ HAWQ 는 기본적으로 Greenplum 엔진(postgresql 8.2 를 기반으로 개발된 MPP SQL 엔진) 을 그대로 하둡 HDFS 에서 구현한 SQL on Hadoop 엔진이므로, Greenplum/Postgresql 에서 사용하던 SQL 구문을 그대로 사용할 수 있다.
  • 55. pivotal.io 875 Howard Street, Fifth Floor, San Francisco, CA 94103 HAWQ 에 query 를 날려보도록 하자. Order table 에서 각 우편번호별 총 지급액, 총세액을 구하는 query 이다. [pivhdsne:hawq_tables]$ psql psql (8.2.15) Type help for help. gpadmin=# select billing_address_postal_code, sum(total_paid_amount::float8) as total, sum(total_tax_amount::float8) as tax from retail_demo.orders_hawq group by billing_address_postal_code order by total desc limit 10; billing_address_postal_code | total | tax -----------------------------+-----------+----------- 48001 | 111868.32 | 6712.0992 15329 | 107958.24 | 6477.4944 42714 | 103244.58 | 6194.6748 41030 | 101365.5 | 6081.93 50223 | 100511.64 | 6030.6984 03106 | 83566.41 | 0 57104 | 77383.63 | 3095.3452 23002 | 73673.66 | 3683.683 25703 | 68282.12 | 4096.9272 26178 | 66836.4 | 4010.184 (10 rows) gpadmin=# [예제]PXF
  • 57.  :
  • 59.   여기서는 HAWQ 의 PXF External Table 을 생성하는 예제를 설명한다. HAWQ PXF External Table 을 사용하면 Pivotal HD 상의 다양한 native format(comma separated, tab delimited, plain text file 등) 으로 정의된 dataset 을 읽고 쓸 수 있다. Load 할 data set 을 확인하자. [pivhdsne:retail_demo]$ hadoop fs -ls /retail_demo Found 9 items drwxr-xr-x - gpadmin hadoop 0 2015-02-02 13:45 /retail_demo/categories_dim
  • 60. pivotal.io 875 Howard Street, Fifth Floor, San Francisco, CA 94103 drwxr-xr-x - gpadmin hadoop 0 2015-02-02 13:45 /retail_demo/customer_addresses_dim drwxr-xr-x - gpadmin hadoop 0 2015-02-02 13:45 /retail_demo/customers_dim drwxr-xr-x - gpadmin hadoop 0 2015-02-02 13:45 /retail_demo/date_dim drwxr-xr-x - gpadmin hadoop 0 2015-02-02 13:45 /retail_demo/email_addresses_dim drwxr-xr-x - gpadmin hadoop 0 2015-02-02 13:45 /retail_demo/order_lineitems drwxr-xr-x - gpadmin hadoop 0 2015-02-02 13:45 /retail_demo/orders drwxr-xr-x - gpadmin hadoop 0 2015-02-02 13:45 /retail_demo/payment_methods drwxr-xr-x - gpadmin hadoop 0 2015-02-02 13:45 /retail_demo/products_dim [pivhdsne:retail_demo]$ hadoop fs -ls /retail_demo/categories_dim Found 1 items -rw-r--r-- 3 gpadmin hadoop 590 2015-02-02 13:45 /retail_demo/categories_dim/categories_dim.tsv.gz 데이터 확인 hadoop fs -cat /retail_demo/categories_dim.tsv.gz |zcat External Table 을 생성한다. ! 이 query 로 수행하면 Fragmenter deprecate 가 되어 워닝이 난다. http://pivotalhd.docs.pivotal.io/tutorial/getting-started/hawq/pxf- external-tables.html 에 있는 External Table Creation 명령어를 이용하자. [pivhdsne:pxf_tables]$ pwd /pivotal-samples/hawq/pxf_tables [pivhdsne:pxf_tables]$ psql psql (8.2.15) Type help for help. gpadmin=# i create_pxf_tables.sql External Table 의 문법을 확인해보자. 미리 정의되어 있는 Profile – HdfsTextSimple 을 사용하여 정의한다. 기정의되어 있는 Profile List 는 http://pivotalhd.docs.pivotal.io/doc/2100/webhelp/index.html#topics/PXFInsta llationandAdministration.html 에서 확인할 수 있다. CREATE EXTERNAL TABLE retail_demo.payment_methods_pxf ( payment_method_id smallint, payment_method_code character varying(20) ) LOCATION ('pxf://pivhdsne:50070/retail_demo/payment_methods/payment_methods.tsv.gz?profile=HdfsTextSimple') FORMAT 'TEXT' (DELIMITER = E't');
  • 61. pivotal.io 875 Howard Street, Fifth Floor, San Francisco, CA 94103 Table 이 제대로 생성이 되었는지 dictionary 를 확인한다. gpadmin=# dx retail_demo.*_pxf List of relations Schema | Name | Type | Owner | Storage -------------+----------------------------+-------+---------+---------- retail_demo | categories_dim_pxf | table | gpadmin | external retail_demo | customer_addresses_dim_pxf | table | gpadmin | external retail_demo | customers_dim_pxf | table | gpadmin | external retail_demo | date_dim_pxf | table | gpadmin | external retail_demo | email_addresses_dim_pxf | table | gpadmin | external retail_demo | order_lineitems_pxf | table | gpadmin | external retail_demo | orders_pxf | table | gpadmin | external retail_demo | payment_methods_pxf | table | gpadmin | external retail_demo | products_dim_pxf | table | gpadmin | external (9 rows) 실제 External Table 로 HDFS 에 있는 file 의 row count 를 세어 보자. [pivhdsne:pxf_tables]$ pwd /pivotal-samples/hawq/pxf_tables [pivhdsne:pxf_tables]$ sh verify_load_pxf_tables.sh Table Name | Count -----------------------------+------------------------ customers_dim_pxf | 401430 categories_dim_pxf | 56 customer_addresses_dim_pxf | 1130639 email_addresses_dim_pxf | 401430 order_lineitems_pxf | 1024158 orders_pxf | 512071 payment_methods_pxf | 5 products_dim_pxf | 698911 -----------------------------+------------------------ External Table 을 이용해서 앞에 수행했던 HAWQ query 를 수행해 보자. gpadmin=#select billing_address_postal_code, sum(total_paid_amount::float8) as total, sum(total_tax_amount::float8) as tax from retail_demo.orders_pxf group by billing_address_postal_code order by total desc limit 10; billing_address_postal_code | total | tax
  • 62. pivotal.io 875 Howard Street, Fifth Floor, San Francisco, CA 94103 -----------------------------+-----------+----------- 48001 | 111868.32 | 6712.0992 15329 | 107958.24 | 6477.4944 42714 | 103244.58 | 6194.6748 41030 | 101365.5 | 6081.93 50223 | 100511.64 | 6030.6984 03106 | 83566.41 | 0 57104 | 77383.63 | 3095.3452 23002 | 73673.66 | 3683.683 25703 | 68282.12 | 4096.9272 26178 | 66836.4 | 4010.184 (10 rows) 통계정보의 생성 PXF External Table 의 경우도 하기의 예제와 같이 통계정보를 수집하여 SQL 수행시 최적의 query plan 을 작성하는데 도움을 줄 수 있다. [pivhdsne:~]$ seq 1 10000000 /tmp/demo.txt 셈플 데이터 생성 [pivhdsne:~]$ hadoop fs -put /tmp/demo.txt / Hadoop 에 load [pivhdsne:~]$ psql psql (8.2.15) Type help for help. gpadmin=# timing Timing is on. gpadmin=# CREATE EXTERNAL TABLE demo (val INT) External Table 생성 gpadmin-# LOCATION ('pxf://pivhdsne:50070/demo.txt?Fragmenter=com.pivotal.pxf.plugins.hdfs.HdfsDataFragmenterAnal yzer=com.pivotal.pxf.plugins.hdfs.HdfsAnalyzerAccessor=com.pivotal.pxf.plugins.hdfs.TextFileAccess orResolver=com.pivotal.pxf.plugins.hdfs.TextResolver') gpadmin-# FORMAT 'TEXT' (DELIMITER = '|'); CREATE EXTERNAL TABLE Time: 52.876 ms gpadmin=# select relpages,reltuples from pg_class where relname='demo'; 통계정보 확인 relpages | reltuples ----------+----------- 1000 | 1e+06 (1 row)
  • 63. pivotal.io 875 Howard Street, Fifth Floor, San Francisco, CA 94103 Time: 137.892 ms gpadmin=# select val from demo where val=59999; query 수행 val ------- 59999 (1 row) Time: 3840.291 ms gpadmin=# analyze demo; 통계정보 수행 ANALYZE Time: 258.789 ms gpadmin=# select relpages,reltuples from pg_class where relname='demo'; relpages | reltuples ----------+----------- 4096 | 161858 (1 row) Time: 101.710 ms gpadmin=# select val from demo where val=59999; val ------- 59999 (1 row) Time: 2734.797 ms gpadmin=# [예제]PXF
  • 65.  :
  • 67.   간단한 HDFS plain text 뿐만 아니라 PXF 를 사용하여 Hive,HBase 같은 소스도 External Table 로 access 가 가능하다. (1) Hive table 생성
  • 68. pivotal.io 875 Howard Street, Fifth Floor, San Francisco, CA 94103 Hive Table 을 생성하고, $ hive hive CREATE DATABASE IF NOT EXISTS retail_demo; USE retail_demo; CREATE TABLE retail_demo.order_lineitems_hive ( Order_ID string , Order_Item_ID bigint , Product_ID int , Product_Name string , Customer_ID int , Store_ID int , Item_Shipment_Status_Code string , Order_Datetime timestamp , Ship_Datetime timestamp , Item_Return_Datetime timestamp , Item_Refund_Datetime timestamp , Product_Category_ID int , Product_Category_Name string , Payment_Method_Code string , Tax_Amount double , Item_Quantity int , Item_Price double , Discount_Amount double , Coupon_Code string , Coupon_Amount double , Ship_Address_Line1 string , Ship_Address_Line2 string , Ship_Address_Line3 string , Ship_Address_City string , Ship_Address_State string , Ship_Address_Postal_Code string , Ship_Address_Country string , Ship_Phone_Number string , Ship_Customer_Name string , Ship_Customer_Email_Address string , Ordering_Session_ID string , Website_URL string
  • 69. pivotal.io 875 Howard Street, Fifth Floor, San Francisco, CA 94103 ) -- PARTITIONED BY (Order_Datetime timestamp) ROW FORMAT DELIMITED FIELDS TERMINATED BY 't' STORED AS TEXTFILE LOCATION '/retail_demo/order_lineitems/'; hive select count(*) from retail_demo.customers_dim_hive; Total MapReduce jobs = 1 .. 2 seconds 940 msec Ended Job = job_1370914856264_0009 MapReduce Jobs Launched: Job 0: Map: 1 Reduce: 1 Cumulative CPU: 2.94 sec HDFS Read: 4646997 HDFS Write: 7 SUCCESS Total MapReduce CPU Time Spent: 2 seconds 940 msec OK 401430 Time taken: 20.03 seconds (2) HAWQ 에서 External Table 을 만들어서 이 Hive Table 을 읽어보자. [pivhdsne:~]$ psql psql (8.2.15) Type help for help CREATE EXTERNAL TABLE retail_demo.order_lineitems_hive ( Order_ID text , Order_Item_ID bigint , Product_ID int , Product_Name text , Customer_ID int , Store_ID int , Item_Shipment_Status_Code text , Order_Datetime timestamp , Ship_Datetime timestamp , Item_Return_Datetime timestamp , Item_Refund_Datetime timestamp , Product_Category_ID int , Product_Category_Name text , Payment_Method_Code text
  • 70. pivotal.io 875 Howard Street, Fifth Floor, San Francisco, CA 94103 , Tax_Amount float8 , Item_Quantity int , Item_Price float8 , Discount_Amount float8 , Coupon_Code text , Coupon_Amount float8 , Ship_Address_Line1 text , Ship_Address_Line2 text , Ship_Address_Line3 text , Ship_Address_City text , Ship_Address_State text , Ship_Address_Postal_Code text , Ship_Address_Country text , Ship_Phone_Number text , Ship_Customer_Name text , Ship_Customer_Email_Address text , Ordering_Session_ID text , Website_URL text ) LOCATION ('pxf://pivhdsne:50070/retail_demo.order_lineitems_hive?PROFILE=hive') FORMAT 'CUSTOM' (formatter='pxfwritable_import'); gpadmin=# select count(*) from retail_demo.order_lineitems_hive; count --------- 1024158 (1 row)
  • 71.  
  • 72.  
  • 73.  
  • 74.  
  • 75.