SlideShare a Scribd company logo
1 of 31
Download to read offline
Greenplum Database
                                            Overview


                                               Michael Crutcher
                                               Greenplum Product Management




© Copyright 2012 EMC Corporation. All rights reserved.                        1
© Copyright 2012 EMC Corporation. All rights reserved.   2
© Copyright 2012 EMC Corporation. All rights reserved.   3
© Copyright 2012 EMC Corporation. All rights reserved.   4
Greenplum Unified Analytic Platform




© Copyright 2012 EMC Corporation. All rights reserved.   5
GREENPLUM DATABASE


                                                         Industry Leading Database with
                                                         Massively Parallel Performance
                                                         To Empower your Analytics




© Copyright 2012 EMC Corporation. All rights reserved.                                    6
GREENPLUM DATABASE

Extreme Performance for Analytics
                                                Optimized for BI and analytics
                                                         – Deep integration with statistical packages
                                                         – High performance parallel implementations
                                               • Simple and automatic
                                                         – Just load and query like any database
                                                         – Tables are automatically distributed
                                                           across nodes
                                               • Extremely scalable
                                                         – MPP shared-nothing architecture
                                                         – All nodes can scan and process in parallel
                                                         – Linear scalability by adding nodes




© Copyright 2012 EMC Corporation. All rights reserved.                                                  7
GREENPLUM DATABASE

Performance Through Parallelism

              Master
              Servers                                    ...   ...
           Query planning &
               dispatch


            Network
          Interconnect


             Segment
             Servers                 ...                             ...
           Query processing
            & data storage




              External
              Sources
                Loading,
            streaming, etc.




© Copyright 2012 EMC Corporation. All rights reserved.                     8
GREENPLUM DATABASE

Greenplum Delivers Choice & Flexibility

                           Greenplum Data                Greenplum
                           Computing Appliance           Software Solutions
                           Choose Greenplum              Greenplum
                           Database and/or                 Database, Hadoop,
                           Hadoop modules in               & Chorus on your
                           ¼ rack increments               x86 hardware
                           Scale up by adding            Flexibility for any
                           your choice of                  workload or
                           additional modules              environment
                           Minimal time to value         Perpetual or
                                                           subscription licenses




© Copyright 2012 EMC Corporation. All rights reserved.                             9
Core Functionality
                                       GREENPLUM DATABASE




© Copyright 2012 EMC Corporation. All rights reserved.      10
GREENPLUM DATABASE

Component Overview
                                           CLIENT ACCESS                          3rd PARTY TOOLS                          ADMIN TOOLS
       CLIENT ACCESS                     ODBC, JDBC, OLEDB,                        BI Tools, ETL Tools               Greenplum Command Center
          & TOOLS                           MapReduce, etc.                          Data Mining, etc                Greenplum Package Manager




                                      LOADING & EXT. ACCESS                   STORAGE & DATA ACCESS                     LANGUAGE SUPPORT
                                        Petabyte-Scale Loading                 Hybrid Storage & Execution                 Comprehensive SQL
                                                                               (Row- & Column-Oriented)
                                         Trickle Micro-Batching                                                            Native MapReduce
          PRODUCT                        Anywhere Data Access
                                                                               In-Database Compression
                                                                                                                       SQL 2003 OLAP Extensions
          FEATURES                                                               Multi-Level Partitioning
                                                                                                                        Programmable Analytics
                                                                              Indexes – Btree, Bitmap, etc.
                                                                                                                         Analytics Extensions
                                                                                 External Table Support               (GeoSpatial, PR/R, PL/Java,
                                                                                                                         PL/Python, PL/Perl)



      GREENPLUM                        Multi-Level Fault Tolerance
   DATABASE ADAPTIVE                   (RAID, Mirroring, DR with                Online System Expansion                 Workload Management
       SERVICES                           Data Domain Boost)



                                                         Shared-Nothing MPP                                   Parallel Dataflow Engine
         CORE MPP
                                                    Parallel Query Optimizer                                gNet™ Software Interconnect
       ARCHITECTURE
                                                  Polymorphic Data Storage™                         Scatter/Gather Streaming™ Data Loading




© Copyright 2012 EMC Corporation. All rights reserved.                                                                                              11
GREENPLUM DATABASE

Most Powerful Data Loading Capabilities
                                                          SINGLE RACK COMPARISON
 Industry leading performance
  at 10+TB per-hour per-rack
 Scatter-Gather Streaming™
  provides true linear scaling
 Support for both large-batch and
  continuous real-time loading
  strategies                                             Greenplum    Oracle   Netezza   Teradata
                                                                     Exadata
 Enable complex data
  transformations ―in-flight‖                            Greenplum load rates scale linearly with
                                                           the number of racks, others do not.
 Transparent interfaces to loading                        For example, two racks = >20TB/H

  via support files, application, and
  services


© Copyright 2012 EMC Corporation. All rights reserved.                                              12
GREENPLUM DATABASE

Polymorphic Table StorageTM
                                                         TABLE ‗CUSTOMER‘
               Mar             Apr           May          Jun   Jul   Aug    Sept     Oct      Nov
               ‗11             ‗11           ‗11          ‗11   ‗11   ‗11     ‗11     ‗11      ‗11




                               Column-oriented for COLD DATA                Row-oriented for HOT DATA

   • Storage types can be mixed within a table or database
            – Four table types: heap, row-oriented AO, column-oriented AO,
              external
   • Rich compression functionality, definable column by column
            – Block compression: Gzip (levels 1-9), QuickLZ
            – Stream compression: RLE (levels 1-4)
   • Flexible indexing, partitioning, and more


© Copyright 2012 EMC Corporation. All rights reserved.                                                  13
GREENPLUM DATABASE

gNet Software Interconnect
 A supercomputing-based ―soft-switch‖
  responsible for
        – Efficiently pumping streams of data between motion
          nodes during query-plan execution
        – Delivers messages, moves data, collects results, and
          coordinates work among the segments in the system

                   gNet Software
                    Interconnect




© Copyright 2012 EMC Corporation. All rights reserved.           14
GREENPLUM DATABASE

Parallel Query Optimizer
                                                                         PHYSICAL EXECUTION PLAN
  Cost-based optimization                                                FROM SQL OR MAPREDUCE

   looks for the most                                                                Gather Motion

   efficient plan
                                                                                      4:1(Slice 3)


                                                                                          Sort
  Physical plan contains
   scans, joins, sorts,
                                                                                     HashAggregate



   aggregations, etc.                                                                   HashJoin



  Global planning avoids                                     Redistribute Motion
                                                                 4:4(Slice 1)                                    Hash

   sub-optimal ‘SQL
                                                                   HashJoin                                  HashJoin
   pushing’ to segments
                                                         Seq Scan on
  Directly inserts ‘motion’
                                                                                                   Seq Scan on
                                                           lineitem           Hash                                         Hash
                                                                                                    customer


   nodes for inter-segment                                                Seq Scan on
                                                                             orders
                                                                                                                   Broadcast Motion
                                                                                                                     4:4(Slice 2)

   communication                                                                                                        Seq Scan on
                                                                                                                          motion




© Copyright 2012 EMC Corporation. All rights reserved.                                                                                15
Analytics Overview
                                       GREENPLUM DATABASE




© Copyright 2012 EMC Corporation. All rights reserved.      16
GREENPLUM DATABASE

Analytical Capabilities Overview
Data Access & Query Layer                           ODBC            JDBC




                                                         SQL

       Stored                   SQL 2003                                   In-Database
                                                               MapReduce
     Procedures                   OLAP                                       Analytics
                                                                                         GREENPLUM
                                                                                            HD
                                        Polymorphic Storage



                                 GREENPLUM DATABASE

                                                           Greenplum gNet




© Copyright 2012 EMC Corporation. All rights reserved.                                               17
GREENPLUM DATABASE

In-Database Analytics: Categories
Data Access & Query Layer                           ODBC        JDBC




                                                                SQL

  In-Database Analytics

        Embedded
                                                           SAS Scoring
                                                           Accelerator
          Partner                       GPDB                                                User-Written
                                                                              Open Source
                                        Embedded                                            Analytical
                                                                              Extensions
                                        Analytics          SAS/HPA                          Algorithms
     Open-Source                                           High Performance
                                                           Analytics
      User-written


                                                    GREENPLUM DATABASE



© Copyright 2012 EMC Corporation. All rights reserved.                                                     18
GREENPLUM DATABASE

Analytics Highlight: MADlib
 Scalable in-database
  analytics
 Data-parallel
        –    Mathematical Algorithms
        –    Statistical Algorithms
        –    Machine learning Algorithms
        –    Supports structured and
             unstructured data.

 Open-source software
        – Source Accessibility
        – Converge business,
          academic, and open-source
          communities




© Copyright 2012 EMC Corporation. All rights reserved.   19
Manageability, Extensions
                                       GREENPLUM DATABASE




© Copyright 2012 EMC Corporation. All rights reserved.      20
GREENPLUM DATABASE

Easy Manageability for Big Data
 Single console for both Database and Hadoop
 Administration
        – Start, Stop Database
        – Recover, Rebalance Segments
 Interactive view of System Metrics
        – Real-time
        – Historic (Configurable by time period)
 In-depth view for System Health
        – Hardware health
        – Software (Database, Hadoop)
 Query Monitoring
        – Search, Prioritize, Cancel Queries
        – View Query‘s Execution Plan
 Workload Management
        – Configure Resource Queues
        – Prioritize Users




© Copyright 2012 EMC Corporation. All rights reserved.   21
GREENPLUM DATABASE

Easy Extension Installation
Greenplum Package Manager
                                                         Greenplum supports easy deployment
                                                         of numerous extensions like Madlib,
                                                         PL/Perl, PL/Java, PostGIS, etc.



                  Master
                  Servers




                Segment                                                         ...
                Servers          ...




© Copyright 2012 EMC Corporation. All rights reserved.                                         22
GREENPLUM DATABASE

High Performance gNet for Hadoop
Parallel Query Access
                                                                    Connect any data set in Hadoop to
                                                                     GP DB‘s SQL Engine
                                                                    Process Hadoop data in place
                                                                    Parallelize import/export data
                                                                     from/to Hadoop thanks to GP DB‘s
                                                                     market leading data sharing
                                                                     performance

                           gNet for Hadoop                          Supported formats:
                                                                      – Text (compressed and
                                                                        uncompressed)
                                                                      – binary
                                                          User-
                        Text            Binary
                                                         Defined
                                                                      – proprietary/user-defined

                                                                    GP HD 1.x, GP MR 1.x, CDH3u2



© Copyright 2012 EMC Corporation. All rights reserved.                                                   23
High Availability,
                                  Back up, Support
                                        GREENPLUM DATABASE




© Copyright 2012 EMC Corporation. All rights reserved.       24
GREENPLUM DATABASE

High Availability
 GPDB cluster
        – 2 Master servers
        – Multiple Segment servers
 Segment servers support
  multiple database
  instances
        – Primary instances that
          actively process queries
        – Standby mirror instances
 Block level mirroring
        – Low resource
                                                            Set of Active
          consumption                                    Segment Instances
        – Differential resynch
          capable for fast recovery



© Copyright 2012 EMC Corporation. All rights reserved.                       25
GREENPLUM DATABASE

Backup/Restore with EMC Data Domain
                                                          Integration options
                                                            – NFS: Data Domain device mounted
                                Full
                             Appliance
                                                              as NFS storage
                                 +
                            Data Domain
                                                            – DD Boost: Native, client-side
                                                              deduplication. Supported in GPDB
                                                              4.2 and higher

                            Boost or NFS
                                                          Drastic reduction in backup storage
                                                           requirement
                             2 X 10GBit IP
                                                          Backup all segment servers in
                                                           parallel directly to Data Domain
                                                          Data Domain Integrates seamlessly
                                                           into standard Greenplum full
                                                           backup data export and data
                                                           restore procedures




© Copyright 2012 EMC Corporation. All rights reserved.                                           26
GREENPLUM DATABASE

Backup/Restore with EMC Data Domain
Backup and restore between remote and primary sites

         Greenplum DCA                                                        Greenplum DCA


                                   Data Domain                        Data Domain

                                                            LAN/WAN



                                                         Data Domain
                                                          Replication


 Ideal for configurations with RPO and RTO requirements that can be specified in hours
 Supports:
    – Collection Replication for DD Boost backup
    – Directory-level replication for NFS backup
    – Encryption over the WAN




© Copyright 2012 EMC Corporation. All rights reserved.                                        27
GREENPLUM DATABASE

Customer Support Services
                                                     • Remote Technical Support
                                                         –   24x7 technical support and remote troubleshooting
                                                         –   Customer-managed case severity level
                                                         –   Four-hour response objective
                                                     • Onsite Support (DCA Only)
                                                         –   Installation of replacement parts
                                                         –   Replacement parts shipped for next business day arrival
                                                         –   GP SW upgrade included
                                                     • Proactive Service
                                                         –   Secure remote monitoring for hardware (DCA)
                                                         –   Notification of engineering technical advisories
                                                         –   Built-in tools maximize stability and performance
                                                     • Secure Self-Help
                                                         –   24x7 access to eService support tools including
                                                             knowledgebase, forums, and appropriately licensed
                                                             software updates




© Copyright 2012 EMC Corporation. All rights reserved.                                                             28
GREENPLUM DATABASE

Other Relevant Greenplum Sessions
Session                                                  Presenter         Times
Unified Analytics Platform Introduction                  Brian Wilson      Tues 10:00-11:00   Thurs 1:00-2:00
Greenplum Hadoop Overview                                Susheel Kaushik   Mon 10:00-11:00    Wed 4:15-5:15
Greenplum DCA Overview                                   Hanxi Chen        Mon 4:00-5:00      Thurs 10:00-11:00
Greenplum Analytics Workbench                            Apurva Desai      Wed 8:30-9:30      Thurs 10:00-11:00
Analytics on Hadoop                                      Don Miner         Tues 11:30-12:30   Thurs 8:30-9:30
Big Data Driven Businesses in Action:                    Mike Maxey        Wed 4:15-5:15      Thurs 11:30-12:30
Creating Real Business Value Using
Greenplum UAP (Panel w/4 Customers)
Analytics for Business Value: Collaboration              Josh Klahr        Mon 10:00-11:00    Wed 2:45-3:45
Disruptive Data Science — How Data                       Annika Jimenez    Tues 4:15-5:15     Thurs 11:30-12:30
Science and Big Data are Transforming                    David Dietrich
Business, IT and People




© Copyright 2012 EMC Corporation. All rights reserved.                                                            29
Thank You




© Copyright 2012 EMC Corporation. All rights reserved.        30
Greenplum Database Overview

More Related Content

What's hot

High Availability in MySQL 8 using InnoDB Cluster
High Availability in MySQL 8 using InnoDB ClusterHigh Availability in MySQL 8 using InnoDB Cluster
High Availability in MySQL 8 using InnoDB ClusterSven Sandberg
 
MySQL Group Replication
MySQL Group ReplicationMySQL Group Replication
MySQL Group ReplicationUlf Wendel
 
Introduction to MySQL InnoDB Cluster
Introduction to MySQL InnoDB ClusterIntroduction to MySQL InnoDB Cluster
Introduction to MySQL InnoDB ClusterFrederic Descamps
 
MySQL Database Architectures - InnoDB ReplicaSet & Cluster
MySQL Database Architectures - InnoDB ReplicaSet & ClusterMySQL Database Architectures - InnoDB ReplicaSet & Cluster
MySQL Database Architectures - InnoDB ReplicaSet & ClusterKenny Gryp
 
MySQL InnoDB Cluster - New Features in 8.0 Releases - Best Practices
MySQL InnoDB Cluster - New Features in 8.0 Releases - Best PracticesMySQL InnoDB Cluster - New Features in 8.0 Releases - Best Practices
MySQL InnoDB Cluster - New Features in 8.0 Releases - Best PracticesKenny Gryp
 
EMR 플랫폼 기반의 Spark 워크로드 실행 최적화 방안 - 정세웅, AWS 솔루션즈 아키텍트:: AWS Summit Online Ko...
EMR 플랫폼 기반의 Spark 워크로드 실행 최적화 방안 - 정세웅, AWS 솔루션즈 아키텍트::  AWS Summit Online Ko...EMR 플랫폼 기반의 Spark 워크로드 실행 최적화 방안 - 정세웅, AWS 솔루션즈 아키텍트::  AWS Summit Online Ko...
EMR 플랫폼 기반의 Spark 워크로드 실행 최적화 방안 - 정세웅, AWS 솔루션즈 아키텍트:: AWS Summit Online Ko...Amazon Web Services Korea
 
Introduction to Greenplum
Introduction to GreenplumIntroduction to Greenplum
Introduction to GreenplumDave Cramer
 
Hadoop & Greenplum: Why Do Such a Thing?
Hadoop & Greenplum: Why Do Such a Thing?Hadoop & Greenplum: Why Do Such a Thing?
Hadoop & Greenplum: Why Do Such a Thing?Ed Kohlwey
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudNoritaka Sekiyama
 
DataOpsbarcelona 2019: Deep dive into MySQL Group Replication... the magic e...
DataOpsbarcelona 2019:  Deep dive into MySQL Group Replication... the magic e...DataOpsbarcelona 2019:  Deep dive into MySQL Group Replication... the magic e...
DataOpsbarcelona 2019: Deep dive into MySQL Group Replication... the magic e...Frederic Descamps
 
Overview of new features in Apache Ranger
Overview of new features in Apache RangerOverview of new features in Apache Ranger
Overview of new features in Apache RangerDataWorks Summit
 
Apache Sparkに手を出してヤケドしないための基本 ~「Apache Spark入門より」~ (デブサミ 2016 講演資料)
Apache Sparkに手を出してヤケドしないための基本 ~「Apache Spark入門より」~ (デブサミ 2016 講演資料)Apache Sparkに手を出してヤケドしないための基本 ~「Apache Spark入門より」~ (デブサミ 2016 講演資料)
Apache Sparkに手を出してヤケドしないための基本 ~「Apache Spark入門より」~ (デブサミ 2016 講演資料)NTT DATA OSS Professional Services
 
MariaDB 제품 소개
MariaDB 제품 소개MariaDB 제품 소개
MariaDB 제품 소개NeoClova
 
Percona Live 2022 - The Evolution of a MySQL Database System
Percona Live 2022 - The Evolution of a MySQL Database SystemPercona Live 2022 - The Evolution of a MySQL Database System
Percona Live 2022 - The Evolution of a MySQL Database SystemFrederic Descamps
 
Using Terraform to manage the configuration of a Cisco ACI fabric.
Using Terraform to manage the configuration of a Cisco ACI fabric.Using Terraform to manage the configuration of a Cisco ACI fabric.
Using Terraform to manage the configuration of a Cisco ACI fabric.Joel W. King
 
ClickHouse Defense Against the Dark Arts - Intro to Security and Privacy
ClickHouse Defense Against the Dark Arts - Intro to Security and PrivacyClickHouse Defense Against the Dark Arts - Intro to Security and Privacy
ClickHouse Defense Against the Dark Arts - Intro to Security and PrivacyAltinity Ltd
 
Solving PostgreSQL wicked problems
Solving PostgreSQL wicked problemsSolving PostgreSQL wicked problems
Solving PostgreSQL wicked problemsAlexander Korotkov
 
uReplicator: Uber Engineering’s Scalable, Robust Kafka Replicator
uReplicator: Uber Engineering’s Scalable,  Robust Kafka ReplicatoruReplicator: Uber Engineering’s Scalable,  Robust Kafka Replicator
uReplicator: Uber Engineering’s Scalable, Robust Kafka ReplicatorMichael Hongliang Xu
 

What's hot (20)

High Availability in MySQL 8 using InnoDB Cluster
High Availability in MySQL 8 using InnoDB ClusterHigh Availability in MySQL 8 using InnoDB Cluster
High Availability in MySQL 8 using InnoDB Cluster
 
MySQL Group Replication
MySQL Group ReplicationMySQL Group Replication
MySQL Group Replication
 
Introduction to MySQL InnoDB Cluster
Introduction to MySQL InnoDB ClusterIntroduction to MySQL InnoDB Cluster
Introduction to MySQL InnoDB Cluster
 
MySQL Database Architectures - InnoDB ReplicaSet & Cluster
MySQL Database Architectures - InnoDB ReplicaSet & ClusterMySQL Database Architectures - InnoDB ReplicaSet & Cluster
MySQL Database Architectures - InnoDB ReplicaSet & Cluster
 
MySQL InnoDB Cluster - New Features in 8.0 Releases - Best Practices
MySQL InnoDB Cluster - New Features in 8.0 Releases - Best PracticesMySQL InnoDB Cluster - New Features in 8.0 Releases - Best Practices
MySQL InnoDB Cluster - New Features in 8.0 Releases - Best Practices
 
EMR 플랫폼 기반의 Spark 워크로드 실행 최적화 방안 - 정세웅, AWS 솔루션즈 아키텍트:: AWS Summit Online Ko...
EMR 플랫폼 기반의 Spark 워크로드 실행 최적화 방안 - 정세웅, AWS 솔루션즈 아키텍트::  AWS Summit Online Ko...EMR 플랫폼 기반의 Spark 워크로드 실행 최적화 방안 - 정세웅, AWS 솔루션즈 아키텍트::  AWS Summit Online Ko...
EMR 플랫폼 기반의 Spark 워크로드 실행 최적화 방안 - 정세웅, AWS 솔루션즈 아키텍트:: AWS Summit Online Ko...
 
Introduction to Greenplum
Introduction to GreenplumIntroduction to Greenplum
Introduction to Greenplum
 
Hadoop & Greenplum: Why Do Such a Thing?
Hadoop & Greenplum: Why Do Such a Thing?Hadoop & Greenplum: Why Do Such a Thing?
Hadoop & Greenplum: Why Do Such a Thing?
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
 
DataOpsbarcelona 2019: Deep dive into MySQL Group Replication... the magic e...
DataOpsbarcelona 2019:  Deep dive into MySQL Group Replication... the magic e...DataOpsbarcelona 2019:  Deep dive into MySQL Group Replication... the magic e...
DataOpsbarcelona 2019: Deep dive into MySQL Group Replication... the magic e...
 
Node Labels in YARN
Node Labels in YARNNode Labels in YARN
Node Labels in YARN
 
Overview of new features in Apache Ranger
Overview of new features in Apache RangerOverview of new features in Apache Ranger
Overview of new features in Apache Ranger
 
Apache Sparkに手を出してヤケドしないための基本 ~「Apache Spark入門より」~ (デブサミ 2016 講演資料)
Apache Sparkに手を出してヤケドしないための基本 ~「Apache Spark入門より」~ (デブサミ 2016 講演資料)Apache Sparkに手を出してヤケドしないための基本 ~「Apache Spark入門より」~ (デブサミ 2016 講演資料)
Apache Sparkに手を出してヤケドしないための基本 ~「Apache Spark入門より」~ (デブサミ 2016 講演資料)
 
InnoDb Vs NDB Cluster
InnoDb Vs NDB ClusterInnoDb Vs NDB Cluster
InnoDb Vs NDB Cluster
 
MariaDB 제품 소개
MariaDB 제품 소개MariaDB 제품 소개
MariaDB 제품 소개
 
Percona Live 2022 - The Evolution of a MySQL Database System
Percona Live 2022 - The Evolution of a MySQL Database SystemPercona Live 2022 - The Evolution of a MySQL Database System
Percona Live 2022 - The Evolution of a MySQL Database System
 
Using Terraform to manage the configuration of a Cisco ACI fabric.
Using Terraform to manage the configuration of a Cisco ACI fabric.Using Terraform to manage the configuration of a Cisco ACI fabric.
Using Terraform to manage the configuration of a Cisco ACI fabric.
 
ClickHouse Defense Against the Dark Arts - Intro to Security and Privacy
ClickHouse Defense Against the Dark Arts - Intro to Security and PrivacyClickHouse Defense Against the Dark Arts - Intro to Security and Privacy
ClickHouse Defense Against the Dark Arts - Intro to Security and Privacy
 
Solving PostgreSQL wicked problems
Solving PostgreSQL wicked problemsSolving PostgreSQL wicked problems
Solving PostgreSQL wicked problems
 
uReplicator: Uber Engineering’s Scalable, Robust Kafka Replicator
uReplicator: Uber Engineering’s Scalable,  Robust Kafka ReplicatoruReplicator: Uber Engineering’s Scalable,  Robust Kafka Replicator
uReplicator: Uber Engineering’s Scalable, Robust Kafka Replicator
 

Similar to Greenplum Database Overview

EMC Greenplum Database version 4.2
EMC Greenplum Database version 4.2 EMC Greenplum Database version 4.2
EMC Greenplum Database version 4.2 EMC
 
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?  Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You? EMC
 
Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10keirdo1
 
Green Plum IIIT- Allahabad
Green Plum IIIT- Allahabad Green Plum IIIT- Allahabad
Green Plum IIIT- Allahabad IIIT ALLAHABAD
 
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...Cloudera, Inc.
 
Big Data and HPC
Big Data and HPCBig Data and HPC
Big Data and HPCNetApp
 
Ugif 04 2011 france ug04042011-jroy_part1
Ugif 04 2011   france ug04042011-jroy_part1Ugif 04 2011   france ug04042011-jroy_part1
Ugif 04 2011 france ug04042011-jroy_part1UGIF
 
The IBM Netezza Data Warehouse Appliance
The IBM Netezza Data Warehouse ApplianceThe IBM Netezza Data Warehouse Appliance
The IBM Netezza Data Warehouse ApplianceIBM Sverige
 
Netezza vs teradata
Netezza vs teradataNetezza vs teradata
Netezza vs teradataAsis Mohanty
 
Ugif 12 2011-informix iwa
Ugif 12 2011-informix iwaUgif 12 2011-informix iwa
Ugif 12 2011-informix iwaUGIF
 
Tools for developing and monitoring SQL in DB2 for z/OS
Tools for developing and monitoring SQL in DB2 for z/OSTools for developing and monitoring SQL in DB2 for z/OS
Tools for developing and monitoring SQL in DB2 for z/OSSurekha Parekh
 
Business Intelligence and Data Analytics Revolutionized with Apache Hadoop
Business Intelligence and Data Analytics Revolutionized with Apache HadoopBusiness Intelligence and Data Analytics Revolutionized with Apache Hadoop
Business Intelligence and Data Analytics Revolutionized with Apache HadoopCloudera, Inc.
 
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...Amr Awadallah
 
Netapp Evento Virtual Business Breakfast 20110616
Netapp Evento  Virtual  Business  Breakfast 20110616Netapp Evento  Virtual  Business  Breakfast 20110616
Netapp Evento Virtual Business Breakfast 20110616Bruno Banha
 
Netezza Deep Dives
Netezza Deep DivesNetezza Deep Dives
Netezza Deep DivesRush Shah
 
Oracle Systems _ David Baker _ Best Practices for Simplifying Implementation ...
Oracle Systems _ David Baker _ Best Practices for Simplifying Implementation ...Oracle Systems _ David Baker _ Best Practices for Simplifying Implementation ...
Oracle Systems _ David Baker _ Best Practices for Simplifying Implementation ...InSync2011
 
Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013Jonathan Seidman
 

Similar to Greenplum Database Overview (20)

Greenplum feature
Greenplum featureGreenplum feature
Greenplum feature
 
EMC Greenplum Database version 4.2
EMC Greenplum Database version 4.2 EMC Greenplum Database version 4.2
EMC Greenplum Database version 4.2
 
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?  Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?
 
Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10
 
Green Plum IIIT- Allahabad
Green Plum IIIT- Allahabad Green Plum IIIT- Allahabad
Green Plum IIIT- Allahabad
 
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
 
Big Data and HPC
Big Data and HPCBig Data and HPC
Big Data and HPC
 
Ugif 04 2011 france ug04042011-jroy_part1
Ugif 04 2011   france ug04042011-jroy_part1Ugif 04 2011   france ug04042011-jroy_part1
Ugif 04 2011 france ug04042011-jroy_part1
 
The IBM Netezza Data Warehouse Appliance
The IBM Netezza Data Warehouse ApplianceThe IBM Netezza Data Warehouse Appliance
The IBM Netezza Data Warehouse Appliance
 
Netezza vs teradata
Netezza vs teradataNetezza vs teradata
Netezza vs teradata
 
Ugif 12 2011-informix iwa
Ugif 12 2011-informix iwaUgif 12 2011-informix iwa
Ugif 12 2011-informix iwa
 
Tools for developing and monitoring SQL in DB2 for z/OS
Tools for developing and monitoring SQL in DB2 for z/OSTools for developing and monitoring SQL in DB2 for z/OS
Tools for developing and monitoring SQL in DB2 for z/OS
 
Greenplum Architecture
Greenplum ArchitectureGreenplum Architecture
Greenplum Architecture
 
Business Intelligence and Data Analytics Revolutionized with Apache Hadoop
Business Intelligence and Data Analytics Revolutionized with Apache HadoopBusiness Intelligence and Data Analytics Revolutionized with Apache Hadoop
Business Intelligence and Data Analytics Revolutionized with Apache Hadoop
 
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
 
Netapp Evento Virtual Business Breakfast 20110616
Netapp Evento  Virtual  Business  Breakfast 20110616Netapp Evento  Virtual  Business  Breakfast 20110616
Netapp Evento Virtual Business Breakfast 20110616
 
Netezza Deep Dives
Netezza Deep DivesNetezza Deep Dives
Netezza Deep Dives
 
Oracle Systems _ David Baker _ Best Practices for Simplifying Implementation ...
Oracle Systems _ David Baker _ Best Practices for Simplifying Implementation ...Oracle Systems _ David Baker _ Best Practices for Simplifying Implementation ...
Oracle Systems _ David Baker _ Best Practices for Simplifying Implementation ...
 
Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013
 
EMC Unified Analytics Platform. Gintaras Pelenis
EMC Unified Analytics Platform. Gintaras PelenisEMC Unified Analytics Platform. Gintaras Pelenis
EMC Unified Analytics Platform. Gintaras Pelenis
 

More from EMC

INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUDINDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUDEMC
 
Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote EMC
 
EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX EMC
 
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIOTransforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIOEMC
 
Citrix ready-webinar-xtremio
Citrix ready-webinar-xtremioCitrix ready-webinar-xtremio
Citrix ready-webinar-xtremioEMC
 
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES EMC
 
EMC with Mirantis Openstack
EMC with Mirantis OpenstackEMC with Mirantis Openstack
EMC with Mirantis OpenstackEMC
 
Modern infrastructure for business data lake
Modern infrastructure for business data lakeModern infrastructure for business data lake
Modern infrastructure for business data lakeEMC
 
Force Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop ElsewhereForce Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop ElsewhereEMC
 
Pivotal : Moments in Container History
Pivotal : Moments in Container History Pivotal : Moments in Container History
Pivotal : Moments in Container History EMC
 
Data Lake Protection - A Technical Review
Data Lake Protection - A Technical ReviewData Lake Protection - A Technical Review
Data Lake Protection - A Technical ReviewEMC
 
Mobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or FoeMobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or FoeEMC
 
Virtualization Myths Infographic
Virtualization Myths Infographic Virtualization Myths Infographic
Virtualization Myths Infographic EMC
 
Intelligence-Driven GRC for Security
Intelligence-Driven GRC for SecurityIntelligence-Driven GRC for Security
Intelligence-Driven GRC for SecurityEMC
 
The Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure AgeThe Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure AgeEMC
 
EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015EMC
 
EMC Academic Summit 2015
EMC Academic Summit 2015EMC Academic Summit 2015
EMC Academic Summit 2015EMC
 
Data Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education ServicesData Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education ServicesEMC
 
Using EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere EnvironmentsUsing EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere EnvironmentsEMC
 
Using EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBookUsing EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBookEMC
 

More from EMC (20)

INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUDINDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
 
Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote
 
EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX
 
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIOTransforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
 
Citrix ready-webinar-xtremio
Citrix ready-webinar-xtremioCitrix ready-webinar-xtremio
Citrix ready-webinar-xtremio
 
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
 
EMC with Mirantis Openstack
EMC with Mirantis OpenstackEMC with Mirantis Openstack
EMC with Mirantis Openstack
 
Modern infrastructure for business data lake
Modern infrastructure for business data lakeModern infrastructure for business data lake
Modern infrastructure for business data lake
 
Force Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop ElsewhereForce Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop Elsewhere
 
Pivotal : Moments in Container History
Pivotal : Moments in Container History Pivotal : Moments in Container History
Pivotal : Moments in Container History
 
Data Lake Protection - A Technical Review
Data Lake Protection - A Technical ReviewData Lake Protection - A Technical Review
Data Lake Protection - A Technical Review
 
Mobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or FoeMobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or Foe
 
Virtualization Myths Infographic
Virtualization Myths Infographic Virtualization Myths Infographic
Virtualization Myths Infographic
 
Intelligence-Driven GRC for Security
Intelligence-Driven GRC for SecurityIntelligence-Driven GRC for Security
Intelligence-Driven GRC for Security
 
The Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure AgeThe Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure Age
 
EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015
 
EMC Academic Summit 2015
EMC Academic Summit 2015EMC Academic Summit 2015
EMC Academic Summit 2015
 
Data Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education ServicesData Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education Services
 
Using EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere EnvironmentsUsing EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere Environments
 
Using EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBookUsing EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBook
 

Recently uploaded

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 

Recently uploaded (20)

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 

Greenplum Database Overview

  • 1. Greenplum Database Overview Michael Crutcher Greenplum Product Management © Copyright 2012 EMC Corporation. All rights reserved. 1
  • 2. © Copyright 2012 EMC Corporation. All rights reserved. 2
  • 3. © Copyright 2012 EMC Corporation. All rights reserved. 3
  • 4. © Copyright 2012 EMC Corporation. All rights reserved. 4
  • 5. Greenplum Unified Analytic Platform © Copyright 2012 EMC Corporation. All rights reserved. 5
  • 6. GREENPLUM DATABASE Industry Leading Database with Massively Parallel Performance To Empower your Analytics © Copyright 2012 EMC Corporation. All rights reserved. 6
  • 7. GREENPLUM DATABASE Extreme Performance for Analytics  Optimized for BI and analytics – Deep integration with statistical packages – High performance parallel implementations • Simple and automatic – Just load and query like any database – Tables are automatically distributed across nodes • Extremely scalable – MPP shared-nothing architecture – All nodes can scan and process in parallel – Linear scalability by adding nodes © Copyright 2012 EMC Corporation. All rights reserved. 7
  • 8. GREENPLUM DATABASE Performance Through Parallelism Master Servers ... ... Query planning & dispatch Network Interconnect Segment Servers ... ... Query processing & data storage External Sources Loading, streaming, etc. © Copyright 2012 EMC Corporation. All rights reserved. 8
  • 9. GREENPLUM DATABASE Greenplum Delivers Choice & Flexibility Greenplum Data Greenplum Computing Appliance Software Solutions Choose Greenplum  Greenplum Database and/or Database, Hadoop, Hadoop modules in & Chorus on your ¼ rack increments x86 hardware Scale up by adding  Flexibility for any your choice of workload or additional modules environment Minimal time to value  Perpetual or subscription licenses © Copyright 2012 EMC Corporation. All rights reserved. 9
  • 10. Core Functionality GREENPLUM DATABASE © Copyright 2012 EMC Corporation. All rights reserved. 10
  • 11. GREENPLUM DATABASE Component Overview CLIENT ACCESS 3rd PARTY TOOLS ADMIN TOOLS CLIENT ACCESS ODBC, JDBC, OLEDB, BI Tools, ETL Tools Greenplum Command Center & TOOLS MapReduce, etc. Data Mining, etc Greenplum Package Manager LOADING & EXT. ACCESS STORAGE & DATA ACCESS LANGUAGE SUPPORT Petabyte-Scale Loading Hybrid Storage & Execution Comprehensive SQL (Row- & Column-Oriented) Trickle Micro-Batching Native MapReduce PRODUCT Anywhere Data Access In-Database Compression SQL 2003 OLAP Extensions FEATURES Multi-Level Partitioning Programmable Analytics Indexes – Btree, Bitmap, etc. Analytics Extensions External Table Support (GeoSpatial, PR/R, PL/Java, PL/Python, PL/Perl) GREENPLUM Multi-Level Fault Tolerance DATABASE ADAPTIVE (RAID, Mirroring, DR with Online System Expansion Workload Management SERVICES Data Domain Boost) Shared-Nothing MPP Parallel Dataflow Engine CORE MPP Parallel Query Optimizer gNet™ Software Interconnect ARCHITECTURE Polymorphic Data Storage™ Scatter/Gather Streaming™ Data Loading © Copyright 2012 EMC Corporation. All rights reserved. 11
  • 12. GREENPLUM DATABASE Most Powerful Data Loading Capabilities SINGLE RACK COMPARISON  Industry leading performance at 10+TB per-hour per-rack  Scatter-Gather Streaming™ provides true linear scaling  Support for both large-batch and continuous real-time loading strategies Greenplum Oracle Netezza Teradata Exadata  Enable complex data transformations ―in-flight‖ Greenplum load rates scale linearly with the number of racks, others do not.  Transparent interfaces to loading For example, two racks = >20TB/H via support files, application, and services © Copyright 2012 EMC Corporation. All rights reserved. 12
  • 13. GREENPLUM DATABASE Polymorphic Table StorageTM TABLE ‗CUSTOMER‘ Mar Apr May Jun Jul Aug Sept Oct Nov ‗11 ‗11 ‗11 ‗11 ‗11 ‗11 ‗11 ‗11 ‗11 Column-oriented for COLD DATA Row-oriented for HOT DATA • Storage types can be mixed within a table or database – Four table types: heap, row-oriented AO, column-oriented AO, external • Rich compression functionality, definable column by column – Block compression: Gzip (levels 1-9), QuickLZ – Stream compression: RLE (levels 1-4) • Flexible indexing, partitioning, and more © Copyright 2012 EMC Corporation. All rights reserved. 13
  • 14. GREENPLUM DATABASE gNet Software Interconnect  A supercomputing-based ―soft-switch‖ responsible for – Efficiently pumping streams of data between motion nodes during query-plan execution – Delivers messages, moves data, collects results, and coordinates work among the segments in the system gNet Software Interconnect © Copyright 2012 EMC Corporation. All rights reserved. 14
  • 15. GREENPLUM DATABASE Parallel Query Optimizer PHYSICAL EXECUTION PLAN  Cost-based optimization FROM SQL OR MAPREDUCE looks for the most Gather Motion efficient plan 4:1(Slice 3) Sort  Physical plan contains scans, joins, sorts, HashAggregate aggregations, etc. HashJoin  Global planning avoids Redistribute Motion 4:4(Slice 1) Hash sub-optimal ‘SQL HashJoin HashJoin pushing’ to segments Seq Scan on  Directly inserts ‘motion’ Seq Scan on lineitem Hash Hash customer nodes for inter-segment Seq Scan on orders Broadcast Motion 4:4(Slice 2) communication Seq Scan on motion © Copyright 2012 EMC Corporation. All rights reserved. 15
  • 16. Analytics Overview GREENPLUM DATABASE © Copyright 2012 EMC Corporation. All rights reserved. 16
  • 17. GREENPLUM DATABASE Analytical Capabilities Overview Data Access & Query Layer ODBC JDBC SQL Stored SQL 2003 In-Database MapReduce Procedures OLAP Analytics GREENPLUM HD Polymorphic Storage GREENPLUM DATABASE Greenplum gNet © Copyright 2012 EMC Corporation. All rights reserved. 17
  • 18. GREENPLUM DATABASE In-Database Analytics: Categories Data Access & Query Layer ODBC JDBC SQL In-Database Analytics Embedded SAS Scoring Accelerator Partner GPDB User-Written Open Source Embedded Analytical Extensions Analytics SAS/HPA Algorithms Open-Source High Performance Analytics User-written GREENPLUM DATABASE © Copyright 2012 EMC Corporation. All rights reserved. 18
  • 19. GREENPLUM DATABASE Analytics Highlight: MADlib  Scalable in-database analytics  Data-parallel – Mathematical Algorithms – Statistical Algorithms – Machine learning Algorithms – Supports structured and unstructured data.  Open-source software – Source Accessibility – Converge business, academic, and open-source communities © Copyright 2012 EMC Corporation. All rights reserved. 19
  • 20. Manageability, Extensions GREENPLUM DATABASE © Copyright 2012 EMC Corporation. All rights reserved. 20
  • 21. GREENPLUM DATABASE Easy Manageability for Big Data  Single console for both Database and Hadoop  Administration – Start, Stop Database – Recover, Rebalance Segments  Interactive view of System Metrics – Real-time – Historic (Configurable by time period)  In-depth view for System Health – Hardware health – Software (Database, Hadoop)  Query Monitoring – Search, Prioritize, Cancel Queries – View Query‘s Execution Plan  Workload Management – Configure Resource Queues – Prioritize Users © Copyright 2012 EMC Corporation. All rights reserved. 21
  • 22. GREENPLUM DATABASE Easy Extension Installation Greenplum Package Manager Greenplum supports easy deployment of numerous extensions like Madlib, PL/Perl, PL/Java, PostGIS, etc. Master Servers Segment ... Servers ... © Copyright 2012 EMC Corporation. All rights reserved. 22
  • 23. GREENPLUM DATABASE High Performance gNet for Hadoop Parallel Query Access  Connect any data set in Hadoop to GP DB‘s SQL Engine  Process Hadoop data in place  Parallelize import/export data from/to Hadoop thanks to GP DB‘s market leading data sharing performance gNet for Hadoop  Supported formats: – Text (compressed and uncompressed) – binary User- Text Binary Defined – proprietary/user-defined  GP HD 1.x, GP MR 1.x, CDH3u2 © Copyright 2012 EMC Corporation. All rights reserved. 23
  • 24. High Availability, Back up, Support GREENPLUM DATABASE © Copyright 2012 EMC Corporation. All rights reserved. 24
  • 25. GREENPLUM DATABASE High Availability  GPDB cluster – 2 Master servers – Multiple Segment servers  Segment servers support multiple database instances – Primary instances that actively process queries – Standby mirror instances  Block level mirroring – Low resource Set of Active consumption Segment Instances – Differential resynch capable for fast recovery © Copyright 2012 EMC Corporation. All rights reserved. 25
  • 26. GREENPLUM DATABASE Backup/Restore with EMC Data Domain  Integration options – NFS: Data Domain device mounted Full Appliance as NFS storage + Data Domain – DD Boost: Native, client-side deduplication. Supported in GPDB 4.2 and higher Boost or NFS  Drastic reduction in backup storage requirement 2 X 10GBit IP  Backup all segment servers in parallel directly to Data Domain  Data Domain Integrates seamlessly into standard Greenplum full backup data export and data restore procedures © Copyright 2012 EMC Corporation. All rights reserved. 26
  • 27. GREENPLUM DATABASE Backup/Restore with EMC Data Domain Backup and restore between remote and primary sites Greenplum DCA Greenplum DCA Data Domain Data Domain LAN/WAN Data Domain Replication  Ideal for configurations with RPO and RTO requirements that can be specified in hours  Supports: – Collection Replication for DD Boost backup – Directory-level replication for NFS backup – Encryption over the WAN © Copyright 2012 EMC Corporation. All rights reserved. 27
  • 28. GREENPLUM DATABASE Customer Support Services • Remote Technical Support – 24x7 technical support and remote troubleshooting – Customer-managed case severity level – Four-hour response objective • Onsite Support (DCA Only) – Installation of replacement parts – Replacement parts shipped for next business day arrival – GP SW upgrade included • Proactive Service – Secure remote monitoring for hardware (DCA) – Notification of engineering technical advisories – Built-in tools maximize stability and performance • Secure Self-Help – 24x7 access to eService support tools including knowledgebase, forums, and appropriately licensed software updates © Copyright 2012 EMC Corporation. All rights reserved. 28
  • 29. GREENPLUM DATABASE Other Relevant Greenplum Sessions Session Presenter Times Unified Analytics Platform Introduction Brian Wilson Tues 10:00-11:00 Thurs 1:00-2:00 Greenplum Hadoop Overview Susheel Kaushik Mon 10:00-11:00 Wed 4:15-5:15 Greenplum DCA Overview Hanxi Chen Mon 4:00-5:00 Thurs 10:00-11:00 Greenplum Analytics Workbench Apurva Desai Wed 8:30-9:30 Thurs 10:00-11:00 Analytics on Hadoop Don Miner Tues 11:30-12:30 Thurs 8:30-9:30 Big Data Driven Businesses in Action: Mike Maxey Wed 4:15-5:15 Thurs 11:30-12:30 Creating Real Business Value Using Greenplum UAP (Panel w/4 Customers) Analytics for Business Value: Collaboration Josh Klahr Mon 10:00-11:00 Wed 2:45-3:45 Disruptive Data Science — How Data Annika Jimenez Tues 4:15-5:15 Thurs 11:30-12:30 Science and Big Data are Transforming David Dietrich Business, IT and People © Copyright 2012 EMC Corporation. All rights reserved. 29
  • 30. Thank You © Copyright 2012 EMC Corporation. All rights reserved. 30