SlideShare a Scribd company logo
Data Computing Division
© Copyright 2011 EMC Corpora2on. All rights reserved. 1
“Presented
By
Brijesh Kumar Awasthi
IMP2014002
What is GreenPlum…?
•Greenplum, the company, was founded in
September 2003 by Scott Yara and Luke
Lonergan.
•It was a merger of two smaller companies
Metapa in Los Angeles and Didera
in Fairfax, Virginia
•Greenplum, based in in San Mateo,
California, released its database
management system software in April 2005
calling it Bizgres
Data Computing Division
EMC ACQUIRE S GREENPLU M
Greenplum Becomes the Foundation of
EMCʼs Data Computing Division
© Copyright 2011 EMC Corpora2on. All rights reserved. 3
“Greenplum, with expertise in the massively parallel arena, will
give the storage giant a boost in big-data computing.”
– InformationWeek –
“For three years, Gartner has identified Greenplum as
the most advanced vendor in the visionary
quadrant of its data warehouse DBMS Magic Quadrant….”
– Gartner
What the COO of EMC said about Green Plum
And BI
Data Computing Division
© Copyright 2011 EMC Corpora2on. All rights reserved. 5
New Rrealities…
New Demands!• Do it faster
– Ingest more data
– Ingest it faste
– Keep it unsummarised, keep it for longer
• Be more Responsive
– Unpredictable queries, Rapidly evolving bespoke analy2cs
– New tools: Hadoop, MapReduce, Hive, HBase, “R”
• Manage new data types
– Manage and allow queries across structured, semi-‐structured and unstructured data
• Do it at a lower cost
Big Data will revolutionize
Data Warehousing and analysis.
Data Computing Division
Why Greenplum?
Fast
Data
Loading
Extreme Performance
& Elastic Scalability
Unified
Data Access
© Copyright 2011 EMC Corpora2on. All rights reserved. 6
• EMC Greenplum is a shared nothing, massively parallel
processing (MPP) data warehouse system
• Core principle of data computing is to move the processing
dramatically closer to the data and to the people
Data Computing Division
Segment
Servers
Query processing &
data storage
... ...
Master
Server
Query planning &
dispatch
Hadoop
MapReduce
Data
Sources
Loading, streaming,
etc.
Network
Interconnect
External Files, URLs, Hadoop (HDFS),
WebServices (including from other DBs),
O/S Pipes (including from other DBs)
Standard Business
Intelligence and
Analy2cal tools
SQL
BI tools
Analytical tools
Queries distributed
across all available
resources
Shared Nothing,
Massively Parallel
Processing means
no boS lenecks and
linear scalability.
Data loading also
takes advantage of
MPP architecture
Greenplum handles
structured, semi-‐
structured and
unstructured data
Clients see a single
database
primary server,
plus hot failover
© Copyright 2011 EMC Corpora2on. All rights reserved. 7
Data Computing Division
Why is MPP different?
…
Greenplum is a Scale-Out Architecture on standard commodity
hardware
MPP
© Copyright 2011 EMC Corpora2on. All rights reserved. 8
• Queries shipped to each node simultaneously
• Execute parallel on each segment instance.
• Multiple pipe lines of data
• Highly Scalable topology
• Locks and buffers not shared.
Traditional
• Single database buffer used by all user
operations
• More locks, means more complex lock
management system
• Single pipe to data
• Limited Scalability
Partitioning: The Key to Parallelism
Strategy: Spread data evenly across
as many nodes (and disks) as possible
Greenplum Database
High Speed Loader
Data Computing Division
© Copyr2ig0h/0t 220/1112EMC Corpora2on. All rights reserved. 6 9
Order
Order#
Order
Date
Customer
ID
43 Oct 20 2005 12
64 Oct 20 2005 111
45 Oct 20 2005 42
46 Oct 20 2005 64
77 Oct 20 2005 32
48 Oct 20 2005 12
50 Oct 20 2005 34
56 Oct 20 2005 213
63 Oct 20 2005 15
44 Oct 20 2005 102
53 Oct 20 2005 82
55 Oct 20 2005 55
Greenplum Database
Powerful Data Loading Capabilities
• Industry leading performance:
– >10TB per hour per rack
• Innovative, parallel-everything architecture:
– Scatter-Gather Streaming™ provides true linear scaling
– Support for both large-batch and continuous real-time loading
strategies
– Enable complex data transformations “in-flight”
– Transparent interfaces to loading via support files, application and
services
Data Computing Division
© Copyright 2011 EMC Corpora2on. All rights reserved. 1
Traditional Loading vs Greenplum DB Parallel Loading
Segment
nodes
Segment
nodes
Segment
nodes
Segment
nodes
Interconnect
Conventional
Loading
ETL
Servers
Interconnect
ETL
Servers
Data Computing Division
© Copyright 2011 EMC Corpora2on. All rights reserved. 1
Client
Advanced pipeline process for fast operation
Data Computing Division
© Copyright 2011 EMC Corpora2on. All rights reserved. 1
Sort Request
Master Server
Segment Servers
9 6 10
2 11 5
4 3 12
1 7 8
Advanced pipeline process for fast operation
Data Computing Division
© Copyright 2011 EMC Corpora2on. All rights reserved. 10
Master Server
Segment Servers
Client
1 3 5
2 6 8
4 7 10
9 11 12
Greenplum Database
Extreme Performance
• Optimized for BI and Analytics
– Rich eco-system of partners
• Provides automatic parallelization
– Just load and query like any database
– Tables are automatically distributed across
nodes
– No need for manual partitioning or tuning
• Extremely scalable MPP shared-nothing
Architecture
– All nodes can scan and process in parallel
– Linear scalability by adding nodes
Interconnect
Data Computing Division
© Copyright 2011 EMC Corpora2on. All rights reserved. 14
Loading
Platform Independence
Delivers Choice and Flexibility
Virtualized Infrastructure
• Pool resources
• Elastic scalability
Data Computing Appliance
• Optimized Price/Performance
• Minimum time-‐to-‐value
• Ideal for Produc@on Environments
Software-‐Only
• On your x86 hardware
• Flexibility for any workload
Data Computing Division
© Copyright 2011 EMC Corpora2on. All rights reserved. 15
Table ‘Customer’
Jan ’09 Feb ’09 Mar ’09
Apr ’09 May ’09 Jun ’09 Jul ’09
Aug ’09 Sept ’09 Oct ’09 Nov ’09
Column-Oriented
Archival Compression
Data Computing Division
© Copyright 2011 EMC Corpora2on. All rights reserved. 16
Column-Oriented
Fast Compression
Row-Oriented
Fast Compression
Greenplum Polymorphic Data Storage
• Greenplum Databaseʼs engine provides a flexible storage model
– Four table types: heap, row-oriented, column-oriented, external
– Block compression: Gzip (levels 1-9), QuickLZ
• Storage types can be mixed within a database, and even within a table
– Fully configurable via table DDL and partitioning syntax
– You may also choose to index some partitions and not others
• Gives customers the choice of processing model for any table or partition
– Tables/partitions of different storage types can be joined together without restriction
– Highly tuned – e.g. columnar does efficient pre-projection and parallel execution
Unified Data Access Across The Enterprise
• Workload Management
– Connection management controls how many
users can be connected and assigns them to a
queue
– User-based resource queues allow for control of the
total number or cost of queries allowed at any point
in time.
• Dynamic Query Prioritization
– Patent pending technique of dynamically
balancing resources across running queries
– Allows DBAs to control query priorities in real-
time, or determine default priorities by resource
queue
Data Computing Division
© Copyright 2011 EMC Corpora2on. All rights reserved. 17
Highly interactive
web-based
performance
monitoring
Real-time and
historic views of:
• Resource
utilization
• Queries and
query internals
Greenplum Performance Monitor
Data Computing Division
© Copyright 2011 EMC Corpora2on. All rights reserved. 18
Key Technical Requirements for HPA
Data Computing Division
© Copyright 2011 EMC Corpora2on. All rights reserved. 19
 Technical Values
 Performance -Massively parallel Architecture
 Load speeds –10TB/hr
 Integration with SAS
 In-database analytics using Java, PL/R, etc
 Integration with many more BI, Analytical tools,
 Integration with Hadoop for unstructured data analysis
 Financial Value
 Lower Total cost of ownership
 Best Price/performance Ratio in the industry for EDW/ analytical
appliance
 Operational Values
 No Indicesmaintenance
 Backup recovery solution
 Most robust Disaster Recovery Solution in Industry
 Best Technical and customer Support Organization backing
Greenplum Customers -- Government
• Pacific Northwest National Labs
(Dept. of Energy) does
cyberanalytics.
• Usa spending.gov traces the
outlays of the US Federal
Government.
• The Federal Reserve Bank of
Kansas City does economic
analysis mostly related to the
housing market.
• Recently, the Internal Revenue
Service purchased a DCA to do
work related to Fraudulent Tax
returns.
• ATO uses GP as an investigatory
tool in their Compliance and Audit
Logging Unit.
Data Computing Division
© Copyright 2011 EMC Corpora2on. All rights reserved. 12 20
High Performance Analytics
‘The power to know fast’
Data Computing Division
© Copyright 2011 EMC Corpora2on. All rights reserved. 21
Thank you
Questions?
Data Computing Division
© Copyright 2011 EMC Corpora2on. All rights reserved. 22

More Related Content

What's hot

Containers and Big Data
Containers and Big DataContainers and Big Data
Containers and Big Data
DataWorks Summit
 
Spectrum Scale final
Spectrum Scale finalSpectrum Scale final
Spectrum Scale final
Joe Krotz
 
Running Enterprise Workloads with an Open Source Hybrid Cloud Data Architecture
Running Enterprise Workloads with an Open Source Hybrid Cloud Data ArchitectureRunning Enterprise Workloads with an Open Source Hybrid Cloud Data Architecture
Running Enterprise Workloads with an Open Source Hybrid Cloud Data Architecture
DataWorks Summit
 
Data Warehouse Design on Cloud ,A Big Data approach Part_One
Data Warehouse Design on Cloud ,A Big Data approach Part_OneData Warehouse Design on Cloud ,A Big Data approach Part_One
Data Warehouse Design on Cloud ,A Big Data approach Part_One
Panchaleswar Nayak
 
The Car of the Future - Autonomous, Connected, and Data Centric
The Car of the Future - Autonomous, Connected, and Data CentricThe Car of the Future - Autonomous, Connected, and Data Centric
The Car of the Future - Autonomous, Connected, and Data Centric
DataWorks Summit
 
Deep learning 101
Deep learning 101Deep learning 101
Deep learning 101
DataWorks Summit
 
IBM eX5 Workload Optimized x86 Servers
IBM eX5 Workload Optimized x86 ServersIBM eX5 Workload Optimized x86 Servers
IBM eX5 Workload Optimized x86 Servers
Cliff Kinard
 
New enhancements for security and usability in EDB 13
New enhancements for security and usability in EDB 13New enhancements for security and usability in EDB 13
New enhancements for security and usability in EDB 13
EDB
 
Storage Efficiency Customer Success Stories Sept 2010 power point
Storage Efficiency Customer Success Stories Sept 2010 power pointStorage Efficiency Customer Success Stories Sept 2010 power point
Storage Efficiency Customer Success Stories Sept 2010 power point
Michael Hudak
 
Revolutionising Storage for your Future Business Requirements
Revolutionising Storage for your Future Business RequirementsRevolutionising Storage for your Future Business Requirements
Revolutionising Storage for your Future Business Requirements
NetApp
 
How HPC Transforms the Corporate Information Technology Ecosystem
How HPC Transforms the Corporate Information Technology EcosystemHow HPC Transforms the Corporate Information Technology Ecosystem
How HPC Transforms the Corporate Information Technology Ecosystem
inside-BigData.com
 
Achieving a 360-degree view of manufacturing via open source industrial data ...
Achieving a 360-degree view of manufacturing via open source industrial data ...Achieving a 360-degree view of manufacturing via open source industrial data ...
Achieving a 360-degree view of manufacturing via open source industrial data ...
DataWorks Summit
 
Scalable data pipeline
Scalable data pipelineScalable data pipeline
Scalable data pipeline
GreenM
 
IBM Power Systems at FIS InFocus 2019
IBM Power Systems at FIS InFocus 2019IBM Power Systems at FIS InFocus 2019
IBM Power Systems at FIS InFocus 2019
Paula Koziol
 
IBM #Softlayer infographic 2016
IBM #Softlayer infographic 2016IBM #Softlayer infographic 2016
IBM #Softlayer infographic 2016
Patrick Bouillaud
 
Air Flow Presentation Pdf
Air Flow  Presentation PdfAir Flow  Presentation Pdf
Air Flow Presentation Pdf
Danny Newman
 
Ibm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bkIbm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bk
IBM Switzerland
 
Break Free from Oracle
Break Free from OracleBreak Free from Oracle
Break Free from Oracle
EDB
 
Webinar: How Snapshots CAN be Backups
Webinar: How Snapshots CAN be BackupsWebinar: How Snapshots CAN be Backups
Webinar: How Snapshots CAN be Backups
Storage Switzerland
 
EDB Postgres & Tools in a Smart City Project
EDB Postgres & Tools in a Smart City ProjectEDB Postgres & Tools in a Smart City Project
EDB Postgres & Tools in a Smart City Project
EDB
 

What's hot (20)

Containers and Big Data
Containers and Big DataContainers and Big Data
Containers and Big Data
 
Spectrum Scale final
Spectrum Scale finalSpectrum Scale final
Spectrum Scale final
 
Running Enterprise Workloads with an Open Source Hybrid Cloud Data Architecture
Running Enterprise Workloads with an Open Source Hybrid Cloud Data ArchitectureRunning Enterprise Workloads with an Open Source Hybrid Cloud Data Architecture
Running Enterprise Workloads with an Open Source Hybrid Cloud Data Architecture
 
Data Warehouse Design on Cloud ,A Big Data approach Part_One
Data Warehouse Design on Cloud ,A Big Data approach Part_OneData Warehouse Design on Cloud ,A Big Data approach Part_One
Data Warehouse Design on Cloud ,A Big Data approach Part_One
 
The Car of the Future - Autonomous, Connected, and Data Centric
The Car of the Future - Autonomous, Connected, and Data CentricThe Car of the Future - Autonomous, Connected, and Data Centric
The Car of the Future - Autonomous, Connected, and Data Centric
 
Deep learning 101
Deep learning 101Deep learning 101
Deep learning 101
 
IBM eX5 Workload Optimized x86 Servers
IBM eX5 Workload Optimized x86 ServersIBM eX5 Workload Optimized x86 Servers
IBM eX5 Workload Optimized x86 Servers
 
New enhancements for security and usability in EDB 13
New enhancements for security and usability in EDB 13New enhancements for security and usability in EDB 13
New enhancements for security and usability in EDB 13
 
Storage Efficiency Customer Success Stories Sept 2010 power point
Storage Efficiency Customer Success Stories Sept 2010 power pointStorage Efficiency Customer Success Stories Sept 2010 power point
Storage Efficiency Customer Success Stories Sept 2010 power point
 
Revolutionising Storage for your Future Business Requirements
Revolutionising Storage for your Future Business RequirementsRevolutionising Storage for your Future Business Requirements
Revolutionising Storage for your Future Business Requirements
 
How HPC Transforms the Corporate Information Technology Ecosystem
How HPC Transforms the Corporate Information Technology EcosystemHow HPC Transforms the Corporate Information Technology Ecosystem
How HPC Transforms the Corporate Information Technology Ecosystem
 
Achieving a 360-degree view of manufacturing via open source industrial data ...
Achieving a 360-degree view of manufacturing via open source industrial data ...Achieving a 360-degree view of manufacturing via open source industrial data ...
Achieving a 360-degree view of manufacturing via open source industrial data ...
 
Scalable data pipeline
Scalable data pipelineScalable data pipeline
Scalable data pipeline
 
IBM Power Systems at FIS InFocus 2019
IBM Power Systems at FIS InFocus 2019IBM Power Systems at FIS InFocus 2019
IBM Power Systems at FIS InFocus 2019
 
IBM #Softlayer infographic 2016
IBM #Softlayer infographic 2016IBM #Softlayer infographic 2016
IBM #Softlayer infographic 2016
 
Air Flow Presentation Pdf
Air Flow  Presentation PdfAir Flow  Presentation Pdf
Air Flow Presentation Pdf
 
Ibm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bkIbm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bk
 
Break Free from Oracle
Break Free from OracleBreak Free from Oracle
Break Free from Oracle
 
Webinar: How Snapshots CAN be Backups
Webinar: How Snapshots CAN be BackupsWebinar: How Snapshots CAN be Backups
Webinar: How Snapshots CAN be Backups
 
EDB Postgres & Tools in a Smart City Project
EDB Postgres & Tools in a Smart City ProjectEDB Postgres & Tools in a Smart City Project
EDB Postgres & Tools in a Smart City Project
 

Viewers also liked

IIT Delhi Presentation for Alumni
IIT Delhi Presentation for AlumniIIT Delhi Presentation for Alumni
IIT Delhi Presentation for Alumni
amit.kumar
 
Iiit allahabad case study
Iiit allahabad   case studyIiit allahabad   case study
Iiit allahabad case study
MOHAMMAD FARZAN
 
Signage system at IIT Guwahati, Thesis Report
Signage system at IIT Guwahati, Thesis ReportSignage system at IIT Guwahati, Thesis Report
Signage system at IIT Guwahati, Thesis Report
Deepak Kumar
 
Iiit delhi case study
Iiit delhi   case studyIiit delhi   case study
Iiit delhi case study
MOHAMMAD FARZAN
 
Iit delhi case study
Iit delhi case studyIit delhi case study
Iit delhi case study
MOHAMMAD FARZAN
 
IITK case study
IITK case studyIITK case study
IITK case study
Yeshu Rao
 
Louis i kahan
Louis i kahanLouis i kahan
Louis i kahan
vikashsaini78
 
Louis i kahn
Louis i kahnLouis i kahn
Louis i kahn
Viji Ramesh
 
Louis i kahn iim ahmedabad
Louis i kahn iim ahmedabadLouis i kahn iim ahmedabad
Louis i kahn iim ahmedabad
Tanzil Faraz
 
Architectural case study of IIM ahemdabad by louis i khan
Architectural case study of IIM ahemdabad by louis i khanArchitectural case study of IIM ahemdabad by louis i khan
Architectural case study of IIM ahemdabad by louis i khan
Rajat Katarne
 

Viewers also liked (10)

IIT Delhi Presentation for Alumni
IIT Delhi Presentation for AlumniIIT Delhi Presentation for Alumni
IIT Delhi Presentation for Alumni
 
Iiit allahabad case study
Iiit allahabad   case studyIiit allahabad   case study
Iiit allahabad case study
 
Signage system at IIT Guwahati, Thesis Report
Signage system at IIT Guwahati, Thesis ReportSignage system at IIT Guwahati, Thesis Report
Signage system at IIT Guwahati, Thesis Report
 
Iiit delhi case study
Iiit delhi   case studyIiit delhi   case study
Iiit delhi case study
 
Iit delhi case study
Iit delhi case studyIit delhi case study
Iit delhi case study
 
IITK case study
IITK case studyIITK case study
IITK case study
 
Louis i kahan
Louis i kahanLouis i kahan
Louis i kahan
 
Louis i kahn
Louis i kahnLouis i kahn
Louis i kahn
 
Louis i kahn iim ahmedabad
Louis i kahn iim ahmedabadLouis i kahn iim ahmedabad
Louis i kahn iim ahmedabad
 
Architectural case study of IIM ahemdabad by louis i khan
Architectural case study of IIM ahemdabad by louis i khanArchitectural case study of IIM ahemdabad by louis i khan
Architectural case study of IIM ahemdabad by louis i khan
 

Similar to Green Plum IIIT- Allahabad

EMC Unified Analytics Platform. Gintaras Pelenis
EMC Unified Analytics Platform. Gintaras PelenisEMC Unified Analytics Platform. Gintaras Pelenis
EMC Unified Analytics Platform. Gintaras Pelenis
Lietuvos kompiuterininkų sąjunga
 
Greenplum feature
Greenplum featureGreenplum feature
Greenplum feature
Ahmad Yani Emrizal
 
EMC Greenplum Database version 4.2
EMC Greenplum Database version 4.2 EMC Greenplum Database version 4.2
EMC Greenplum Database version 4.2
EMC
 
Greenplum Architecture
Greenplum ArchitectureGreenplum Architecture
Greenplum Architecture
Alexey Grishchenko
 
Using Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeUsing Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-Purpose
DATAVERSITY
 
New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...
New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...
New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...
Paul Hofmann
 
IBM Spectrum Scale Overview november 2015
IBM Spectrum Scale Overview november 2015IBM Spectrum Scale Overview november 2015
IBM Spectrum Scale Overview november 2015
Doug O'Flaherty
 
Software Defined Infrastructure
Software Defined InfrastructureSoftware Defined Infrastructure
Software Defined Infrastructure
inside-BigData.com
 
S de0882 new-generation-tiering-edge2015-v3
S de0882 new-generation-tiering-edge2015-v3S de0882 new-generation-tiering-edge2015-v3
S de0882 new-generation-tiering-edge2015-v3
Tony Pearson
 
Data Virtualization: An Essential Component of a Cloud Data Lake
Data Virtualization: An Essential Component of a Cloud Data LakeData Virtualization: An Essential Component of a Cloud Data Lake
Data Virtualization: An Essential Component of a Cloud Data Lake
Denodo
 
How Yellowbrick Data Integrates to Existing Environments Webcast
How Yellowbrick Data Integrates to Existing Environments WebcastHow Yellowbrick Data Integrates to Existing Environments Webcast
How Yellowbrick Data Integrates to Existing Environments Webcast
Yellowbrick Data
 
Oracle databáze – Konsolidovaná Data Management Platforma
Oracle databáze – Konsolidovaná Data Management PlatformaOracle databáze – Konsolidovaná Data Management Platforma
Oracle databáze – Konsolidovaná Data Management Platforma
MarketingArrowECS_CZ
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
DATAVERSITY
 
Oracle databáze - zkonsolidovat, ochránit a ještě ušetřit! (1. část)
Oracle databáze - zkonsolidovat, ochránit a ještě ušetřit! (1. část)Oracle databáze - zkonsolidovat, ochránit a ještě ušetřit! (1. část)
Oracle databáze - zkonsolidovat, ochránit a ještě ušetřit! (1. část)
MarketingArrowECS_CZ
 
Jak konsolidovat Vaše databáze s využitím Cloud služeb?
Jak konsolidovat Vaše databáze s využitím Cloud služeb?Jak konsolidovat Vaše databáze s využitím Cloud služeb?
Jak konsolidovat Vaše databáze s využitím Cloud služeb?
MarketingArrowECS_CZ
 
A5 oracle exadata-the game changer for online transaction processing data w...
A5   oracle exadata-the game changer for online transaction processing data w...A5   oracle exadata-the game changer for online transaction processing data w...
A5 oracle exadata-the game changer for online transaction processing data w...
Dr. Wilfred Lin (Ph.D.)
 
The Future of Data Warehousing, Data Science and Machine Learning
The Future of Data Warehousing, Data Science and Machine LearningThe Future of Data Warehousing, Data Science and Machine Learning
The Future of Data Warehousing, Data Science and Machine Learning
ModusOptimum
 
Back to The Future V
Back to The Future VBack to The Future V
Back to The Future V
Magnus Backman
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data Integration
DATAVERSITY
 
TechEvent 2019: Create a Private Database Cloud in the Public Cloud using the...
TechEvent 2019: Create a Private Database Cloud in the Public Cloud using the...TechEvent 2019: Create a Private Database Cloud in the Public Cloud using the...
TechEvent 2019: Create a Private Database Cloud in the Public Cloud using the...
Trivadis
 

Similar to Green Plum IIIT- Allahabad (20)

EMC Unified Analytics Platform. Gintaras Pelenis
EMC Unified Analytics Platform. Gintaras PelenisEMC Unified Analytics Platform. Gintaras Pelenis
EMC Unified Analytics Platform. Gintaras Pelenis
 
Greenplum feature
Greenplum featureGreenplum feature
Greenplum feature
 
EMC Greenplum Database version 4.2
EMC Greenplum Database version 4.2 EMC Greenplum Database version 4.2
EMC Greenplum Database version 4.2
 
Greenplum Architecture
Greenplum ArchitectureGreenplum Architecture
Greenplum Architecture
 
Using Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeUsing Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-Purpose
 
New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...
New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...
New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...
 
IBM Spectrum Scale Overview november 2015
IBM Spectrum Scale Overview november 2015IBM Spectrum Scale Overview november 2015
IBM Spectrum Scale Overview november 2015
 
Software Defined Infrastructure
Software Defined InfrastructureSoftware Defined Infrastructure
Software Defined Infrastructure
 
S de0882 new-generation-tiering-edge2015-v3
S de0882 new-generation-tiering-edge2015-v3S de0882 new-generation-tiering-edge2015-v3
S de0882 new-generation-tiering-edge2015-v3
 
Data Virtualization: An Essential Component of a Cloud Data Lake
Data Virtualization: An Essential Component of a Cloud Data LakeData Virtualization: An Essential Component of a Cloud Data Lake
Data Virtualization: An Essential Component of a Cloud Data Lake
 
How Yellowbrick Data Integrates to Existing Environments Webcast
How Yellowbrick Data Integrates to Existing Environments WebcastHow Yellowbrick Data Integrates to Existing Environments Webcast
How Yellowbrick Data Integrates to Existing Environments Webcast
 
Oracle databáze – Konsolidovaná Data Management Platforma
Oracle databáze – Konsolidovaná Data Management PlatformaOracle databáze – Konsolidovaná Data Management Platforma
Oracle databáze – Konsolidovaná Data Management Platforma
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
 
Oracle databáze - zkonsolidovat, ochránit a ještě ušetřit! (1. část)
Oracle databáze - zkonsolidovat, ochránit a ještě ušetřit! (1. část)Oracle databáze - zkonsolidovat, ochránit a ještě ušetřit! (1. část)
Oracle databáze - zkonsolidovat, ochránit a ještě ušetřit! (1. část)
 
Jak konsolidovat Vaše databáze s využitím Cloud služeb?
Jak konsolidovat Vaše databáze s využitím Cloud služeb?Jak konsolidovat Vaše databáze s využitím Cloud služeb?
Jak konsolidovat Vaše databáze s využitím Cloud služeb?
 
A5 oracle exadata-the game changer for online transaction processing data w...
A5   oracle exadata-the game changer for online transaction processing data w...A5   oracle exadata-the game changer for online transaction processing data w...
A5 oracle exadata-the game changer for online transaction processing data w...
 
The Future of Data Warehousing, Data Science and Machine Learning
The Future of Data Warehousing, Data Science and Machine LearningThe Future of Data Warehousing, Data Science and Machine Learning
The Future of Data Warehousing, Data Science and Machine Learning
 
Back to The Future V
Back to The Future VBack to The Future V
Back to The Future V
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data Integration
 
TechEvent 2019: Create a Private Database Cloud in the Public Cloud using the...
TechEvent 2019: Create a Private Database Cloud in the Public Cloud using the...TechEvent 2019: Create a Private Database Cloud in the Public Cloud using the...
TechEvent 2019: Create a Private Database Cloud in the Public Cloud using the...
 

More from IIIT ALLAHABAD

Case Study IndiGo Airbus 250 A320neo aircraft Deal
    Case Study IndiGo Airbus  250 A320neo aircraft Deal    Case Study IndiGo Airbus  250 A320neo aircraft Deal
Case Study IndiGo Airbus 250 A320neo aircraft Deal
IIIT ALLAHABAD
 
Data Mining
Data MiningData Mining
Data Mining
IIIT ALLAHABAD
 
Art of war presentation 16273893930333333
Art of war presentation 16273893930333333Art of war presentation 16273893930333333
Art of war presentation 16273893930333333
IIIT ALLAHABAD
 
Mba it 2014-15
Mba it 2014-15Mba it 2014-15
Mba it 2014-15
IIIT ALLAHABAD
 
Summer Internship At investors Clinic Lucknow
Summer Internship At investors Clinic LucknowSummer Internship At investors Clinic Lucknow
Summer Internship At investors Clinic Lucknow
IIIT ALLAHABAD
 
E commerce-131110221615-phpapp02
E commerce-131110221615-phpapp02E commerce-131110221615-phpapp02
E commerce-131110221615-phpapp02
IIIT ALLAHABAD
 
Flipkart Big Billion Day
Flipkart Big Billion DayFlipkart Big Billion Day
Flipkart Big Billion Day
IIIT ALLAHABAD
 

More from IIIT ALLAHABAD (7)

Case Study IndiGo Airbus 250 A320neo aircraft Deal
    Case Study IndiGo Airbus  250 A320neo aircraft Deal    Case Study IndiGo Airbus  250 A320neo aircraft Deal
Case Study IndiGo Airbus 250 A320neo aircraft Deal
 
Data Mining
Data MiningData Mining
Data Mining
 
Art of war presentation 16273893930333333
Art of war presentation 16273893930333333Art of war presentation 16273893930333333
Art of war presentation 16273893930333333
 
Mba it 2014-15
Mba it 2014-15Mba it 2014-15
Mba it 2014-15
 
Summer Internship At investors Clinic Lucknow
Summer Internship At investors Clinic LucknowSummer Internship At investors Clinic Lucknow
Summer Internship At investors Clinic Lucknow
 
E commerce-131110221615-phpapp02
E commerce-131110221615-phpapp02E commerce-131110221615-phpapp02
E commerce-131110221615-phpapp02
 
Flipkart Big Billion Day
Flipkart Big Billion DayFlipkart Big Billion Day
Flipkart Big Billion Day
 

Recently uploaded

GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 

Recently uploaded (20)

GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 

Green Plum IIIT- Allahabad

  • 1. Data Computing Division © Copyright 2011 EMC Corpora2on. All rights reserved. 1 “Presented By Brijesh Kumar Awasthi IMP2014002
  • 2. What is GreenPlum…? •Greenplum, the company, was founded in September 2003 by Scott Yara and Luke Lonergan. •It was a merger of two smaller companies Metapa in Los Angeles and Didera in Fairfax, Virginia •Greenplum, based in in San Mateo, California, released its database management system software in April 2005 calling it Bizgres
  • 3. Data Computing Division EMC ACQUIRE S GREENPLU M Greenplum Becomes the Foundation of EMCʼs Data Computing Division © Copyright 2011 EMC Corpora2on. All rights reserved. 3 “Greenplum, with expertise in the massively parallel arena, will give the storage giant a boost in big-data computing.” – InformationWeek – “For three years, Gartner has identified Greenplum as the most advanced vendor in the visionary quadrant of its data warehouse DBMS Magic Quadrant….” – Gartner
  • 4. What the COO of EMC said about Green Plum And BI
  • 5. Data Computing Division © Copyright 2011 EMC Corpora2on. All rights reserved. 5 New Rrealities… New Demands!• Do it faster – Ingest more data – Ingest it faste – Keep it unsummarised, keep it for longer • Be more Responsive – Unpredictable queries, Rapidly evolving bespoke analy2cs – New tools: Hadoop, MapReduce, Hive, HBase, “R” • Manage new data types – Manage and allow queries across structured, semi-‐structured and unstructured data • Do it at a lower cost Big Data will revolutionize Data Warehousing and analysis.
  • 6. Data Computing Division Why Greenplum? Fast Data Loading Extreme Performance & Elastic Scalability Unified Data Access © Copyright 2011 EMC Corpora2on. All rights reserved. 6 • EMC Greenplum is a shared nothing, massively parallel processing (MPP) data warehouse system • Core principle of data computing is to move the processing dramatically closer to the data and to the people
  • 7. Data Computing Division Segment Servers Query processing & data storage ... ... Master Server Query planning & dispatch Hadoop MapReduce Data Sources Loading, streaming, etc. Network Interconnect External Files, URLs, Hadoop (HDFS), WebServices (including from other DBs), O/S Pipes (including from other DBs) Standard Business Intelligence and Analy2cal tools SQL BI tools Analytical tools Queries distributed across all available resources Shared Nothing, Massively Parallel Processing means no boS lenecks and linear scalability. Data loading also takes advantage of MPP architecture Greenplum handles structured, semi-‐ structured and unstructured data Clients see a single database primary server, plus hot failover © Copyright 2011 EMC Corpora2on. All rights reserved. 7
  • 8. Data Computing Division Why is MPP different? … Greenplum is a Scale-Out Architecture on standard commodity hardware MPP © Copyright 2011 EMC Corpora2on. All rights reserved. 8 • Queries shipped to each node simultaneously • Execute parallel on each segment instance. • Multiple pipe lines of data • Highly Scalable topology • Locks and buffers not shared. Traditional • Single database buffer used by all user operations • More locks, means more complex lock management system • Single pipe to data • Limited Scalability
  • 9. Partitioning: The Key to Parallelism Strategy: Spread data evenly across as many nodes (and disks) as possible Greenplum Database High Speed Loader Data Computing Division © Copyr2ig0h/0t 220/1112EMC Corpora2on. All rights reserved. 6 9 Order Order# Order Date Customer ID 43 Oct 20 2005 12 64 Oct 20 2005 111 45 Oct 20 2005 42 46 Oct 20 2005 64 77 Oct 20 2005 32 48 Oct 20 2005 12 50 Oct 20 2005 34 56 Oct 20 2005 213 63 Oct 20 2005 15 44 Oct 20 2005 102 53 Oct 20 2005 82 55 Oct 20 2005 55
  • 10. Greenplum Database Powerful Data Loading Capabilities • Industry leading performance: – >10TB per hour per rack • Innovative, parallel-everything architecture: – Scatter-Gather Streaming™ provides true linear scaling – Support for both large-batch and continuous real-time loading strategies – Enable complex data transformations “in-flight” – Transparent interfaces to loading via support files, application and services Data Computing Division © Copyright 2011 EMC Corpora2on. All rights reserved. 1
  • 11. Traditional Loading vs Greenplum DB Parallel Loading Segment nodes Segment nodes Segment nodes Segment nodes Interconnect Conventional Loading ETL Servers Interconnect ETL Servers Data Computing Division © Copyright 2011 EMC Corpora2on. All rights reserved. 1
  • 12. Client Advanced pipeline process for fast operation Data Computing Division © Copyright 2011 EMC Corpora2on. All rights reserved. 1 Sort Request Master Server Segment Servers 9 6 10 2 11 5 4 3 12 1 7 8
  • 13. Advanced pipeline process for fast operation Data Computing Division © Copyright 2011 EMC Corpora2on. All rights reserved. 10 Master Server Segment Servers Client 1 3 5 2 6 8 4 7 10 9 11 12
  • 14. Greenplum Database Extreme Performance • Optimized for BI and Analytics – Rich eco-system of partners • Provides automatic parallelization – Just load and query like any database – Tables are automatically distributed across nodes – No need for manual partitioning or tuning • Extremely scalable MPP shared-nothing Architecture – All nodes can scan and process in parallel – Linear scalability by adding nodes Interconnect Data Computing Division © Copyright 2011 EMC Corpora2on. All rights reserved. 14 Loading
  • 15. Platform Independence Delivers Choice and Flexibility Virtualized Infrastructure • Pool resources • Elastic scalability Data Computing Appliance • Optimized Price/Performance • Minimum time-‐to-‐value • Ideal for Produc@on Environments Software-‐Only • On your x86 hardware • Flexibility for any workload Data Computing Division © Copyright 2011 EMC Corpora2on. All rights reserved. 15
  • 16. Table ‘Customer’ Jan ’09 Feb ’09 Mar ’09 Apr ’09 May ’09 Jun ’09 Jul ’09 Aug ’09 Sept ’09 Oct ’09 Nov ’09 Column-Oriented Archival Compression Data Computing Division © Copyright 2011 EMC Corpora2on. All rights reserved. 16 Column-Oriented Fast Compression Row-Oriented Fast Compression Greenplum Polymorphic Data Storage • Greenplum Databaseʼs engine provides a flexible storage model – Four table types: heap, row-oriented, column-oriented, external – Block compression: Gzip (levels 1-9), QuickLZ • Storage types can be mixed within a database, and even within a table – Fully configurable via table DDL and partitioning syntax – You may also choose to index some partitions and not others • Gives customers the choice of processing model for any table or partition – Tables/partitions of different storage types can be joined together without restriction – Highly tuned – e.g. columnar does efficient pre-projection and parallel execution
  • 17. Unified Data Access Across The Enterprise • Workload Management – Connection management controls how many users can be connected and assigns them to a queue – User-based resource queues allow for control of the total number or cost of queries allowed at any point in time. • Dynamic Query Prioritization – Patent pending technique of dynamically balancing resources across running queries – Allows DBAs to control query priorities in real- time, or determine default priorities by resource queue Data Computing Division © Copyright 2011 EMC Corpora2on. All rights reserved. 17
  • 18. Highly interactive web-based performance monitoring Real-time and historic views of: • Resource utilization • Queries and query internals Greenplum Performance Monitor Data Computing Division © Copyright 2011 EMC Corpora2on. All rights reserved. 18
  • 19. Key Technical Requirements for HPA Data Computing Division © Copyright 2011 EMC Corpora2on. All rights reserved. 19  Technical Values  Performance -Massively parallel Architecture  Load speeds –10TB/hr  Integration with SAS  In-database analytics using Java, PL/R, etc  Integration with many more BI, Analytical tools,  Integration with Hadoop for unstructured data analysis  Financial Value  Lower Total cost of ownership  Best Price/performance Ratio in the industry for EDW/ analytical appliance  Operational Values  No Indicesmaintenance  Backup recovery solution  Most robust Disaster Recovery Solution in Industry  Best Technical and customer Support Organization backing
  • 20. Greenplum Customers -- Government • Pacific Northwest National Labs (Dept. of Energy) does cyberanalytics. • Usa spending.gov traces the outlays of the US Federal Government. • The Federal Reserve Bank of Kansas City does economic analysis mostly related to the housing market. • Recently, the Internal Revenue Service purchased a DCA to do work related to Fraudulent Tax returns. • ATO uses GP as an investigatory tool in their Compliance and Audit Logging Unit. Data Computing Division © Copyright 2011 EMC Corpora2on. All rights reserved. 12 20
  • 21. High Performance Analytics ‘The power to know fast’ Data Computing Division © Copyright 2011 EMC Corpora2on. All rights reserved. 21 Thank you
  • 22. Questions? Data Computing Division © Copyright 2011 EMC Corpora2on. All rights reserved. 22