Vertica
Distributed database in the cloud wild west
Zvika Gutkin
Big data dbOps
zvika.gutkin@convertro.com
Vertica on AWSConvertro
We have 2 identical instances,
Although one clearly peforms 30% better than the other.
Vertica on AWSConvertro
AWS
MAGIC
AGENDA
Vertica
on
AWS
Vertica on AWSConvertro
Vertica ?
How we
Use
10x ~ 100x
Performance
Enhancements
High compression rates
Sorted data
Object segmentation / replication
Optimizer aware
Vertica on AWS
Create Table …..
Vertica on AWSConvertro
USA
Dallas
385
Sq. Miles
Block Block Block Block Block Block
120000
Asia
Israel
Acres
Tel Aviv
520000
Acres
450000
USA
Dallas
385
Sq. Miles
Asia
Israel
Acres
Tel Aviv
520000
Acres
450000
N.America
USA
Dallas
385
Sq. Miles
Block Block Block Block Block Block
120000
Asia
Israel
Acres
Tel Aviv
520000
Acres
450000
USA
Dallas
385
Sq. Miles
Asia
Israel
Acres
Tel Aviv
520000
Acres
450000
N.America
Block Block Block Block Block Block
Bad compression…
Continent Country Size type City size City Name Population
Tel Aviv
Jerusalem
Haifa
Dallas
450000
800000
268000
1200000
8200000
8800000
New York
New Jersey
52000
78000
63000
385
468
8700
Acres
Acres
Acres
Sq. Miles
Sq. Miles
Sq. Miles
Israel
Israel
Israel
USA
USA
USA
Asia
Asia
Asia
N. America
N. America
N. America
Continent Country Size type City size City Name Population
!@#$@a
$%##!
*&&^
!@#$
LZO Encoding
450000
268000
+532000
+932000
+7932000
+8532000
DeltaVal
Encoding
52000
385
+83
+8315
+62615
+77615
DeltaVal
Encoding
Acres, 3
Sq. Miles, 3
RLE
Encoding
Israel, 3
USA, 3
RLE
Encoding
Asia, 3
N. America, 3
RLE
Encoding
Vertica on AWSConvertro
How we
Use
Unified Temp
Table
Target
Table/Partition
Stream COPY
Number of
parallel loads1
Number of
parallel nodes2
Chunk size per
loads3
TEMP
TEMP
TEMP
MOVE
PARTITIONS
MOVE
PARTITIONS
Vertica The Convertro waConvertro
Real Time ETR
select A from B
where C=‘D’
Business Logic
Topology
Sampling
Lookup
Aggregate
Hydro
Web
Service
Hydro
Vertica on AWSConvertro
Vertica
on
AWS
Production Future
Convertro Vertica on AWS
Setup
VPC
Compatible instances
Enhanced network
Placement group
EBS
Convertro Vertica on AWSThe cloud wild west
Convertro Vertica on AWSThe cloud wild west
Convertro Vertica on AWS
Convertro VPC
Convertro Vertica on AWS
Convertro VPC
Placement group
Convertro Vertica on AWS
Convertro VPC
Throughput
Network
Disks
EBS
Convertro Vertica on AWS
Convertro VPC
Vertica Cluster RAC
Production Future
Nodes Fails
AWS magic
Convertro Vertica on AWS
Setup
Production Future
Convertro Vertica on AWS
Setup
Up to date with AWS features
Replay database
Cluster elasticity
Thank You
zvika.gutkin@convertro.com
https://github.com/Convertro/Hydro
https://www.linkedin.com/pulse/vertica-convertro-way-zvika-gutkin

Vertica on aws

Editor's Notes

  • #3 Gotham The o.c Gladiator A beautiful mind
  • #9 70% reduction in storage
  • #11 150K rows per sec Regular days 10 – 15 billion rows per day 40 billion rows per day New vertica feature => Vhash
  • #12 out of the box improvements => denormalize => data model changes are Game changer !!!.Vertica can handle big joins => merge joins MMM => measure measure measure => data collector tables .
  • #14 Enhanced networking – EBS-optimized instance or an instance with 10 Gigabit network connectivity
  • #15 Enhanced networking – EBS-optimized instance or an instance with 10 Gigabit network connectivity
  • #16 Enhanced networking – EBS-optimized instance or an instance with 10 Gigabit network connectivity
  • #17 Enhanced networking – EBS-optimized instance or an instance with 10 Gigabit network connectivity
  • #18 Enhanced networking – EBS-optimized instance or an instance with 10 Gigabit network connectivity
  • #19  Enhanced networking – EBS-optimized instance or an instance with 10 Gigabit network connectivity
  • #20 Enhanced networking – EBS-optimized instance or an instance with 10 Gigabit network connectivity