SlideShare a Scribd company logo
1 of 25
All content is the property and proprietary interest of CloudZone, The removal of any proprietary notices, including attribution information, is strictly prohibited.
All content is the property and proprietary interest of CloudZone, The removal of any proprietary notices, including attribution information, is strictly prohibited.
Big Data Month 2016 – Up Next…
15.11
22.11
22.11
28.11 30.11
14.11
All content is the property and proprietary interest of CloudZone, The removal of any proprietary notices, including attribution information, is strictly prohibited.
13:00 – 13:20 Intro to Amazon Redshift by IronSource
13:20 – 15:00 LAB I – Using Amazon RedShift
15:00 – 15:15 Break
15:15 – 17:25 LAB II – Table Layout and Schema Design with
Amazon Redshift
17:25 – 17:30 Your next steps on AWS by CloudZone
Master AWS Redshift - Agenda
Shimon Tolts
General Manager, Data Solutions
Atom
Data Pipeline Processing 200B events with
Node.js And Docker On AWS
About ironSource: Hypergrowth
People Reached Each Month
4200
Apps Installed Every Minute
with the ironSource Platform
Registered & Analyzed Data Events
Every Month
200B
800M
50B
0
100B
150B
200B
Jun
2015
Jul
2015
Aug
2015
Sep
2015
Oct
2015
Nov
2015
Dec
2015
Jan
2016
Feb
2016
Mar
2016
Apr
2016
May
2016
We needed a way to manage this data:
Our Business Challenge
ProcessCollect Store
Collection
● Multi region layer - Latency based routing
● Low latency from client to Atom servers
● High Availability - AWS regions does fail!
● Storing raw data + headers upon receiving
Data Enrichment
● Enrich data before storing in your Data Lake
and/or Warehouse
○ IP to Country
○ Currency conversion
○ Decrypt data
○ User Agent parsing - OS, Browser, Device...
● Any custom logic you would like! - fully
extendible
Data Targets
● Near real-time data insertion - 1 minute!
● Stream data to Google Storage and/or AWS S3
● Smart insertion of data into AWS Redshift
○ Set the amount of parallel copys
○ Configure priority on tables
● BigQuery - Streaming data using batch files
import (saves 20% cost)
Micro-Services Architecture
● Everything is a service
● Decoupling
● Distributed systems
Separate lifecycle
● Communication using RESTful /
Queue / Streams
Docker
● Linux Container
● Save provisioning time
● Infrastructure as code
● Dev-Test-Production - identical container
● Ship easily
Cloud infrastructure
● Pay as you go - (grow)
● SaaS services
● Auto-scaling-groups
● DynamoDB
● RDS *SQL
● Redshift data warehouse
Continuous Integration
● From commit to production
● Jenkins commit hook
● Git branching model
● AWS dynamic slaves
● Unit tests
● Docker builds
● Updating live environment
Diagram
● Xplenty - hadoop service - ~40min query
● One big cluster - 96 xlarge nodes
● No WLM configuration
● CSV copy
● No reserved nodes
● different ETL process implemented by every
department.
STARTING POINT
● using 8xlnodes if needed
● Redshift cluster per department
● “hot and cold” clusters - SSD: fast and furios, HDD: slow but cheap
● WLM configuration
● Reserved Nodes
● JSON copy
● One pipeline to rule them all - ironBeast - currently supporting over
50B events per month. inserting data to more than 10 Redshift clusters.
SOLUTION:
WORK LOAD MANAGEMENT
THINGS WE LEARNED ALONG THE WAY
● https://github.com/awslabs/amazon-redshift-utils (AdminViews)
● users permissions does not apply on new tables created in a schema
● Vacuum Vacuum Vacuum
● Avoid parallel inserts (especially in 8xl nodes) - if you copy to multiple tables, it is better to
implement a COPY queue
● STL_LOAD_ERRORS - money on the floor
● Columnar datastore does not mean you can use as much columns as you want - it is better to
split to multiple tables.
● Encode your columns - ‘analyze compression’
● instances that query Redshift should use MTU 1500 - link
Redshift use cases
10 Million
Free Monthly Events
Thank you!
ironsrc.com/atom
shimont@ironsrc.com @shimontolts

More Related Content

What's hot

Change Data Capture - Scale by the Bay 2019
Change Data Capture - Scale by the Bay 2019Change Data Capture - Scale by the Bay 2019
Change Data Capture - Scale by the Bay 2019Petr Zapletal
 
Real-Time Vote Platform Benchmark
Real-Time Vote Platform BenchmarkReal-Time Vote Platform Benchmark
Real-Time Vote Platform BenchmarkLahav Savir
 
How Docker Accelerates Continuous Development at ironSource: Containers #101 ...
How Docker Accelerates Continuous Development at ironSource: Containers #101 ...How Docker Accelerates Continuous Development at ironSource: Containers #101 ...
How Docker Accelerates Continuous Development at ironSource: Containers #101 ...Brittany Ingram
 
Metail at Cambridge AWS User Group Main Meetup #3
Metail at Cambridge AWS User Group Main Meetup #3Metail at Cambridge AWS User Group Main Meetup #3
Metail at Cambridge AWS User Group Main Meetup #3Gareth Rogers
 
Cloud Capacity Planning Tooling - South Bay SRE Meetup Aug-09-2016
Cloud Capacity Planning Tooling - South Bay SRE Meetup Aug-09-2016Cloud Capacity Planning Tooling - South Bay SRE Meetup Aug-09-2016
Cloud Capacity Planning Tooling - South Bay SRE Meetup Aug-09-2016Coburn Watson
 
Apache Cassandra in the Cloud
Apache Cassandra in the CloudApache Cassandra in the Cloud
Apache Cassandra in the CloudInstaclustr
 
Santa Cloud: How Netflix Does Holiday Capacity Planning - South Bay SRE Meetu...
Santa Cloud: How Netflix Does Holiday Capacity Planning - South Bay SRE Meetu...Santa Cloud: How Netflix Does Holiday Capacity Planning - South Bay SRE Meetu...
Santa Cloud: How Netflix Does Holiday Capacity Planning - South Bay SRE Meetu...Coburn Watson
 
GCPLA Meetup Workshop - Migration from a Legacy Infrastructure to the Cloud
GCPLA Meetup Workshop - Migration from a Legacy Infrastructure to the CloudGCPLA Meetup Workshop - Migration from a Legacy Infrastructure to the Cloud
GCPLA Meetup Workshop - Migration from a Legacy Infrastructure to the CloudSamuel Chow
 
Managing application & instance state on AWS
Managing application & instance state on AWSManaging application & instance state on AWS
Managing application & instance state on AWSDavid Mat
 
Cloud Overview
Cloud OverviewCloud Overview
Cloud Overviewiasaglobal
 
Should You Read Kafka as a Stream or in Batch? Should You Even Care? | Ido Na...
Should You Read Kafka as a Stream or in Batch? Should You Even Care? | Ido Na...Should You Read Kafka as a Stream or in Batch? Should You Even Care? | Ido Na...
Should You Read Kafka as a Stream or in Batch? Should You Even Care? | Ido Na...HostedbyConfluent
 
Presto Summit 2018 - 07 - Lyft
Presto Summit 2018 - 07 - LyftPresto Summit 2018 - 07 - Lyft
Presto Summit 2018 - 07 - Lyftkbajda
 
PloneConf2017: serverless python for astronaut safety
PloneConf2017:  serverless python for astronaut safetyPloneConf2017:  serverless python for astronaut safety
PloneConf2017: serverless python for astronaut safetyChris Shenton
 
Spotify's journey to GCP
Spotify's journey to GCPSpotify's journey to GCP
Spotify's journey to GCPAlexey Lapitsky
 
Intro to the Google Cloud for Developers
Intro to the Google Cloud for DevelopersIntro to the Google Cloud for Developers
Intro to the Google Cloud for DevelopersLynn Langit
 
Why Isn't the Cloud Cheaper - John Merline, Milwaukee
 Why Isn't the Cloud Cheaper - John Merline, Milwaukee Why Isn't the Cloud Cheaper - John Merline, Milwaukee
Why Isn't the Cloud Cheaper - John Merline, MilwaukeeAWS Chicago
 
Presto Summit 2018 - 02 - LinkedIn
Presto Summit 2018  - 02 - LinkedInPresto Summit 2018  - 02 - LinkedIn
Presto Summit 2018 - 02 - LinkedInkbajda
 
Big Data on EC2: Mashing Technology in the Cloud
Big Data on EC2: Mashing Technology in the CloudBig Data on EC2: Mashing Technology in the Cloud
Big Data on EC2: Mashing Technology in the CloudGeorge Ang
 
Big Data Day LA 2015 - Lessons learned from scaling Big Data in the Cloud by...
Big Data Day LA 2015 -  Lessons learned from scaling Big Data in the Cloud by...Big Data Day LA 2015 -  Lessons learned from scaling Big Data in the Cloud by...
Big Data Day LA 2015 - Lessons learned from scaling Big Data in the Cloud by...Data Con LA
 
AWSome day 2018 - database in cloud
AWSome day 2018 -  database in cloudAWSome day 2018 -  database in cloud
AWSome day 2018 - database in cloudCorley S.r.l.
 

What's hot (20)

Change Data Capture - Scale by the Bay 2019
Change Data Capture - Scale by the Bay 2019Change Data Capture - Scale by the Bay 2019
Change Data Capture - Scale by the Bay 2019
 
Real-Time Vote Platform Benchmark
Real-Time Vote Platform BenchmarkReal-Time Vote Platform Benchmark
Real-Time Vote Platform Benchmark
 
How Docker Accelerates Continuous Development at ironSource: Containers #101 ...
How Docker Accelerates Continuous Development at ironSource: Containers #101 ...How Docker Accelerates Continuous Development at ironSource: Containers #101 ...
How Docker Accelerates Continuous Development at ironSource: Containers #101 ...
 
Metail at Cambridge AWS User Group Main Meetup #3
Metail at Cambridge AWS User Group Main Meetup #3Metail at Cambridge AWS User Group Main Meetup #3
Metail at Cambridge AWS User Group Main Meetup #3
 
Cloud Capacity Planning Tooling - South Bay SRE Meetup Aug-09-2016
Cloud Capacity Planning Tooling - South Bay SRE Meetup Aug-09-2016Cloud Capacity Planning Tooling - South Bay SRE Meetup Aug-09-2016
Cloud Capacity Planning Tooling - South Bay SRE Meetup Aug-09-2016
 
Apache Cassandra in the Cloud
Apache Cassandra in the CloudApache Cassandra in the Cloud
Apache Cassandra in the Cloud
 
Santa Cloud: How Netflix Does Holiday Capacity Planning - South Bay SRE Meetu...
Santa Cloud: How Netflix Does Holiday Capacity Planning - South Bay SRE Meetu...Santa Cloud: How Netflix Does Holiday Capacity Planning - South Bay SRE Meetu...
Santa Cloud: How Netflix Does Holiday Capacity Planning - South Bay SRE Meetu...
 
GCPLA Meetup Workshop - Migration from a Legacy Infrastructure to the Cloud
GCPLA Meetup Workshop - Migration from a Legacy Infrastructure to the CloudGCPLA Meetup Workshop - Migration from a Legacy Infrastructure to the Cloud
GCPLA Meetup Workshop - Migration from a Legacy Infrastructure to the Cloud
 
Managing application & instance state on AWS
Managing application & instance state on AWSManaging application & instance state on AWS
Managing application & instance state on AWS
 
Cloud Overview
Cloud OverviewCloud Overview
Cloud Overview
 
Should You Read Kafka as a Stream or in Batch? Should You Even Care? | Ido Na...
Should You Read Kafka as a Stream or in Batch? Should You Even Care? | Ido Na...Should You Read Kafka as a Stream or in Batch? Should You Even Care? | Ido Na...
Should You Read Kafka as a Stream or in Batch? Should You Even Care? | Ido Na...
 
Presto Summit 2018 - 07 - Lyft
Presto Summit 2018 - 07 - LyftPresto Summit 2018 - 07 - Lyft
Presto Summit 2018 - 07 - Lyft
 
PloneConf2017: serverless python for astronaut safety
PloneConf2017:  serverless python for astronaut safetyPloneConf2017:  serverless python for astronaut safety
PloneConf2017: serverless python for astronaut safety
 
Spotify's journey to GCP
Spotify's journey to GCPSpotify's journey to GCP
Spotify's journey to GCP
 
Intro to the Google Cloud for Developers
Intro to the Google Cloud for DevelopersIntro to the Google Cloud for Developers
Intro to the Google Cloud for Developers
 
Why Isn't the Cloud Cheaper - John Merline, Milwaukee
 Why Isn't the Cloud Cheaper - John Merline, Milwaukee Why Isn't the Cloud Cheaper - John Merline, Milwaukee
Why Isn't the Cloud Cheaper - John Merline, Milwaukee
 
Presto Summit 2018 - 02 - LinkedIn
Presto Summit 2018  - 02 - LinkedInPresto Summit 2018  - 02 - LinkedIn
Presto Summit 2018 - 02 - LinkedIn
 
Big Data on EC2: Mashing Technology in the Cloud
Big Data on EC2: Mashing Technology in the CloudBig Data on EC2: Mashing Technology in the Cloud
Big Data on EC2: Mashing Technology in the Cloud
 
Big Data Day LA 2015 - Lessons learned from scaling Big Data in the Cloud by...
Big Data Day LA 2015 -  Lessons learned from scaling Big Data in the Cloud by...Big Data Day LA 2015 -  Lessons learned from scaling Big Data in the Cloud by...
Big Data Day LA 2015 - Lessons learned from scaling Big Data in the Cloud by...
 
AWSome day 2018 - database in cloud
AWSome day 2018 -  database in cloudAWSome day 2018 -  database in cloud
AWSome day 2018 - database in cloud
 

Similar to CloudZone Big Data Month 2016 Agenda for Mastering AWS Redshift

[よくわかるAmazon Redshift]Amazon Redshift最新情報と導入事例のご紹介
[よくわかるAmazon Redshift]Amazon Redshift最新情報と導入事例のご紹介[よくわかるAmazon Redshift]Amazon Redshift最新情報と導入事例のご紹介
[よくわかるAmazon Redshift]Amazon Redshift最新情報と導入事例のご紹介Amazon Web Services Japan
 
[よくわかるAmazon Redshift in 大阪]Amazon Redshift最新情報と導入事例のご紹介
[よくわかるAmazon Redshift in 大阪]Amazon Redshift最新情報と導入事例のご紹介[よくわかるAmazon Redshift in 大阪]Amazon Redshift最新情報と導入事例のご紹介
[よくわかるAmazon Redshift in 大阪]Amazon Redshift最新情報と導入事例のご紹介Amazon Web Services Japan
 
Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...
Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...
Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...Gary Arora
 
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...Amazon Web Services
 
AWS for the Java Developer
AWS for the Java DeveloperAWS for the Java Developer
AWS for the Java DeveloperRory Preddy
 
AWS and Serverless with Alexa
AWS and Serverless with AlexaAWS and Serverless with Alexa
AWS and Serverless with AlexaRory Preddy
 
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...Precisely
 
Deep Dive on Microservices and Docker
Deep Dive on Microservices and DockerDeep Dive on Microservices and Docker
Deep Dive on Microservices and DockerKristana Kane
 
Amazon Elastic Map Reduce - Ian Meyers
Amazon Elastic Map Reduce - Ian MeyersAmazon Elastic Map Reduce - Ian Meyers
Amazon Elastic Map Reduce - Ian Meyershuguk
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
AWS 201 - A Walk through the AWS Cloud: App Hosting on AWS - Games, Apps and ...
AWS 201 - A Walk through the AWS Cloud: App Hosting on AWS - Games, Apps and ...AWS 201 - A Walk through the AWS Cloud: App Hosting on AWS - Games, Apps and ...
AWS 201 - A Walk through the AWS Cloud: App Hosting on AWS - Games, Apps and ...Amazon Web Services
 
AWS for Java Developers workshop
AWS for Java Developers workshopAWS for Java Developers workshop
AWS for Java Developers workshopRory Preddy
 
2017 AWS DB Day | Amazon Redshift 자세히 살펴보기
2017 AWS DB Day | Amazon Redshift 자세히 살펴보기2017 AWS DB Day | Amazon Redshift 자세히 살펴보기
2017 AWS DB Day | Amazon Redshift 자세히 살펴보기Amazon Web Services Korea
 
ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...
ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...
ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...Amazon Web Services
 
Migrating to Amazon RDS with Database Migration Service
Migrating to Amazon RDS with Database Migration ServiceMigrating to Amazon RDS with Database Migration Service
Migrating to Amazon RDS with Database Migration ServiceAmazon Web Services
 
Migrating Monolithic Applications with the Strangler Pattern
Migrating Monolithic Applications with the Strangler Pattern Migrating Monolithic Applications with the Strangler Pattern
Migrating Monolithic Applications with the Strangler Pattern Thanh Nguyen
 
How Glidewell Moves Data to Amazon Redshift
How Glidewell Moves Data to Amazon RedshiftHow Glidewell Moves Data to Amazon Redshift
How Glidewell Moves Data to Amazon RedshiftAttunity
 
Aws lambda and accesing AWS RDS - Clouddictive
Aws lambda and accesing AWS RDS - ClouddictiveAws lambda and accesing AWS RDS - Clouddictive
Aws lambda and accesing AWS RDS - ClouddictiveClouddictive
 

Similar to CloudZone Big Data Month 2016 Agenda for Mastering AWS Redshift (20)

[よくわかるAmazon Redshift]Amazon Redshift最新情報と導入事例のご紹介
[よくわかるAmazon Redshift]Amazon Redshift最新情報と導入事例のご紹介[よくわかるAmazon Redshift]Amazon Redshift最新情報と導入事例のご紹介
[よくわかるAmazon Redshift]Amazon Redshift最新情報と導入事例のご紹介
 
REDSHIFT - Amazon
REDSHIFT - AmazonREDSHIFT - Amazon
REDSHIFT - Amazon
 
[よくわかるAmazon Redshift in 大阪]Amazon Redshift最新情報と導入事例のご紹介
[よくわかるAmazon Redshift in 大阪]Amazon Redshift最新情報と導入事例のご紹介[よくわかるAmazon Redshift in 大阪]Amazon Redshift最新情報と導入事例のご紹介
[よくわかるAmazon Redshift in 大阪]Amazon Redshift最新情報と導入事例のご紹介
 
Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...
Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...
Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...
 
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
 
Serverless Realtime Backup
Serverless Realtime BackupServerless Realtime Backup
Serverless Realtime Backup
 
AWS for the Java Developer
AWS for the Java DeveloperAWS for the Java Developer
AWS for the Java Developer
 
AWS and Serverless with Alexa
AWS and Serverless with AlexaAWS and Serverless with Alexa
AWS and Serverless with Alexa
 
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
 
Deep Dive on Microservices and Docker
Deep Dive on Microservices and DockerDeep Dive on Microservices and Docker
Deep Dive on Microservices and Docker
 
Amazon Elastic Map Reduce - Ian Meyers
Amazon Elastic Map Reduce - Ian MeyersAmazon Elastic Map Reduce - Ian Meyers
Amazon Elastic Map Reduce - Ian Meyers
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
AWS 201 - A Walk through the AWS Cloud: App Hosting on AWS - Games, Apps and ...
AWS 201 - A Walk through the AWS Cloud: App Hosting on AWS - Games, Apps and ...AWS 201 - A Walk through the AWS Cloud: App Hosting on AWS - Games, Apps and ...
AWS 201 - A Walk through the AWS Cloud: App Hosting on AWS - Games, Apps and ...
 
AWS for Java Developers workshop
AWS for Java Developers workshopAWS for Java Developers workshop
AWS for Java Developers workshop
 
2017 AWS DB Day | Amazon Redshift 자세히 살펴보기
2017 AWS DB Day | Amazon Redshift 자세히 살펴보기2017 AWS DB Day | Amazon Redshift 자세히 살펴보기
2017 AWS DB Day | Amazon Redshift 자세히 살펴보기
 
ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...
ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...
ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...
 
Migrating to Amazon RDS with Database Migration Service
Migrating to Amazon RDS with Database Migration ServiceMigrating to Amazon RDS with Database Migration Service
Migrating to Amazon RDS with Database Migration Service
 
Migrating Monolithic Applications with the Strangler Pattern
Migrating Monolithic Applications with the Strangler Pattern Migrating Monolithic Applications with the Strangler Pattern
Migrating Monolithic Applications with the Strangler Pattern
 
How Glidewell Moves Data to Amazon Redshift
How Glidewell Moves Data to Amazon RedshiftHow Glidewell Moves Data to Amazon Redshift
How Glidewell Moves Data to Amazon Redshift
 
Aws lambda and accesing AWS RDS - Clouddictive
Aws lambda and accesing AWS RDS - ClouddictiveAws lambda and accesing AWS RDS - Clouddictive
Aws lambda and accesing AWS RDS - Clouddictive
 

More from Idan Tohami

Simplify Your Security with Cybowall
Simplify Your Security with CybowallSimplify Your Security with Cybowall
Simplify Your Security with CybowallIdan Tohami
 
AML Transaction Monitoring Tuning Webinar
AML Transaction Monitoring Tuning WebinarAML Transaction Monitoring Tuning Webinar
AML Transaction Monitoring Tuning WebinarIdan Tohami
 
Robotic Process Automation (RPA) Webinar - By Matrix-IFS
Robotic Process Automation (RPA) Webinar - By Matrix-IFSRobotic Process Automation (RPA) Webinar - By Matrix-IFS
Robotic Process Automation (RPA) Webinar - By Matrix-IFSIdan Tohami
 
Open Banking / PSD2 & GDPR Regulations and How They Are Changing Fraud & Fina...
Open Banking / PSD2 & GDPR Regulations and How They Are Changing Fraud & Fina...Open Banking / PSD2 & GDPR Regulations and How They Are Changing Fraud & Fina...
Open Banking / PSD2 & GDPR Regulations and How They Are Changing Fraud & Fina...Idan Tohami
 
Robotic Automation Process (RPA) Webinar - By Matrix-IFS
Robotic Automation Process (RPA) Webinar - By Matrix-IFSRobotic Automation Process (RPA) Webinar - By Matrix-IFS
Robotic Automation Process (RPA) Webinar - By Matrix-IFSIdan Tohami
 
Robotic Automation Process (RPA) Brochure - By Matrix-IFS
Robotic Automation Process (RPA) Brochure - By Matrix-IFSRobotic Automation Process (RPA) Brochure - By Matrix-IFS
Robotic Automation Process (RPA) Brochure - By Matrix-IFSIdan Tohami
 
The Journey to the Hybrid Multi Cloud
The Journey to the Hybrid Multi CloudThe Journey to the Hybrid Multi Cloud
The Journey to the Hybrid Multi CloudIdan Tohami
 
Introdction to Cloud Regulation for Enterprise by 2Bsecure
Introdction to Cloud Regulation for Enterprise by 2BsecureIntrodction to Cloud Regulation for Enterprise by 2Bsecure
Introdction to Cloud Regulation for Enterprise by 2BsecureIdan Tohami
 
Enterprise Journey to the Cloud - Opening Remarks
Enterprise Journey to the Cloud  - Opening RemarksEnterprise Journey to the Cloud  - Opening Remarks
Enterprise Journey to the Cloud - Opening RemarksIdan Tohami
 
Ready.Set.Cloud - Enterprise Cloud Migration Framework
Ready.Set.Cloud - Enterprise Cloud Migration FrameworkReady.Set.Cloud - Enterprise Cloud Migration Framework
Ready.Set.Cloud - Enterprise Cloud Migration FrameworkIdan Tohami
 
Journey to the Public Cloud
Journey to the Public CloudJourney to the Public Cloud
Journey to the Public CloudIdan Tohami
 
Google Cloud Fundamentals by CloudZone
Google Cloud Fundamentals by CloudZoneGoogle Cloud Fundamentals by CloudZone
Google Cloud Fundamentals by CloudZoneIdan Tohami
 
HDinsight Workshop - Prerequisite Activity
HDinsight Workshop - Prerequisite ActivityHDinsight Workshop - Prerequisite Activity
HDinsight Workshop - Prerequisite ActivityIdan Tohami
 
Cloud Regulations and Security Standards by Ran Adler
Cloud Regulations and Security Standards by Ran AdlerCloud Regulations and Security Standards by Ran Adler
Cloud Regulations and Security Standards by Ran AdlerIdan Tohami
 
Azure Logic Apps by Gil Gross, CloudZone
Azure Logic Apps by Gil Gross, CloudZoneAzure Logic Apps by Gil Gross, CloudZone
Azure Logic Apps by Gil Gross, CloudZoneIdan Tohami
 
AWS Fundamentals @Back2School by CloudZone
AWS Fundamentals @Back2School by CloudZoneAWS Fundamentals @Back2School by CloudZone
AWS Fundamentals @Back2School by CloudZoneIdan Tohami
 
Cloud Accelerated Genomics
Cloud Accelerated GenomicsCloud Accelerated Genomics
Cloud Accelerated GenomicsIdan Tohami
 
Achieving HIPAA on GCP
Achieving HIPAA on GCPAchieving HIPAA on GCP
Achieving HIPAA on GCPIdan Tohami
 

More from Idan Tohami (20)

Simplify Your Security with Cybowall
Simplify Your Security with CybowallSimplify Your Security with Cybowall
Simplify Your Security with Cybowall
 
AML Transaction Monitoring Tuning Webinar
AML Transaction Monitoring Tuning WebinarAML Transaction Monitoring Tuning Webinar
AML Transaction Monitoring Tuning Webinar
 
Robotic Process Automation (RPA) Webinar - By Matrix-IFS
Robotic Process Automation (RPA) Webinar - By Matrix-IFSRobotic Process Automation (RPA) Webinar - By Matrix-IFS
Robotic Process Automation (RPA) Webinar - By Matrix-IFS
 
Open Banking / PSD2 & GDPR Regulations and How They Are Changing Fraud & Fina...
Open Banking / PSD2 & GDPR Regulations and How They Are Changing Fraud & Fina...Open Banking / PSD2 & GDPR Regulations and How They Are Changing Fraud & Fina...
Open Banking / PSD2 & GDPR Regulations and How They Are Changing Fraud & Fina...
 
Robotic Automation Process (RPA) Webinar - By Matrix-IFS
Robotic Automation Process (RPA) Webinar - By Matrix-IFSRobotic Automation Process (RPA) Webinar - By Matrix-IFS
Robotic Automation Process (RPA) Webinar - By Matrix-IFS
 
Robotic Automation Process (RPA) Brochure - By Matrix-IFS
Robotic Automation Process (RPA) Brochure - By Matrix-IFSRobotic Automation Process (RPA) Brochure - By Matrix-IFS
Robotic Automation Process (RPA) Brochure - By Matrix-IFS
 
The Journey to the Hybrid Multi Cloud
The Journey to the Hybrid Multi CloudThe Journey to the Hybrid Multi Cloud
The Journey to the Hybrid Multi Cloud
 
Introdction to Cloud Regulation for Enterprise by 2Bsecure
Introdction to Cloud Regulation for Enterprise by 2BsecureIntrodction to Cloud Regulation for Enterprise by 2Bsecure
Introdction to Cloud Regulation for Enterprise by 2Bsecure
 
Enterprise Journey to the Cloud - Opening Remarks
Enterprise Journey to the Cloud  - Opening RemarksEnterprise Journey to the Cloud  - Opening Remarks
Enterprise Journey to the Cloud - Opening Remarks
 
Vmware on aws
Vmware on awsVmware on aws
Vmware on aws
 
Ready.Set.Cloud - Enterprise Cloud Migration Framework
Ready.Set.Cloud - Enterprise Cloud Migration FrameworkReady.Set.Cloud - Enterprise Cloud Migration Framework
Ready.Set.Cloud - Enterprise Cloud Migration Framework
 
Journey to the Public Cloud
Journey to the Public CloudJourney to the Public Cloud
Journey to the Public Cloud
 
Google Cloud Fundamentals by CloudZone
Google Cloud Fundamentals by CloudZoneGoogle Cloud Fundamentals by CloudZone
Google Cloud Fundamentals by CloudZone
 
HDinsight Workshop - Prerequisite Activity
HDinsight Workshop - Prerequisite ActivityHDinsight Workshop - Prerequisite Activity
HDinsight Workshop - Prerequisite Activity
 
Cloud Regulations and Security Standards by Ran Adler
Cloud Regulations and Security Standards by Ran AdlerCloud Regulations and Security Standards by Ran Adler
Cloud Regulations and Security Standards by Ran Adler
 
Azure Logic Apps by Gil Gross, CloudZone
Azure Logic Apps by Gil Gross, CloudZoneAzure Logic Apps by Gil Gross, CloudZone
Azure Logic Apps by Gil Gross, CloudZone
 
AWS Fundamentals @Back2School by CloudZone
AWS Fundamentals @Back2School by CloudZoneAWS Fundamentals @Back2School by CloudZone
AWS Fundamentals @Back2School by CloudZone
 
Cloud Accelerated Genomics
Cloud Accelerated GenomicsCloud Accelerated Genomics
Cloud Accelerated Genomics
 
Achieving HIPAA on GCP
Achieving HIPAA on GCPAchieving HIPAA on GCP
Achieving HIPAA on GCP
 
Couchbase Day
Couchbase DayCouchbase Day
Couchbase Day
 

Recently uploaded

Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 

Recently uploaded (20)

Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 

CloudZone Big Data Month 2016 Agenda for Mastering AWS Redshift

  • 1. All content is the property and proprietary interest of CloudZone, The removal of any proprietary notices, including attribution information, is strictly prohibited.
  • 2. All content is the property and proprietary interest of CloudZone, The removal of any proprietary notices, including attribution information, is strictly prohibited. Big Data Month 2016 – Up Next… 15.11 22.11 22.11 28.11 30.11 14.11
  • 3. All content is the property and proprietary interest of CloudZone, The removal of any proprietary notices, including attribution information, is strictly prohibited. 13:00 – 13:20 Intro to Amazon Redshift by IronSource 13:20 – 15:00 LAB I – Using Amazon RedShift 15:00 – 15:15 Break 15:15 – 17:25 LAB II – Table Layout and Schema Design with Amazon Redshift 17:25 – 17:30 Your next steps on AWS by CloudZone Master AWS Redshift - Agenda
  • 4. Shimon Tolts General Manager, Data Solutions Atom Data Pipeline Processing 200B events with Node.js And Docker On AWS
  • 5. About ironSource: Hypergrowth People Reached Each Month 4200 Apps Installed Every Minute with the ironSource Platform Registered & Analyzed Data Events Every Month 200B 800M 50B 0 100B 150B 200B Jun 2015 Jul 2015 Aug 2015 Sep 2015 Oct 2015 Nov 2015 Dec 2015 Jan 2016 Feb 2016 Mar 2016 Apr 2016 May 2016
  • 6. We needed a way to manage this data: Our Business Challenge ProcessCollect Store
  • 7.
  • 8. Collection ● Multi region layer - Latency based routing ● Low latency from client to Atom servers ● High Availability - AWS regions does fail! ● Storing raw data + headers upon receiving
  • 9. Data Enrichment ● Enrich data before storing in your Data Lake and/or Warehouse ○ IP to Country ○ Currency conversion ○ Decrypt data ○ User Agent parsing - OS, Browser, Device... ● Any custom logic you would like! - fully extendible
  • 10. Data Targets ● Near real-time data insertion - 1 minute! ● Stream data to Google Storage and/or AWS S3 ● Smart insertion of data into AWS Redshift ○ Set the amount of parallel copys ○ Configure priority on tables ● BigQuery - Streaming data using batch files import (saves 20% cost)
  • 11.
  • 12. Micro-Services Architecture ● Everything is a service ● Decoupling ● Distributed systems Separate lifecycle ● Communication using RESTful / Queue / Streams
  • 13. Docker ● Linux Container ● Save provisioning time ● Infrastructure as code ● Dev-Test-Production - identical container ● Ship easily
  • 14. Cloud infrastructure ● Pay as you go - (grow) ● SaaS services ● Auto-scaling-groups ● DynamoDB ● RDS *SQL ● Redshift data warehouse
  • 15. Continuous Integration ● From commit to production ● Jenkins commit hook ● Git branching model ● AWS dynamic slaves ● Unit tests ● Docker builds ● Updating live environment
  • 17.
  • 18. ● Xplenty - hadoop service - ~40min query ● One big cluster - 96 xlarge nodes ● No WLM configuration ● CSV copy ● No reserved nodes ● different ETL process implemented by every department. STARTING POINT
  • 19.
  • 20.
  • 21. ● using 8xlnodes if needed ● Redshift cluster per department ● “hot and cold” clusters - SSD: fast and furios, HDD: slow but cheap ● WLM configuration ● Reserved Nodes ● JSON copy ● One pipeline to rule them all - ironBeast - currently supporting over 50B events per month. inserting data to more than 10 Redshift clusters. SOLUTION:
  • 23. THINGS WE LEARNED ALONG THE WAY ● https://github.com/awslabs/amazon-redshift-utils (AdminViews) ● users permissions does not apply on new tables created in a schema ● Vacuum Vacuum Vacuum ● Avoid parallel inserts (especially in 8xl nodes) - if you copy to multiple tables, it is better to implement a COPY queue ● STL_LOAD_ERRORS - money on the floor ● Columnar datastore does not mean you can use as much columns as you want - it is better to split to multiple tables. ● Encode your columns - ‘analyze compression’ ● instances that query Redshift should use MTU 1500 - link
  • 25. 10 Million Free Monthly Events Thank you! ironsrc.com/atom shimont@ironsrc.com @shimontolts