Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Containers #101: Optimize CI/CD for Big Data Solutions

327 views

Published on

Recording posted here- https://codefresh.io/blog/containers-101-meetup-docker-accelerates-continuous-development/
Shimon Tolts, General Manager/ CTO of Data Solutions at ironSouce, joined us to talk about how they leverage Docker to simplify their workflow and deliver Big Data solutions to their customers faster. He shared their experience running Docker containers in production and how they took one of their base systems, considered "the backbone of the company," and transformed it using containers.

Published in: Technology
  • D0WNL0AD FULL ▶ ▶ ▶ ▶ http://1url.pw/HOaLk ◀ ◀ ◀ ◀
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

Containers #101: Optimize CI/CD for Big Data Solutions

  1. 1. Containers #101 Optimize CI/CD for Big Data Solutions Oct 2016
  2. 2. Shimon Tolts General Manager, Data Solutions Atom Data Pipeline Processing 200B events with Node.js And Docker On AWS
  3. 3. About ironSource: Hypergrowth People Reached Each Month 4200 Apps Installed Every Minute with the ironSource Platform Registered & Analyzed Data Events Every Month 200B 800M 50B 0 100B 150B 200B Jun 2015 Jul 2015 Aug 2015 Sep 2015 Oct 2015 Nov 2015 Dec 2015 Jan 2016 Feb 2016 Mar 2016 Apr 2016 May 2016
  4. 4. We needed a way to manage this data: Our Business Challenge ProcessCollect Store
  5. 5. Collection ● Multi region layer - Latency based routing ● Low latency from client to Atom servers ● High Availability - AWS regions does fail! ● Storing raw data + headers upon receiving
  6. 6. Data Enrichment ● Enrich data before storing in your Data Lake and/or Warehouse ○ IP to Country ○ Currency conversion ○ Decrypt data ○ User Agent parsing - OS, Browser, Device... ● Any custom logic you would like! - fully extendible
  7. 7. Data Targets ● Near real-time data insertion - 1 minute! ● Stream data to Google Storage and/or AWS S3 ● Smart insertion of data into AWS Redshift ○ Set the amount of parallel copys ○ Configure priority on tables ● BigQuery - Streaming data using batch files import (saves 20% cost)
  8. 8. Micro-Services Architecture ● Everything is a service ● Decoupling ● Distributed systems Separate lifecycle ● Communication using RESTful / Queue / Streams
  9. 9. Docker ● Linux Container ● Save provisioning time ● Infrastructure as code ● Dev-Test-Production - identical container ● Ship easily
  10. 10. Cloud infrastructure ● Pay as you go - (grow) ● SaaS services ● Auto-scaling-groups ● DynamoDB ● RDS *SQL ● Redshift data warehouse
  11. 11. Continuous Integration ● From commit to production ● Jenkins commit hook ● Git branching model ● AWS dynamic slaves ● Unit tests ● Docker builds ● Updating live environment
  12. 12. Diagram
  13. 13. Starting Point Pre-baked images - AMIs Supervisor Nginx reverse proxy Node.js * cpu-count Provisioning time * instances Bash provisioning scripts
  14. 14. Minimum Viable Product Infrastructure as code Nginx Node.js * cpu-count Supervisor Docker Hub No Bash scripts! No provisioning time * instances
  15. 15. https://github.com/ironSource/docker-config/blob/bb6be85b97132cbdd10084305ee1ee2f414b0b50/Dockerfile
  16. 16. Interactive Cycle Nginx Supervisor Infrastructure as code Node.js * cpu-count Docker Hub No Bash scripts! No provisioning time * instances
  17. 17. https://github.com/ironSource/docker-config/blob/c4bbad11a323fd6e36ff31505c43e7c8dc51b1eb/Dockerfile-iojs-cluster
  18. 18. User Data
  19. 19. https://github.com/ironSource/docker-config/blob/2f4ccc7c277850de928cc432f47b2fc58fb8732a/Dockerfile-nodejs-cluster
  20. 20. docker-common.yml docker-compose.yml https://stash.ironsrc.com/projects/INFRA-IB/repos/ironbeastcompserter/browse/docker-compose.yml Docker Compose Example #1 (Using ‘Extends):
  21. 21. User Data
  22. 22. Docker Compose Example #2 (Using ‘links’):
  23. 23. 10 Million Free Monthly Events Thank you! ironsrc.com/atom shimont@ironsrc.com @shimontolts

×