SlideShare a Scribd company logo
Photos at Meetup
processing / storage / serving
Greg Whalin (@gwhalin)
Meetup
founded in 2002 in NYC
81k+ paying groups
1.2m+ RSVPs per months
25k+ Meetups per week
6 straight quarters of profitability
team of 70 people working on MEME
we are hiring
Photos (processing and storage)
~500k photo uploads per month
processing
3 - 5 different dimensions
orientation (not dealing with for now)
exif
storage
cheap
easy to scale
performance not critical because ...
serving
fast
but CDN ok
local cache
Processing
ImageMagick (or GraphicsMagick)
CPU intensive
No processing in core application
for large uploads, can hold up request
separate service (backed by distributed cluster)
should be able to work behind dumb round-robin load
balancer
Batch for background processing
rapid turnaround still important
Storage
Requirements: cheap, scalable, redundant, easy
local RAID
easy, but not scalable, redundant or cheap
SAN
scalable, easy, and redundant, but not cheap
NAS
scalable, easy, and cheap, but not super redundant
Distributed
can be cheap, scalable, redundant, and easy
We went w/ MogileFS
Cheap and scalable: jbod on entire network
Redundant: auto-replication schemes (rack and chassis
aware), no SPOF
Easy: kinda, but no POSIX interface; web api (GET/PUT)
Photo upload life cycle
1. User POSTs photo(s) from computer to Meetup App
servers
2. Record inserted into db in a pending state
3. App server POSTs photo w/ some metadata to staging
cluster (behind dumb load balancer)
4. lighttpd+fastCGI process running on each stager stores
photo and meta to filesystem
5. Stager job monitors directory (using inotify) and wakes up
when new photo to process
6. Stager code (Perl) processes photo, stores in MogileFS and
updates DB that it is complete
Current Setup
2 x dedicated upload app servers machines
rarely ever restarted so as to avoid interrupting long
uploads
java + tomcat
8 proc Opteron w/ 16 GB
2 x dedicated staging/processing boxes
lighttpd + perl daemon
4 proc Opteron w/ 8 GB
9 x storage nodes (Mogile)
4 proc Opteron w/ 8 GB
JBOD (old boxes 8 x 500GB, new 16 x 1TB)
total capacity now 86TB (72 used)
2 x db boxes for meta info (running mysql)
Serving
2 serving nodes
2 proc Opterons w/ 4 GB (old cheap boxes)
lighttpd + fastCGI
retrieve photo from MogileFS, and return it
not high speed at serving so...
Akamai
photos served w/ long TTL for cache
even so, 20% of all requests hit our origin servers (lots of
edge?) so ...
Varnish cache in front of origin servers (coming soon)
Akamai midgress
Links
http://www.danga.com/mogilefs/
http://www.imagemagick.org/script/index.php

More Related Content

What's hot

CFWheels - Pragmatic, Beautiful Code
CFWheels - Pragmatic, Beautiful CodeCFWheels - Pragmatic, Beautiful Code
CFWheels - Pragmatic, Beautiful Code
indiver
 
Apache Mesos: a simple explanation of basics
Apache Mesos: a simple explanation of basicsApache Mesos: a simple explanation of basics
Apache Mesos: a simple explanation of basics
Gladson Manuel
 
Overview of sheepdog
Overview of sheepdogOverview of sheepdog
Overview of sheepdogLiu Yuan
 
MySQL Rebuild using Logical Backups
MySQL Rebuild using Logical Backups MySQL Rebuild using Logical Backups
MySQL Rebuild using Logical Backups
Mydbops
 
MySQL Performance Schema in Action
MySQL Performance Schema in Action MySQL Performance Schema in Action
MySQL Performance Schema in Action
Mydbops
 
Massively Scaled High Performance Web Services with PHP
Massively Scaled High Performance Web Services with PHPMassively Scaled High Performance Web Services with PHP
Massively Scaled High Performance Web Services with PHP
Demin Yin
 
JahiaOne - Performance Tuning
JahiaOne - Performance TuningJahiaOne - Performance Tuning
JahiaOne - Performance Tuning
Jahia Solutions Group
 
Insight on MongoDB Change Stream - Abhishek.D, Mydbops Team
Insight on MongoDB Change Stream - Abhishek.D, Mydbops TeamInsight on MongoDB Change Stream - Abhishek.D, Mydbops Team
Insight on MongoDB Change Stream - Abhishek.D, Mydbops Team
Mydbops
 
Using advanced options in MariaDB Connector/J
Using advanced options in MariaDB Connector/JUsing advanced options in MariaDB Connector/J
Using advanced options in MariaDB Connector/J
MariaDB plc
 
JSR107 Come, Code, Cache, Compute!
JSR107 Come, Code, Cache, Compute! JSR107 Come, Code, Cache, Compute!
JSR107 Come, Code, Cache, Compute!
Payara
 
OpenNebula Conf 2014 | Lightning talk: Proactive Autonomic Management Feature...
OpenNebula Conf 2014 | Lightning talk: Proactive Autonomic Management Feature...OpenNebula Conf 2014 | Lightning talk: Proactive Autonomic Management Feature...
OpenNebula Conf 2014 | Lightning talk: Proactive Autonomic Management Feature...
NETWAYS
 
Data Processing and Ruby in the World
Data Processing and Ruby in the WorldData Processing and Ruby in the World
Data Processing and Ruby in the World
SATOSHI TAGOMORI
 
Evolution of MySQL Parallel Replication
Evolution of MySQL Parallel Replication Evolution of MySQL Parallel Replication
Evolution of MySQL Parallel Replication
Mydbops
 
Building an Angular 2 App
Building an Angular 2 AppBuilding an Angular 2 App
Building an Angular 2 App
Felix Gessert
 
Automate MongoDB with MongoDB Management Service
Automate MongoDB with MongoDB Management ServiceAutomate MongoDB with MongoDB Management Service
Automate MongoDB with MongoDB Management Service
MongoDB
 
OpenNebula Conf 2014 | Lightning talk: OpenNebula at Etnetera by Jan Horacek
OpenNebula Conf 2014 | Lightning talk: OpenNebula at Etnetera by Jan HoracekOpenNebula Conf 2014 | Lightning talk: OpenNebula at Etnetera by Jan Horacek
OpenNebula Conf 2014 | Lightning talk: OpenNebula at Etnetera by Jan Horacek
NETWAYS
 
Running virtualized Galera instances for fun and profit
Running virtualized Galera instances for fun and profitRunning virtualized Galera instances for fun and profit
Running virtualized Galera instances for fun and profit
Raghavendra Prabhu
 
MariaDB Auto-Clustering, Vertical and Horizontal Scaling within Jelastic PaaS
MariaDB Auto-Clustering, Vertical and Horizontal Scaling within Jelastic PaaSMariaDB Auto-Clustering, Vertical and Horizontal Scaling within Jelastic PaaS
MariaDB Auto-Clustering, Vertical and Horizontal Scaling within Jelastic PaaS
Jelastic Multi-Cloud PaaS
 
Introduction to Apache Mesos
Introduction to Apache MesosIntroduction to Apache Mesos
Introduction to Apache Mesos
Joe Stein
 
OpenNebulaconf2017US: Rapid scaling of research computing to over 70,000 cor...
OpenNebulaconf2017US:  Rapid scaling of research computing to over 70,000 cor...OpenNebulaconf2017US:  Rapid scaling of research computing to over 70,000 cor...
OpenNebulaconf2017US: Rapid scaling of research computing to over 70,000 cor...
OpenNebula Project
 

What's hot (20)

CFWheels - Pragmatic, Beautiful Code
CFWheels - Pragmatic, Beautiful CodeCFWheels - Pragmatic, Beautiful Code
CFWheels - Pragmatic, Beautiful Code
 
Apache Mesos: a simple explanation of basics
Apache Mesos: a simple explanation of basicsApache Mesos: a simple explanation of basics
Apache Mesos: a simple explanation of basics
 
Overview of sheepdog
Overview of sheepdogOverview of sheepdog
Overview of sheepdog
 
MySQL Rebuild using Logical Backups
MySQL Rebuild using Logical Backups MySQL Rebuild using Logical Backups
MySQL Rebuild using Logical Backups
 
MySQL Performance Schema in Action
MySQL Performance Schema in Action MySQL Performance Schema in Action
MySQL Performance Schema in Action
 
Massively Scaled High Performance Web Services with PHP
Massively Scaled High Performance Web Services with PHPMassively Scaled High Performance Web Services with PHP
Massively Scaled High Performance Web Services with PHP
 
JahiaOne - Performance Tuning
JahiaOne - Performance TuningJahiaOne - Performance Tuning
JahiaOne - Performance Tuning
 
Insight on MongoDB Change Stream - Abhishek.D, Mydbops Team
Insight on MongoDB Change Stream - Abhishek.D, Mydbops TeamInsight on MongoDB Change Stream - Abhishek.D, Mydbops Team
Insight on MongoDB Change Stream - Abhishek.D, Mydbops Team
 
Using advanced options in MariaDB Connector/J
Using advanced options in MariaDB Connector/JUsing advanced options in MariaDB Connector/J
Using advanced options in MariaDB Connector/J
 
JSR107 Come, Code, Cache, Compute!
JSR107 Come, Code, Cache, Compute! JSR107 Come, Code, Cache, Compute!
JSR107 Come, Code, Cache, Compute!
 
OpenNebula Conf 2014 | Lightning talk: Proactive Autonomic Management Feature...
OpenNebula Conf 2014 | Lightning talk: Proactive Autonomic Management Feature...OpenNebula Conf 2014 | Lightning talk: Proactive Autonomic Management Feature...
OpenNebula Conf 2014 | Lightning talk: Proactive Autonomic Management Feature...
 
Data Processing and Ruby in the World
Data Processing and Ruby in the WorldData Processing and Ruby in the World
Data Processing and Ruby in the World
 
Evolution of MySQL Parallel Replication
Evolution of MySQL Parallel Replication Evolution of MySQL Parallel Replication
Evolution of MySQL Parallel Replication
 
Building an Angular 2 App
Building an Angular 2 AppBuilding an Angular 2 App
Building an Angular 2 App
 
Automate MongoDB with MongoDB Management Service
Automate MongoDB with MongoDB Management ServiceAutomate MongoDB with MongoDB Management Service
Automate MongoDB with MongoDB Management Service
 
OpenNebula Conf 2014 | Lightning talk: OpenNebula at Etnetera by Jan Horacek
OpenNebula Conf 2014 | Lightning talk: OpenNebula at Etnetera by Jan HoracekOpenNebula Conf 2014 | Lightning talk: OpenNebula at Etnetera by Jan Horacek
OpenNebula Conf 2014 | Lightning talk: OpenNebula at Etnetera by Jan Horacek
 
Running virtualized Galera instances for fun and profit
Running virtualized Galera instances for fun and profitRunning virtualized Galera instances for fun and profit
Running virtualized Galera instances for fun and profit
 
MariaDB Auto-Clustering, Vertical and Horizontal Scaling within Jelastic PaaS
MariaDB Auto-Clustering, Vertical and Horizontal Scaling within Jelastic PaaSMariaDB Auto-Clustering, Vertical and Horizontal Scaling within Jelastic PaaS
MariaDB Auto-Clustering, Vertical and Horizontal Scaling within Jelastic PaaS
 
Introduction to Apache Mesos
Introduction to Apache MesosIntroduction to Apache Mesos
Introduction to Apache Mesos
 
OpenNebulaconf2017US: Rapid scaling of research computing to over 70,000 cor...
OpenNebulaconf2017US:  Rapid scaling of research computing to over 70,000 cor...OpenNebulaconf2017US:  Rapid scaling of research computing to over 70,000 cor...
OpenNebulaconf2017US: Rapid scaling of research computing to over 70,000 cor...
 

Similar to Meetup photo processing, storage, and serving - NYC Tech Talks Meetup

mogpres
mogpresmogpres
mogpresxlight
 
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
Chester Chen
 
MySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated EnvironmentMySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated Environment
Jean-François Gagné
 
Why Wordnik went non-relational
Why Wordnik went non-relationalWhy Wordnik went non-relational
Why Wordnik went non-relational
Tony Tam
 
Galera webinar migration to galera cluster from my sql async replication
Galera webinar migration to galera cluster from my sql async replicationGalera webinar migration to galera cluster from my sql async replication
Galera webinar migration to galera cluster from my sql async replication
Codership Oy - Creators of Galera Cluster
 
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
Fred de Villamil
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
HostedbyConfluent
 
Magento Imagine eCommerce Conference February 2011: Optimizing Magento For Pe...
Magento Imagine eCommerce Conference February 2011: Optimizing Magento For Pe...Magento Imagine eCommerce Conference February 2011: Optimizing Magento For Pe...
Magento Imagine eCommerce Conference February 2011: Optimizing Magento For Pe...
varien
 
Magento's Imagine eCommerce Conference 2011 - Hosting Magento: Performance an...
Magento's Imagine eCommerce Conference 2011 - Hosting Magento: Performance an...Magento's Imagine eCommerce Conference 2011 - Hosting Magento: Performance an...
Magento's Imagine eCommerce Conference 2011 - Hosting Magento: Performance an...MagentoImagine
 
-Kerberos and Health Checks and Bare Metal, Oh My! Updates to OpenStack Sahar...
-Kerberos and Health Checks and Bare Metal, Oh My! Updates to OpenStack Sahar...-Kerberos and Health Checks and Bare Metal, Oh My! Updates to OpenStack Sahar...
-Kerberos and Health Checks and Bare Metal, Oh My! Updates to OpenStack Sahar...Гриднев Виталий
 
Galaxy Big Data with MariaDB
Galaxy Big Data with MariaDBGalaxy Big Data with MariaDB
Galaxy Big Data with MariaDB
MariaDB Corporation
 
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
OpenStack Korea Community
 
Joomla! Performance on Steroids
Joomla! Performance on SteroidsJoomla! Performance on Steroids
Joomla! Performance on Steroids
SiteGround.com
 
Scabiv0.2
Scabiv0.2Scabiv0.2
Scabiv0.2
Dilshad Mustafa
 
IOUG Data Integration SIG w/ Oracle GoldenGate Solutions and Configuration
IOUG Data Integration SIG w/ Oracle GoldenGate Solutions and ConfigurationIOUG Data Integration SIG w/ Oracle GoldenGate Solutions and Configuration
IOUG Data Integration SIG w/ Oracle GoldenGate Solutions and Configuration
Bobby Curtis
 
MySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated EnvironmentMySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated Environment
Jean-François Gagné
 
Oracle Golden Gate Interview Questions
Oracle Golden Gate Interview QuestionsOracle Golden Gate Interview Questions
Oracle Golden Gate Interview Questions
Arun Sharma
 
Doag data replication with oracle golden gate: Looking behind the scenes
Doag data replication with oracle golden gate: Looking behind the scenesDoag data replication with oracle golden gate: Looking behind the scenes
Doag data replication with oracle golden gate: Looking behind the scenes
Trivadis
 

Similar to Meetup photo processing, storage, and serving - NYC Tech Talks Meetup (20)

mogpres
mogpresmogpres
mogpres
 
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
 
MySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated EnvironmentMySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated Environment
 
mogpres
mogpresmogpres
mogpres
 
11g R2
11g R211g R2
11g R2
 
Why Wordnik went non-relational
Why Wordnik went non-relationalWhy Wordnik went non-relational
Why Wordnik went non-relational
 
Galera webinar migration to galera cluster from my sql async replication
Galera webinar migration to galera cluster from my sql async replicationGalera webinar migration to galera cluster from my sql async replication
Galera webinar migration to galera cluster from my sql async replication
 
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
 
Magento Imagine eCommerce Conference February 2011: Optimizing Magento For Pe...
Magento Imagine eCommerce Conference February 2011: Optimizing Magento For Pe...Magento Imagine eCommerce Conference February 2011: Optimizing Magento For Pe...
Magento Imagine eCommerce Conference February 2011: Optimizing Magento For Pe...
 
Magento's Imagine eCommerce Conference 2011 - Hosting Magento: Performance an...
Magento's Imagine eCommerce Conference 2011 - Hosting Magento: Performance an...Magento's Imagine eCommerce Conference 2011 - Hosting Magento: Performance an...
Magento's Imagine eCommerce Conference 2011 - Hosting Magento: Performance an...
 
-Kerberos and Health Checks and Bare Metal, Oh My! Updates to OpenStack Sahar...
-Kerberos and Health Checks and Bare Metal, Oh My! Updates to OpenStack Sahar...-Kerberos and Health Checks and Bare Metal, Oh My! Updates to OpenStack Sahar...
-Kerberos and Health Checks and Bare Metal, Oh My! Updates to OpenStack Sahar...
 
Galaxy Big Data with MariaDB
Galaxy Big Data with MariaDBGalaxy Big Data with MariaDB
Galaxy Big Data with MariaDB
 
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
 
Joomla! Performance on Steroids
Joomla! Performance on SteroidsJoomla! Performance on Steroids
Joomla! Performance on Steroids
 
Scabiv0.2
Scabiv0.2Scabiv0.2
Scabiv0.2
 
IOUG Data Integration SIG w/ Oracle GoldenGate Solutions and Configuration
IOUG Data Integration SIG w/ Oracle GoldenGate Solutions and ConfigurationIOUG Data Integration SIG w/ Oracle GoldenGate Solutions and Configuration
IOUG Data Integration SIG w/ Oracle GoldenGate Solutions and Configuration
 
MySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated EnvironmentMySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated Environment
 
Oracle Golden Gate Interview Questions
Oracle Golden Gate Interview QuestionsOracle Golden Gate Interview Questions
Oracle Golden Gate Interview Questions
 
Doag data replication with oracle golden gate: Looking behind the scenes
Doag data replication with oracle golden gate: Looking behind the scenesDoag data replication with oracle golden gate: Looking behind the scenes
Doag data replication with oracle golden gate: Looking behind the scenes
 

Recently uploaded

Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
Rohit Gautam
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
ThomasParaiso2
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Vladimir Iglovikov, Ph.D.
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 

Recently uploaded (20)

Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 

Meetup photo processing, storage, and serving - NYC Tech Talks Meetup

  • 1. Photos at Meetup processing / storage / serving Greg Whalin (@gwhalin)
  • 2. Meetup founded in 2002 in NYC 81k+ paying groups 1.2m+ RSVPs per months 25k+ Meetups per week 6 straight quarters of profitability team of 70 people working on MEME we are hiring
  • 3. Photos (processing and storage) ~500k photo uploads per month processing 3 - 5 different dimensions orientation (not dealing with for now) exif storage cheap easy to scale performance not critical because ... serving fast but CDN ok local cache
  • 4. Processing ImageMagick (or GraphicsMagick) CPU intensive No processing in core application for large uploads, can hold up request separate service (backed by distributed cluster) should be able to work behind dumb round-robin load balancer Batch for background processing rapid turnaround still important
  • 5. Storage Requirements: cheap, scalable, redundant, easy local RAID easy, but not scalable, redundant or cheap SAN scalable, easy, and redundant, but not cheap NAS scalable, easy, and cheap, but not super redundant Distributed can be cheap, scalable, redundant, and easy We went w/ MogileFS Cheap and scalable: jbod on entire network Redundant: auto-replication schemes (rack and chassis aware), no SPOF Easy: kinda, but no POSIX interface; web api (GET/PUT)
  • 6. Photo upload life cycle 1. User POSTs photo(s) from computer to Meetup App servers 2. Record inserted into db in a pending state 3. App server POSTs photo w/ some metadata to staging cluster (behind dumb load balancer) 4. lighttpd+fastCGI process running on each stager stores photo and meta to filesystem 5. Stager job monitors directory (using inotify) and wakes up when new photo to process 6. Stager code (Perl) processes photo, stores in MogileFS and updates DB that it is complete
  • 7.
  • 8. Current Setup 2 x dedicated upload app servers machines rarely ever restarted so as to avoid interrupting long uploads java + tomcat 8 proc Opteron w/ 16 GB 2 x dedicated staging/processing boxes lighttpd + perl daemon 4 proc Opteron w/ 8 GB 9 x storage nodes (Mogile) 4 proc Opteron w/ 8 GB JBOD (old boxes 8 x 500GB, new 16 x 1TB) total capacity now 86TB (72 used) 2 x db boxes for meta info (running mysql)
  • 9.
  • 10.
  • 11. Serving 2 serving nodes 2 proc Opterons w/ 4 GB (old cheap boxes) lighttpd + fastCGI retrieve photo from MogileFS, and return it not high speed at serving so... Akamai photos served w/ long TTL for cache even so, 20% of all requests hit our origin servers (lots of edge?) so ... Varnish cache in front of origin servers (coming soon) Akamai midgress
  • 12.
  • 13.