SlideShare a Scribd company logo
Prospero Media Storage Managing 100TB of small files… IGT –  Event July 2011
Numbers 70TBused space 700 million files 200GBand 250,000 files uploaded every day 1200Mbpsbandwidth throughput in peak 180TBof data is being served out monthly 3700 Hits per second in peak  40 storage node servers – 300TB raw space $0.13 per GB
Motivation Web 2.0 content serving paradigm shift Too many files 12M users x 1 file = very long tail Too many connections 1M users + keepalive = 1M connections Living with modern content in web 2.0 1 file x (thumbnail + iPhone + Mac) = 3 file copies
Traditional Architecture HTTP IO IO IO IO Centralized Storage (NAS, SAN, DAS etc.)
Traditional Architecture HTTP – TOO MANY CONNECTIONS IO IO IO IO Centralized Storage (NAS, SAN, DAS etc.)
Traditional Architecture HTTP IO IO IO IO IO IO IO Centralized Storage (NAS, SAN, DAS etc.)
Traditional Architecture HTTP IO IO IO IO IO IO IO Too much IO
Traditional Architecture HTTP Cache IO IO IO IO IO IO IO Centralized Storage (NAS, SAN, DAS etc.)
“There are only two hard things in Computer Science: cache invalidation and naming things”.  -- Tim Bray quoting Phil Karlton
Architecture goals Symmetric identical server nodes Simplified management and scaling Linear scaling out No functional / role servers No single point of failure No performance bottlenecks Multiple datacenters support DRP support Geo load distribution
Meet Prospero Distributed Web content storage system Full blown HTTP support Runs on low cost commodity hardware Adjustable file level replication controls redundancy policy for every content type Provides dynamic image manipulation
How do we do it?
Designed to fail Fallback for every operation Geographical, machine, storage medium Write never fails All files will reach their destination Journaling Tracking all uploaded files Pending jobs  Guaranteed file distribution
How do we achieve this Control the input define the only unified API  Functional process isolation every function deserves its own process by default watchdogs monitors alerts
get 37D815B5.jpg Go to 37 range servers Fallback if not found 2.static 6.static 0.static 4.static HTTP HTTP HTTP 20-3f 60-7f 00-1f 40-5f 7.static 3.static 1.static 5.static HTTP HTTP HTTP
Fallback Example
Node Architecture
Real Life
It’s all about performance Non blocking IO, readiness notification (epoll) Asynchronous file IO (AIO) Zero copy (sendfile) Memory maps Inter-process binary protocols UNIX socket Minimize dynamic memory allocation lighttpd memory footprint: 50MB
Lessons learnt Be symmetric Control the input Design to failure Performance matters again Simple is hard but a must

More Related Content

What's hot

Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
Alluxio, Inc.
 
MongoDB SF Ruby
MongoDB SF RubyMongoDB SF Ruby
MongoDB SF Ruby
Mike Dirolf
 
Webinar - Approaching 1 billion documents with MongoDB
Webinar - Approaching 1 billion documents with MongoDBWebinar - Approaching 1 billion documents with MongoDB
Webinar - Approaching 1 billion documents with MongoDB
Boxed Ice
 
Apache Arrow: In Theory, In Practice
Apache Arrow: In Theory, In PracticeApache Arrow: In Theory, In Practice
Apache Arrow: In Theory, In Practice
Dremio Corporation
 
Redis & MongoDB: Stop Big Data Indigestion Before It Starts
Redis & MongoDB: Stop Big Data Indigestion Before It StartsRedis & MongoDB: Stop Big Data Indigestion Before It Starts
Redis & MongoDB: Stop Big Data Indigestion Before It Starts
Itamar Haber
 
Redis database
Redis databaseRedis database
Redis database
Ñáwrás Ñzár
 
Expert Roundtable: The Future of Metadata After Hive Metastore
Expert Roundtable: The Future of Metadata After Hive MetastoreExpert Roundtable: The Future of Metadata After Hive Metastore
Expert Roundtable: The Future of Metadata After Hive Metastore
lakeFS
 
MongoDB IoT City Tour LONDON: Managing the Database Complexity, by Arthur Vie...
MongoDB IoT City Tour LONDON: Managing the Database Complexity, by Arthur Vie...MongoDB IoT City Tour LONDON: Managing the Database Complexity, by Arthur Vie...
MongoDB IoT City Tour LONDON: Managing the Database Complexity, by Arthur Vie...
MongoDB
 

What's hot (8)

Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
 
MongoDB SF Ruby
MongoDB SF RubyMongoDB SF Ruby
MongoDB SF Ruby
 
Webinar - Approaching 1 billion documents with MongoDB
Webinar - Approaching 1 billion documents with MongoDBWebinar - Approaching 1 billion documents with MongoDB
Webinar - Approaching 1 billion documents with MongoDB
 
Apache Arrow: In Theory, In Practice
Apache Arrow: In Theory, In PracticeApache Arrow: In Theory, In Practice
Apache Arrow: In Theory, In Practice
 
Redis & MongoDB: Stop Big Data Indigestion Before It Starts
Redis & MongoDB: Stop Big Data Indigestion Before It StartsRedis & MongoDB: Stop Big Data Indigestion Before It Starts
Redis & MongoDB: Stop Big Data Indigestion Before It Starts
 
Redis database
Redis databaseRedis database
Redis database
 
Expert Roundtable: The Future of Metadata After Hive Metastore
Expert Roundtable: The Future of Metadata After Hive MetastoreExpert Roundtable: The Future of Metadata After Hive Metastore
Expert Roundtable: The Future of Metadata After Hive Metastore
 
MongoDB IoT City Tour LONDON: Managing the Database Complexity, by Arthur Vie...
MongoDB IoT City Tour LONDON: Managing the Database Complexity, by Arthur Vie...MongoDB IoT City Tour LONDON: Managing the Database Complexity, by Arthur Vie...
MongoDB IoT City Tour LONDON: Managing the Database Complexity, by Arthur Vie...
 

Viewers also liked

Playing with Java Classes and Bytecode
Playing with Java Classes and BytecodePlaying with Java Classes and Bytecode
Playing with Java Classes and Bytecode
Yoav Avrahami
 
DOs and DONTs on the way to 10M users
DOs and DONTs on the way to 10M usersDOs and DONTs on the way to 10M users
DOs and DONTs on the way to 10M users
Yoav Avrahami
 
Pintura figurativa e abstrata
Pintura figurativa e abstrataPintura figurativa e abstrata
Pintura figurativa e abstrata
manasantos
 
Antoloxía de Lois Pereiro por Mariña Amo
Antoloxía de Lois Pereiro por Mariña AmoAntoloxía de Lois Pereiro por Mariña Amo
Antoloxía de Lois Pereiro por Mariña Amoporviso
 
Tortenet valosagshow
Tortenet valosagshowTortenet valosagshow
Tortenet valosagshowAdelina Buna
 
Tortenet valosagshow
Tortenet valosagshowTortenet valosagshow
Tortenet valosagshowAdelina Buna
 
Software Architecture
Software ArchitectureSoftware Architecture
Software Architecture
Yoav Avrahami
 
Jvm memory model
Jvm memory modelJvm memory model
Jvm memory model
Yoav Avrahami
 
Scaling up to 30 m users
Scaling up to 30 m usersScaling up to 30 m users
Scaling up to 30 m users
Yoav Avrahami
 
Scaling wix to over 70 m users
Scaling wix to over 70 m usersScaling wix to over 70 m users
Scaling wix to over 70 m users
Yoav Avrahami
 
Scaling wix to over 50 m users
Scaling wix to over 50 m usersScaling wix to over 50 m users
Scaling wix to over 50 m users
Yoav Avrahami
 
Continuous Delivery at Wix
Continuous Delivery at WixContinuous Delivery at Wix
Continuous Delivery at Wix
Yoav Avrahami
 
Scala design pattern
Scala design patternScala design pattern
Scala design pattern
Kenji Yoshida
 
Día do libro1
Día do libro1Día do libro1
Día do libro1
porviso
 
DevOps is not a Culture. It is about responsibility
DevOps is not a Culture. It is about responsibilityDevOps is not a Culture. It is about responsibility
DevOps is not a Culture. It is about responsibility
Yoav Avrahami
 

Viewers also liked (18)

Playing with Java Classes and Bytecode
Playing with Java Classes and BytecodePlaying with Java Classes and Bytecode
Playing with Java Classes and Bytecode
 
DOs and DONTs on the way to 10M users
DOs and DONTs on the way to 10M usersDOs and DONTs on the way to 10M users
DOs and DONTs on the way to 10M users
 
Pintura figurativa e abstrata
Pintura figurativa e abstrataPintura figurativa e abstrata
Pintura figurativa e abstrata
 
Antoloxía de Lois Pereiro por Mariña Amo
Antoloxía de Lois Pereiro por Mariña AmoAntoloxía de Lois Pereiro por Mariña Amo
Antoloxía de Lois Pereiro por Mariña Amo
 
Tortenet valosagshow
Tortenet valosagshowTortenet valosagshow
Tortenet valosagshow
 
Tortenet valosagshow
Tortenet valosagshowTortenet valosagshow
Tortenet valosagshow
 
Valosag mufajok
Valosag mufajokValosag mufajok
Valosag mufajok
 
Lystrup slides
Lystrup slidesLystrup slides
Lystrup slides
 
Software Architecture
Software ArchitectureSoftware Architecture
Software Architecture
 
Jvm memory model
Jvm memory modelJvm memory model
Jvm memory model
 
Scaling up to 30 m users
Scaling up to 30 m usersScaling up to 30 m users
Scaling up to 30 m users
 
Scaling wix to over 70 m users
Scaling wix to over 70 m usersScaling wix to over 70 m users
Scaling wix to over 70 m users
 
Scaling wix to over 50 m users
Scaling wix to over 50 m usersScaling wix to over 50 m users
Scaling wix to over 50 m users
 
Continuous Delivery at Wix
Continuous Delivery at WixContinuous Delivery at Wix
Continuous Delivery at Wix
 
Scala design pattern
Scala design patternScala design pattern
Scala design pattern
 
Día do libro1
Día do libro1Día do libro1
Día do libro1
 
DevOps is not a Culture. It is about responsibility
DevOps is not a Culture. It is about responsibilityDevOps is not a Culture. It is about responsibility
DevOps is not a Culture. It is about responsibility
 
scala-kaigi1-sbt
scala-kaigi1-sbtscala-kaigi1-sbt
scala-kaigi1-sbt
 

Similar to Wix 10M Users Event - Prospero Media Storage

Next-generation sequencing: Data mangement
Next-generation sequencing: Data mangementNext-generation sequencing: Data mangement
Next-generation sequencing: Data mangement
Guy Coates
 
Hadoop File system (HDFS)
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)
Prashant Gupta
 
Scaling and High Performance Storage System: LeoFS
Scaling and High Performance Storage System: LeoFSScaling and High Performance Storage System: LeoFS
Scaling and High Performance Storage System: LeoFS
Rakuten Group, Inc.
 
Bigdata
BigdataBigdata
Bigdata
Shankar R
 
S100299 ibm-cos-orlando-v1804c
S100299 ibm-cos-orlando-v1804cS100299 ibm-cos-orlando-v1804c
S100299 ibm-cos-orlando-v1804c
Tony Pearson
 
Wix Architecture at Scale - QCon London 2014
Wix Architecture at Scale - QCon London 2014Wix Architecture at Scale - QCon London 2014
Wix Architecture at Scale - QCon London 2014
Aviran Mordo
 
Building Super Fast Cloud-Native Data Platforms - Yaron Haviv, KubeCon 2017 EU
Building Super Fast Cloud-Native Data Platforms - Yaron Haviv, KubeCon 2017 EUBuilding Super Fast Cloud-Native Data Platforms - Yaron Haviv, KubeCon 2017 EU
Building Super Fast Cloud-Native Data Platforms - Yaron Haviv, KubeCon 2017 EU
Yaron Haviv
 
Stac summit june 14th - goodbye datalakes
Stac summit june 14th - goodbye datalakesStac summit june 14th - goodbye datalakes
Stac summit june 14th - goodbye datalakes
iguazio
 
Containers and Databases
Containers and DatabasesContainers and Databases
Containers and Databases
Fernando Ike
 
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Bhupesh Bansal
 
Hadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedIn
Hadoop User Group
 
GC free coding in @Java presented @Geecon
GC free coding in @Java presented @GeeconGC free coding in @Java presented @Geecon
GC free coding in @Java presented @Geecon
Peter Lawrey
 
Real-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiReal-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFi
Manish Gupta
 
HDInsight for Architects
HDInsight for ArchitectsHDInsight for Architects
HDInsight for Architects
Ashish Thapliyal
 
Deploying On EC2
Deploying On EC2Deploying On EC2
Deploying On EC2
Steve Loughran
 
Azure: Lessons From The Field
Azure: Lessons From The FieldAzure: Lessons From The Field
Azure: Lessons From The Field
Rob Gillen
 
Libnova ICoC
Libnova ICoCLibnova ICoC
One billion notes as 'Small Data' (Dave Engberg)
One billion notes as 'Small Data' (Dave Engberg)One billion notes as 'Small Data' (Dave Engberg)
One billion notes as 'Small Data' (Dave Engberg)
Ontico
 
Inroduction to Big Data
Inroduction to Big DataInroduction to Big Data
Inroduction to Big Data
Omnia Safaan
 
Storage for next-generation sequencing
Storage for next-generation sequencingStorage for next-generation sequencing
Storage for next-generation sequencing
Guy Coates
 

Similar to Wix 10M Users Event - Prospero Media Storage (20)

Next-generation sequencing: Data mangement
Next-generation sequencing: Data mangementNext-generation sequencing: Data mangement
Next-generation sequencing: Data mangement
 
Hadoop File system (HDFS)
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)
 
Scaling and High Performance Storage System: LeoFS
Scaling and High Performance Storage System: LeoFSScaling and High Performance Storage System: LeoFS
Scaling and High Performance Storage System: LeoFS
 
Bigdata
BigdataBigdata
Bigdata
 
S100299 ibm-cos-orlando-v1804c
S100299 ibm-cos-orlando-v1804cS100299 ibm-cos-orlando-v1804c
S100299 ibm-cos-orlando-v1804c
 
Wix Architecture at Scale - QCon London 2014
Wix Architecture at Scale - QCon London 2014Wix Architecture at Scale - QCon London 2014
Wix Architecture at Scale - QCon London 2014
 
Building Super Fast Cloud-Native Data Platforms - Yaron Haviv, KubeCon 2017 EU
Building Super Fast Cloud-Native Data Platforms - Yaron Haviv, KubeCon 2017 EUBuilding Super Fast Cloud-Native Data Platforms - Yaron Haviv, KubeCon 2017 EU
Building Super Fast Cloud-Native Data Platforms - Yaron Haviv, KubeCon 2017 EU
 
Stac summit june 14th - goodbye datalakes
Stac summit june 14th - goodbye datalakesStac summit june 14th - goodbye datalakes
Stac summit june 14th - goodbye datalakes
 
Containers and Databases
Containers and DatabasesContainers and Databases
Containers and Databases
 
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
 
Hadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedIn
 
GC free coding in @Java presented @Geecon
GC free coding in @Java presented @GeeconGC free coding in @Java presented @Geecon
GC free coding in @Java presented @Geecon
 
Real-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiReal-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFi
 
HDInsight for Architects
HDInsight for ArchitectsHDInsight for Architects
HDInsight for Architects
 
Deploying On EC2
Deploying On EC2Deploying On EC2
Deploying On EC2
 
Azure: Lessons From The Field
Azure: Lessons From The FieldAzure: Lessons From The Field
Azure: Lessons From The Field
 
Libnova ICoC
Libnova ICoCLibnova ICoC
Libnova ICoC
 
One billion notes as 'Small Data' (Dave Engberg)
One billion notes as 'Small Data' (Dave Engberg)One billion notes as 'Small Data' (Dave Engberg)
One billion notes as 'Small Data' (Dave Engberg)
 
Inroduction to Big Data
Inroduction to Big DataInroduction to Big Data
Inroduction to Big Data
 
Storage for next-generation sequencing
Storage for next-generation sequencingStorage for next-generation sequencing
Storage for next-generation sequencing
 

Recently uploaded

Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
Hiike
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
alexjohnson7307
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Public CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptxPublic CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptx
marufrahmanstratejm
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
Intelisync
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
Dinusha Kumarasiri
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
ScyllaDB
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
Miro Wengner
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 

Recently uploaded (20)

Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Public CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptxPublic CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptx
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 

Wix 10M Users Event - Prospero Media Storage

  • 1. Prospero Media Storage Managing 100TB of small files… IGT – Event July 2011
  • 2. Numbers 70TBused space 700 million files 200GBand 250,000 files uploaded every day 1200Mbpsbandwidth throughput in peak 180TBof data is being served out monthly 3700 Hits per second in peak 40 storage node servers – 300TB raw space $0.13 per GB
  • 3. Motivation Web 2.0 content serving paradigm shift Too many files 12M users x 1 file = very long tail Too many connections 1M users + keepalive = 1M connections Living with modern content in web 2.0 1 file x (thumbnail + iPhone + Mac) = 3 file copies
  • 4. Traditional Architecture HTTP IO IO IO IO Centralized Storage (NAS, SAN, DAS etc.)
  • 5. Traditional Architecture HTTP – TOO MANY CONNECTIONS IO IO IO IO Centralized Storage (NAS, SAN, DAS etc.)
  • 6. Traditional Architecture HTTP IO IO IO IO IO IO IO Centralized Storage (NAS, SAN, DAS etc.)
  • 7. Traditional Architecture HTTP IO IO IO IO IO IO IO Too much IO
  • 8. Traditional Architecture HTTP Cache IO IO IO IO IO IO IO Centralized Storage (NAS, SAN, DAS etc.)
  • 9. “There are only two hard things in Computer Science: cache invalidation and naming things”. -- Tim Bray quoting Phil Karlton
  • 10. Architecture goals Symmetric identical server nodes Simplified management and scaling Linear scaling out No functional / role servers No single point of failure No performance bottlenecks Multiple datacenters support DRP support Geo load distribution
  • 11. Meet Prospero Distributed Web content storage system Full blown HTTP support Runs on low cost commodity hardware Adjustable file level replication controls redundancy policy for every content type Provides dynamic image manipulation
  • 12. How do we do it?
  • 13. Designed to fail Fallback for every operation Geographical, machine, storage medium Write never fails All files will reach their destination Journaling Tracking all uploaded files Pending jobs Guaranteed file distribution
  • 14. How do we achieve this Control the input define the only unified API Functional process isolation every function deserves its own process by default watchdogs monitors alerts
  • 15. get 37D815B5.jpg Go to 37 range servers Fallback if not found 2.static 6.static 0.static 4.static HTTP HTTP HTTP 20-3f 60-7f 00-1f 40-5f 7.static 3.static 1.static 5.static HTTP HTTP HTTP
  • 19. It’s all about performance Non blocking IO, readiness notification (epoll) Asynchronous file IO (AIO) Zero copy (sendfile) Memory maps Inter-process binary protocols UNIX socket Minimize dynamic memory allocation lighttpd memory footprint: 50MB
  • 20. Lessons learnt Be symmetric Control the input Design to failure Performance matters again Simple is hard but a must

Editor's Notes

  1. AIO: Asynchronous i/o overlaps application processing with i/o operationsfor improved utilization of CPU and devices, and improved application performance, in a dynamic/adaptive manner, especially under high loads Zero-copy: Hardware that supports gather can assemble data from multiple memory locations, eliminating another copy.Step one: the sendfile system call causes the file contents to be copied into a kernel buffer by the DMA engine.Step two: no data is copied into the socket buffer. Instead, only descriptors with information about the whereabouts and length of the data are appended to the socket buffer. The DMA engine passes data directly from the kernel buffer to the protocol engine, thus eliminating the remaining final copy.involving large numbers of i/o operations.