SlideShare a Scribd company logo
1 of 55
Download to read offline
Introduction   Storage         Processing     Monitoring   Review




                   Scaling at Showyou
                             Operations




                         September 26, 2011
Introduction           Storage     Processing   Monitoring   Review




      I’m Kyle Kingsbury

               Handle aphyr
                Code http://github.com/aphyr
                Email kyle@remixation.com
                Focus Backend, API, ops
Introduction   Storage       Processing    Monitoring   Review




                   What the hell is Showyou?
Introduction   Storage           Processing      Monitoring   Review




                         Nontrivial complexity
Introduction         Storage    Processing   Monitoring   Review




      Challenges
          ˆ Scanning social networks
          ˆ Feeds
          ˆ Search
          ˆ Trends
          ˆ Responsive client experience
Introduction         Storage      Processing   Monitoring   Review




      Challenges
          ˆ Scanning social networks
          ˆ Feeds
          ˆ Search
          ˆ Trends
          ˆ Responsive client experience
          ˆ Everything fails all the time
Introduction   Storage    Processing   Monitoring   Review




                         Storage
Introduction    Storage   Processing   Monitoring   Review




      We left MySQL
Introduction          Storage   Processing     Monitoring   Review




      We left MySQL
          ˆ Changing the schema requires downtime
          ˆ Crashes
          ˆ Master-slave lag
          ˆ Slow restarts
          ˆ Node replacements difficult
          ˆ Fully normalized queries complex, slow
Introduction   Storage            Processing       Monitoring   Review




                           MySQL does scale


                         But there are tradeoffs
Introduction        Storage         Processing    Monitoring   Review




      Riak
          ˆ Key/value store
          ˆ Homogenous
          ˆ Scales linearly with nodes
          ˆ Excellent durability/recoverability
          ˆ Eventually consistent
Introduction        Storage       Processing     Monitoring   Review




      We use Riak as our durable datastore
          ˆ Users, feeds, videos, etc
          ˆ Highly denormalized
          ˆ Limited MR queries (feeds, etc)
              ˆ Latency-bounded MR jobs are Erlang
              ˆ Hot-deployable
          ˆ Extensive use of conflict resolution
              ˆ Made possible by Risky
Introduction        Storage      Processing      Monitoring   Review




      Riak at Showyou
          ˆ 51 million keys (153 M replicated)
          ˆ 100 GB of data (300 GB replicated)
          ˆ 260 gets/sec (baseline)
          ˆ 75 puts/sec (baseline)
          ˆ Capable of over 3000 ops/sec
Introduction          Storage     Processing           Monitoring    Review




      SSDs are amazing


          WD 7200RPM                           Micron RealSSD P300
               ˆ 100 ops/sec                    ˆ 1000+ ops/sec
               ˆ 95%: 100-300ms                 ˆ 95%: 3-5ms
Introduction       Storage       Processing    Monitoring        Review




      When Riak fails,
          ˆ Another node takes up the slack
          ˆ Clients connected to that node reconnect to others
          ˆ Typically no service interruption
              ˆ However, latencies may rise
              ˆ Especially for MR jobs
Introduction        Storage      Processing        Monitoring   Review




      Riak has downsides
          ˆ Difficult to debug
          ˆ Membership changes are dangerous
          ˆ Significantly slower than MySQL
          ˆ (Bitcask) All keys must fit in memory
          ˆ Mapreduce is only appropriate for known keys
          ˆ List-keys can take down your cluster

      Long story short: it’s only a KV store
Introduction   Storage    Processing   Monitoring   Review




                         +Redis
Introduction         Storage     Processing    Monitoring   Review




      We use Redis for fast, temporary state
          ˆ List of users
          ˆ List of videos
          ˆ Counters
          ˆ Queues

      Incredibly fast, excellent primitives
Introduction           Storage   Processing     Monitoring     Review




      When Redis fails,
          ˆ Daemons using those indexes pause
          ˆ Frontend service continues
          ˆ Bitcask scanners and incremental updaters repair
               any lost data
Introduction           Storage   Processing     Monitoring     Review




      When Redis fails,
          ˆ Daemons using those indexes pause
          ˆ Frontend service continues
          ˆ Bitcask scanners and incremental updaters repair
               any lost data
      Eventually consistent.
Introduction        Storage     Processing   Monitoring   Review




      We also use SOLR extensively
          ˆ Supplements Riak
          ˆ Complex indices
          ˆ Full-text search
          ˆ Analytics

      More on that later. . .
Introduction   Storage     Processing   Monitoring   Review




                         Processing
Introduction        Storage         Processing    Monitoring   Review




      Do one thing well

      Lots of small processes handling well-defined tasks
          ˆ Easier to debug
          ˆ Easier to test
          ˆ Simplifies parallelism
          ˆ Simplifies error handling
          ˆ Less likely to cause total system failure
Introduction        Storage      Processing    Monitoring   Review




      Minimize Shared State
          ˆ Vector clocks for concurrent modification
          ˆ Queues for message passing
          ˆ Riak for durable storage
          ˆ Redis for fast synchronous state
Introduction       Storage       Processing       Monitoring   Review




      Crash by Default
          ˆ Someone else will take your work
          ˆ Repair constantly
          ˆ Assume everybody is out to kill you
Introduction             Storage   Processing    Monitoring     Review




      Distribute
          ˆ Multiple threads, processes, hosts
          ˆ Failover IPs with Heartbeat
          ˆ Rolling restarts mean frequent deploys and nobody
               notices
          ˆ Losing a node is no big deal
          ˆ Scaling out is easy
Introduction   Storage     Processing   Monitoring   Review




                         Monitoring
Introduction   Storage      Processing     Monitoring   Review




                  UState: A state aggregator
Introduction           Storage      Processing   Monitoring   Review




      Receive states over protobufs
                 Host backend1.showyou.com
               Service feed merger rate
                 Time unix epoch seconds
                State ok
                Metric 12.5
      Description 12.5 feed items/sec
Introduction        Storage       Processing      Monitoring   Review




      Query states
          ˆ state = "warning" or state = "critical"
          ˆ service =∼ "api %" and host != null
Introduction        Storage       Processing     Monitoring   Review




          ˆ Combine states together (sum, average, . . . )
          ˆ Send email on changes
          ˆ Forward to another UState server
          ˆ Forward to Graphite
          ˆ Dashboard
Introduction   Storage      Processing    Monitoring   Review




               Understand application behavior
Introduction   Storage         Processing     Monitoring   Review




                         When can we. . . ?
Introduction   Storage        Processing   Monitoring   Review




                         It’s 23:15 PST.
Introduction      Storage        Processing   Monitoring   Review




                            It’s 23:15 PST.

               Do you know where YOUR database is?
Introduction   Storage     Processing     Monitoring   Review




               http://github.com/aphyr/ustate
Introduction        Storage        Processing   Monitoring   Review




      Recap
          ˆ Robust, discrete components
          ˆ Highly distributed
          ˆ Message passing
          ˆ Eventual consistency
          ˆ Comprehensive monitoring
Introduction       Storage      Processing   Monitoring   Review




      Thanks!
          ˆ Basho (esp. Pharkmillups!)
          ˆ Formspring
          ˆ Bump

More Related Content

What's hot

Spectre/Meltdown security vulnerabilities FAQ
Spectre/Meltdown security vulnerabilities FAQSpectre/Meltdown security vulnerabilities FAQ
Spectre/Meltdown security vulnerabilities FAQDavid Pasek
 
A day in the life of a VSAN I/O - STO7875
A day in the life of a VSAN I/O - STO7875A day in the life of a VSAN I/O - STO7875
A day in the life of a VSAN I/O - STO7875Duncan Epping
 
Webinar slides: ClusterControl 1.4: The MySQL Replication & MongoDB Edition -...
Webinar slides: ClusterControl 1.4: The MySQL Replication & MongoDB Edition -...Webinar slides: ClusterControl 1.4: The MySQL Replication & MongoDB Edition -...
Webinar slides: ClusterControl 1.4: The MySQL Replication & MongoDB Edition -...Severalnines
 
Why Use Oracle VM for Oracle Databases? Revera Presentation
Why Use Oracle VM for Oracle Databases? Revera PresentationWhy Use Oracle VM for Oracle Databases? Revera Presentation
Why Use Oracle VM for Oracle Databases? Revera PresentationFrancisco Alvarez
 
Monitoring and Tuning GlassFish
Monitoring and Tuning GlassFishMonitoring and Tuning GlassFish
Monitoring and Tuning GlassFishC2B2 Consulting
 
STO7535 Virtual SAN Proof of Concept - VMworld 2016
STO7535 Virtual SAN Proof of Concept - VMworld 2016STO7535 Virtual SAN Proof of Concept - VMworld 2016
STO7535 Virtual SAN Proof of Concept - VMworld 2016Cormac Hogan
 
VMworld 2015: Explaining Advanced Virtual Volumes Configurations
VMworld 2015: Explaining Advanced Virtual Volumes ConfigurationsVMworld 2015: Explaining Advanced Virtual Volumes Configurations
VMworld 2015: Explaining Advanced Virtual Volumes ConfigurationsVMworld
 
VMworld 2014: vSphere Distributed Switch
VMworld 2014: vSphere Distributed SwitchVMworld 2014: vSphere Distributed Switch
VMworld 2014: vSphere Distributed SwitchVMworld
 
VMworld 2016: Virtualize Active Directory, the Right Way!
VMworld 2016: Virtualize Active Directory, the Right Way! VMworld 2016: Virtualize Active Directory, the Right Way!
VMworld 2016: Virtualize Active Directory, the Right Way! VMworld
 
VMworld Europe 2014: Top 10 Do’s / Don’ts of Data Protection For VMware vSphere
VMworld Europe 2014: Top 10 Do’s / Don’ts of Data Protection For VMware vSphereVMworld Europe 2014: Top 10 Do’s / Don’ts of Data Protection For VMware vSphere
VMworld Europe 2014: Top 10 Do’s / Don’ts of Data Protection For VMware vSphereVMworld
 
Salt Cloud vmware-orchestration
Salt Cloud vmware-orchestrationSalt Cloud vmware-orchestration
Salt Cloud vmware-orchestrationMo Rawi
 
VMworld 2015: Networking Virtual SAN's Backbone
VMworld 2015: Networking Virtual SAN's BackboneVMworld 2015: Networking Virtual SAN's Backbone
VMworld 2015: Networking Virtual SAN's BackboneVMworld
 
VMworld 2015: Virtualize Active Directory, the Right Way!
VMworld 2015: Virtualize Active Directory, the Right Way!VMworld 2015: Virtualize Active Directory, the Right Way!
VMworld 2015: Virtualize Active Directory, the Right Way!VMworld
 
VMworld 2016: vSphere 6.x Host Resource Deep Dive
VMworld 2016: vSphere 6.x Host Resource Deep DiveVMworld 2016: vSphere 6.x Host Resource Deep Dive
VMworld 2016: vSphere 6.x Host Resource Deep DiveVMworld
 
Monitoring&Logging - Stanislav Kolenkin
Monitoring&Logging - Stanislav Kolenkin  Monitoring&Logging - Stanislav Kolenkin
Monitoring&Logging - Stanislav Kolenkin Kuberton
 
Sql server 2012 ha dr 24_hop_final
Sql server 2012 ha dr 24_hop_finalSql server 2012 ha dr 24_hop_final
Sql server 2012 ha dr 24_hop_finalJoseph D'Antoni
 
VMworld 2015: Site Recovery Manager and Policy Based DR Deep Dive with Engine...
VMworld 2015: Site Recovery Manager and Policy Based DR Deep Dive with Engine...VMworld 2015: Site Recovery Manager and Policy Based DR Deep Dive with Engine...
VMworld 2015: Site Recovery Manager and Policy Based DR Deep Dive with Engine...VMworld
 
VMworld 2015: Extreme Performance Series - vCenter Performance Best Practices
VMworld 2015: Extreme Performance Series - vCenter Performance Best PracticesVMworld 2015: Extreme Performance Series - vCenter Performance Best Practices
VMworld 2015: Extreme Performance Series - vCenter Performance Best PracticesVMworld
 
VMworld 2015: Advanced SQL Server on vSphere
VMworld 2015: Advanced SQL Server on vSphereVMworld 2015: Advanced SQL Server on vSphere
VMworld 2015: Advanced SQL Server on vSphereVMworld
 
Kscope 2013 delphix
Kscope 2013 delphixKscope 2013 delphix
Kscope 2013 delphixKyle Hailey
 

What's hot (20)

Spectre/Meltdown security vulnerabilities FAQ
Spectre/Meltdown security vulnerabilities FAQSpectre/Meltdown security vulnerabilities FAQ
Spectre/Meltdown security vulnerabilities FAQ
 
A day in the life of a VSAN I/O - STO7875
A day in the life of a VSAN I/O - STO7875A day in the life of a VSAN I/O - STO7875
A day in the life of a VSAN I/O - STO7875
 
Webinar slides: ClusterControl 1.4: The MySQL Replication & MongoDB Edition -...
Webinar slides: ClusterControl 1.4: The MySQL Replication & MongoDB Edition -...Webinar slides: ClusterControl 1.4: The MySQL Replication & MongoDB Edition -...
Webinar slides: ClusterControl 1.4: The MySQL Replication & MongoDB Edition -...
 
Why Use Oracle VM for Oracle Databases? Revera Presentation
Why Use Oracle VM for Oracle Databases? Revera PresentationWhy Use Oracle VM for Oracle Databases? Revera Presentation
Why Use Oracle VM for Oracle Databases? Revera Presentation
 
Monitoring and Tuning GlassFish
Monitoring and Tuning GlassFishMonitoring and Tuning GlassFish
Monitoring and Tuning GlassFish
 
STO7535 Virtual SAN Proof of Concept - VMworld 2016
STO7535 Virtual SAN Proof of Concept - VMworld 2016STO7535 Virtual SAN Proof of Concept - VMworld 2016
STO7535 Virtual SAN Proof of Concept - VMworld 2016
 
VMworld 2015: Explaining Advanced Virtual Volumes Configurations
VMworld 2015: Explaining Advanced Virtual Volumes ConfigurationsVMworld 2015: Explaining Advanced Virtual Volumes Configurations
VMworld 2015: Explaining Advanced Virtual Volumes Configurations
 
VMworld 2014: vSphere Distributed Switch
VMworld 2014: vSphere Distributed SwitchVMworld 2014: vSphere Distributed Switch
VMworld 2014: vSphere Distributed Switch
 
VMworld 2016: Virtualize Active Directory, the Right Way!
VMworld 2016: Virtualize Active Directory, the Right Way! VMworld 2016: Virtualize Active Directory, the Right Way!
VMworld 2016: Virtualize Active Directory, the Right Way!
 
VMworld Europe 2014: Top 10 Do’s / Don’ts of Data Protection For VMware vSphere
VMworld Europe 2014: Top 10 Do’s / Don’ts of Data Protection For VMware vSphereVMworld Europe 2014: Top 10 Do’s / Don’ts of Data Protection For VMware vSphere
VMworld Europe 2014: Top 10 Do’s / Don’ts of Data Protection For VMware vSphere
 
Salt Cloud vmware-orchestration
Salt Cloud vmware-orchestrationSalt Cloud vmware-orchestration
Salt Cloud vmware-orchestration
 
VMworld 2015: Networking Virtual SAN's Backbone
VMworld 2015: Networking Virtual SAN's BackboneVMworld 2015: Networking Virtual SAN's Backbone
VMworld 2015: Networking Virtual SAN's Backbone
 
VMworld 2015: Virtualize Active Directory, the Right Way!
VMworld 2015: Virtualize Active Directory, the Right Way!VMworld 2015: Virtualize Active Directory, the Right Way!
VMworld 2015: Virtualize Active Directory, the Right Way!
 
VMworld 2016: vSphere 6.x Host Resource Deep Dive
VMworld 2016: vSphere 6.x Host Resource Deep DiveVMworld 2016: vSphere 6.x Host Resource Deep Dive
VMworld 2016: vSphere 6.x Host Resource Deep Dive
 
Monitoring&Logging - Stanislav Kolenkin
Monitoring&Logging - Stanislav Kolenkin  Monitoring&Logging - Stanislav Kolenkin
Monitoring&Logging - Stanislav Kolenkin
 
Sql server 2012 ha dr 24_hop_final
Sql server 2012 ha dr 24_hop_finalSql server 2012 ha dr 24_hop_final
Sql server 2012 ha dr 24_hop_final
 
VMworld 2015: Site Recovery Manager and Policy Based DR Deep Dive with Engine...
VMworld 2015: Site Recovery Manager and Policy Based DR Deep Dive with Engine...VMworld 2015: Site Recovery Manager and Policy Based DR Deep Dive with Engine...
VMworld 2015: Site Recovery Manager and Policy Based DR Deep Dive with Engine...
 
VMworld 2015: Extreme Performance Series - vCenter Performance Best Practices
VMworld 2015: Extreme Performance Series - vCenter Performance Best PracticesVMworld 2015: Extreme Performance Series - vCenter Performance Best Practices
VMworld 2015: Extreme Performance Series - vCenter Performance Best Practices
 
VMworld 2015: Advanced SQL Server on vSphere
VMworld 2015: Advanced SQL Server on vSphereVMworld 2015: Advanced SQL Server on vSphere
VMworld 2015: Advanced SQL Server on vSphere
 
Kscope 2013 delphix
Kscope 2013 delphixKscope 2013 delphix
Kscope 2013 delphix
 

Viewers also liked

Scaling operations digital catapult
Scaling operations   digital catapultScaling operations   digital catapult
Scaling operations digital catapultDavid Norris
 
Scaling operations in the Subscription Economy (Brandwatch)
Scaling operations in the Subscription Economy (Brandwatch)Scaling operations in the Subscription Economy (Brandwatch)
Scaling operations in the Subscription Economy (Brandwatch)Zuora, Inc.
 
Scaling operations
Scaling operationsScaling operations
Scaling operationsDavid Norris
 
The Operations Perspective: Scaling Operations in the Subscription Economy
The Operations Perspective: Scaling Operations in the Subscription EconomyThe Operations Perspective: Scaling Operations in the Subscription Economy
The Operations Perspective: Scaling Operations in the Subscription EconomyZuora, Inc.
 
You Can't Change Culture, But You Can Change Behavior (DevOpsDays Rome 2012)
You Can't Change Culture, But You Can Change Behavior (DevOpsDays Rome 2012)You Can't Change Culture, But You Can Change Behavior (DevOpsDays Rome 2012)
You Can't Change Culture, But You Can Change Behavior (DevOpsDays Rome 2012)dev2ops
 
Continuously Deploying Culture: Scaling Culture at Etsy - Velocity Europe 2012
Continuously Deploying Culture: Scaling Culture at Etsy - Velocity Europe 2012Continuously Deploying Culture: Scaling Culture at Etsy - Velocity Europe 2012
Continuously Deploying Culture: Scaling Culture at Etsy - Velocity Europe 2012Patrick McDonnell
 

Viewers also liked (7)

Scaling operations digital catapult
Scaling operations   digital catapultScaling operations   digital catapult
Scaling operations digital catapult
 
Scaling operations in the Subscription Economy (Brandwatch)
Scaling operations in the Subscription Economy (Brandwatch)Scaling operations in the Subscription Economy (Brandwatch)
Scaling operations in the Subscription Economy (Brandwatch)
 
Scaling operations
Scaling operationsScaling operations
Scaling operations
 
The Operations Perspective: Scaling Operations in the Subscription Economy
The Operations Perspective: Scaling Operations in the Subscription EconomyThe Operations Perspective: Scaling Operations in the Subscription Economy
The Operations Perspective: Scaling Operations in the Subscription Economy
 
You Can't Change Culture, But You Can Change Behavior (DevOpsDays Rome 2012)
You Can't Change Culture, But You Can Change Behavior (DevOpsDays Rome 2012)You Can't Change Culture, But You Can Change Behavior (DevOpsDays Rome 2012)
You Can't Change Culture, But You Can Change Behavior (DevOpsDays Rome 2012)
 
Continuously Deploying Culture: Scaling Culture at Etsy - Velocity Europe 2012
Continuously Deploying Culture: Scaling Culture at Etsy - Velocity Europe 2012Continuously Deploying Culture: Scaling Culture at Etsy - Velocity Europe 2012
Continuously Deploying Culture: Scaling Culture at Etsy - Velocity Europe 2012
 
Scaling Operations At Spotify
Scaling Operations At SpotifyScaling Operations At Spotify
Scaling Operations At Spotify
 

Similar to Scaling at Showyou: Operations

Always On - Wydajność i bezpieczeństwo naszych danych - High Availability SQL...
Always On - Wydajność i bezpieczeństwo naszych danych - High Availability SQL...Always On - Wydajność i bezpieczeństwo naszych danych - High Availability SQL...
Always On - Wydajność i bezpieczeństwo naszych danych - High Availability SQL...SQLExpert.pl
 
Log-Structured File System (LSFS) as a weapon to fight “I/O Blender” virtuali...
Log-Structured File System (LSFS) as a weapon to fight “I/O Blender” virtuali...Log-Structured File System (LSFS) as a weapon to fight “I/O Blender” virtuali...
Log-Structured File System (LSFS) as a weapon to fight “I/O Blender” virtuali...StarWind Software
 
A Very Biased Comparison of MVC Libraries
A Very Biased Comparison of MVC LibrariesA Very Biased Comparison of MVC Libraries
A Very Biased Comparison of MVC LibrariesBrian Moschel
 
Ultimate SharePoint Infrastructure Best Practices Session - Live360 Orlando 2012
Ultimate SharePoint Infrastructure Best Practices Session - Live360 Orlando 2012Ultimate SharePoint Infrastructure Best Practices Session - Live360 Orlando 2012
Ultimate SharePoint Infrastructure Best Practices Session - Live360 Orlando 2012Michael Noel
 
Planning For High Performance Web Application
Planning For High Performance Web ApplicationPlanning For High Performance Web Application
Planning For High Performance Web ApplicationYue Tian
 
Azure SQL Managed Instance - SqlBits 2019
Azure SQL Managed Instance - SqlBits 2019Azure SQL Managed Instance - SqlBits 2019
Azure SQL Managed Instance - SqlBits 2019Jovan Popovic
 
VMworld 2013: Storage IO Control: Concepts, Configuration and Best Practices ...
VMworld 2013: Storage IO Control: Concepts, Configuration and Best Practices ...VMworld 2013: Storage IO Control: Concepts, Configuration and Best Practices ...
VMworld 2013: Storage IO Control: Concepts, Configuration and Best Practices ...VMworld
 
Summer training oracle
Summer training   oracle Summer training   oracle
Summer training oracle Arshit Rai
 
Observability in real time at scale
Observability in real time at scaleObservability in real time at scale
Observability in real time at scaleBalvinder Hira
 
Jakarta EE Test Strategies (2022)
Jakarta EE Test Strategies (2022)Jakarta EE Test Strategies (2022)
Jakarta EE Test Strategies (2022)Ryan Cuprak
 
Summer training oracle
Summer training   oracle Summer training   oracle
Summer training oracle Arshit Rai
 
Use the SAP Content Server for Your Document Imaging and Archiving Needs!
Use the SAP Content Server for Your Document Imaging and Archiving Needs!Use the SAP Content Server for Your Document Imaging and Archiving Needs!
Use the SAP Content Server for Your Document Imaging and Archiving Needs!Verbella CMG
 
Focus on your app with Amazon RDS
Focus on your app with Amazon RDSFocus on your app with Amazon RDS
Focus on your app with Amazon RDSAmazon Web Services
 
Experience In Building Scalable Web Sites Through Infrastructure's View
Experience In Building Scalable Web Sites Through Infrastructure's ViewExperience In Building Scalable Web Sites Through Infrastructure's View
Experience In Building Scalable Web Sites Through Infrastructure's ViewPhuwadon D
 
SteelEye 201409 표준 제안서
SteelEye 201409 표준 제안서SteelEye 201409 표준 제안서
SteelEye 201409 표준 제안서Yong-uk Choe
 
MySQL Alta Disponibilidade com Replicação
 MySQL Alta Disponibilidade com Replicação MySQL Alta Disponibilidade com Replicação
MySQL Alta Disponibilidade com ReplicaçãoMySQL Brasil
 
VMworld 2013: Troubleshooting at Cox Communications with VMware vCenter Log I...
VMworld 2013: Troubleshooting at Cox Communications with VMware vCenter Log I...VMworld 2013: Troubleshooting at Cox Communications with VMware vCenter Log I...
VMworld 2013: Troubleshooting at Cox Communications with VMware vCenter Log I...VMworld
 
Enterprise Storage NAS - Dual Controller
Enterprise Storage NAS - Dual ControllerEnterprise Storage NAS - Dual Controller
Enterprise Storage NAS - Dual ControllerFernando Barrientos
 
#lspe: Dynamic Scaling
#lspe: Dynamic Scaling #lspe: Dynamic Scaling
#lspe: Dynamic Scaling steveshah
 

Similar to Scaling at Showyou: Operations (20)

Always On - Wydajność i bezpieczeństwo naszych danych - High Availability SQL...
Always On - Wydajność i bezpieczeństwo naszych danych - High Availability SQL...Always On - Wydajność i bezpieczeństwo naszych danych - High Availability SQL...
Always On - Wydajność i bezpieczeństwo naszych danych - High Availability SQL...
 
Log-Structured File System (LSFS) as a weapon to fight “I/O Blender” virtuali...
Log-Structured File System (LSFS) as a weapon to fight “I/O Blender” virtuali...Log-Structured File System (LSFS) as a weapon to fight “I/O Blender” virtuali...
Log-Structured File System (LSFS) as a weapon to fight “I/O Blender” virtuali...
 
A Very Biased Comparison of MVC Libraries
A Very Biased Comparison of MVC LibrariesA Very Biased Comparison of MVC Libraries
A Very Biased Comparison of MVC Libraries
 
Ultimate SharePoint Infrastructure Best Practices Session - Live360 Orlando 2012
Ultimate SharePoint Infrastructure Best Practices Session - Live360 Orlando 2012Ultimate SharePoint Infrastructure Best Practices Session - Live360 Orlando 2012
Ultimate SharePoint Infrastructure Best Practices Session - Live360 Orlando 2012
 
Planning For High Performance Web Application
Planning For High Performance Web ApplicationPlanning For High Performance Web Application
Planning For High Performance Web Application
 
Azure SQL Managed Instance - SqlBits 2019
Azure SQL Managed Instance - SqlBits 2019Azure SQL Managed Instance - SqlBits 2019
Azure SQL Managed Instance - SqlBits 2019
 
VMworld 2013: Storage IO Control: Concepts, Configuration and Best Practices ...
VMworld 2013: Storage IO Control: Concepts, Configuration and Best Practices ...VMworld 2013: Storage IO Control: Concepts, Configuration and Best Practices ...
VMworld 2013: Storage IO Control: Concepts, Configuration and Best Practices ...
 
Summer training oracle
Summer training   oracle Summer training   oracle
Summer training oracle
 
Jvm fundamentals
Jvm fundamentalsJvm fundamentals
Jvm fundamentals
 
Observability in real time at scale
Observability in real time at scaleObservability in real time at scale
Observability in real time at scale
 
Jakarta EE Test Strategies (2022)
Jakarta EE Test Strategies (2022)Jakarta EE Test Strategies (2022)
Jakarta EE Test Strategies (2022)
 
Summer training oracle
Summer training   oracle Summer training   oracle
Summer training oracle
 
Use the SAP Content Server for Your Document Imaging and Archiving Needs!
Use the SAP Content Server for Your Document Imaging and Archiving Needs!Use the SAP Content Server for Your Document Imaging and Archiving Needs!
Use the SAP Content Server for Your Document Imaging and Archiving Needs!
 
Focus on your app with Amazon RDS
Focus on your app with Amazon RDSFocus on your app with Amazon RDS
Focus on your app with Amazon RDS
 
Experience In Building Scalable Web Sites Through Infrastructure's View
Experience In Building Scalable Web Sites Through Infrastructure's ViewExperience In Building Scalable Web Sites Through Infrastructure's View
Experience In Building Scalable Web Sites Through Infrastructure's View
 
SteelEye 201409 표준 제안서
SteelEye 201409 표준 제안서SteelEye 201409 표준 제안서
SteelEye 201409 표준 제안서
 
MySQL Alta Disponibilidade com Replicação
 MySQL Alta Disponibilidade com Replicação MySQL Alta Disponibilidade com Replicação
MySQL Alta Disponibilidade com Replicação
 
VMworld 2013: Troubleshooting at Cox Communications with VMware vCenter Log I...
VMworld 2013: Troubleshooting at Cox Communications with VMware vCenter Log I...VMworld 2013: Troubleshooting at Cox Communications with VMware vCenter Log I...
VMworld 2013: Troubleshooting at Cox Communications with VMware vCenter Log I...
 
Enterprise Storage NAS - Dual Controller
Enterprise Storage NAS - Dual ControllerEnterprise Storage NAS - Dual Controller
Enterprise Storage NAS - Dual Controller
 
#lspe: Dynamic Scaling
#lspe: Dynamic Scaling #lspe: Dynamic Scaling
#lspe: Dynamic Scaling
 

Recently uploaded

A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 

Scaling at Showyou: Operations

  • 1. Introduction Storage Processing Monitoring Review Scaling at Showyou Operations September 26, 2011
  • 2. Introduction Storage Processing Monitoring Review I’m Kyle Kingsbury Handle aphyr Code http://github.com/aphyr Email kyle@remixation.com Focus Backend, API, ops
  • 3. Introduction Storage Processing Monitoring Review What the hell is Showyou?
  • 4.
  • 5. Introduction Storage Processing Monitoring Review Nontrivial complexity
  • 6. Introduction Storage Processing Monitoring Review Challenges ˆ Scanning social networks ˆ Feeds ˆ Search ˆ Trends ˆ Responsive client experience
  • 7. Introduction Storage Processing Monitoring Review Challenges ˆ Scanning social networks ˆ Feeds ˆ Search ˆ Trends ˆ Responsive client experience ˆ Everything fails all the time
  • 8.
  • 9. Introduction Storage Processing Monitoring Review Storage
  • 10.
  • 11. Introduction Storage Processing Monitoring Review We left MySQL
  • 12. Introduction Storage Processing Monitoring Review We left MySQL ˆ Changing the schema requires downtime ˆ Crashes ˆ Master-slave lag ˆ Slow restarts ˆ Node replacements difficult ˆ Fully normalized queries complex, slow
  • 13. Introduction Storage Processing Monitoring Review MySQL does scale But there are tradeoffs
  • 14. Introduction Storage Processing Monitoring Review Riak ˆ Key/value store ˆ Homogenous ˆ Scales linearly with nodes ˆ Excellent durability/recoverability ˆ Eventually consistent
  • 15. Introduction Storage Processing Monitoring Review We use Riak as our durable datastore ˆ Users, feeds, videos, etc ˆ Highly denormalized ˆ Limited MR queries (feeds, etc) ˆ Latency-bounded MR jobs are Erlang ˆ Hot-deployable ˆ Extensive use of conflict resolution ˆ Made possible by Risky
  • 16. Introduction Storage Processing Monitoring Review Riak at Showyou ˆ 51 million keys (153 M replicated) ˆ 100 GB of data (300 GB replicated) ˆ 260 gets/sec (baseline) ˆ 75 puts/sec (baseline) ˆ Capable of over 3000 ops/sec
  • 17. Introduction Storage Processing Monitoring Review SSDs are amazing WD 7200RPM Micron RealSSD P300 ˆ 100 ops/sec ˆ 1000+ ops/sec ˆ 95%: 100-300ms ˆ 95%: 3-5ms
  • 18. Introduction Storage Processing Monitoring Review When Riak fails, ˆ Another node takes up the slack ˆ Clients connected to that node reconnect to others ˆ Typically no service interruption ˆ However, latencies may rise ˆ Especially for MR jobs
  • 19. Introduction Storage Processing Monitoring Review Riak has downsides ˆ Difficult to debug ˆ Membership changes are dangerous ˆ Significantly slower than MySQL ˆ (Bitcask) All keys must fit in memory ˆ Mapreduce is only appropriate for known keys ˆ List-keys can take down your cluster Long story short: it’s only a KV store
  • 20. Introduction Storage Processing Monitoring Review +Redis
  • 21. Introduction Storage Processing Monitoring Review We use Redis for fast, temporary state ˆ List of users ˆ List of videos ˆ Counters ˆ Queues Incredibly fast, excellent primitives
  • 22. Introduction Storage Processing Monitoring Review When Redis fails, ˆ Daemons using those indexes pause ˆ Frontend service continues ˆ Bitcask scanners and incremental updaters repair any lost data
  • 23. Introduction Storage Processing Monitoring Review When Redis fails, ˆ Daemons using those indexes pause ˆ Frontend service continues ˆ Bitcask scanners and incremental updaters repair any lost data Eventually consistent.
  • 24. Introduction Storage Processing Monitoring Review We also use SOLR extensively ˆ Supplements Riak ˆ Complex indices ˆ Full-text search ˆ Analytics More on that later. . .
  • 25. Introduction Storage Processing Monitoring Review Processing
  • 26.
  • 27. Introduction Storage Processing Monitoring Review Do one thing well Lots of small processes handling well-defined tasks ˆ Easier to debug ˆ Easier to test ˆ Simplifies parallelism ˆ Simplifies error handling ˆ Less likely to cause total system failure
  • 28. Introduction Storage Processing Monitoring Review Minimize Shared State ˆ Vector clocks for concurrent modification ˆ Queues for message passing ˆ Riak for durable storage ˆ Redis for fast synchronous state
  • 29. Introduction Storage Processing Monitoring Review Crash by Default ˆ Someone else will take your work ˆ Repair constantly ˆ Assume everybody is out to kill you
  • 30. Introduction Storage Processing Monitoring Review Distribute ˆ Multiple threads, processes, hosts ˆ Failover IPs with Heartbeat ˆ Rolling restarts mean frequent deploys and nobody notices ˆ Losing a node is no big deal ˆ Scaling out is easy
  • 31.
  • 32. Introduction Storage Processing Monitoring Review Monitoring
  • 33.
  • 34. Introduction Storage Processing Monitoring Review UState: A state aggregator
  • 35. Introduction Storage Processing Monitoring Review Receive states over protobufs Host backend1.showyou.com Service feed merger rate Time unix epoch seconds State ok Metric 12.5 Description 12.5 feed items/sec
  • 36. Introduction Storage Processing Monitoring Review Query states ˆ state = "warning" or state = "critical" ˆ service =∼ "api %" and host != null
  • 37. Introduction Storage Processing Monitoring Review ˆ Combine states together (sum, average, . . . ) ˆ Send email on changes ˆ Forward to another UState server ˆ Forward to Graphite ˆ Dashboard
  • 38.
  • 39.
  • 40. Introduction Storage Processing Monitoring Review Understand application behavior
  • 41.
  • 42. Introduction Storage Processing Monitoring Review When can we. . . ?
  • 43.
  • 44. Introduction Storage Processing Monitoring Review It’s 23:15 PST.
  • 45. Introduction Storage Processing Monitoring Review It’s 23:15 PST. Do you know where YOUR database is?
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52.
  • 53. Introduction Storage Processing Monitoring Review http://github.com/aphyr/ustate
  • 54. Introduction Storage Processing Monitoring Review Recap ˆ Robust, discrete components ˆ Highly distributed ˆ Message passing ˆ Eventual consistency ˆ Comprehensive monitoring
  • 55. Introduction Storage Processing Monitoring Review Thanks! ˆ Basho (esp. Pharkmillups!) ˆ Formspring ˆ Bump