This talk provides an overview of the large platform architecture used at State51 to deliver independent music and media content. It discusses using technologies like Varnish, Nginx, MogileFS, and Memcached to serve content efficiently from cheap hardware. Jobs and queues are used to process workflows like encoding files. The architecture is designed to be redundant and handle failures across the web, application, storage, and processing layers. Logging and monitoring are important for debugging in these complex systems.
A look at mod_php and fastcgi and how apache handles internal HTTP requests. Aim is to provide web developers and architects with architectural information on how mod_php and fastcgi handle static and dynamic requests to provide background knowledge when deciding on which way to go for your server or application.
A look at mod_php and fastcgi and how apache handles internal HTTP requests. Aim is to provide web developers and architects with architectural information on how mod_php and fastcgi handle static and dynamic requests to provide background knowledge when deciding on which way to go for your server or application.
A quick overview of the seed for Meandre 2.0 series. It covers the main motivations moving forward and the disruptive changes introduced via the use of Scala and MongoDB
WebAssembly is great as a target for a low-level code, but in order to do something useful, it needs to interact with the outer world. On the Web, this means performing all sorts of I/O through Web APIs, and here comes the challenge: they are designed to be asynchronous, but WebAssembly is not.
Let's take a look at this problem and see how it can be fixed, as well as at some examples we can build by combining power of both modern Web APIs and a fast native compilation target.
This is a talk I gave about Offline First development at jsDay Verona on May 14th, 2015 and TopConf Tallinn on November 18th, 2015 .
It covers why and when we should prepare our web apps for the offline state, which browser capabilities help us to accomplish the job and how we can detect the offline state for a better UI.
Empowering developers to deploy their own data storesTomas Doran
Empowering developers to deploy their own data stores using Terrafom, Puppet and rage. A talk about automating server building and configuration for Elasticsearch clusters, using Hashicorp and puppet labs tool. Presented at Config Management Camp 2016 in Ghent
Dockersh and a brief intro to the docker internalsTomas Doran
Dockersh is a new tool to give a login shell into per-user Docker containers. (https://github.com/Yelp/dockersh) This talk will be an illustrated tour of what dockersh does, and why it might be useful to you. During this journey we’ll dive into the Go programming language, + libcontainer (the technologies Docker is built on) in addition to the facilities Docker uses in the kernel (Namespaces, Cgroups and Capabilities), how these work, and how normal mortals can (ab)use them for fun and profit
More Related Content
Similar to Large platform architecture in (mostly) perl - an illustrated tour
A quick overview of the seed for Meandre 2.0 series. It covers the main motivations moving forward and the disruptive changes introduced via the use of Scala and MongoDB
WebAssembly is great as a target for a low-level code, but in order to do something useful, it needs to interact with the outer world. On the Web, this means performing all sorts of I/O through Web APIs, and here comes the challenge: they are designed to be asynchronous, but WebAssembly is not.
Let's take a look at this problem and see how it can be fixed, as well as at some examples we can build by combining power of both modern Web APIs and a fast native compilation target.
This is a talk I gave about Offline First development at jsDay Verona on May 14th, 2015 and TopConf Tallinn on November 18th, 2015 .
It covers why and when we should prepare our web apps for the offline state, which browser capabilities help us to accomplish the job and how we can detect the offline state for a better UI.
Empowering developers to deploy their own data storesTomas Doran
Empowering developers to deploy their own data stores using Terrafom, Puppet and rage. A talk about automating server building and configuration for Elasticsearch clusters, using Hashicorp and puppet labs tool. Presented at Config Management Camp 2016 in Ghent
Dockersh and a brief intro to the docker internalsTomas Doran
Dockersh is a new tool to give a login shell into per-user Docker containers. (https://github.com/Yelp/dockersh) This talk will be an illustrated tour of what dockersh does, and why it might be useful to you. During this journey we’ll dive into the Go programming language, + libcontainer (the technologies Docker is built on) in addition to the facilities Docker uses in the kernel (Namespaces, Cgroups and Capabilities), how these work, and how normal mortals can (ab)use them for fun and profit
Sensu and Sensibility - Puppetconf 2014Tomas Doran
As the Yelp infrastructure and engineering team grew, so did the pain of managing Nagios. Problems like splitting alerting across multiple teams, providing high availability and managing nagios systems in multiple environments had become pressing. As we grew towards a service oriented architecture and pushed some services out into the cloud, we rapidly needed more automated monitoring configuration.
An evolutionary solution wasn’t going to solve all of our problems, we needed to revolutionize our monitoring. Sensu is built from the ground up to solve many of our issues and be easy to extend.
This talk covers our puppet ‘monitoring_check’ API (that sets up monitoring for our services within puppet), how and why we deploy Sensu and our custom handlers and escalations, along with how we provide automatic ‘self service’ monitoring for dynamic services and how we deal with the challenges posed by the more ephemeral nature of cloud architectures.
Building a smarter application stack - service discovery and wiring for DockerTomas Doran
There are many advantages to a container based, microservices architecture - however, as always, there is no silver bullet. Any serious deployment will involve multiple host machines, and will have a pressing need to migrate containers between hosts at some point. In such a dynamic world hard coding IP addresses, or even host names is not a viable solution.
This talk will take a journey through how Yelp has solved the discovery problems using Airbnb’s SmartStack to dynamically discover service dependencies, and how this is helping unify our architecture, from traditional metal to EC2 ‘immutable’ SOA images, to Docker containers.
Chasing AMI - Building Amazon machine images with Puppet, Packer and JenkinsTomas Doran
Using puppet when configuring EC2 machines seems a natural fit. However bringing up new machines from a community image with puppet is not trivial and can be slow, and so not useful for auto-scaling.
The cloud also offers a solution to ongoing server maintenance, allowing you to launch fresh instances whenever you upgrade your applications (Immutable or Phoenix servers). However to predictably succeed, you need to freeze the puppet code alongside the application version for deployment.
The solution to these issues is generating custom machine images (AMIs) with your software inlined. This talk will cover Yelp's use of a Packer, Jenkins and Puppet for generating AMIs. This will include how we deal with issues like bootstrapping, getting canonical information about a machine's environment and cluster state at launch time, as well as supporting immutable/phoenix servers in combination with more traditional long lived servers inside our hybrid cloud infrastructure.
My talk from the Bay area puppetcamp about deploying puppet code to a global network of puppet masters as quickly as possible.
Covers the design and implementation of the TIM Group (and now Yelp) puppetupdate mcollective agent: https://github.com/Yelp/puppetupdate/
Talk from Puppet Camp Munich 2013 about how to lay out classes and defines in puppet code, and how to use hiera data.
Covers puppet 2.7 => 3.3 and how to write sanely forwards compatible code between them.
My talk from the pupet devops conference Berlin 2014 (http://code-your-config.com/). A low level tour of some terrible terrible puppet code, with advice on how to do it better (including showing off the awesome new each() construct you get in puppet 3.2)
Messaging, interoperability and log aggregation - a new frameworkTomas Doran
In this talk, I will talk about why log files are horrible, logging log lines, and more structured performance metrics from large scale production applications as well as building reliable, scaleable and flexible large scale software systems in multiple languages.
Why (almost) all log formats are horrible will be explained, and why JSON is a good solution for logging will be discussed, along with a number of message queuing, middleware and network transport technologies, including STOMP, AMQP and ZeroMQ.
The Message::Passing framework will be introduced, along with the logstash.net project which the perl code is interoperable with. These are pluggable frameworks in ruby/java/jruby and perl with pre-written sets of inputs, filters and outputs for many many different systems, message formats and transports.
They were initially designed to be aggregators and filters of data for logging. However they are flexible enough to be used as part of your messaging middleware, or even as a replacement for centralised message queuing systems.
You can have your cake and eat it too - an architecture which is flexible, extensible, scaleable and distributed. Build discrete, loosely coupled components which just pass messages to each other easily.
Integrate and interoperate with your existing code and code bases easily, consume from or publish to any existing message queue, logging or performance metrics system you have installed.
Simple examples using common input and output classes will be demonstrated using the framework, as will easily adding your own custom filters. A number of common messaging middleware patterns will be shown to be trivial to implement.
Some higher level use-cases will also be explored, demonstrating log indexing in ElasticSearch and how to build a responsive platform API using webhooks.
Interoperability is also an important goal for messaging middleware. The logstash.net project will be highlighted and we'll discuss crossing the single language barrier, allowing us to have full integration between java, ruby and perl components, and to easily write bindings into libraries we want to reuse in any of those languages.
Large platform architecture in (mostly) perl - an illustrated tour
1. Large platform
architecture in (mostly)
perl - an illustrated
tour
Tomas (t0m) Doran
São Paulo.pm perl workshop 2010
YAPC::EU Pisa 2010
2. This talk
• Is mostly a ramble
• About what I do for a living
• Good bits
• and bad bits (probably mostly bad bits)
• And when I say ‘illustrated’, I’m not very
good at diagrams, sorry...
3. Making money from
independent music
• IMPOSSIBLE
• No, no it isn’t. But we’re very lucky to have
people who know the music industry
• A startup would tank
• Last.fm guys “keep losing less money”
4. The state51 conspiracy
Consolidated Independent
Media Service Provider
• Several (largely profitable) businesses based
on the same technology platform
• East London (Brick Lane), a warehouse.
• > 60% of UK independent content goes
through us somewhere
5. Being S3 on the cheap
• WAV files are big.Videos are bigger.
• Transcodes aren’t small, especially when
you have 15 of them.
• My music collection is several hundred
terrabytes
• Need to be able to serve this stuff fast and
concurrently.
6. MogileFS
• Is free.
• Runs on cheap hardware
• Cheaper then S3.
• Not so awesome if you aren’t Livejournal
7. Data center design
• 8 amp racks. Seriously, WTF!?!?!
• Electricity is more expensive than servers,
ergo rolling hardware upgrades trivially pay
for themselves.
• Transit is really, really expensive.
• Worth buying fiber to other locations to
peer if you need lots of bandwidth.
9. Web architecture
• App servers apache, apps FastCGI, port 81
• Varnish + ESI, caching, port 80
• 1 varnish per host, talks to all the apaches
• 1 VIP per host
• Host fail:VIP transfer
• Apache/app fail (or overload), varnish
rebalances/retries.
10. Web architecture (cont)
• Varnish doesn’t cache media, just provides
failover.
• nginx sends the hit to FastCGI app.
• Returns X-Accel-Redirect.
• nginx talks to MogileFS, handles delivery.
12. Storage architecture
• Lots of boxes with lots of disk.
• Many additional roles to storage. (Mogile
tracker, memcache node, metal encoding,
VMWare, SOAP Service)
• Not all the boxes do all the roles.
• All the roles can safely fall over and die.
• Which is good, as they do. Or the box falls
over. Or a, then b.
14. WAV files
• WAV is a container format.
• Loosely defined.
• You can stuff XML documents in WAV files
• Some encoders (oh hai flac) very picky.
• ‘dirty’ and ‘clean’ WAV files.
16. Win32
• We’re running ActiveState for hysterical
raisins.
• No XS modules
• Thin as possible
17. Encoding
HTTP Nodes
HTTP Nodes
HTTP Nodes Encoding Service Uploading Service
GET
&
PUT
SOAP
media
Encoder
Downloader Uploader
Win32 &
Local Disk Encoder
(mp3)
Encoder
(wma) Unix
18. Snakes On A Plane
• SOAP actually works ok here, as we
control both ends.
• Old version of SOAP::Lite
• Wouldn’t recommend interoperating
19. Logging
• Used to be terribly hard to debug
• Push logs into syslog
• Aggregate in splunk - time correlated from
encoding machines, web service machines,
etc.
• Much easier to work out what happened.
20. Hardware is shit
• When you have several 100 Tb, undetected
bit error rate of magnetic media is actually
significant.
• See also networks, memory, etc.
21. Things will always fail
• If you need reliability, you have to design it
in from the start.
• Not only will you have (a lot of) hardware
failures, all the software will break in
unexpected ways. Lets not talk about
netotworks..
• Maybe you don’t need this..
22. Queueing
• We have work queues of different types of
media (e.g. mp3/wma/aac etc)
• In the database.
• Don’t do this.
23. MySQL sucks
• 1 type of JOIN
• No query rewriting
• Not enough stats for the planner to be
sane
24. This can hurt
• File Transform table:
• Master (File)
• Result (File)
• Status (pending/complete/failed/running)
• TransformStep (from/to)
• Leads to bad join order, massive fail
26. How to fail
• SELECT all file transforms that lead to wma
(millions).
• JOIN all files, ever (millions). Filter to find
those in state ‘pending’
• All pending looks like a bad bet - cardinality
of ‘all wmas’ looks better than cardinality of
‘all pending’.
• JOIN in the wrong order, nested loop,
screwed..
27. Queueing
• Did I mention queues in the DB suck?
• Even if you’re not screwing it up.
• Get a Message Queue (or at least an async
job server)
• If your problem is simple - Gearman.
Harder or you need interop - RabbitMQ.
28. Mutable state
• Mutable state is the enemy
• Too many things rw.
• No idea how an object got to this state
29. Anemic domain model
Object-oriented programming (OOP) is a
programming paradigm that uses "objects" –
data structures consisting of data fields and
methods together with their
interactions – to design applications and
computer programs. Programming techniques
may include features such as data
abstraction, encapsulation, modularity,
polymorphism, and inheritance.
30. Anemic domain model
• Superset of too much mutable state
• Able to create invalid objects
• Able to make previously valid objects
invalid
• Violation of the encapsulation and
information hiding principles.
31. scripts
• Lots of our business logic was in scripts
that manipulated objects
• You need people to run scripts (in screen
sessions)
• Ewwww, ewwwww.
32. Jobs
• Moved to a job based approach
• Jobs started by file creation, or changing
state of something in a web app
• Jobs sent via message queuing.
• Results go via message queueing
• Jobs trigger other jobs
33. Jobs Example
• Validate XLS file supplied with order.
• Valid files trigger another job to create
objects for each thing in the XLS
• This then triggers another job to create
transforms, which are then done...
• ... etc ...
• Can’t do this workflow in a web request.
34. Jobs Future
• More automation of things people run
scripts for.
• Automatic job regeneration (you will lose
messages).
35. Lava flow
• Old (possibly unclean/invalid) data
• Old (unused/unmaintained) code
• “What harm does it do”
37. Data consistency
• This should theoretically be the same thing
as relational integrity.
• In practice...
38. Mumble View Crap
• Too much logic in templates
• Copy & paste
• Business objects viewed as unchangeable
• Deleted 3000 lines from 2 simple
workflows. This fixed a dozen bugs.
39. Tangram
• No LEFT JOIN
• Displaying a product list becomes an x n
problem.
• OUCH
• Keep stupid - put the entire DB hot in
memcache!
40. Don’t do web design
• You are a programmer
• Make people pay for a design/CSS/HTML
person
• Work with them
• Be happy
41. Love your sysadmins
• Help them out.
• Build packages, or local::libs or something
• Keep everything in revision control
• Allow things to be sensibly configured.
• DOCUMENT THE POSSIBLE SETTINGS
• Use systems management - Puppet?
42. Love your logs
• Active feedback
• Aggregate in splunk
• Actively prune useless stuff
• Actively add useful stuff after a production
incident
43. ESI
• Is really awesome
• Make the pain go away
• PURGE requests
• Keep everything hot all the time
44. memcache everything
• Keep the entire database hot in memcache
• We mostly ask trivial questions, so just
cache those paths.
• 30 Gb of RAM isn’t actually much (3
boxes..)
45. memcache
• IS A CACHE
• Use sequential port numbers and CNAMES
• E.g. cache0:11210, cache1:11211,
cache2:11212 etc..
• Run several per machine
• Allows you to scale capacity and rebalance
without entire cache flush.
46. Don’t push bytes
• X-Sendfile and X-Accel-Redirect
• I already talked about file delivery like this
• Using 100Mb of RAM to proxy web
requests does not scale.
47. Test everything
• Redundant systems need testing
• You’ll still die unexpectedly in production
• If you can manage it, make responsibility for
deployment SEP.