The Secrets of Building Realtime Big Data Systemsnathanmarz
The architectural principles behind building systems that scale to vast amounts of data and operate on that data in realtime.
Presented at POSSCON '11.
Two popular tools for doing Machine Learning on top of JVM ecosystem is H2O and SparkML. This presentation compares these two tools as Machine Learning libraries (Didn't consider Spark's Data Munjing perspective). This work was done during June of 2018.
Performance Benchmarking: Tips, Tricks, and Lessons LearnedTim Callaghan
Presentation covering 25 years worth of lessons learned while performance benchmarking applications and databases. Presented at Percona Live London in November 2014.
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Amazon Web Services
"This is a technical architect's case study of how Loggly has employed the latest social-media-scale technologies as the backbone ingestion processing for our multi-tenant, geo-distributed, and real-time log management system. This presentation describes design details of how we built a second-generation system fully leveraging AWS services including Amazon Route 53 DNS with heartbeat and latency-based routing, multi-region VPCs, Elastic Load Balancing, Amazon Relational Database Service, and a number of pro-active and re-active approaches to scaling computational and indexing capacity.
The talk includes lessons learned in our first generation release, validated by thousands of customers; speed bumps and the mistakes we made along the way; various data models and architectures previously considered; and success at scale: speeds, feeds, and an unmeltable log processing engine."
The Secrets of Building Realtime Big Data Systemsnathanmarz
The architectural principles behind building systems that scale to vast amounts of data and operate on that data in realtime.
Presented at POSSCON '11.
Two popular tools for doing Machine Learning on top of JVM ecosystem is H2O and SparkML. This presentation compares these two tools as Machine Learning libraries (Didn't consider Spark's Data Munjing perspective). This work was done during June of 2018.
Performance Benchmarking: Tips, Tricks, and Lessons LearnedTim Callaghan
Presentation covering 25 years worth of lessons learned while performance benchmarking applications and databases. Presented at Percona Live London in November 2014.
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Amazon Web Services
"This is a technical architect's case study of how Loggly has employed the latest social-media-scale technologies as the backbone ingestion processing for our multi-tenant, geo-distributed, and real-time log management system. This presentation describes design details of how we built a second-generation system fully leveraging AWS services including Amazon Route 53 DNS with heartbeat and latency-based routing, multi-region VPCs, Elastic Load Balancing, Amazon Relational Database Service, and a number of pro-active and re-active approaches to scaling computational and indexing capacity.
The talk includes lessons learned in our first generation release, validated by thousands of customers; speed bumps and the mistakes we made along the way; various data models and architectures previously considered; and success at scale: speeds, feeds, and an unmeltable log processing engine."
PHP Backends for Real-Time User Interaction using Apache Storm.DECK36
Engaging users in real-time is the topic of our times. Whether it’s a game, a shop, or a content-network, the aim remains the same: providing a personalized experience. In this workshop we will look under the hood of Apache Storm and lay a firm foundation on how to use it with PHP. By that, you can leverage your existing codebase and PHP expertise for an entirely new world: real-time analytics and business logic working on message streams. During the course of the workshop, we will introduce Apache Storm and take a look at all of its components. We will then skyrocket the applicability of Storm by showing you how to implement their components with PHP. All exercises will be conducted using an example project, the infamous and most exhilarating lolcat kitten game ever conceived: Plan 9 From Outer Kitten. In order to follow the hands-on excercises, you will need a development VM prepared by us with all relevant system components and our project repositories. To make the workshop experience as smooth as possible for all participants, please bring a prepared computer to the workshop, as there will be no time to deal with installation and setup issues. Please download all prerequisites and install them as described: VM, Plan 9 webapp, Plan 9 storm backend, (Tutorial: https://github.com/DECK36/plan9_workshop_tutorial ).
Functional Comparison and Performance Evaluation of Streaming FrameworksHuafeng Wang
A report covers the functional comparison and performance evaluation between Apache Flink, Apache Spark Streaming, Apache Storm and Apache Gearpump(incubating)
Slides of erlang factory 2011 London talk "Designing for performance with erlang"
Video of this presentation available at http://vimeo.com/26715793#at=0
Erlang factory SF 2011 "Erlang and the big switch in social games"Paolo Negri
talk given at erlang factory 2011 about using erlang to build social games backends
Watch the video of this presentation http://vimeo.com/22144057#at=0
PHP Backends for Real-Time User Interaction using Apache Storm.DECK36
Engaging users in real-time is the topic of our times. Whether it’s a game, a shop, or a content-network, the aim remains the same: providing a personalized experience. In this workshop we will look under the hood of Apache Storm and lay a firm foundation on how to use it with PHP. By that, you can leverage your existing codebase and PHP expertise for an entirely new world: real-time analytics and business logic working on message streams. During the course of the workshop, we will introduce Apache Storm and take a look at all of its components. We will then skyrocket the applicability of Storm by showing you how to implement their components with PHP. All exercises will be conducted using an example project, the infamous and most exhilarating lolcat kitten game ever conceived: Plan 9 From Outer Kitten. In order to follow the hands-on excercises, you will need a development VM prepared by us with all relevant system components and our project repositories. To make the workshop experience as smooth as possible for all participants, please bring a prepared computer to the workshop, as there will be no time to deal with installation and setup issues. Please download all prerequisites and install them as described: VM, Plan 9 webapp, Plan 9 storm backend, (Tutorial: https://github.com/DECK36/plan9_workshop_tutorial ).
Functional Comparison and Performance Evaluation of Streaming FrameworksHuafeng Wang
A report covers the functional comparison and performance evaluation between Apache Flink, Apache Spark Streaming, Apache Storm and Apache Gearpump(incubating)
Slides of erlang factory 2011 London talk "Designing for performance with erlang"
Video of this presentation available at http://vimeo.com/26715793#at=0
Erlang factory SF 2011 "Erlang and the big switch in social games"Paolo Negri
talk given at erlang factory 2011 about using erlang to build social games backends
Watch the video of this presentation http://vimeo.com/22144057#at=0
.Net Microservices with Event Sourcing, CQRS, Docker and... Windows Server 20...Javier García Magna
Good technical practices you can follow with (micro)services but can be applied to almost anything: discovery (microphone/consul), security, resilience (polly), composition, ssecurity (jwt/oauth2)... And then an example with a CQRS application, and how docker can be used in Windows 2016. Lastly a brief summary of what Service Fabric is and its programming models.
A high level overview of MeteorJS, Amazon Web Services, and how to scale MeteorJS on Amazon's cloud to handle tends of thousands of concurrent websocket connections.
DCSF19 Container Security: Theory & Practice at NetflixDocker, Inc.
Michael Wardrop, Netflix
Usage of containers has undergone rapid growth at Netflix and it is still accelerating. Our container story started organically with developers downloading Docker and using it to improve their developer experience. The first production workloads were simple batch jobs, pioneering micro-services followed, then status as a first class platform running critical workloads.
As the types of workloads changed and their importance increased, the security of our container ecosystem needed to evolve and adapt. This session will cover some security theory, architecture, along with practical considerations, and lessons we learnt along the way.
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDogRedis Labs
Think you have big data? What about high availability
requirements? At DataDog we process billions of data points every day including metrics and events, as we help the world
monitor the their applications and infrastructure. Being the world’s monitoring system is a big responsibility, and thanks to
Redis we are up to the task. Join us as we discuss how the DataDog team monitors and scales Redis to power our SaaS based monitoring offering. We will discuss our usage and deployment patterns, as well as dive into monitoring best practices for production Redis workloads
AWS re:Invent presentation: Unmeltable Infrastructure at Scale by Loggly SolarWinds Loggly
April 2014 update to this presentation: Loggly removed Storm from its architecture. Details here: https://www.loggly.com/blog/what-we-learned-about-scaling-with-apache-storm/
This is a technical architect's case study of how Loggly has employed the latest social-media-scale technologies as the backbone ingestion processing for our multi-tenant, geo-distributed, and real-time log management system. Given by Jim Nisbet and Philip O'Toole, this presentation describes design details of how we built a second-generation system fully leveraging AWS services including Amazon Route 53 DNS with heartbeat and latency-based routing, multi-region VPCs, Elastic Load Balancing, Amazon Relational Database Service, and a number of pro-active and re-active approaches to scaling computational and indexing capacity.
The talk includes lessons learned in our first generation release, validated by thousands of customers; speed bumps and the mistakes we made along the way; various data models and architectures previously considered; and success at scale: speeds, feeds, and an unmeltable log processing engine.
Microservices in action at the Dutch National Police - Bert Jan Schrijver - C...Codemotion
At the Cloud, Big Data and Internet division of the Dutch National Police, 4 DevOps teams use the latest open source technology to build high tech, cloud native web applications using Spring Boot, Angular 5, Spark, Kafka and Jenkins 2. I'll share our experiences and real-world use cases for microservices. I’ll show how 4 teams work together on one product and I’ll talk about how we apply the principles of DevOps and Continuous Delivery. I’ll show how we handle security, build pipelines, test automation, performance tests, service discovery, automated deployments, monitoring and more!
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInLinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn. This was a presentation made at QCon 2009 and is embedded on LinkedIn's blog - http://blog.linkedin.com/
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
2. John Adams / @netik
• Early Twitter employee
• Operations + Security Team
• Keynote Speaker: O’Reilly Velocity 2009, 2010
• O’Reilly Web 2.0 Speaker (2008, 2010)
• Previous companies: Inktomi, Apple, c|net
3. Operations
• Support the site and the developers
• Make it performant
• Capacity Planning (metrics-driven)
• Configuration Management
• Improve existing architecture
4. 2,940 3,085
Tweets per Second Tweets per Second
Japan Scores! Lakers Win!
6. Where’s the cloud?
• Amazon S3 for Avatar, Background, and Asset
Storage. Few if any EC2 instances (test only.)
• Direct S3 is expensive
• Cost for data in, data out, data stored
• Cloudfront is cheaper than direct S3
• Cloudfront to S3 origins = free
• Use International CDNs on top of Cloudfront
7. CDNs and Versioning
• Hard to purge quickly from CDNs
• DMCA, image takedowns, emergency changes
• CDN images, javascript, and other static files
get long-lived expire times from origin
• Version entire static asset store on deploys
• /static/NNNNNN/directory/file...
9. Nothing works the
first time.
• Scale site using best available technologies
• Plan to build everything more than once.
• Most solutions work to a certain level of scale,
and then you must re-evaluate to grow.
• This is a continual process.
10. UNIX friends fail at scale
• Cron
• Add NTP, and many machines executing the
same thing cause “micro” outages across the site.
• Syslog
• Truncation, data loss, aggregation issues
• RRD
• Data rounding over time
16. Sysadmin 2.0 (Devops)
• Don’t be a just a sysadmin anymore.
• Think of Systems management as a
programming task (puppet, chef, cfengine...)
• No more silos, or lobbing things over the wall
• We’re all on the same side. Work Together!
17. Monitoring
• Twitter graphs and reports critical metrics in
as near to real time as possible
• If you build tools against our API, you should
too.
• Use this data to inform the public
• dev.twitter.com - API availability
• status.twitter.com
18. Data Analysis
• Instrumenting the world pays off.
• “Data analysis, visualization, and other
techniques for seeing patterns in data are
going to be an increasingly valuable skill set.
Employers take notice!”
“Web Squared: Web 2.0 Five Years On”, Tim O’Reilly, Web 2.0 Summit, 2009
20. Forecasting
Curve-fitting for capacity planning
(R, fityk, Mathematica, CurveFit)
unsigned int (32 bit)
Twitpocolypse
status_id
signed int (32 bit)
Twitpocolypse
r2=0.99
21. Loony (tool)
• Accesses central machine database (MySQL)
• Python, Django, Paraminko SSH
• Ties into LDAP
• Filter and list machines, find asset data
• On demand changes across machines
22. Murder
• Bittorrent based replication for deploys
(Python w/libtorrent)
• ~30-60 seconds to update >1k machines
• Uses our machine database to find destination
hosts
• Legal P2P
23. Configuration Management
• Start automated configuration management
EARLY in your company.
• Don’t wait until it’s too late.
• Twitter started within the first few months.
24. Virtualization
• A great start, but...
• Images are not a stable way to replicate
configuration
• Changes made to deployed images cannot be
easily detected and repaired
• Typically the only solution is to redeploy
25. Puppet
• Puppet + SVN
• Hundreds of modules
• Runs constantly
• Post-Commit hooks for config+sanity chceks
• No one logs into machines
• Centralized Change
28. Logging
• Syslog doesn’t work at high traffic rates
• No redundancy, no ability to recover from
daemon failure
• Moving large files around is painful
• Solution:
• Scribe
29. Scribe
• Fault-tolerant data transmission
• Simple data model, easy to extend
• Log locally, then scribe to aggregation nodes
• Slow-start recovery when aggregator goes down
• Twitter patches
• LZO compression and direct HDFS
30. Hadoop for Ops
• Once the data’s scribed to HDFS you can:
• Aggregate reports across thousands of
servers
• Produce application level metrics
• Use map-reduce to gain insight into your
systems.
31. Analyze
• Turn data into information
• Where is the code base going?
• Are things worse than they were?
• Understand the impact of the last software
deploy
• Run check scripts during and after deploys
• Capacity Planning, not Fire Fighting!
32. Dashboard
• “Criticals” view
• Smokeping/MRTG
• Google Analytics
• Not just for HTTP
200s/SEO
• XML Feeds from
managed services
33. Whale Watcher
• Simple shell script, Huge Win
• Whale = HTTP 503 (timeout)
• Robot = HTTP 500 (error)
• Examines last 60 seconds of aggregated
daemon / www logs
• “Whales per Second” > Wthreshold
• Thar be whales! Call in ops.
34. Deploy Watcher
Sample window: 300.0 seconds
First start time:
Mon Apr 5 15:30:00 2010 (Mon Apr 5 08:30:00 PDT 2010)
Second start time:
Tue Apr 6 02:09:40 2010 (Mon Apr 5 19:09:40 PDT 2010)
PRODUCTION APACHE: ALL OK
PRODUCTION OTHER: ALL OK
WEB049 CANARY APACHE: ALL OK
WEB049 CANARY BACKEND SERVICES: ALL OK
DAEMON031 CANARY BACKEND SERVICES: ALL OK
DAEMON031 CANARY OTHER: ALL OK
35. Deploys
• Block deploys if site in error state
• Graph time-of-deploy along side server CPU
and Latency
• Display time-of-last-deploy on dashboard
• Communicate deploys in Campfire to teams
^^ last deploy times ^^
36. Feature “Decider”
• Controls to enable and disable
computationally or I/O heavy features
• Our “Emergency Stop” button, and feature
release control. All features go through it.
• Changes logged and reported to all teams
• Hundreds of knobs we can turn.
• Not boolean, expressed as % of userbase
38. request flow
Load Balancers
Apache
Rails (Unicorn)
FlockDB Kestrel Memcached
MySQL Cassandra
Monitoring Daemons Mail Servers
39. Many limiting factors in the request pipeline
Apache Rails
Worker Model (unicorn)
MaxClients
2:1 oversubscribed
TCP Listen queue depth
to cores
Memcached
# connections
MySQL
Varnish (search) # db connections
# threads
40. Unicorn Rails Server
• Connection push to socket polling model
• Deploys without Downtime
• Less memory and 30% less CPU
• Shift from ProxyPass to Proxy Balancer
• mod_proxy_balancer lies about usage
• Race condition in counters patched
41. Rails
• Front-end (Scala/Java back-end)
• Not to blame for our issues. Analysis found:
• Caching + Cache invalidation problems
• Bad queries generated by ActiveRecord,
resulting in slow queries against the db
• Garbage Collection issues (20-25%)
• Replication Lag
42. memcached
• Network Memory Bus isn’t infinite
• Evictions make the cache unreliable for
important configuration data
(loss of darkmode flags, for example)
• Segmented into pools for better performance
• Examine slab allocation and watch for high
use/eviction rates on individual slabs using
peep. Adjust slab factors and size accordingly.
43. Decomposition
• SOA: Service Orientated Architecture
• Take application and decompose into services
• Admin the services as separate units
• Decouple the services from each other
44. Asynchronous Requests
• Executing work during the web request is
expensive
• The request pipeline should not be used to
handle 3rd party communications or
back-end work.
• Move work to queues
• Run daemons against queues
45. Thrift
• Cross-language services framework
• Originally developed at Facebook
• Now an Apache project
• Seamless operation between C++, Java,
Python, PHP, Ruby, Erlang, Perl, Haskell, C#,
Cocoa, Smalltalk, OCaml (phew!)
46. Kestrel
• Works like memcache (same protocol)
• SET = enqueue | GET = dequeue
• No strict ordering of jobs
• No shared state between servers
• Written in Scala. Open Source.
47. Daemons
• Many different types at Twitter.
• # of daemons have to match the workload
• Early Kestrel would crash if queues filled
• “Seppaku” patch
• Kill daemons after n requests
• Long-running leaky daemons = low memory
48. Flock DB Flock DB
• Gizzard sharding
framework Gizzard
• Billions of edges
• MySQL backend
Mysql Mysql Mysql
• Open Source
• http://github.com/twitter/gizzard
49. Disk is the new Tape.
• Social Networking application profile has
many O(ny) operations.
• Page requests have to happen in < 500mS or
users start to notice. Goal: 250-300mS
• Web 2.0 isn’t possible without lots of RAM
• What to do?
50. Caching
• We’re “real time”, but still lots of caching
opportunity
• Most caching strategies rely on long TTLs (>60 s)
• Separate memcache pools for different data types
to prevent eviction
• Optimize Ruby Gem to libmemcached + FNV
Hash instead of Ruby + MD5
• Twitter largest contributor to libmemcached
51. Caching
• “Cache Everything!” not the best policy, as
• Invalidating caches at the right time is
difficult.
• Cold Cache problem; What happens after
power or system failure?
• Use cache to augment db, not to replace
52. MySQL
• We have many MySQL servers
• Increasingly used more and more as key/value
store
• Many instances spread out through the
Gizzard sharding framework
53. MySQL Challenges
• Replication Delay
• Single threaded replication = pain.
• Social Networking not good for RDBMS
• N x N relationships and social graph / tree
traversal - we have FlockDB for that
• Disk issues
• FS Choice, noatime, scheduling algorithm
54. Database Replication
• Major issues around users and statuses tables
• Multiple functional masters (FRP, FWP)
• Make sure your code reads and writes to the
write DBs. Reading from master = slow death
• Monitor the DB. Find slow / poorly designed
queries
• Kill long running queries before they kill you
(mkill)
55. Key Points
• Use cloud where necessary
• Instrument everything.
• Use metrics to make decisions, not intuition.
• Decompose your app and remove
dependencies between services
• Process asynchronously when possible