This document discusses Cloudflare's global network and the process of building and managing over 80 points of presence (PoPs). It outlines Cloudflare's global distributed network and anycast routing system, and describes the benefits this architecture provides including performance, resiliency, and mitigating attacks. It then details the strategic planning, challenges, and solutions involved in continuously deploying new PoPs around the world at a rate of one per week.
Infrastructure as Code with Terraform: Koombea TechTalksKoombea
This deck was presented as part of a company initiative, #TechTalks, aimed to provide a space for the sharing and exploration of topics of interest in the industry.
Presented by: Juan Pablo Jaramilo, DevOps
Why we need open systems, and how to create them in the era of the Cloud (Ops...Igalia
By Katerina Barone-Adesi.
The promise of open infrastructure and decentralized cloud tools, using Snabb as an example to make some of the ideas more concrete.
Infrastructure as Code with Terraform: Koombea TechTalksKoombea
This deck was presented as part of a company initiative, #TechTalks, aimed to provide a space for the sharing and exploration of topics of interest in the industry.
Presented by: Juan Pablo Jaramilo, DevOps
Why we need open systems, and how to create them in the era of the Cloud (Ops...Igalia
By Katerina Barone-Adesi.
The promise of open infrastructure and decentralized cloud tools, using Snabb as an example to make some of the ideas more concrete.
Jumbo helps you deploy development environments for Big Data technologies. It takes a few minutes to get a custom virtualized Hadoop cluster up and running, with all the technologies you needed installed.
Over the last year, how many times have you found yourself itching to implement that great new service? Then you start getting your hands dirty and notice that you have to make choices: time to market? reliability? performance? If you are running low on budget (as I usually am), the problem gets exponentially more complicated. In this talk, I am going to share some recent experiences I had on using the Go programming language to explore the free-tier of cloud providers to quickly develop web services without putting aside good performance, and reliability.
Presented at the AWS Washington DC User Group and updated for September 25
While AWS is very affordable, it can get really costly for the average at-home user. The free tier is a great way to get started but how do you use AWS long-term while keeping costs low? Michael Soh, unashamed cheap skate and the unofficial chief innovation engineer at the Inova Translational Medicine Institute, shares his journey into AWS and how the average user can get the most out of their dollars.
This work is released via the CC-BY-SA license.
While user tracking with WebTrends, comScore, Google Analytics etc. is a de-facto standard in the online world, tracking visitors in the real world is still fragmented. From a wide perspective, potential tracking data is produced by various sensors. With a real ‘bricks and mortar’ store, one could figure out possible sensors they could use: customer frequency counters at the doors, the cashier system, free WiFi access points, video capture, temperature, background music, smells and many more. For many of those sensors additional hardware and software would be needed, but a few sensors already have solutions available, e.g. video capturing with face or even eye recognition. The most interesting sensor data that doesn’t require additional hardware and software could be the WiFi access points. Especially given that many visitors will have WiFi enabled mobile phones. This talk demonstrates how WiFi access point log files can be used to answer different questions for a particular store.
Definición de derecho laboral y los conceptos de los Principios mínimos fundamentales y generales de la legislación laboral y los mecanismos de Garantía de los derechos laborales
Customer data management - great tool for increasing salesМаксим Остархов
В презентации рассматривается влияние качества данных о клиентах на эффективность маркетинговых кампаний. Данные о клиентах рассматривается как инструмент увеличения эффективности продаж.
Jumbo helps you deploy development environments for Big Data technologies. It takes a few minutes to get a custom virtualized Hadoop cluster up and running, with all the technologies you needed installed.
Over the last year, how many times have you found yourself itching to implement that great new service? Then you start getting your hands dirty and notice that you have to make choices: time to market? reliability? performance? If you are running low on budget (as I usually am), the problem gets exponentially more complicated. In this talk, I am going to share some recent experiences I had on using the Go programming language to explore the free-tier of cloud providers to quickly develop web services without putting aside good performance, and reliability.
Presented at the AWS Washington DC User Group and updated for September 25
While AWS is very affordable, it can get really costly for the average at-home user. The free tier is a great way to get started but how do you use AWS long-term while keeping costs low? Michael Soh, unashamed cheap skate and the unofficial chief innovation engineer at the Inova Translational Medicine Institute, shares his journey into AWS and how the average user can get the most out of their dollars.
This work is released via the CC-BY-SA license.
While user tracking with WebTrends, comScore, Google Analytics etc. is a de-facto standard in the online world, tracking visitors in the real world is still fragmented. From a wide perspective, potential tracking data is produced by various sensors. With a real ‘bricks and mortar’ store, one could figure out possible sensors they could use: customer frequency counters at the doors, the cashier system, free WiFi access points, video capture, temperature, background music, smells and many more. For many of those sensors additional hardware and software would be needed, but a few sensors already have solutions available, e.g. video capturing with face or even eye recognition. The most interesting sensor data that doesn’t require additional hardware and software could be the WiFi access points. Especially given that many visitors will have WiFi enabled mobile phones. This talk demonstrates how WiFi access point log files can be used to answer different questions for a particular store.
Definición de derecho laboral y los conceptos de los Principios mínimos fundamentales y generales de la legislación laboral y los mecanismos de Garantía de los derechos laborales
Customer data management - great tool for increasing salesМаксим Остархов
В презентации рассматривается влияние качества данных о клиентах на эффективность маркетинговых кампаний. Данные о клиентах рассматривается как инструмент увеличения эффективности продаж.
The bond between automation and network engineeringJimmy Lim
The slides are presented in IDNOG5. It discusses the evolving role of network engineering. Automation is an integral part of network engineering. It provides real examples on how automation is the master of network engineering in Cloudflare.
This talk goes over the host identification process we follow, the development of EyeWitness 1.0, the problems which lead to 2.0 and talk about future work on EyeWitness.
Urs Hoelzle
Vice President
Google
Summary
● Google operates two large backbone networks
○ Internet-facing backbone (user traffic)
○ Datacenter backbone (internal traffic)
● Managing large backbones is hard
● OpenFlow has helped us improve backbone performance and reduce backbone complexity and cost
● I'll tell you how
ONS2015: http://bit.ly/ons2015sd
ONS Inspire! Webinars: http://bit.ly/oiw-sd
Watch the talk (video) on ONS Content Archives: http://bit.ly/ons-archives-sd
Build real time stream processing applications using Apache KafkaHotstar
This talk was presented at the Hotstar Scale Meetup in Bangalore by Jayesh Sidhwani
In this talk, the presenter introduces Apache Kafka and the Apache Kafka Streams library. Starting from the need for building streaming applications to thinking the use-cases as a streaming job - this talk covers all the technicalities.
It ends with a short description of how Kafka is deployed and used at Hotstar
Keeping the Internet Fast and Resilient for You and Your CustomersCloudflare
Many of the most common uses of the Internet today weren’t envisioned when it was created. In many ways, the success of the Internet and the TCP/IP protocol once envisioned by DARPA is pushing it to the limits. As a result, ensuring high-performance for end-users is complicated. Join Cloudflare experts for a talk that will describe the depth of these problems -- ranging from how routing breaks, to how shortage of IP space (under IPv4) hurts performance, to route leaks -- and how these issues lead to congestion and poor performance. They'll also discuss an approach to solving these challenges given the constraints.
In Jan 2012, Zynga was kind enough to invite me to speak at their SF office. These are the slides I presented; its much of the same SPDY content, although starting to focus more on mobile.
Architecting for the Cloud: Hoping for the Best, Prepared for the Worstmartincozzi
Infrastructure as code, automation, monitoring, disaster recovery, security, scaling and cost tracking are all subjects that are easily accessible but too often overlooked until it is already too late. In this session Cotap will share what AWS offers to help them stay ahead of the curve. By following 4 simple rules they will show how Cotap's Engineering team has been able to run for the past 12 months with over four nines of availability. They deploy 3 to 5 times a day, run in 2 regions/6 AZs and still manage to keep AWS costs below the monthly salary of an Engineer.
Kraken is a P2P docker image distribution system. It’s loosely based on BitTorrent protocol, fully compatible with docker registry API, and supports pluggable storage backends like S3, HDFS, etc. It successfully solved scaling problems we saw under different scenarios, also greatly sped up container deployment.
Mixing performance, configurability, density, and security at scale has, historically, been hard with PHP. Early approaches have involved CGIs, suhosin, or multiple Apache instances. Then came PHP-FPM. At Pantheon, we've taken PHP-FPM, integrated it with cgroups, namespaces, and systemd socket activation. We use it to deliver all of our goals at unheard-of densities: thousands and thousands of isolated pools per box.
Ensuring Performance in a Fast-Paced Environment (CMG 2014)Martin Spier
Netflix accounts for more than a third of all traffic heading into American homes at peak hours. Making sure users are getting the best possible experience at all times is no simple feat and performance is at the core of this experience. In order to ensure performance and maintain development agility in a highly decentralized environment/(organization?), Netflix employs a multitude of strategies, such as production canary analysis, fully automated performance tests, simple zero-downtime deployments and rollbacks, auto-scaling clusters and a fault-tolerant stateless service architecture. We will present a set of use cases that demonstrate how and why different groups employ different strategies to achieve a common goal, great performance and stability, and detail how these strategies are incorporated into development, test and DevOps with minimal overhead.
3. ● 4+ million zones/domains
● 43+ billion DNS queries/day
● How?
○ Orange cloud
○ Global distributed network
in 80+ locations
Still growing fast!
○ Anycast routing
Protect and accelerate any website online
4. Benefit of orange cloud
● Direct visitors to the nearest entry point
○ Fast!
■ Lesser hops
■ Reduced latency
■ Improved performance
● Save bandwidth!
○ Lesser requests to origin
■ Typically 50% of the resources on any given
web page are cacheable
○ Mitigate malicious visitors or DDoS
■ Stop them before get to the origin web server
● Resiliency
○ 80+ locations!
8. Strategic Planning
● Agreement/Negotiation
● Location
○ Peering Exchanges
○ Cost
○ Support
● Size
○ Traffic analysis
■ Number of Racks
■ Equipment types
■ Transits/Peering Exchanges
● How many?
● How big are the pipes?
9. Challenges
● Installation
○ Regulation
■ Import policy
○ Transits
■ Different carriers have different setup/policies
○ Language barriers
● Human factors
○ Configuration errors!
■ Anycast
● Traffic turnup
○ How to ensure it is not impacting
■ No outages please!
10. Solutions
● Out of band network is a must!
○ Acting as last resort
○ Upgrade/downgrade
○ Maintenances
● Configuration template
○ Auto configuration
■ Anycast!
○ Peer review
● Global Network Engineering
○ Round the clock deployment
■ Reduced bottleneck
11. Testing with providers
● Circuit testing
○ Point to point extended ping test
■ Test all physical ports
○ Failover Testing
■ Redundancy
● Do not create a blackhole instead!
● Use testing prefix
○ Global versus domestic
■ RIPE Atlas measurement
■ Public route servers
○ Good related BGP configuration
■ It does what is supposed to do
12. Traffic Turnup
● Do not send all prefixes at 1 go!
○ Start with few prefixes
○ Check the routing to these few prefixes
■ Global traffic analysis
● No big drop of traffic in other location
● Traffic comes from the right countries
○ Monitor for 24 hours
■ Confirms there are no anomalies observed
● On the new location
● Globally
○ Announce all prefixes
■ In batches
■ Repeat the same steps above!
13. Traffic Turnup
● Get the providers to be involved
○ Especially if it is a single homed
○ Inform them the schedule
■ Get them to understand what to expect
■ Troubleshoot and fix the problem faster!
○ Their users might be able to see problem faster
16. Building Resilience Network
● Stable hardware and software
● Automatic configuration template/peer review
● Solid monitoring system
● Network automation
● Global network engineering
17. Hardware and Software
● Proper evaluation and testing
○ Fits requirement
○ Bugs free
○ Scalable
● Global standardization
○ Same models of hardware
○ Same software codes
● No mass software upgrade!
○ Small PoP first
○ Deploy in batches
18. Solid Monitoring System
● Reduced unwanted alerts
○ Only gets relevant alerts
○ Silence PoP/ports during maintenances
● Monitor the performance of transit providers
○ Detects packet loss on their backbone
○ Provides automatic related traceroutes
○ Actions based on severity
■ Disabling the PoP automatically
■ Disabling traffic on related transit provider automatically
■ Suggests on actions to do
26. Global Network Engineering
● Follow the sun approach
○ San Francisco -> Singapore -> London -> San Francisco
● Doing all stuffs
○ Technical operations
○ Network engineering
○ Network expansion projects
○ New PoPs deployment
○ Peering stuffs
● Very fast response to network issues and escalation