SlideShare a Scribd company logo
1 of 18
Download to read offline
TV and video on the
Internet
How we created a CDN to host
video clips and live broadcasts.
Presentation plan
•Our motivation
•A little bit about what we wanted to achieve
•How did we do it?
– the project and objectives
– edge and origin nodes
– redirector
– traffic modeling (sounds smart, doesn’t it?  )
– replication and file management
– statistics
– live-streaming
•What does already work,
•… and what doesn’t and what are the plans
Our motivation
•We’ve created a free CDN before - videosteam.pl
– which made us aware that creating such systems is not so
simple 
•We were given a chance to create a large video portal - TiVi.pl
•We simply wanted to create something interesting
What we wanted to achieve
•Distribution network (edge) and storage (origin) based on ordinary
PCs - if possible - keeping prices low for customers
•Effective division of traffic, matching routes to the customer's
network - as much as possible
+ use of multiple datacenters
•Data replication instead of backups
+ automatic shutdown of broken servers and replacing them with
new ones
•Supporting not only static files but also live broadcasts
•We wanted the service to be consistent with the standards (e.g.
HTTP, WebDAV, RTMP …) – making it easier to use
Why ordinary PCs? http://www.manageability.org/blog/stuff/economics-google-hardware-infrastructure
Assumptions
•To ensure redundancy wherever possible
•To use proven software and possibly convert/adjust it
– Varnish, nginx, Apache, MogileFS
•Write only the missing components
– the more code, the more errors 
– redirector, file management (WebDAV), statistics, billings
– Java, Python + WSGI – only when necessary, rewrite the
key elements in C
The project
Secure Archive of
the Origin servers
is composed of
clusters (which
may be in different
DataCenters)
equipped with a
replicated system
of files. The
clusters are
independent. The
system stores a
minimum of 2-3
copies of each
uploaded file.
Edge nodes
located in different
networks (TPSA,
PLIX, WIX…)
service the traffic
outgoing to
customers
Redirector selects
the best node
position according
to load and
network location of
a customer
Edge and origin nodes
•We decided to separate the nodes – there may be less storage
nodes (origin) - traffic from/to the end customer reaches only
the edge nodes that are simpler and there can be more of
them
•Edge nodes act as a proxy using the processed Varnish and
nginx retrieving data from the origin nodes
– we needed software which redirects the user to an
operating and not overloaded node
•On the origin nodes we use nginx and MogileFS, which
replicates data and automatically renews copies - doing a lot
of good work,
– we needed software to manage the files for customers
•Initially, edge and origin nodes may be the same machine
Redirector
•Accepts all read requests (both files and live)
– HTTP redirection – updates faster than DNS, simpler than BGP – live
broadcast operated separately in the player
•Has information of edge nodes status - state, load, bandwidth limit,
bandwidth usage, system load and its placement (datacenter/ASN /
network)
•Has information about customers' networks and "distance"
between them (ping, hops, the amount of ASNs) from DC - knows
which network a given customer comes from by IP address
– on this basis it can choose the most favorable server to the customer and
redirect their question there
•It runs on a minimum of 2 nodes (+ hardware load balancer dividing
traffic between redirectors), heavily uses cache and is written in
python + wsgi - we have achieved approx. 2000 req/s per server
Traffic modelling
• Redirector’s main task
• For each request:
– it takes the customer's address and checks which
network is the customer from
– it checks the weight/distance of the network from the
particular DC and selects the best - the weights are
updated every 5 minutes. By separate applications
running on the nodes, additional weight to manual
modeling of traffic according to network policies
– we take into account hops, route / amount of ASNs
along the way (more significant only), we initially
counted distances between servers, but the distance
from DC is enough
– it selects a group of servers that support a particular
request (livestreaming, pseudo-streaming, static
files/buffered video)
– it selects the least loaded server from the group
(random with weights) and provides it to the customer
+ caches it for less than a minute
Replication and management
• Replication is provided by MogileFS - each file must be in 2-3
replicas (different classes of replication) - if a node fails, the file is
replicated to other servers
• File management - software written in Python to provide
WebDAV interface with so-called Bridge. In the future, we would
like to add support for S3 API. The bridge mediates between the
customer and MogileFS Tracker and MogileFS HTTP interface
(nginx in our case)
Replication and management
• The Bridge runs on two servers (+ hardware load balancer) and
a MySQL base (master + several slaves)
• In addition to the redirector, the Bridge is a key element, we
test it automatically with scripts performing basic operations of
through curl immediately after the deployment (through SVN)
• The Bridge also protects files – it provides edge nodes with
information if a particular file is available (tokens, expired
customer accounts, in future also access rights to files)
Replication and management
Statistics
•There’s really a lot of logs (for the time being approx. 50 req/s per
server - everything goes to access logs) - a simple map/reduce
•We collect statistics from edge servers and the Bridge - wear for
customers (transfer, storage, hits), stream wear (number of
emissions, number of customers)
•We had to write several small programs:
– Statistics are initially processed through logalyzer (with
rotating logs) and uploaded to SimpleStorage in CSV format,
– logcollector downloads log packs and uploads them to the
base in bulk,
– statscalc aggregates data in subtotals, non-aggregated data is
removed over time.
•As a result – a user sees changes in statistics on a page updated
hourly
Livestreaming
•The system has been running productively
•Compatible with RTMP/RTMPT/RTMPE – already created Java
server with the changes (statistics) installed on all the nodes
•Traffic division done by using proxy on the level of video
application (a code in Java sends a video from the server the
customer is broadcasting from to target servers – we can
dynamically choose servers for a given customer)
•We've added broadcaster authorization (tokens - supported
by the Bridge)
Livestreaming
•… as well as watcher authorization (tokens can be handled
individually by the webservice of a particular service, e.g.
VoD)
•Redirector manages redirection to streaming (support was
necessary in the player) - RTMP doesn’t support redirects
•It would be great to pack also in HTTP – buffering, among
other things, would work then and there would be no
problems with firewalls
– soon, we would like to add support for smooth-streaming
and streaming on iPhones (mpeg-ts) - we can do it by
transcoding on the fly (we’ve already conducted trials) or
Wowza-type dedicated servers, IIS - heterogeneous
environment is a disadvantage
Livestreaming
Plans for development?
• file authorization
• optimization and improving traffic modeling algorithm
• streaming through HTTP and smooth-streaming
• streaming for iPhones
• introducing more nodes – e.g. to PL-IX
• expansion and acceleration of statistics
• …
Contact
Piotr Karwatka
pkarwatka@divante.pl
THANK YOU FOR ATTENTION

More Related Content

What's hot

Nick Bond - Zeus - Load Balancing in the Cloud - CloudCamp Berlin 30.04.2009
Nick Bond - Zeus - Load Balancing in the Cloud - CloudCamp Berlin 30.04.2009Nick Bond - Zeus - Load Balancing in the Cloud - CloudCamp Berlin 30.04.2009
Nick Bond - Zeus - Load Balancing in the Cloud - CloudCamp Berlin 30.04.2009CloudAngels
 
IBM MQ High Availabillity and Disaster Recovery (2017 version)
IBM MQ High Availabillity and Disaster Recovery (2017 version)IBM MQ High Availabillity and Disaster Recovery (2017 version)
IBM MQ High Availabillity and Disaster Recovery (2017 version)MarkTaylorIBM
 
Traffic analysis for Planning, Peering and Security by Julie Liu
Traffic analysis for Planning, Peering and Security by Julie LiuTraffic analysis for Planning, Peering and Security by Julie Liu
Traffic analysis for Planning, Peering and Security by Julie LiuMyNOG
 
How Apache Pulsar Helps Tencent Process Tens of Billions of Transactions Effi...
How Apache Pulsar Helps Tencent Process Tens of Billions of Transactions Effi...How Apache Pulsar Helps Tencent Process Tens of Billions of Transactions Effi...
How Apache Pulsar Helps Tencent Process Tens of Billions of Transactions Effi...StreamNative
 
Making communication across boundaries simple with Azure Service Bus
Making communication across boundaries simple with Azure Service BusMaking communication across boundaries simple with Azure Service Bus
Making communication across boundaries simple with Azure Service BusParticular Software
 
NServiceBus - building a distributed system based on a messaging infrastructure
NServiceBus - building a distributed system based on a messaging infrastructureNServiceBus - building a distributed system based on a messaging infrastructure
NServiceBus - building a distributed system based on a messaging infrastructureMauro Servienti
 
Flume and Hadoop performance insights
Flume and Hadoop performance insightsFlume and Hadoop performance insights
Flume and Hadoop performance insightsOmid Vahdaty
 
Creating a Centralized Consumer Profile Management Service with WebSphere Dat...
Creating a Centralized Consumer Profile Management Service with WebSphere Dat...Creating a Centralized Consumer Profile Management Service with WebSphere Dat...
Creating a Centralized Consumer Profile Management Service with WebSphere Dat...Prolifics
 
Kafka as Message Broker
Kafka as Message BrokerKafka as Message Broker
Kafka as Message BrokerHaluan Irsad
 
Membase East Coast Meetups
Membase East Coast MeetupsMembase East Coast Meetups
Membase East Coast MeetupsMembase
 
Membase Meetup - Silicon Valley
Membase Meetup - Silicon ValleyMembase Meetup - Silicon Valley
Membase Meetup - Silicon ValleyMembase
 
Copy of Kafka-Camus
Copy of Kafka-CamusCopy of Kafka-Camus
Copy of Kafka-CamusDeep Shah
 
Local DNS with pfSense 2.4 - pfSense Hangout April 2018
Local DNS with pfSense 2.4 - pfSense Hangout April 2018Local DNS with pfSense 2.4 - pfSense Hangout April 2018
Local DNS with pfSense 2.4 - pfSense Hangout April 2018Netgate
 
Advanced Captive Portal - pfSense Hangout June 2017
Advanced Captive Portal - pfSense Hangout June 2017Advanced Captive Portal - pfSense Hangout June 2017
Advanced Captive Portal - pfSense Hangout June 2017Netgate
 
RADIUS and LDAP - pfSense Hangout August 2015
RADIUS and LDAP - pfSense Hangout August 2015RADIUS and LDAP - pfSense Hangout August 2015
RADIUS and LDAP - pfSense Hangout August 2015Netgate
 
Creating a DMZ - pfSense Hangout January 2016
Creating a DMZ - pfSense Hangout January 2016Creating a DMZ - pfSense Hangout January 2016
Creating a DMZ - pfSense Hangout January 2016Netgate
 
Feb 2013 HUG: Large Scale Data Ingest Using Apache Flume
Feb 2013 HUG: Large Scale Data Ingest Using Apache FlumeFeb 2013 HUG: Large Scale Data Ingest Using Apache Flume
Feb 2013 HUG: Large Scale Data Ingest Using Apache FlumeYahoo Developer Network
 

What's hot (20)

Nick Bond - Zeus - Load Balancing in the Cloud - CloudCamp Berlin 30.04.2009
Nick Bond - Zeus - Load Balancing in the Cloud - CloudCamp Berlin 30.04.2009Nick Bond - Zeus - Load Balancing in the Cloud - CloudCamp Berlin 30.04.2009
Nick Bond - Zeus - Load Balancing in the Cloud - CloudCamp Berlin 30.04.2009
 
IBM MQ High Availabillity and Disaster Recovery (2017 version)
IBM MQ High Availabillity and Disaster Recovery (2017 version)IBM MQ High Availabillity and Disaster Recovery (2017 version)
IBM MQ High Availabillity and Disaster Recovery (2017 version)
 
Traffic analysis for Planning, Peering and Security by Julie Liu
Traffic analysis for Planning, Peering and Security by Julie LiuTraffic analysis for Planning, Peering and Security by Julie Liu
Traffic analysis for Planning, Peering and Security by Julie Liu
 
Load balancing
Load balancingLoad balancing
Load balancing
 
How Apache Pulsar Helps Tencent Process Tens of Billions of Transactions Effi...
How Apache Pulsar Helps Tencent Process Tens of Billions of Transactions Effi...How Apache Pulsar Helps Tencent Process Tens of Billions of Transactions Effi...
How Apache Pulsar Helps Tencent Process Tens of Billions of Transactions Effi...
 
Making communication across boundaries simple with Azure Service Bus
Making communication across boundaries simple with Azure Service BusMaking communication across boundaries simple with Azure Service Bus
Making communication across boundaries simple with Azure Service Bus
 
NServiceBus - building a distributed system based on a messaging infrastructure
NServiceBus - building a distributed system based on a messaging infrastructureNServiceBus - building a distributed system based on a messaging infrastructure
NServiceBus - building a distributed system based on a messaging infrastructure
 
Mule overview
Mule overviewMule overview
Mule overview
 
Flume and Hadoop performance insights
Flume and Hadoop performance insightsFlume and Hadoop performance insights
Flume and Hadoop performance insights
 
Creating a Centralized Consumer Profile Management Service with WebSphere Dat...
Creating a Centralized Consumer Profile Management Service with WebSphere Dat...Creating a Centralized Consumer Profile Management Service with WebSphere Dat...
Creating a Centralized Consumer Profile Management Service with WebSphere Dat...
 
Flume
FlumeFlume
Flume
 
Kafka as Message Broker
Kafka as Message BrokerKafka as Message Broker
Kafka as Message Broker
 
Membase East Coast Meetups
Membase East Coast MeetupsMembase East Coast Meetups
Membase East Coast Meetups
 
Membase Meetup - Silicon Valley
Membase Meetup - Silicon ValleyMembase Meetup - Silicon Valley
Membase Meetup - Silicon Valley
 
Copy of Kafka-Camus
Copy of Kafka-CamusCopy of Kafka-Camus
Copy of Kafka-Camus
 
Local DNS with pfSense 2.4 - pfSense Hangout April 2018
Local DNS with pfSense 2.4 - pfSense Hangout April 2018Local DNS with pfSense 2.4 - pfSense Hangout April 2018
Local DNS with pfSense 2.4 - pfSense Hangout April 2018
 
Advanced Captive Portal - pfSense Hangout June 2017
Advanced Captive Portal - pfSense Hangout June 2017Advanced Captive Portal - pfSense Hangout June 2017
Advanced Captive Portal - pfSense Hangout June 2017
 
RADIUS and LDAP - pfSense Hangout August 2015
RADIUS and LDAP - pfSense Hangout August 2015RADIUS and LDAP - pfSense Hangout August 2015
RADIUS and LDAP - pfSense Hangout August 2015
 
Creating a DMZ - pfSense Hangout January 2016
Creating a DMZ - pfSense Hangout January 2016Creating a DMZ - pfSense Hangout January 2016
Creating a DMZ - pfSense Hangout January 2016
 
Feb 2013 HUG: Large Scale Data Ingest Using Apache Flume
Feb 2013 HUG: Large Scale Data Ingest Using Apache FlumeFeb 2013 HUG: Large Scale Data Ingest Using Apache Flume
Feb 2013 HUG: Large Scale Data Ingest Using Apache Flume
 

Similar to Tv and video on the Internet

C/S archtecture including basic networking
C/S archtecture including basic networkingC/S archtecture including basic networking
C/S archtecture including basic networkingabhinav2727
 
Building real time data-driven products
Building real time data-driven productsBuilding real time data-driven products
Building real time data-driven productsLars Albertsson
 
WebCamp 2016: DevOps. Николай Дойков: Опыт создания клауда для потокового вид...
WebCamp 2016: DevOps. Николай Дойков: Опыт создания клауда для потокового вид...WebCamp 2016: DevOps. Николай Дойков: Опыт создания клауда для потокового вид...
WebCamp 2016: DevOps. Николай Дойков: Опыт создания клауда для потокового вид...WebCamp
 
08. networking-part-2
08. networking-part-208. networking-part-2
08. networking-part-2Muhammad Ahad
 
System models 2 in distributed system
System models 2 in distributed systemSystem models 2 in distributed system
System models 2 in distributed systemishapadhy
 
2014 09-12 lambda-architecture-at-indix
2014 09-12 lambda-architecture-at-indix2014 09-12 lambda-architecture-at-indix
2014 09-12 lambda-architecture-at-indixYu Ishikawa
 
Introduction to the Internet and Web.pptx
Introduction to the Internet and Web.pptxIntroduction to the Internet and Web.pptx
Introduction to the Internet and Web.pptxhishamousl
 
Kinesis @ lyft
Kinesis @ lyftKinesis @ lyft
Kinesis @ lyftMian Hamid
 
EQR Reporting: Rails + Amazon EC2
EQR Reporting:  Rails + Amazon EC2EQR Reporting:  Rails + Amazon EC2
EQR Reporting: Rails + Amazon EC2jeperkins4
 
Architecting for the cloud scability-availability
Architecting for the cloud scability-availabilityArchitecting for the cloud scability-availability
Architecting for the cloud scability-availabilityLen Bass
 
Server operating system
Server operating systemServer operating system
Server operating systemTapan Khilar
 
ClickHouse Paris Meetup. Pragma Analytics Software Suite w/ClickHouse, by Mat...
ClickHouse Paris Meetup. Pragma Analytics Software Suite w/ClickHouse, by Mat...ClickHouse Paris Meetup. Pragma Analytics Software Suite w/ClickHouse, by Mat...
ClickHouse Paris Meetup. Pragma Analytics Software Suite w/ClickHouse, by Mat...Altinity Ltd
 
An in-building multi-server cloud system based on shortest Path algorithm dep...
An in-building multi-server cloud system based on shortest Path algorithm dep...An in-building multi-server cloud system based on shortest Path algorithm dep...
An in-building multi-server cloud system based on shortest Path algorithm dep...IOSR Journals
 
Cloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark AnalyticsCloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark Analyticsamesar0
 

Similar to Tv and video on the Internet (20)

C/S archtecture including basic networking
C/S archtecture including basic networkingC/S archtecture including basic networking
C/S archtecture including basic networking
 
Building real time data-driven products
Building real time data-driven productsBuilding real time data-driven products
Building real time data-driven products
 
WebCamp 2016: DevOps. Николай Дойков: Опыт создания клауда для потокового вид...
WebCamp 2016: DevOps. Николай Дойков: Опыт создания клауда для потокового вид...WebCamp 2016: DevOps. Николай Дойков: Опыт создания клауда для потокового вид...
WebCamp 2016: DevOps. Николай Дойков: Опыт создания клауда для потокового вид...
 
PLB
PLBPLB
PLB
 
08. networking-part-2
08. networking-part-208. networking-part-2
08. networking-part-2
 
System models 2 in distributed system
System models 2 in distributed systemSystem models 2 in distributed system
System models 2 in distributed system
 
2014 09-12 lambda-architecture-at-indix
2014 09-12 lambda-architecture-at-indix2014 09-12 lambda-architecture-at-indix
2014 09-12 lambda-architecture-at-indix
 
Introduction to the Internet and Web.pptx
Introduction to the Internet and Web.pptxIntroduction to the Internet and Web.pptx
Introduction to the Internet and Web.pptx
 
Kinesis @ lyft
Kinesis @ lyftKinesis @ lyft
Kinesis @ lyft
 
EQR Reporting: Rails + Amazon EC2
EQR Reporting:  Rails + Amazon EC2EQR Reporting:  Rails + Amazon EC2
EQR Reporting: Rails + Amazon EC2
 
Alternative metrics
Alternative metricsAlternative metrics
Alternative metrics
 
Architecting for the cloud scability-availability
Architecting for the cloud scability-availabilityArchitecting for the cloud scability-availability
Architecting for the cloud scability-availability
 
Server operating system
Server operating systemServer operating system
Server operating system
 
Javantura v3 - Real-time BigData ingestion and querying of aggregated data – ...
Javantura v3 - Real-time BigData ingestion and querying of aggregated data – ...Javantura v3 - Real-time BigData ingestion and querying of aggregated data – ...
Javantura v3 - Real-time BigData ingestion and querying of aggregated data – ...
 
ClickHouse Paris Meetup. Pragma Analytics Software Suite w/ClickHouse, by Mat...
ClickHouse Paris Meetup. Pragma Analytics Software Suite w/ClickHouse, by Mat...ClickHouse Paris Meetup. Pragma Analytics Software Suite w/ClickHouse, by Mat...
ClickHouse Paris Meetup. Pragma Analytics Software Suite w/ClickHouse, by Mat...
 
INT_Ch17.pptx
INT_Ch17.pptxINT_Ch17.pptx
INT_Ch17.pptx
 
An in-building multi-server cloud system based on shortest Path algorithm dep...
An in-building multi-server cloud system based on shortest Path algorithm dep...An in-building multi-server cloud system based on shortest Path algorithm dep...
An in-building multi-server cloud system based on shortest Path algorithm dep...
 
H017113842
H017113842H017113842
H017113842
 
Cloud Migration
Cloud MigrationCloud Migration
Cloud Migration
 
Cloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark AnalyticsCloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark Analytics
 

More from Divante

The eCommerce Platforms in the Global Setup
The eCommerce Platforms in the Global Setup	The eCommerce Platforms in the Global Setup
The eCommerce Platforms in the Global Setup Divante
 
eCommerce Trends 2020
eCommerce Trends 2020eCommerce Trends 2020
eCommerce Trends 2020Divante
 
Async & Bulk REST API new possibilities of communication between systems
Async & Bulk REST API new possibilities of communication  between systemsAsync & Bulk REST API new possibilities of communication  between systems
Async & Bulk REST API new possibilities of communication between systemsDivante
 
Magento Functional Testing Framework a way to seriously write automated tests...
Magento Functional Testing Framework a way to seriously write automated tests...Magento Functional Testing Framework a way to seriously write automated tests...
Magento Functional Testing Framework a way to seriously write automated tests...Divante
 
Die Top 10 Progressive Web Apps in der Modernbranche
Die Top 10 Progressive Web Apps in der ModernbrancheDie Top 10 Progressive Web Apps in der Modernbranche
Die Top 10 Progressive Web Apps in der ModernbrancheDivante
 
progressive web apps - pwa as a game changer for e-commerce - meet magento i...
 progressive web apps - pwa as a game changer for e-commerce - meet magento i... progressive web apps - pwa as a game changer for e-commerce - meet magento i...
progressive web apps - pwa as a game changer for e-commerce - meet magento i...Divante
 
Customer churn - how to stop it?
Customer churn - how to stop it?Customer churn - how to stop it?
Customer churn - how to stop it?Divante
 
eCommerce trends 2019 by Divante.co
eCommerce trends 2019 by Divante.coeCommerce trends 2019 by Divante.co
eCommerce trends 2019 by Divante.coDivante
 
How to create a Vue Storefront theme
How to create a Vue Storefront themeHow to create a Vue Storefront theme
How to create a Vue Storefront themeDivante
 
Game changer for e-commerce - Vue Storefront - open source pwa
Game changer for e-commerce - Vue Storefront - open source pwa Game changer for e-commerce - Vue Storefront - open source pwa
Game changer for e-commerce - Vue Storefront - open source pwa Divante
 
Vue Storefront - Progressive Web App for Magento (1.9, 2.x) - MM18DE speech
Vue Storefront - Progressive Web App for Magento (1.9, 2.x) - MM18DE speechVue Storefront - Progressive Web App for Magento (1.9, 2.x) - MM18DE speech
Vue Storefront - Progressive Web App for Magento (1.9, 2.x) - MM18DE speechDivante
 
How to successfully onboard end-clients to a B2B Platform - Magento Imagine ...
How to successfully onboard  end-clients to a B2B Platform - Magento Imagine ...How to successfully onboard  end-clients to a B2B Platform - Magento Imagine ...
How to successfully onboard end-clients to a B2B Platform - Magento Imagine ...Divante
 
eCommerce trends from 2017 to 2018 by Divante.co
eCommerce trends from 2017 to 2018 by Divante.coeCommerce trends from 2017 to 2018 by Divante.co
eCommerce trends from 2017 to 2018 by Divante.coDivante
 
Designing for PWA (Progressive Web Apps)
Designing for PWA (Progressive Web Apps)Designing for PWA (Progressive Web Apps)
Designing for PWA (Progressive Web Apps)Divante
 
Why is crud a bad idea - focus on real scenarios
Why is crud a bad idea - focus on real scenariosWhy is crud a bad idea - focus on real scenarios
Why is crud a bad idea - focus on real scenariosDivante
 
vue-storefront - PWA eCommerce for Magento2 MM17NYC presentation
vue-storefront - PWA eCommerce for Magento2 MM17NYC presentationvue-storefront - PWA eCommerce for Magento2 MM17NYC presentation
vue-storefront - PWA eCommerce for Magento2 MM17NYC presentationDivante
 
Pimcore Overview - Pimcore5
Pimcore Overview - Pimcore5Pimcore Overview - Pimcore5
Pimcore Overview - Pimcore5Divante
 
Pimcore E-Commerce Framework - Pimcore5
Pimcore E-Commerce Framework - Pimcore5Pimcore E-Commerce Framework - Pimcore5
Pimcore E-Commerce Framework - Pimcore5Divante
 
The biggest stores on Magento
The biggest stores on MagentoThe biggest stores on Magento
The biggest stores on MagentoDivante
 
B2B Commerce - how to become successful
B2B Commerce - how to become successfulB2B Commerce - how to become successful
B2B Commerce - how to become successfulDivante
 

More from Divante (20)

The eCommerce Platforms in the Global Setup
The eCommerce Platforms in the Global Setup	The eCommerce Platforms in the Global Setup
The eCommerce Platforms in the Global Setup
 
eCommerce Trends 2020
eCommerce Trends 2020eCommerce Trends 2020
eCommerce Trends 2020
 
Async & Bulk REST API new possibilities of communication between systems
Async & Bulk REST API new possibilities of communication  between systemsAsync & Bulk REST API new possibilities of communication  between systems
Async & Bulk REST API new possibilities of communication between systems
 
Magento Functional Testing Framework a way to seriously write automated tests...
Magento Functional Testing Framework a way to seriously write automated tests...Magento Functional Testing Framework a way to seriously write automated tests...
Magento Functional Testing Framework a way to seriously write automated tests...
 
Die Top 10 Progressive Web Apps in der Modernbranche
Die Top 10 Progressive Web Apps in der ModernbrancheDie Top 10 Progressive Web Apps in der Modernbranche
Die Top 10 Progressive Web Apps in der Modernbranche
 
progressive web apps - pwa as a game changer for e-commerce - meet magento i...
 progressive web apps - pwa as a game changer for e-commerce - meet magento i... progressive web apps - pwa as a game changer for e-commerce - meet magento i...
progressive web apps - pwa as a game changer for e-commerce - meet magento i...
 
Customer churn - how to stop it?
Customer churn - how to stop it?Customer churn - how to stop it?
Customer churn - how to stop it?
 
eCommerce trends 2019 by Divante.co
eCommerce trends 2019 by Divante.coeCommerce trends 2019 by Divante.co
eCommerce trends 2019 by Divante.co
 
How to create a Vue Storefront theme
How to create a Vue Storefront themeHow to create a Vue Storefront theme
How to create a Vue Storefront theme
 
Game changer for e-commerce - Vue Storefront - open source pwa
Game changer for e-commerce - Vue Storefront - open source pwa Game changer for e-commerce - Vue Storefront - open source pwa
Game changer for e-commerce - Vue Storefront - open source pwa
 
Vue Storefront - Progressive Web App for Magento (1.9, 2.x) - MM18DE speech
Vue Storefront - Progressive Web App for Magento (1.9, 2.x) - MM18DE speechVue Storefront - Progressive Web App for Magento (1.9, 2.x) - MM18DE speech
Vue Storefront - Progressive Web App for Magento (1.9, 2.x) - MM18DE speech
 
How to successfully onboard end-clients to a B2B Platform - Magento Imagine ...
How to successfully onboard  end-clients to a B2B Platform - Magento Imagine ...How to successfully onboard  end-clients to a B2B Platform - Magento Imagine ...
How to successfully onboard end-clients to a B2B Platform - Magento Imagine ...
 
eCommerce trends from 2017 to 2018 by Divante.co
eCommerce trends from 2017 to 2018 by Divante.coeCommerce trends from 2017 to 2018 by Divante.co
eCommerce trends from 2017 to 2018 by Divante.co
 
Designing for PWA (Progressive Web Apps)
Designing for PWA (Progressive Web Apps)Designing for PWA (Progressive Web Apps)
Designing for PWA (Progressive Web Apps)
 
Why is crud a bad idea - focus on real scenarios
Why is crud a bad idea - focus on real scenariosWhy is crud a bad idea - focus on real scenarios
Why is crud a bad idea - focus on real scenarios
 
vue-storefront - PWA eCommerce for Magento2 MM17NYC presentation
vue-storefront - PWA eCommerce for Magento2 MM17NYC presentationvue-storefront - PWA eCommerce for Magento2 MM17NYC presentation
vue-storefront - PWA eCommerce for Magento2 MM17NYC presentation
 
Pimcore Overview - Pimcore5
Pimcore Overview - Pimcore5Pimcore Overview - Pimcore5
Pimcore Overview - Pimcore5
 
Pimcore E-Commerce Framework - Pimcore5
Pimcore E-Commerce Framework - Pimcore5Pimcore E-Commerce Framework - Pimcore5
Pimcore E-Commerce Framework - Pimcore5
 
The biggest stores on Magento
The biggest stores on MagentoThe biggest stores on Magento
The biggest stores on Magento
 
B2B Commerce - how to become successful
B2B Commerce - how to become successfulB2B Commerce - how to become successful
B2B Commerce - how to become successful
 

Tv and video on the Internet

  • 1. TV and video on the Internet How we created a CDN to host video clips and live broadcasts.
  • 2. Presentation plan •Our motivation •A little bit about what we wanted to achieve •How did we do it? – the project and objectives – edge and origin nodes – redirector – traffic modeling (sounds smart, doesn’t it?  ) – replication and file management – statistics – live-streaming •What does already work, •… and what doesn’t and what are the plans
  • 3. Our motivation •We’ve created a free CDN before - videosteam.pl – which made us aware that creating such systems is not so simple  •We were given a chance to create a large video portal - TiVi.pl •We simply wanted to create something interesting
  • 4. What we wanted to achieve •Distribution network (edge) and storage (origin) based on ordinary PCs - if possible - keeping prices low for customers •Effective division of traffic, matching routes to the customer's network - as much as possible + use of multiple datacenters •Data replication instead of backups + automatic shutdown of broken servers and replacing them with new ones •Supporting not only static files but also live broadcasts •We wanted the service to be consistent with the standards (e.g. HTTP, WebDAV, RTMP …) – making it easier to use Why ordinary PCs? http://www.manageability.org/blog/stuff/economics-google-hardware-infrastructure
  • 5. Assumptions •To ensure redundancy wherever possible •To use proven software and possibly convert/adjust it – Varnish, nginx, Apache, MogileFS •Write only the missing components – the more code, the more errors  – redirector, file management (WebDAV), statistics, billings – Java, Python + WSGI – only when necessary, rewrite the key elements in C
  • 6. The project Secure Archive of the Origin servers is composed of clusters (which may be in different DataCenters) equipped with a replicated system of files. The clusters are independent. The system stores a minimum of 2-3 copies of each uploaded file. Edge nodes located in different networks (TPSA, PLIX, WIX…) service the traffic outgoing to customers Redirector selects the best node position according to load and network location of a customer
  • 7. Edge and origin nodes •We decided to separate the nodes – there may be less storage nodes (origin) - traffic from/to the end customer reaches only the edge nodes that are simpler and there can be more of them •Edge nodes act as a proxy using the processed Varnish and nginx retrieving data from the origin nodes – we needed software which redirects the user to an operating and not overloaded node •On the origin nodes we use nginx and MogileFS, which replicates data and automatically renews copies - doing a lot of good work, – we needed software to manage the files for customers •Initially, edge and origin nodes may be the same machine
  • 8. Redirector •Accepts all read requests (both files and live) – HTTP redirection – updates faster than DNS, simpler than BGP – live broadcast operated separately in the player •Has information of edge nodes status - state, load, bandwidth limit, bandwidth usage, system load and its placement (datacenter/ASN / network) •Has information about customers' networks and "distance" between them (ping, hops, the amount of ASNs) from DC - knows which network a given customer comes from by IP address – on this basis it can choose the most favorable server to the customer and redirect their question there •It runs on a minimum of 2 nodes (+ hardware load balancer dividing traffic between redirectors), heavily uses cache and is written in python + wsgi - we have achieved approx. 2000 req/s per server
  • 9. Traffic modelling • Redirector’s main task • For each request: – it takes the customer's address and checks which network is the customer from – it checks the weight/distance of the network from the particular DC and selects the best - the weights are updated every 5 minutes. By separate applications running on the nodes, additional weight to manual modeling of traffic according to network policies – we take into account hops, route / amount of ASNs along the way (more significant only), we initially counted distances between servers, but the distance from DC is enough – it selects a group of servers that support a particular request (livestreaming, pseudo-streaming, static files/buffered video) – it selects the least loaded server from the group (random with weights) and provides it to the customer + caches it for less than a minute
  • 10. Replication and management • Replication is provided by MogileFS - each file must be in 2-3 replicas (different classes of replication) - if a node fails, the file is replicated to other servers • File management - software written in Python to provide WebDAV interface with so-called Bridge. In the future, we would like to add support for S3 API. The bridge mediates between the customer and MogileFS Tracker and MogileFS HTTP interface (nginx in our case)
  • 11. Replication and management • The Bridge runs on two servers (+ hardware load balancer) and a MySQL base (master + several slaves) • In addition to the redirector, the Bridge is a key element, we test it automatically with scripts performing basic operations of through curl immediately after the deployment (through SVN) • The Bridge also protects files – it provides edge nodes with information if a particular file is available (tokens, expired customer accounts, in future also access rights to files)
  • 13. Statistics •There’s really a lot of logs (for the time being approx. 50 req/s per server - everything goes to access logs) - a simple map/reduce •We collect statistics from edge servers and the Bridge - wear for customers (transfer, storage, hits), stream wear (number of emissions, number of customers) •We had to write several small programs: – Statistics are initially processed through logalyzer (with rotating logs) and uploaded to SimpleStorage in CSV format, – logcollector downloads log packs and uploads them to the base in bulk, – statscalc aggregates data in subtotals, non-aggregated data is removed over time. •As a result – a user sees changes in statistics on a page updated hourly
  • 14. Livestreaming •The system has been running productively •Compatible with RTMP/RTMPT/RTMPE – already created Java server with the changes (statistics) installed on all the nodes •Traffic division done by using proxy on the level of video application (a code in Java sends a video from the server the customer is broadcasting from to target servers – we can dynamically choose servers for a given customer) •We've added broadcaster authorization (tokens - supported by the Bridge)
  • 15. Livestreaming •… as well as watcher authorization (tokens can be handled individually by the webservice of a particular service, e.g. VoD) •Redirector manages redirection to streaming (support was necessary in the player) - RTMP doesn’t support redirects •It would be great to pack also in HTTP – buffering, among other things, would work then and there would be no problems with firewalls – soon, we would like to add support for smooth-streaming and streaming on iPhones (mpeg-ts) - we can do it by transcoding on the fly (we’ve already conducted trials) or Wowza-type dedicated servers, IIS - heterogeneous environment is a disadvantage
  • 17. Plans for development? • file authorization • optimization and improving traffic modeling algorithm • streaming through HTTP and smooth-streaming • streaming for iPhones • introducing more nodes – e.g. to PL-IX • expansion and acceleration of statistics • …