SlideShare a Scribd company logo
Architecture of PBS.org
 DCPython - June 7, 2011
PBS is…

• PBS is a national federation of independently owned and
  operated public television stations and producers
   – Each with their own management and development resources


• 1500+ highly trafficked websites:
   –   http://www.pbs.org/
   –   http://www.pbs.org/nova/
   –   http://pbskids.org/
   –   http://pbskids.org/sesame/
   –   http://video.pbs.org/

• Enterprise services/APIs
PBS is not!



• Radio is easy… We do television!




• Or any of the other ~200 local stations.
What we do

• Technology leadership within public
  broadcasting community
• Distribution of national programming content
• Services to local stations
• Core application development. Yeah!!!
A few of our sites
History of PBS.org




      Early 1990’s: Hand rolled static html
       Late 1990’s: Hand crafted static html + CGI!
    Most of 2000’s: Zope/Plone CMS generated static html
          2008-10: Django generated static html
Launched Oct 2010: Django all the way
COVE API

• Contains the metadata for all PBS videos online
  including pointers to streaming video
• Needed to be:
   – Secure
   – Fast
   – Scalable
COVE API – Technology Stack

•   Amazon Elastic Cluster Computing (EC2)
•   Amazon Relational Database Service (RDS)
•   Linux
•   Python
•   Django
•   Piston for REST API
COVE API - Architecture
                         Internet

                    Elastic Load Balancer

 Auto Scale Array
         App Server 1      …        App Server N

                         HA Proxy

                                                      S3
           RDS Master               RDS Slave 11
                                     RDS Slave
                                      RDS Slave 1   Backups

App Sync Server
COVE API – Management Tools

• Amazon Web Service Console
• RightScale
• Splunk
COVE API – Interesting Stuff

• Easy to load test
  – Duplicate environment for several days
• Easy to scale
  – Autoscale array grows automatically
• Easy to upgrade
  – Each server built from vanilla base
COVE API – Lessons learned
• Use normalized data for administration and de-
  normalized data for API
COVE API – Lessons learned
• Piston is fine, but lacks flexibility without
  significant customization
   – TastyPie?
• JSON is probably good enough
• Don’t get fancy with your endpoints
• Stick to REST principles
• Don’t get fancy with your authentication
   – Use OAuth2 or simple token
PBS.org and Merlin API

• PBS.org
   – Slim, fast layer
   – Pulls data from Merlin API
   – Uses memcache extensively
   – Currently Django, but could be anything (Flask?)

• Merlin API
  – Aggregate content from distributed CMSes
  – Expose via standardized API
  – Power PBS.org and more
Merlin API – Technology stack

•   Python       • Amazon Web Services (“cloud”)
•   Django         – EC2
•   MySQL          – RDS - Relational Database Service
                      – ELB - Elastic Load Balancing
•   Piston
                          – Cloudfront CDN
•   Solr                      – S3Storage
•   Celery
•   RabbitMQ
Data flow




RSS Feed   Standardized
Ingestor       API
Merlin API architecture

   API Endpoint – Django Piston

          Search service             Indexing service
         Django-haystack                   Solr

     Data layer – MySQL (RDS)

Administration      Feed ingestion
Django admin           Celery
Merlin API server topology

          Internet



     Elastic Load Balancer


Autoscaling     App #N
                 App #N                Solr   Master
      array       App #N     Celery
                    App #n            Index   DB RDS


               S3 backups
Merlin API – Management Tools

• Amazon Web Service Console
• RightScale
• Splunk
API - Piston/Haystack/Solr
class WebObjectIndexHandler(BaseHandler):
    ...
    def get_queryset(self):
        ...
        return PistonSearchQuerySet().models(*models)

from haystack.query import SearchQuerySet
class PistonSearchQuerySet(SearchQuerySet):
    ...
    def __getitem__(self, k):
        ...
        return [IndexSerializer(i) for i in
super(PistonSearchQuerySet, self).__getitem__(k)]
Feed ingestor - Celery
from celery.decorators import task, periodic_task

@periodic_task(run_every=timedelta(seconds=300))
def update_webobject_states():
    ...
solr_visible = WebObject.children.filter(visible=True)
solr_visible = solr_visible.exclude(
flag__api_visible=True, available__isnull=True)
    ...
    updated = solr_visible.update(visible=False,
is_indexed = False)
    ...
signals.bulk_update.send('tasks.update_webobject_states')
Merlin API - Lessons learned

• Memcached was not necessary
• Denormalized search data via Solr index is much faster
  than querying database
• Asynchronous task delegation is awesome
• Celery prone to memory leaks
• App server array for easy horizontal scaling
   – Even if not autoscaling, increase min servers
• Never trust data you don’t control (validate!)
Resources

•   http://lucene.apache.org/solr/
•   http://haystacksearch.org/
•   http://celeryproject.org/
•   http://celeryproject.org/docs/django-celery/
•   http://aws.amazon.com/
PBS Developer Community

• Dedicated to making open.PBS the industry
  standard in open development communities.

                  http://open.pbs.org/
                  https://github.com/pbs

                  open@pbs.org
Questions?

  Drew Engelson
  drew@engelson.net
  http://tomatohater.com



  Edgar Roman
  emroman@pbs.org

More Related Content

What's hot

Local Testing and Deployment Best Practices for Serverless Applications - AWS...
Local Testing and Deployment Best Practices for Serverless Applications - AWS...Local Testing and Deployment Best Practices for Serverless Applications - AWS...
Local Testing and Deployment Best Practices for Serverless Applications - AWS...
Amazon Web Services
 
Designing Fault Tolerant Applications on AWS - Janakiram MSV
Designing Fault Tolerant Applications on AWS - Janakiram MSVDesigning Fault Tolerant Applications on AWS - Janakiram MSV
Designing Fault Tolerant Applications on AWS - Janakiram MSVAmazon Web Services
 
(DVO305) Turbocharge YContinuous Deployment Pipeline with Containers
(DVO305) Turbocharge YContinuous Deployment Pipeline with Containers(DVO305) Turbocharge YContinuous Deployment Pipeline with Containers
(DVO305) Turbocharge YContinuous Deployment Pipeline with Containers
Amazon Web Services
 
Building Serverless Web Applications - DevDay Austin 2017
Building Serverless Web Applications - DevDay Austin 2017Building Serverless Web Applications - DevDay Austin 2017
Building Serverless Web Applications - DevDay Austin 2017Amazon Web Services
 
(DVO308) Docker & ECS in Production: How We Migrated Our Infrastructure from ...
(DVO308) Docker & ECS in Production: How We Migrated Our Infrastructure from ...(DVO308) Docker & ECS in Production: How We Migrated Our Infrastructure from ...
(DVO308) Docker & ECS in Production: How We Migrated Our Infrastructure from ...
Amazon Web Services
 
Amazon Elastic Beanstalk
Amazon Elastic BeanstalkAmazon Elastic Beanstalk
Amazon Elastic BeanstalkEberhard Wolff
 
Transforming Software Development
Transforming Software DevelopmentTransforming Software Development
Transforming Software Development
Amazon Web Services
 
AppScale Talk at SBonRails
AppScale Talk at SBonRailsAppScale Talk at SBonRails
AppScale Talk at SBonRails
Chris Bunch
 
Deploy, Manage, and Scale Your Apps with OpsWorks and Elastic Beanstalk
Deploy, Manage, and Scale Your Apps with OpsWorks and Elastic BeanstalkDeploy, Manage, and Scale Your Apps with OpsWorks and Elastic Beanstalk
Deploy, Manage, and Scale Your Apps with OpsWorks and Elastic Beanstalk
Amazon Web Services
 
Building Serverless Web Applications - May 2017 AWS Online Tech Talks
Building Serverless Web Applications  - May 2017 AWS Online Tech TalksBuilding Serverless Web Applications  - May 2017 AWS Online Tech Talks
Building Serverless Web Applications - May 2017 AWS Online Tech Talks
Amazon Web Services
 
AppScale @ LA.rb
AppScale @ LA.rbAppScale @ LA.rb
AppScale @ LA.rb
Chris Bunch
 
Stop Worrying about Prodweb001 and Start Loving i-98fb9856 (ARC201) | AWS re:...
Stop Worrying about Prodweb001 and Start Loving i-98fb9856 (ARC201) | AWS re:...Stop Worrying about Prodweb001 and Start Loving i-98fb9856 (ARC201) | AWS re:...
Stop Worrying about Prodweb001 and Start Loving i-98fb9856 (ARC201) | AWS re:...
Amazon Web Services
 
Alfresco WCM For High Scalability
Alfresco WCM For High ScalabilityAlfresco WCM For High Scalability
Alfresco WCM For High Scalability
Alfresco Software
 
Active Cloud DB at CloudComp '10
Active Cloud DB at CloudComp '10Active Cloud DB at CloudComp '10
Active Cloud DB at CloudComp '10Chris Bunch
 
從劍宗到氣宗 - 談AWS ECS與Serverless最佳實踐
從劍宗到氣宗  - 談AWS ECS與Serverless最佳實踐從劍宗到氣宗  - 談AWS ECS與Serverless最佳實踐
從劍宗到氣宗 - 談AWS ECS與Serverless最佳實踐
Pahud Hsieh
 
AppScale + Neptune @ HPCDB
AppScale + Neptune @ HPCDBAppScale + Neptune @ HPCDB
AppScale + Neptune @ HPCDB
Chris Bunch
 
Appscale at CLOUDCOMP '09
Appscale at CLOUDCOMP '09Appscale at CLOUDCOMP '09
Appscale at CLOUDCOMP '09
Chris Bunch
 
Amazon ECS Deep Dive
Amazon ECS Deep DiveAmazon ECS Deep Dive
Amazon ECS Deep Dive
Amazon Web Services
 
Auto scaling with Ruby, AWS, Jenkins and Redis
Auto scaling with Ruby, AWS, Jenkins and RedisAuto scaling with Ruby, AWS, Jenkins and Redis
Auto scaling with Ruby, AWS, Jenkins and Redis
Yi Hsuan (Jeddie) Chuang
 
Managing Your Cloud Assets
Managing Your Cloud AssetsManaging Your Cloud Assets
Managing Your Cloud Assets
Amazon Web Services
 

What's hot (20)

Local Testing and Deployment Best Practices for Serverless Applications - AWS...
Local Testing and Deployment Best Practices for Serverless Applications - AWS...Local Testing and Deployment Best Practices for Serverless Applications - AWS...
Local Testing and Deployment Best Practices for Serverless Applications - AWS...
 
Designing Fault Tolerant Applications on AWS - Janakiram MSV
Designing Fault Tolerant Applications on AWS - Janakiram MSVDesigning Fault Tolerant Applications on AWS - Janakiram MSV
Designing Fault Tolerant Applications on AWS - Janakiram MSV
 
(DVO305) Turbocharge YContinuous Deployment Pipeline with Containers
(DVO305) Turbocharge YContinuous Deployment Pipeline with Containers(DVO305) Turbocharge YContinuous Deployment Pipeline with Containers
(DVO305) Turbocharge YContinuous Deployment Pipeline with Containers
 
Building Serverless Web Applications - DevDay Austin 2017
Building Serverless Web Applications - DevDay Austin 2017Building Serverless Web Applications - DevDay Austin 2017
Building Serverless Web Applications - DevDay Austin 2017
 
(DVO308) Docker & ECS in Production: How We Migrated Our Infrastructure from ...
(DVO308) Docker & ECS in Production: How We Migrated Our Infrastructure from ...(DVO308) Docker & ECS in Production: How We Migrated Our Infrastructure from ...
(DVO308) Docker & ECS in Production: How We Migrated Our Infrastructure from ...
 
Amazon Elastic Beanstalk
Amazon Elastic BeanstalkAmazon Elastic Beanstalk
Amazon Elastic Beanstalk
 
Transforming Software Development
Transforming Software DevelopmentTransforming Software Development
Transforming Software Development
 
AppScale Talk at SBonRails
AppScale Talk at SBonRailsAppScale Talk at SBonRails
AppScale Talk at SBonRails
 
Deploy, Manage, and Scale Your Apps with OpsWorks and Elastic Beanstalk
Deploy, Manage, and Scale Your Apps with OpsWorks and Elastic BeanstalkDeploy, Manage, and Scale Your Apps with OpsWorks and Elastic Beanstalk
Deploy, Manage, and Scale Your Apps with OpsWorks and Elastic Beanstalk
 
Building Serverless Web Applications - May 2017 AWS Online Tech Talks
Building Serverless Web Applications  - May 2017 AWS Online Tech TalksBuilding Serverless Web Applications  - May 2017 AWS Online Tech Talks
Building Serverless Web Applications - May 2017 AWS Online Tech Talks
 
AppScale @ LA.rb
AppScale @ LA.rbAppScale @ LA.rb
AppScale @ LA.rb
 
Stop Worrying about Prodweb001 and Start Loving i-98fb9856 (ARC201) | AWS re:...
Stop Worrying about Prodweb001 and Start Loving i-98fb9856 (ARC201) | AWS re:...Stop Worrying about Prodweb001 and Start Loving i-98fb9856 (ARC201) | AWS re:...
Stop Worrying about Prodweb001 and Start Loving i-98fb9856 (ARC201) | AWS re:...
 
Alfresco WCM For High Scalability
Alfresco WCM For High ScalabilityAlfresco WCM For High Scalability
Alfresco WCM For High Scalability
 
Active Cloud DB at CloudComp '10
Active Cloud DB at CloudComp '10Active Cloud DB at CloudComp '10
Active Cloud DB at CloudComp '10
 
從劍宗到氣宗 - 談AWS ECS與Serverless最佳實踐
從劍宗到氣宗  - 談AWS ECS與Serverless最佳實踐從劍宗到氣宗  - 談AWS ECS與Serverless最佳實踐
從劍宗到氣宗 - 談AWS ECS與Serverless最佳實踐
 
AppScale + Neptune @ HPCDB
AppScale + Neptune @ HPCDBAppScale + Neptune @ HPCDB
AppScale + Neptune @ HPCDB
 
Appscale at CLOUDCOMP '09
Appscale at CLOUDCOMP '09Appscale at CLOUDCOMP '09
Appscale at CLOUDCOMP '09
 
Amazon ECS Deep Dive
Amazon ECS Deep DiveAmazon ECS Deep Dive
Amazon ECS Deep Dive
 
Auto scaling with Ruby, AWS, Jenkins and Redis
Auto scaling with Ruby, AWS, Jenkins and RedisAuto scaling with Ruby, AWS, Jenkins and Redis
Auto scaling with Ruby, AWS, Jenkins and Redis
 
Managing Your Cloud Assets
Managing Your Cloud AssetsManaging Your Cloud Assets
Managing Your Cloud Assets
 

Similar to Architecture at PBS

DCPython: Architecture at PBS (Jun 7, 2011)
DCPython: Architecture at PBS (Jun 7, 2011)DCPython: Architecture at PBS (Jun 7, 2011)
DCPython: Architecture at PBS (Jun 7, 2011)
Drew Engelson
 
NetflixOSS for Triangle Devops Oct 2013
NetflixOSS for Triangle Devops Oct 2013NetflixOSS for Triangle Devops Oct 2013
NetflixOSS for Triangle Devops Oct 2013
aspyker
 
A 60-mn tour of AWS compute (March 2016)
A 60-mn tour of AWS compute (March 2016)A 60-mn tour of AWS compute (March 2016)
A 60-mn tour of AWS compute (March 2016)
Julien SIMON
 
Agile Deployment using Git and AWS Elastic Beanstalk
Agile Deployment using Git and AWS Elastic BeanstalkAgile Deployment using Git and AWS Elastic Beanstalk
Agile Deployment using Git and AWS Elastic Beanstalk
Amazon Web Services
 
Amazon Web Services - Elastic Beanstalk
Amazon Web Services - Elastic BeanstalkAmazon Web Services - Elastic Beanstalk
Amazon Web Services - Elastic Beanstalk
Amazon Web Services
 
Journey Towards Scaling Your Application to 10 million users
Journey Towards Scaling Your Application to 10 million usersJourney Towards Scaling Your Application to 10 million users
Journey Towards Scaling Your Application to 10 million users
Amazon Web Services
 
DevOpsCon Cloud Workshop
DevOpsCon Cloud Workshop DevOpsCon Cloud Workshop
DevOpsCon Cloud Workshop
Sascha Möllering
 
Agile Deployment using Git and AWS Elastic Beanstalk
Agile Deployment using Git and AWS Elastic BeanstalkAgile Deployment using Git and AWS Elastic Beanstalk
Agile Deployment using Git and AWS Elastic Beanstalk
Amazon Web Services
 
DevOps on AWS
DevOps on AWSDevOps on AWS
DevOps on AWS
CloudHesive
 
Journey Towards Scaling Your Application to Million Users
Journey Towards Scaling Your Application to Million UsersJourney Towards Scaling Your Application to Million Users
Journey Towards Scaling Your Application to Million Users
Adrian Hornsby
 
Scaling Up to Your First 10 Million Users
Scaling Up to Your First 10 Million UsersScaling Up to Your First 10 Million Users
Scaling Up to Your First 10 Million Users
Amazon Web Services
 
04 integrate entityframework
04 integrate entityframework04 integrate entityframework
04 integrate entityframework
Erhwen Kuo
 
AWS and Serverless with Alexa
AWS and Serverless with AlexaAWS and Serverless with Alexa
AWS and Serverless with Alexa
Rory Preddy
 
Evolving IGN’s New APIs with Scala
 Evolving IGN’s New APIs with Scala Evolving IGN’s New APIs with Scala
Evolving IGN’s New APIs with Scala
Manish Pandit
 
AWS as platform for scalable applications
AWS as platform for scalable applicationsAWS as platform for scalable applications
AWS as platform for scalable applications
Roman Gomolko
 
What is Amazon Web Services & How to Start to deploy your apps ?
What is Amazon Web Services & How to Start to deploy your apps ?What is Amazon Web Services & How to Start to deploy your apps ?
What is Amazon Web Services & How to Start to deploy your apps ?
Sébastien ☁ Stormacq
 
Scaling up to Your First 10 Million Users
Scaling up to Your First 10 Million UsersScaling up to Your First 10 Million Users
Scaling up to Your First 10 Million Users
Amazon Web Services
 
Rails as iOS Application Backend
Rails as iOS Application BackendRails as iOS Application Backend
Rails as iOS Application Backend
maximeguilbot
 
Serverless Web Apps using API Gateway, Lambda and DynamoDB
Serverless Web Apps using API Gateway, Lambda and DynamoDBServerless Web Apps using API Gateway, Lambda and DynamoDB
Serverless Web Apps using API Gateway, Lambda and DynamoDB
Amazon Web Services
 

Similar to Architecture at PBS (20)

DCPython: Architecture at PBS (Jun 7, 2011)
DCPython: Architecture at PBS (Jun 7, 2011)DCPython: Architecture at PBS (Jun 7, 2011)
DCPython: Architecture at PBS (Jun 7, 2011)
 
NetflixOSS for Triangle Devops Oct 2013
NetflixOSS for Triangle Devops Oct 2013NetflixOSS for Triangle Devops Oct 2013
NetflixOSS for Triangle Devops Oct 2013
 
A 60-mn tour of AWS compute (March 2016)
A 60-mn tour of AWS compute (March 2016)A 60-mn tour of AWS compute (March 2016)
A 60-mn tour of AWS compute (March 2016)
 
Agile Deployment using Git and AWS Elastic Beanstalk
Agile Deployment using Git and AWS Elastic BeanstalkAgile Deployment using Git and AWS Elastic Beanstalk
Agile Deployment using Git and AWS Elastic Beanstalk
 
Amazon Web Services - Elastic Beanstalk
Amazon Web Services - Elastic BeanstalkAmazon Web Services - Elastic Beanstalk
Amazon Web Services - Elastic Beanstalk
 
Journey Towards Scaling Your Application to 10 million users
Journey Towards Scaling Your Application to 10 million usersJourney Towards Scaling Your Application to 10 million users
Journey Towards Scaling Your Application to 10 million users
 
[Jun AWS 201] Technical Workshop
[Jun AWS 201] Technical Workshop[Jun AWS 201] Technical Workshop
[Jun AWS 201] Technical Workshop
 
DevOpsCon Cloud Workshop
DevOpsCon Cloud Workshop DevOpsCon Cloud Workshop
DevOpsCon Cloud Workshop
 
Agile Deployment using Git and AWS Elastic Beanstalk
Agile Deployment using Git and AWS Elastic BeanstalkAgile Deployment using Git and AWS Elastic Beanstalk
Agile Deployment using Git and AWS Elastic Beanstalk
 
DevOps on AWS
DevOps on AWSDevOps on AWS
DevOps on AWS
 
Journey Towards Scaling Your Application to Million Users
Journey Towards Scaling Your Application to Million UsersJourney Towards Scaling Your Application to Million Users
Journey Towards Scaling Your Application to Million Users
 
Scaling Up to Your First 10 Million Users
Scaling Up to Your First 10 Million UsersScaling Up to Your First 10 Million Users
Scaling Up to Your First 10 Million Users
 
04 integrate entityframework
04 integrate entityframework04 integrate entityframework
04 integrate entityframework
 
AWS and Serverless with Alexa
AWS and Serverless with AlexaAWS and Serverless with Alexa
AWS and Serverless with Alexa
 
Evolving IGN’s New APIs with Scala
 Evolving IGN’s New APIs with Scala Evolving IGN’s New APIs with Scala
Evolving IGN’s New APIs with Scala
 
AWS as platform for scalable applications
AWS as platform for scalable applicationsAWS as platform for scalable applications
AWS as platform for scalable applications
 
What is Amazon Web Services & How to Start to deploy your apps ?
What is Amazon Web Services & How to Start to deploy your apps ?What is Amazon Web Services & How to Start to deploy your apps ?
What is Amazon Web Services & How to Start to deploy your apps ?
 
Scaling up to Your First 10 Million Users
Scaling up to Your First 10 Million UsersScaling up to Your First 10 Million Users
Scaling up to Your First 10 Million Users
 
Rails as iOS Application Backend
Rails as iOS Application BackendRails as iOS Application Backend
Rails as iOS Application Backend
 
Serverless Web Apps using API Gateway, Lambda and DynamoDB
Serverless Web Apps using API Gateway, Lambda and DynamoDBServerless Web Apps using API Gateway, Lambda and DynamoDB
Serverless Web Apps using API Gateway, Lambda and DynamoDB
 

More from Public Broadcasting Service

Cloud Orchestration is Broken
Cloud Orchestration is BrokenCloud Orchestration is Broken
Cloud Orchestration is Broken
Public Broadcasting Service
 
Simplified Localization+ Presentation
Simplified Localization+ PresentationSimplified Localization+ Presentation
Simplified Localization+ Presentation
Public Broadcasting Service
 
PBS Localization+ API Webinar
PBS Localization+ API WebinarPBS Localization+ API Webinar
PBS Localization+ API Webinar
Public Broadcasting Service
 
I've Got a Key to Your API, Now What? (Joint PBS and NPR API Presentation Giv...
I've Got a Key to Your API, Now What? (Joint PBS and NPR API Presentation Giv...I've Got a Key to Your API, Now What? (Joint PBS and NPR API Presentation Giv...
I've Got a Key to Your API, Now What? (Joint PBS and NPR API Presentation Giv...
Public Broadcasting Service
 
SQL Injection Defense in Python
SQL Injection Defense in PythonSQL Injection Defense in Python
SQL Injection Defense in Python
Public Broadcasting Service
 
PBS Tech Con 2011 API Workshop
PBS Tech Con 2011 API WorkshopPBS Tech Con 2011 API Workshop
PBS Tech Con 2011 API Workshop
Public Broadcasting Service
 

More from Public Broadcasting Service (10)

Cloud Orchestration is Broken
Cloud Orchestration is BrokenCloud Orchestration is Broken
Cloud Orchestration is Broken
 
Pycon2013
Pycon2013Pycon2013
Pycon2013
 
Simplified Localization+ Presentation
Simplified Localization+ PresentationSimplified Localization+ Presentation
Simplified Localization+ Presentation
 
PBS Localization+ API Webinar
PBS Localization+ API WebinarPBS Localization+ API Webinar
PBS Localization+ API Webinar
 
Mobile Presentation at PBS TECH CON 2011
Mobile Presentation at PBS TECH CON 2011Mobile Presentation at PBS TECH CON 2011
Mobile Presentation at PBS TECH CON 2011
 
PBS Presentation at AWS Summit 2012
PBS Presentation at AWS Summit 2012PBS Presentation at AWS Summit 2012
PBS Presentation at AWS Summit 2012
 
I've Got a Key to Your API, Now What? (Joint PBS and NPR API Presentation Giv...
I've Got a Key to Your API, Now What? (Joint PBS and NPR API Presentation Giv...I've Got a Key to Your API, Now What? (Joint PBS and NPR API Presentation Giv...
I've Got a Key to Your API, Now What? (Joint PBS and NPR API Presentation Giv...
 
SQL Injection Defense in Python
SQL Injection Defense in PythonSQL Injection Defense in Python
SQL Injection Defense in Python
 
PBS Tech Con 2011 API Workshop
PBS Tech Con 2011 API WorkshopPBS Tech Con 2011 API Workshop
PBS Tech Con 2011 API Workshop
 
Fall2010 producer summit_openpbs_final
Fall2010 producer summit_openpbs_finalFall2010 producer summit_openpbs_final
Fall2010 producer summit_openpbs_final
 

Recently uploaded

How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
Fwdays
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 

Recently uploaded (20)

How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 

Architecture at PBS

  • 1. Architecture of PBS.org DCPython - June 7, 2011
  • 2. PBS is… • PBS is a national federation of independently owned and operated public television stations and producers – Each with their own management and development resources • 1500+ highly trafficked websites: – http://www.pbs.org/ – http://www.pbs.org/nova/ – http://pbskids.org/ – http://pbskids.org/sesame/ – http://video.pbs.org/ • Enterprise services/APIs
  • 3. PBS is not! • Radio is easy… We do television! • Or any of the other ~200 local stations.
  • 4. What we do • Technology leadership within public broadcasting community • Distribution of national programming content • Services to local stations • Core application development. Yeah!!!
  • 5. A few of our sites
  • 6. History of PBS.org Early 1990’s: Hand rolled static html Late 1990’s: Hand crafted static html + CGI! Most of 2000’s: Zope/Plone CMS generated static html 2008-10: Django generated static html Launched Oct 2010: Django all the way
  • 7. COVE API • Contains the metadata for all PBS videos online including pointers to streaming video • Needed to be: – Secure – Fast – Scalable
  • 8. COVE API – Technology Stack • Amazon Elastic Cluster Computing (EC2) • Amazon Relational Database Service (RDS) • Linux • Python • Django • Piston for REST API
  • 9. COVE API - Architecture Internet Elastic Load Balancer Auto Scale Array App Server 1 … App Server N HA Proxy S3 RDS Master RDS Slave 11 RDS Slave RDS Slave 1 Backups App Sync Server
  • 10. COVE API – Management Tools • Amazon Web Service Console • RightScale • Splunk
  • 11. COVE API – Interesting Stuff • Easy to load test – Duplicate environment for several days • Easy to scale – Autoscale array grows automatically • Easy to upgrade – Each server built from vanilla base
  • 12. COVE API – Lessons learned • Use normalized data for administration and de- normalized data for API
  • 13. COVE API – Lessons learned • Piston is fine, but lacks flexibility without significant customization – TastyPie? • JSON is probably good enough • Don’t get fancy with your endpoints • Stick to REST principles • Don’t get fancy with your authentication – Use OAuth2 or simple token
  • 14. PBS.org and Merlin API • PBS.org – Slim, fast layer – Pulls data from Merlin API – Uses memcache extensively – Currently Django, but could be anything (Flask?) • Merlin API – Aggregate content from distributed CMSes – Expose via standardized API – Power PBS.org and more
  • 15. Merlin API – Technology stack • Python • Amazon Web Services (“cloud”) • Django – EC2 • MySQL – RDS - Relational Database Service – ELB - Elastic Load Balancing • Piston – Cloudfront CDN • Solr – S3Storage • Celery • RabbitMQ
  • 16. Data flow RSS Feed Standardized Ingestor API
  • 17. Merlin API architecture API Endpoint – Django Piston Search service Indexing service Django-haystack Solr Data layer – MySQL (RDS) Administration Feed ingestion Django admin Celery
  • 18. Merlin API server topology Internet Elastic Load Balancer Autoscaling App #N App #N Solr Master array App #N Celery App #n Index DB RDS S3 backups
  • 19. Merlin API – Management Tools • Amazon Web Service Console • RightScale • Splunk
  • 20. API - Piston/Haystack/Solr class WebObjectIndexHandler(BaseHandler): ... def get_queryset(self): ... return PistonSearchQuerySet().models(*models) from haystack.query import SearchQuerySet class PistonSearchQuerySet(SearchQuerySet): ... def __getitem__(self, k): ... return [IndexSerializer(i) for i in super(PistonSearchQuerySet, self).__getitem__(k)]
  • 21. Feed ingestor - Celery from celery.decorators import task, periodic_task @periodic_task(run_every=timedelta(seconds=300)) def update_webobject_states(): ... solr_visible = WebObject.children.filter(visible=True) solr_visible = solr_visible.exclude( flag__api_visible=True, available__isnull=True) ... updated = solr_visible.update(visible=False, is_indexed = False) ... signals.bulk_update.send('tasks.update_webobject_states')
  • 22. Merlin API - Lessons learned • Memcached was not necessary • Denormalized search data via Solr index is much faster than querying database • Asynchronous task delegation is awesome • Celery prone to memory leaks • App server array for easy horizontal scaling – Even if not autoscaling, increase min servers • Never trust data you don’t control (validate!)
  • 23. Resources • http://lucene.apache.org/solr/ • http://haystacksearch.org/ • http://celeryproject.org/ • http://celeryproject.org/docs/django-celery/ • http://aws.amazon.com/
  • 24. PBS Developer Community • Dedicated to making open.PBS the industry standard in open development communities. http://open.pbs.org/ https://github.com/pbs open@pbs.org
  • 25. Questions? Drew Engelson drew@engelson.net http://tomatohater.com Edgar Roman emroman@pbs.org