Platform 3: To Infinity and Beyond January, 2009 Summit XI
Gartner’s Hype Cycle - -
Overview Architecture Video Reporting - -
Architecture:  What’s that? The structures of the system The externally visible parts and the relationships between them - -
Architecture: Goals Performance Every page needs to yield a response within 5 seconds Availability/Reliability Always there! Scalability Dynamically add RAM/CPU Dynamically add more servers Agile/Flexible Can easily be adapted Follow best practices Accuracy No response left behind Quality Assurance - -
Architecture: Performance How do we achieve great performance? Using the right software Ruby on Rails Twitter, LinkedIn, Hulu Good application design Reporting has different needs than Authoring/Runtime Testing / Benchmarking / Tuning Rails has lots of good built-in utilities to make these easy We’re writing test code, right? Dedicating time for maintenance / new features As data grows As more complexity is brought in to application environment As we get smarter - -
Architecture: Performance Good Application Design –  Separation of Concerns Separating databases for Runtime and Reporting is a Good thing! Runtime is OLTP OLTP , refers to a class of systems that facilitate and manage transaction-oriented applications, typically for data entry and retrieval  transaction processing .  It has also been used to refer to processing in which the system responds immediately to user requests.  - Wikipedia Reporting is OLAP OLAP , is an approach to quickly provide answers to analytical queries that are multi-dimensional in nature.  Databases  configured for OLAP employ a  multidimensional data model , allowing for complex analytical and ad-hoc queries with a rapid execution time.  - Wikipedia Analytical processing on Reporting doesn’t impact performance on Runtime (ie Active Surveys in the field) because they are physically different systems. - -
Architecture: Availability/Reliability Co-location  Uptime eApps  99.98% over past 1000 days Colo4Dallas Guarantees 100%, reality? 99%+ Amazon Web Services  99.95% Redundancy Servers have different profiles for different services Databases Web / Application servers Proxy / Load balancing Server profiles are duplicated and online for…  Hardware failures  Load balancing during peak demand - -
Architecture: Scalability Reporting www.eqrtools.com  hosted at eApps   Runs on an $70/month plan (1.2 GB RAM Virtual Private Server) Pre-packaged with Java, Rails, MySQL, mail server, etc. Can upgrade package in minutes and add servers via web interface Cancel anytime Amazon Web Services S3 = Simple Storage Service EC2 = Elastic Cloud Computing CloudFront = Content Delivery Network Authoring/Runtime Hosted at Colo4Dallas n Front End Web/Application servers n Database servers Wowza Streaming Video Service via Amazon EC2 - -
Architecture: Amazon Web Services Simple Storage Service (S3) In use at Equation with JTS for 2+ years Expanding use for storing more stuff Images – plain, rollover, etc. Documents – PDF reports Videos EC2 Machine Images Elastic Cloud Computing (EC2) Provides ability to add servers (Linux/Windows flavors) for specific services i.e. Wowza Video Streaming Grabs content from S3 Can be expanded to other uses – Rails application hosting/database CloudFront Provides Content Delivery Network (CDN) to push to edge Content that we move into S3 Moves content closer to clients reducing network latency - -
Architecture: EC2 Simplified Virtual Machines/Servers Scalability in two dimensions Use as many machines as you need Various machine sizes available High availability High bandwidth - -
Architecture: EC2 Instance Types EC2 supports different  instance types  Small Instance 1.7 GB memory, 32-bit platform, I/O Performance: Moderate 1 EC2 Compute Unit (1 virtual core with 1 EC2 Compute Unit) 160 GB instance storage (150 GB plus 10 GB root partition) Price: $0.10 per instance hour Large Instance 7.5 GB memory, 64-bit platform , I/O Performance: High 4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each) 850 GB instance storage (2 x 420 GB plus 10 GB root partition) Price: $0.40 per instance hour Extra Large Instance 15 GB memory, 64-bit platform, I/O Performance: High 8 EC2 Compute Units (4 virtual cores with 2 EC2 Compute Units each) 1,690 GB instance storage (4 x 420 GB plus 10 GB root partition) Price: $0.80 per instance hour - -
- - CloudFront: Content Delivery Network and how it works…
Amazon – CloudFront CDN Copies of files in S3 bucket are accessed/cached from edge servers around the world. - - Amazon: CloudFront
Architecture:  Amazon Benefits No upfront investments  No contract No hardware to purchase, install/fit, maintain Pay for what we use Offer variety of uses – Content hosting, machine hosting, streaming video  Competitors often charge upfront and monthly fees and don’t offer one-stop-service We can dynamically add/remove machines as we need them Additional applications built on EC2 are also available… Wowza Video Streaming Jungle Disk (backup/recovery) GigaVox Media (Podcast hosting) Morph (Application hosting) RightScale (Application hosting/monitoring) Scalr (Load Balancing/farm) - -
Architecture:  Quality Assurance Code Coverage - -
Architecture: Quality Assurance Example – Question controller - -
Rich Media: Audio, Images and Video - -
Video – The learning curve… grok  as "to understand intuitively or by empathy; to establish rapport with" and "to empathize or communicate sympathetically (with); also, to experience enjoyment.“  (source Old Oxford Dictionary) - -
Serving Video is like… TV Content  (i.e. The Ad) Delivery (i.e. Cable, Satellite, Rabbit ears) Viewer (i.e. – The television box)
The Content:  Preparation There are many source formats to video AVI (early Windows format), Quicktime (.mov), Windows Media, MPEG, Flash Files are large and not optimized for web delivery Encoded for other mediums
Content conversion The Old Way Sorensen Squeeze A desktop tool where we manually took a file and converted into multiple varying bitrate Flash files Uploaded file(s) to third party hosted Flash Video service The New Way File uploader ffmpeg (under the covers) An open source utility that has been wrapped with Ruby packages to provide compression in the P3 Application Media is compressed for optimal playback experience Media is still formatted to flash Most commonly served format on Internet (> 92%) Converted file uploaded to Amazon  File resides in S3 folder Streamed via Wowza server hosted on EC2 instance
Video: ffmpeg Still a bit of magic involved… Reduce this, increase that… - -
Video:  ffmpeg conversion But at least we’ve built tools! - -
Video:  Delivery Progressive Download Copy of video is made on your local temp drive and then buffered back through the player as it downloads Lacks IP protection ESPN Video is sent to player over http from file system on host server Some companies will block content by MIME type video over http on port 80 is the easiest way to get past security Streaming Video is streamed in real time from streaming video server No local copy made Near instantaneous playback Uses rtmp protocol Important to size/compress correctly for intended audience - -
Video: Delivery Factors impacting Client reception Other programs running How much available CPU/RAM does the respondent’s web-enabled device have? Bandwidth DSL, Cable, dialup? Bandwidth varies during a video session (i.e. 30 second Ad) - -
Video: The Player The swf file Hosted on server, embedded in page Skinnable Remove controls Plays either progressive or streaming JW Player is the most ubiquitous - -
P3 Reporting - -
Reporting:  Online Analytical Processing (OLAP) - -
Reporting: The Update Algorithm Scheduled Batch Go update all the surveys every x minutes… Open and recently closed On Demand Update  this  survey now Real-time Asynchronously, grab queued responses from a MQ with updates from the Runtime - -
Reporting: On demand - -
Reporting:  Key features View results by Question Filtering By status Compound filters based on question/choice sets Crosstabs Question v Question crosstabs Filter by status Quotas / Segments View current / total counts Monitor survey progress Total, Last day, Last hour… - -
Reporting:  What’s left? More testing… Report generation PDF Other formats Email notification More slicing/dicing tools Migration to Scalr??? Beta with select clients User feedback Incorporate into future releases - -

EQR Reporting: Rails + Amazon EC2

  • 1.
    Platform 3: ToInfinity and Beyond January, 2009 Summit XI
  • 2.
  • 3.
  • 4.
    Architecture: What’sthat? The structures of the system The externally visible parts and the relationships between them - -
  • 5.
    Architecture: Goals PerformanceEvery page needs to yield a response within 5 seconds Availability/Reliability Always there! Scalability Dynamically add RAM/CPU Dynamically add more servers Agile/Flexible Can easily be adapted Follow best practices Accuracy No response left behind Quality Assurance - -
  • 6.
    Architecture: Performance Howdo we achieve great performance? Using the right software Ruby on Rails Twitter, LinkedIn, Hulu Good application design Reporting has different needs than Authoring/Runtime Testing / Benchmarking / Tuning Rails has lots of good built-in utilities to make these easy We’re writing test code, right? Dedicating time for maintenance / new features As data grows As more complexity is brought in to application environment As we get smarter - -
  • 7.
    Architecture: Performance GoodApplication Design – Separation of Concerns Separating databases for Runtime and Reporting is a Good thing! Runtime is OLTP OLTP , refers to a class of systems that facilitate and manage transaction-oriented applications, typically for data entry and retrieval transaction processing . It has also been used to refer to processing in which the system responds immediately to user requests. - Wikipedia Reporting is OLAP OLAP , is an approach to quickly provide answers to analytical queries that are multi-dimensional in nature. Databases configured for OLAP employ a multidimensional data model , allowing for complex analytical and ad-hoc queries with a rapid execution time. - Wikipedia Analytical processing on Reporting doesn’t impact performance on Runtime (ie Active Surveys in the field) because they are physically different systems. - -
  • 8.
    Architecture: Availability/Reliability Co-location Uptime eApps 99.98% over past 1000 days Colo4Dallas Guarantees 100%, reality? 99%+ Amazon Web Services 99.95% Redundancy Servers have different profiles for different services Databases Web / Application servers Proxy / Load balancing Server profiles are duplicated and online for… Hardware failures Load balancing during peak demand - -
  • 9.
    Architecture: Scalability Reportingwww.eqrtools.com hosted at eApps Runs on an $70/month plan (1.2 GB RAM Virtual Private Server) Pre-packaged with Java, Rails, MySQL, mail server, etc. Can upgrade package in minutes and add servers via web interface Cancel anytime Amazon Web Services S3 = Simple Storage Service EC2 = Elastic Cloud Computing CloudFront = Content Delivery Network Authoring/Runtime Hosted at Colo4Dallas n Front End Web/Application servers n Database servers Wowza Streaming Video Service via Amazon EC2 - -
  • 10.
    Architecture: Amazon WebServices Simple Storage Service (S3) In use at Equation with JTS for 2+ years Expanding use for storing more stuff Images – plain, rollover, etc. Documents – PDF reports Videos EC2 Machine Images Elastic Cloud Computing (EC2) Provides ability to add servers (Linux/Windows flavors) for specific services i.e. Wowza Video Streaming Grabs content from S3 Can be expanded to other uses – Rails application hosting/database CloudFront Provides Content Delivery Network (CDN) to push to edge Content that we move into S3 Moves content closer to clients reducing network latency - -
  • 11.
    Architecture: EC2 SimplifiedVirtual Machines/Servers Scalability in two dimensions Use as many machines as you need Various machine sizes available High availability High bandwidth - -
  • 12.
    Architecture: EC2 InstanceTypes EC2 supports different instance types Small Instance 1.7 GB memory, 32-bit platform, I/O Performance: Moderate 1 EC2 Compute Unit (1 virtual core with 1 EC2 Compute Unit) 160 GB instance storage (150 GB plus 10 GB root partition) Price: $0.10 per instance hour Large Instance 7.5 GB memory, 64-bit platform , I/O Performance: High 4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each) 850 GB instance storage (2 x 420 GB plus 10 GB root partition) Price: $0.40 per instance hour Extra Large Instance 15 GB memory, 64-bit platform, I/O Performance: High 8 EC2 Compute Units (4 virtual cores with 2 EC2 Compute Units each) 1,690 GB instance storage (4 x 420 GB plus 10 GB root partition) Price: $0.80 per instance hour - -
  • 13.
    - - CloudFront:Content Delivery Network and how it works…
  • 14.
    Amazon – CloudFrontCDN Copies of files in S3 bucket are accessed/cached from edge servers around the world. - - Amazon: CloudFront
  • 15.
    Architecture: AmazonBenefits No upfront investments No contract No hardware to purchase, install/fit, maintain Pay for what we use Offer variety of uses – Content hosting, machine hosting, streaming video Competitors often charge upfront and monthly fees and don’t offer one-stop-service We can dynamically add/remove machines as we need them Additional applications built on EC2 are also available… Wowza Video Streaming Jungle Disk (backup/recovery) GigaVox Media (Podcast hosting) Morph (Application hosting) RightScale (Application hosting/monitoring) Scalr (Load Balancing/farm) - -
  • 16.
    Architecture: QualityAssurance Code Coverage - -
  • 17.
    Architecture: Quality AssuranceExample – Question controller - -
  • 18.
    Rich Media: Audio,Images and Video - -
  • 19.
    Video – Thelearning curve… grok as "to understand intuitively or by empathy; to establish rapport with" and "to empathize or communicate sympathetically (with); also, to experience enjoyment.“ (source Old Oxford Dictionary) - -
  • 20.
    Serving Video islike… TV Content (i.e. The Ad) Delivery (i.e. Cable, Satellite, Rabbit ears) Viewer (i.e. – The television box)
  • 21.
    The Content: Preparation There are many source formats to video AVI (early Windows format), Quicktime (.mov), Windows Media, MPEG, Flash Files are large and not optimized for web delivery Encoded for other mediums
  • 22.
    Content conversion TheOld Way Sorensen Squeeze A desktop tool where we manually took a file and converted into multiple varying bitrate Flash files Uploaded file(s) to third party hosted Flash Video service The New Way File uploader ffmpeg (under the covers) An open source utility that has been wrapped with Ruby packages to provide compression in the P3 Application Media is compressed for optimal playback experience Media is still formatted to flash Most commonly served format on Internet (> 92%) Converted file uploaded to Amazon File resides in S3 folder Streamed via Wowza server hosted on EC2 instance
  • 23.
    Video: ffmpeg Stilla bit of magic involved… Reduce this, increase that… - -
  • 24.
    Video: ffmpegconversion But at least we’ve built tools! - -
  • 25.
    Video: DeliveryProgressive Download Copy of video is made on your local temp drive and then buffered back through the player as it downloads Lacks IP protection ESPN Video is sent to player over http from file system on host server Some companies will block content by MIME type video over http on port 80 is the easiest way to get past security Streaming Video is streamed in real time from streaming video server No local copy made Near instantaneous playback Uses rtmp protocol Important to size/compress correctly for intended audience - -
  • 26.
    Video: Delivery Factorsimpacting Client reception Other programs running How much available CPU/RAM does the respondent’s web-enabled device have? Bandwidth DSL, Cable, dialup? Bandwidth varies during a video session (i.e. 30 second Ad) - -
  • 27.
    Video: The PlayerThe swf file Hosted on server, embedded in page Skinnable Remove controls Plays either progressive or streaming JW Player is the most ubiquitous - -
  • 28.
  • 29.
    Reporting: OnlineAnalytical Processing (OLAP) - -
  • 30.
    Reporting: The UpdateAlgorithm Scheduled Batch Go update all the surveys every x minutes… Open and recently closed On Demand Update this survey now Real-time Asynchronously, grab queued responses from a MQ with updates from the Runtime - -
  • 31.
  • 32.
    Reporting: Keyfeatures View results by Question Filtering By status Compound filters based on question/choice sets Crosstabs Question v Question crosstabs Filter by status Quotas / Segments View current / total counts Monitor survey progress Total, Last day, Last hour… - -
  • 33.
    Reporting: What’sleft? More testing… Report generation PDF Other formats Email notification More slicing/dicing tools Migration to Scalr??? Beta with select clients User feedback Incorporate into future releases - -