Tweaking Open-Source
    a case-study




 Nelson Gomes (nelson.gomes@telecom.pt)
  Team Leader
  11th of November 2011
Talk Index

Introduction
OpenX Components
Improvements Introduced
Overall Architecture
Report Server
Problems Found
Links
Q&A
Introduction – Online Advertisement

Online advertisement coupes with delivering ads;
Placing ads in sites is a complex process:
  Obtain all ads electable for a placeholder;
  Exclude ads with business limitations like capping;
  Assure that the ads are beying presented to the target
    audience;
  Assure the advertiser goals are being met;
  Account the ads delivered;
  Deliver the right ad format;
Introduction – Online Advertisement
Examples
Introduction – Online Advertisement
Examples
Introduction – Online Advertisement
Examples
Introduction – Online Advertisement
Examples
Introduction – OpenX

Open Source advertising server;
Licensed under GNU General Public License;
Project forked from phpAds developed by Tobias Ratschiller
  in 1998;
Was called phpAdsNew, OpenAds and finally OpenX;
Features:
  Has a web based GUI;
  Extendable plugins architecture;
  Serves ads throught JS and Iframes calls mainly;
Introduction – OpenX

Support technologies:
  PHP;
  MySQL;
  Web Server (Apache, Nginx);
  Optional memcached usage;
  Filesystem to serve ad content;
Introduction – OpenDisplay

Starting from OpenX 1.8.5 version, SAPO OpenDisplay
  project began;
A four-person team started in April 2010 to analyse and
  improve OpenX capabilities to ensure entire SAPO's ad
  serving network;
In August 2010, OpenDisplay started to serve a major
  website, while development was undergoing;
In February 2011, SAPO began migrating it's ad serving
  network in a process that took about 3 months to complete;
Today OpenDisplay serves the entire SAPO's ad network;
Introduction – OpenDisplay

OpenDisplay serves ads for several media:
 Internet;
  Mobile Internet;
  Mobile Applications;
  TV set-top boxes;
  Connected TV's;
In the near future we'll be serving bulk campaigns for other
  media;
I'll try to tell in this presentation this endeavour steps and
  quirks;
OpenDisplay Components – Frontend

This component is responsible to serve all ad formats;
No data processing is done here due to performance besides
  adserving itself;
The adserving is done using munged PHP scripts for
  performance;
Plugins are included in a on demand basis;
Database queries are cached;
So it's all about ad serving decision making;
OpenDisplay Components – Backend

Comprises data feature processing;
Web based GUI for campaign and ad management;
Ad serving statistics;
Reporting;
Batch processing of ad delivery data for use by the
  frontends;
OpenDisplay Components – Tasks

Maintenance Priority Engine (MPE)
 Determines witch campaigns to serve given their
    priorities;
  Calculates ad serving probabilites given it's probabilities
    and corrects them when underperforming or
    overperforming;
Maintenance Statistics Engine (MSE)
 Processes ad serving numbers;
  Starts and stops campaigns;
Improvements Introduced - General

Added reusable segmentation rules;
 This way a rule can be reused in several campaigns;
 Added compound segmentation rules;
  Segmentation rules engine was rewritten, cause the
    previous segmentation system was inadequate;
Added the concept of Orders;
 Sometimes a customer has several goals to different sites;
 The concept of order allows to place several campaigns
    with different goals in a single customer order;
Improvements Introduced - General

Added Zone Groups;
 Instead of selecting placeholders one a at a time we can
    associate several at once;
  Imagine that a Run of Network (RON) campaign for all
    MREC (300x250) placeholders would need to be
    associated to all placeholders one by one;
Added revenue-share acounting;
 For ads served on pages with third-party content;
  This way, revenue can be shared with third-party content
    providers;
Improvements Introduced - General

OpenDisplay went through a security audit by SAPO's
  security team and several issues were solved;
Backoffice:
  UI session cookies are now only delivered over SSL;
  Session id generation function wasn't good enought and
    could be easily guessed. This correction minimized
    session hijacking;
  New user profiles were added, and entity access was
    reviewed;
  Some user profiles were changed to read-only, like
    advertisers and sites;
Improvements Introduced - General

Ads uploaded into the ad server are stored in a folder and
  served upon;
  At first look there is no problem with this, but over time
    in some systems this can cause inode exhaustion;
  So to prevent this, and speed up file retrieval we
    improved upload component to distribute the files in a
    two-level folder hierarchy;
OpenX can use a content farm to deliver ads, so we use this
  feature from the start;
Improvements Introduced - General

Traffic forecast:
  OpenX doesn't have a traffic forecast engine, instead it
    uses an average of ads served;
  We developed two alternative forecast algorithms using
    Python;
  This forecast is critical for a couple of reasons:
    Inventory selling;
      Correct impression allocation for campaigns,
        specially due to targetting rules;
Improvements Introduced - General

Traffic forecast example:
Improvements Introduced - General

Added data logging and analysis:
 We started to summarize delivery properties to allow us
    to calculate precise segmentation delivery probabilies;
  Using these numbers in combination with traffic forecast
    we can estimate the inventory for each campaign and it's
    overall probability of delivery;
  Also, this information is useful to commercial purposes:
     Knowing the market is a very valuable information;
  We are currently migrating some of this data to Hbase
    that reduces data, making it usable;
Improvements Introduced - General

Restructured VAST 1.0 system and upgraded it to 2.0;
 Video Ad Serving Template (VAST) standard from
    Interactive Advertising Bureau;
  Delivers video ads (pre, mid and postrolls);
  Delivers overlays;
We also added a new type of ad that allows us to serve
  SAPO text ads has images;
  This virtual ad type works has a proxy to a different ad
    system, combining two different ad systems;
  Probably the first time an ad system combined them;
Improvements Introduced - General

<?xml version="1.0" encoding="UTF­8" standalone="no"?>
<VAST version="2.0" (...)>
 <Ad id="30324">
  <InLine>
   <AdSystem>OpenDisplay</AdSystem>
   <AdTitle><![CDATA[Teste Vast Video]]></AdTitle>
   <Description><![CDATA[VAST Ad]]></Description>
   <Impression id="OpenDisplay"><![CDATA[http://pub.sapo.pt/lg.php?(...)]]></Impression>
   <Creatives>
    <Creative id="30324">
     <Linear>
      <TrackingEvents>
       <Tracking event="creativeView"><![CDATA[http://pub.sapo.pt/(...)&vast_event=creativeview]]></Tracking>
(...)
      </TrackingEvents>
      <MediaFiles><MediaFile (...) type="video/x­flv">http://(...)/video.flv</MediaFile></MediaFiles>
     </Linear>
    </Creative>
   </Creatives>
  </InLine>
 </Ad>
</VAST>
Improvements Introduced - General

Flash ads are a major problem in some systems that don't
  support Flash;
  iPhones and iPads for example;
To assure these ads are at all times visible we added
  automatic Flash ad image generation to ads upload via
  Backend;
This way, even if a Flash ad doesn't have a fallback image,
  we generate one automatically;
  This was accomplished using GNU's gnash in combination
    with xvfb-run that provides a virtual X Window System
    for gnash to run;
Improvements Introduced - General

Future developments will include bulk campaigns;
 These campaigns differ from regular campaigns cause we
    know the characteristics of the audience in advance;
  Splitting audiance in sets with the same features we can
    process an entire set within the LP solver at once
    minimizing the number of variables;
So we can optimize the revenue using linear programming
  solutions;
  We will use GLPK (GNU Linear Programming Kit) has a
    solver to obtain an optimal solution;
  This way we can provide a solution that maximizes a
    campaign's revenue;
Improvements Introduced - General

GLPK sample problem:
# Giapetto's problem, maximizing Giapetto's profit
var x1 >=0;  /* soldier worths 3€  */
var x2 >=0;  /* train worths 2€  */


/* Objective function */
maximize z: 3*x1 + 2*x2; // maximize Giapetto's profit


/* Constraints */
s.t. Finishing : 2*x1 + x2 <= 100; // only 100 hours per week
s.t. Carpentry : x1 + x2 <= 80; // only 80 hours per week
s.t. Demand    : x1 <= 40; // demand of soldiers per week
End;
Improvements Introduced - Frontend

Database write operations were removed. Database access
  now is read-only;
Delivery scripts were analysed using xdebug, and major
  performance issues were tuned:
  User agent regexp's used by PHPSniff were taking 25% of
    the entire request time. Using memchache as user agent
    cache we saved 97% of this time!
  All ad serving counters are done in memcache and
    persisted at every minute, soon we'll migrate this to
    broker queues;
  Improved ad caching system, to store and retrieve
    EVERYTHING in a single operation;
Improvements Introduced - Frontend

Using xdebug output has an input to KCachegrind it is very
  easy to analyse any PHP script: just run it!
Files generated by xdebug are read and analysed by
  KCachegrind that shows for instance:
  How many times a function has been called;
  Total time each function used;
  Where request time is use;
Making very easy to detect and improve any long running
  script;
Improvements Introduced - Frontend

KCachegrind printscreen
Improvements Introduced - Frontend

Instead of using an Apache web server we decided to use
  Nginx with PHP-FPM:
  Nginx scales almost linearly;
  PHP-FPM behaved very fast in our tests;
PHP-FPM is a FastCGI implementation, now blunded with
  PHP 5.3.3;
Instead of using PHP output compression, we used Nginx
  compression, witch is faster;
Of course, we used a PHP accelerator: eAccelerator with
  shared memory witch is adequate to PHP-FPM multi-
  process architecture;
Improvements Introduced - Frontend

Even adding new features, we still were able to reduce
  delivery times:
Improvements Introduced - Frontend

Introduced a cookie abstraction API to allow storing all
  cookie and session information server-side:
  OpenX by default stored session information in cookies
    what was insufficient to keep an entire ad network
    running due to cookie size limit (~4k);
  This was a critical issue for long serving campaigns that
    used capping or conversion data;
  Less cookies means less bandwidth usage and faster
    responses;
Improvements Introduced - Frontend

The new session storage mechanism added new issues;
 The requests had to be sequential to allow correct session
    retrieval and storage;
  This required a lock mechanism to obtain session info in
    an ordered fashion;
  This was accomplished using memcache atomic
    increments to lock session access;
  All sessions are stored in memcache and the complete
    process of locking, retrieving, storing and unlocking of
    the session is done in a few ms (<3ms), from remote
    servers!;
Improvements Introduced - Frontend

We can see in this chart outbound traffic dropped
  significantly:
Improvements Introduced - Frontend

We introduced zone capping, a feature that wasn't available
  in OpenX;
  This feature is very useful with video ads, to avoid user
    flooding with video ads;
  Using zone capping we can say that a user will see one or
    more ads and then will not see any more ads during a
    given period of time;
  This feature is managed by placeholder, independently of
    the campaign settings;
Improvements Introduced - Frontend

Added new delivery endpoints to accomodate new formats:
 Mobile:
      Json
      Xml
      iPhonePlist
  TV
  VAST
Also we developed a SDK to help mobile ads integration:
 Mobile ads are placed server-side, so client information
    has to be passed to ad server (client IP, session id, user-
    agent);
Improvements Introduced - Frontend

Frontend delivery algorithm was changed to support:
 New segmentation rules system;
  Changed election algorithm;
  Zone capping;
  Server-side storage of information instead of cookies;
  Increased performance;
  New endpoints to provide new types of ads;
  No write operations into database;
  Gather user properties for analysis;
Improvements Introduced - Frontend

Some eye opening numbers:
 More than 4.000.000.000 web requests per month;
  9 frontend servers using 36Mbits outbound and 25Mbits
    inbound, in a total of 61Mbits throughput!
  Aproximately 2,200 ad requests per second and the twice
    of web requests (4,400/s);
  95% of the web requests replied under 18ms;
  PHP power at work... :-)
Improvements Introduced - Backend

Statistics component was changed to read information from
  a database replica due large number of accesses;
Backoffice changed to support some filters and results
  paging;
All user generated delete operations were removed, why?
 Removal of a user, due to table relations could delete all
    campaigns and statistics, and compromise forecast
    results;
  Deleting of a campaign, could loose all campaign data,
    required for billing;
  So all delete operations are done in maintenance tasks;
Improvements Introduced - Backend

We also added new targetting rules and improved others:
 Geographical: country, district;
  Mobile Devices Model, OS, Version;
  Browser Family;
  Internet Service Provider;
  Organization;
  Day of week;
Improvements Introduced - Backend

MPE was changed for a couple of reasons:
 Become faster;
  Decrease memory usage;
  Changes in algorithm;
  Optimizations;
MPE was reading ALL campaigns from database even
  finished ones, so memory comsuption was increasing
  linearly;
All services are now redundant;
Improvements Introduced - Backend
Overall Architecture
Report Server

OpenX only generates csv reports;
A more reliable product required more reliable, comercial-
  style reports;
This need lead us to try out JasperReports, an open-source
  Java reports generator;
Thanks to iReport for Jasper, a Crystal-Reports style report
  designer as a tool for creating reports, the reports can be
  easily edited and tested;
Report Server




                iReports for Jasper
Report Server

So, starting with JasperReports we generated a cloud style
  report generation farm, how?
Combining it with SAPO Broker, a message passing system
  and a flexible layered architecture;
Given this, a report request is a simple message delivered to
  a SAPO Broker queue;
Every server generating reports can consume a report
  request, allowing this architecture to scale almost linearly;
Report Server

We developed this report server in a layered style:
 What report to generate;
  Report parameters;
  Datasource to use;
  Outputs formats (HTML, XLS, Word, PDF,...);
  Delivery channels (Email, FTP, SSH, …);
  Report completion notification (HTTP, DB);
This layered style architecture allows us to extend any of the
  layers with new features;
Will become available has open-source soon...
Report Server
            Layer 1: what to generate
              Report & parameters


               Layer 2: data source
               Data to use on report


              Layer 3: output formats
                Xls, pdf, doc, rtf...


            Layer 4: delivery channels
                 Http, db, email


          Layer 5: completion notification
                      Url, db
Problems Found

Unable to scale;
 Some queries would read an entire database table if
    existed long-running campaigns;
  Changed this and acumulated totals in each banner what
    is easier to sum;
  Some internal data is still passed on using temporary
    tables, but not for long...
Not fast enough, of course OpenX is good enought for small
  site advertising, but not for an entire ad network;
Some entities were not working properly or were missing
  due to business requirements;
Problems Found

But in retrospective OpenX gave us a good starting point...
Tweaking open-source code allowed us to:
  From an existing open-source solution obtain a good base
    to develop a better solution;
  Save some costs if we had started for scratch;
  Gain knowledge about advertisement concepts;
  Customize new features according to specific needs;
So tweaking open-source is a great idea has a base to create
  good solutions!!!
Q&A




      Thank You
Links

http://www.openx.com
http://php-fpm.org
http://jasperforge.org/projects/jasperreports
http://jasperforge.org/project/ireport
http://softwarelivre.sapo.pt/broker
http://www.php.net
http://nginx.net
http://www.gnu.org/s/gnash
http://www.gnu.org/s/glpk

Tweaking Open Source

  • 1.
    Tweaking Open-Source a case-study  Nelson Gomes (nelson.gomes@telecom.pt) Team Leader 11th of November 2011
  • 2.
    Talk Index Introduction OpenX Components ImprovementsIntroduced Overall Architecture Report Server Problems Found Links Q&A
  • 3.
    Introduction – OnlineAdvertisement Online advertisement coupes with delivering ads; Placing ads in sites is a complex process: Obtain all ads electable for a placeholder; Exclude ads with business limitations like capping; Assure that the ads are beying presented to the target audience; Assure the advertiser goals are being met; Account the ads delivered; Deliver the right ad format;
  • 4.
    Introduction – OnlineAdvertisement Examples
  • 5.
    Introduction – OnlineAdvertisement Examples
  • 6.
    Introduction – OnlineAdvertisement Examples
  • 7.
    Introduction – OnlineAdvertisement Examples
  • 8.
    Introduction – OpenX OpenSource advertising server; Licensed under GNU General Public License; Project forked from phpAds developed by Tobias Ratschiller in 1998; Was called phpAdsNew, OpenAds and finally OpenX; Features: Has a web based GUI; Extendable plugins architecture; Serves ads throught JS and Iframes calls mainly;
  • 9.
    Introduction – OpenX Supporttechnologies: PHP; MySQL; Web Server (Apache, Nginx); Optional memcached usage; Filesystem to serve ad content;
  • 10.
    Introduction – OpenDisplay Startingfrom OpenX 1.8.5 version, SAPO OpenDisplay project began; A four-person team started in April 2010 to analyse and improve OpenX capabilities to ensure entire SAPO's ad serving network; In August 2010, OpenDisplay started to serve a major website, while development was undergoing; In February 2011, SAPO began migrating it's ad serving network in a process that took about 3 months to complete; Today OpenDisplay serves the entire SAPO's ad network;
  • 11.
    Introduction – OpenDisplay OpenDisplayserves ads for several media: Internet; Mobile Internet; Mobile Applications; TV set-top boxes; Connected TV's; In the near future we'll be serving bulk campaigns for other media; I'll try to tell in this presentation this endeavour steps and quirks;
  • 12.
    OpenDisplay Components –Frontend This component is responsible to serve all ad formats; No data processing is done here due to performance besides adserving itself; The adserving is done using munged PHP scripts for performance; Plugins are included in a on demand basis; Database queries are cached; So it's all about ad serving decision making;
  • 13.
    OpenDisplay Components –Backend Comprises data feature processing; Web based GUI for campaign and ad management; Ad serving statistics; Reporting; Batch processing of ad delivery data for use by the frontends;
  • 14.
    OpenDisplay Components –Tasks Maintenance Priority Engine (MPE) Determines witch campaigns to serve given their priorities; Calculates ad serving probabilites given it's probabilities and corrects them when underperforming or overperforming; Maintenance Statistics Engine (MSE) Processes ad serving numbers; Starts and stops campaigns;
  • 15.
    Improvements Introduced -General Added reusable segmentation rules; This way a rule can be reused in several campaigns; Added compound segmentation rules; Segmentation rules engine was rewritten, cause the previous segmentation system was inadequate; Added the concept of Orders; Sometimes a customer has several goals to different sites; The concept of order allows to place several campaigns with different goals in a single customer order;
  • 16.
    Improvements Introduced -General Added Zone Groups; Instead of selecting placeholders one a at a time we can associate several at once; Imagine that a Run of Network (RON) campaign for all MREC (300x250) placeholders would need to be associated to all placeholders one by one; Added revenue-share acounting; For ads served on pages with third-party content; This way, revenue can be shared with third-party content providers;
  • 17.
    Improvements Introduced -General OpenDisplay went through a security audit by SAPO's security team and several issues were solved; Backoffice: UI session cookies are now only delivered over SSL; Session id generation function wasn't good enought and could be easily guessed. This correction minimized session hijacking; New user profiles were added, and entity access was reviewed; Some user profiles were changed to read-only, like advertisers and sites;
  • 18.
    Improvements Introduced -General Ads uploaded into the ad server are stored in a folder and served upon; At first look there is no problem with this, but over time in some systems this can cause inode exhaustion; So to prevent this, and speed up file retrieval we improved upload component to distribute the files in a two-level folder hierarchy; OpenX can use a content farm to deliver ads, so we use this feature from the start;
  • 19.
    Improvements Introduced -General Traffic forecast: OpenX doesn't have a traffic forecast engine, instead it uses an average of ads served; We developed two alternative forecast algorithms using Python; This forecast is critical for a couple of reasons: Inventory selling; Correct impression allocation for campaigns, specially due to targetting rules;
  • 20.
    Improvements Introduced -General Traffic forecast example:
  • 21.
    Improvements Introduced -General Added data logging and analysis: We started to summarize delivery properties to allow us to calculate precise segmentation delivery probabilies; Using these numbers in combination with traffic forecast we can estimate the inventory for each campaign and it's overall probability of delivery; Also, this information is useful to commercial purposes: Knowing the market is a very valuable information; We are currently migrating some of this data to Hbase that reduces data, making it usable;
  • 22.
    Improvements Introduced -General Restructured VAST 1.0 system and upgraded it to 2.0; Video Ad Serving Template (VAST) standard from Interactive Advertising Bureau; Delivers video ads (pre, mid and postrolls); Delivers overlays; We also added a new type of ad that allows us to serve SAPO text ads has images; This virtual ad type works has a proxy to a different ad system, combining two different ad systems; Probably the first time an ad system combined them;
  • 23.
    Improvements Introduced -General <?xml version="1.0" encoding="UTF­8" standalone="no"?> <VAST version="2.0" (...)>  <Ad id="30324">   <InLine>    <AdSystem>OpenDisplay</AdSystem>    <AdTitle><![CDATA[Teste Vast Video]]></AdTitle>    <Description><![CDATA[VAST Ad]]></Description>    <Impression id="OpenDisplay"><![CDATA[http://pub.sapo.pt/lg.php?(...)]]></Impression>    <Creatives>     <Creative id="30324">      <Linear>       <TrackingEvents>        <Tracking event="creativeView"><![CDATA[http://pub.sapo.pt/(...)&vast_event=creativeview]]></Tracking> (...)       </TrackingEvents>       <MediaFiles><MediaFile (...) type="video/x­flv">http://(...)/video.flv</MediaFile></MediaFiles>      </Linear>     </Creative>    </Creatives>   </InLine>  </Ad> </VAST>
  • 24.
    Improvements Introduced -General Flash ads are a major problem in some systems that don't support Flash; iPhones and iPads for example; To assure these ads are at all times visible we added automatic Flash ad image generation to ads upload via Backend; This way, even if a Flash ad doesn't have a fallback image, we generate one automatically; This was accomplished using GNU's gnash in combination with xvfb-run that provides a virtual X Window System for gnash to run;
  • 25.
    Improvements Introduced -General Future developments will include bulk campaigns; These campaigns differ from regular campaigns cause we know the characteristics of the audience in advance; Splitting audiance in sets with the same features we can process an entire set within the LP solver at once minimizing the number of variables; So we can optimize the revenue using linear programming solutions; We will use GLPK (GNU Linear Programming Kit) has a solver to obtain an optimal solution; This way we can provide a solution that maximizes a campaign's revenue;
  • 26.
    Improvements Introduced -General GLPK sample problem: # Giapetto's problem, maximizing Giapetto's profit var x1 >=0;  /* soldier worths 3€  */ var x2 >=0;  /* train worths 2€  */ /* Objective function */ maximize z: 3*x1 + 2*x2; // maximize Giapetto's profit /* Constraints */ s.t. Finishing : 2*x1 + x2 <= 100; // only 100 hours per week s.t. Carpentry : x1 + x2 <= 80; // only 80 hours per week s.t. Demand    : x1 <= 40; // demand of soldiers per week End;
  • 27.
    Improvements Introduced -Frontend Database write operations were removed. Database access now is read-only; Delivery scripts were analysed using xdebug, and major performance issues were tuned: User agent regexp's used by PHPSniff were taking 25% of the entire request time. Using memchache as user agent cache we saved 97% of this time! All ad serving counters are done in memcache and persisted at every minute, soon we'll migrate this to broker queues; Improved ad caching system, to store and retrieve EVERYTHING in a single operation;
  • 28.
    Improvements Introduced -Frontend Using xdebug output has an input to KCachegrind it is very easy to analyse any PHP script: just run it! Files generated by xdebug are read and analysed by KCachegrind that shows for instance: How many times a function has been called; Total time each function used; Where request time is use; Making very easy to detect and improve any long running script;
  • 29.
    Improvements Introduced -Frontend KCachegrind printscreen
  • 30.
    Improvements Introduced -Frontend Instead of using an Apache web server we decided to use Nginx with PHP-FPM: Nginx scales almost linearly; PHP-FPM behaved very fast in our tests; PHP-FPM is a FastCGI implementation, now blunded with PHP 5.3.3; Instead of using PHP output compression, we used Nginx compression, witch is faster; Of course, we used a PHP accelerator: eAccelerator with shared memory witch is adequate to PHP-FPM multi- process architecture;
  • 31.
    Improvements Introduced -Frontend Even adding new features, we still were able to reduce delivery times:
  • 32.
    Improvements Introduced -Frontend Introduced a cookie abstraction API to allow storing all cookie and session information server-side: OpenX by default stored session information in cookies what was insufficient to keep an entire ad network running due to cookie size limit (~4k); This was a critical issue for long serving campaigns that used capping or conversion data; Less cookies means less bandwidth usage and faster responses;
  • 33.
    Improvements Introduced -Frontend The new session storage mechanism added new issues; The requests had to be sequential to allow correct session retrieval and storage; This required a lock mechanism to obtain session info in an ordered fashion; This was accomplished using memcache atomic increments to lock session access; All sessions are stored in memcache and the complete process of locking, retrieving, storing and unlocking of the session is done in a few ms (<3ms), from remote servers!;
  • 34.
    Improvements Introduced -Frontend We can see in this chart outbound traffic dropped significantly:
  • 35.
    Improvements Introduced -Frontend We introduced zone capping, a feature that wasn't available in OpenX; This feature is very useful with video ads, to avoid user flooding with video ads; Using zone capping we can say that a user will see one or more ads and then will not see any more ads during a given period of time; This feature is managed by placeholder, independently of the campaign settings;
  • 36.
    Improvements Introduced -Frontend Added new delivery endpoints to accomodate new formats: Mobile: Json Xml iPhonePlist TV VAST Also we developed a SDK to help mobile ads integration: Mobile ads are placed server-side, so client information has to be passed to ad server (client IP, session id, user- agent);
  • 37.
    Improvements Introduced -Frontend Frontend delivery algorithm was changed to support: New segmentation rules system; Changed election algorithm; Zone capping; Server-side storage of information instead of cookies; Increased performance; New endpoints to provide new types of ads; No write operations into database; Gather user properties for analysis;
  • 38.
    Improvements Introduced -Frontend Some eye opening numbers: More than 4.000.000.000 web requests per month; 9 frontend servers using 36Mbits outbound and 25Mbits inbound, in a total of 61Mbits throughput! Aproximately 2,200 ad requests per second and the twice of web requests (4,400/s); 95% of the web requests replied under 18ms; PHP power at work... :-)
  • 39.
    Improvements Introduced -Backend Statistics component was changed to read information from a database replica due large number of accesses; Backoffice changed to support some filters and results paging; All user generated delete operations were removed, why? Removal of a user, due to table relations could delete all campaigns and statistics, and compromise forecast results; Deleting of a campaign, could loose all campaign data, required for billing; So all delete operations are done in maintenance tasks;
  • 40.
    Improvements Introduced -Backend We also added new targetting rules and improved others: Geographical: country, district; Mobile Devices Model, OS, Version; Browser Family; Internet Service Provider; Organization; Day of week;
  • 41.
    Improvements Introduced -Backend MPE was changed for a couple of reasons: Become faster; Decrease memory usage; Changes in algorithm; Optimizations; MPE was reading ALL campaigns from database even finished ones, so memory comsuption was increasing linearly; All services are now redundant;
  • 42.
  • 43.
  • 44.
    Report Server OpenX onlygenerates csv reports; A more reliable product required more reliable, comercial- style reports; This need lead us to try out JasperReports, an open-source Java reports generator; Thanks to iReport for Jasper, a Crystal-Reports style report designer as a tool for creating reports, the reports can be easily edited and tested;
  • 45.
    Report Server iReports for Jasper
  • 46.
    Report Server So, startingwith JasperReports we generated a cloud style report generation farm, how? Combining it with SAPO Broker, a message passing system and a flexible layered architecture; Given this, a report request is a simple message delivered to a SAPO Broker queue; Every server generating reports can consume a report request, allowing this architecture to scale almost linearly;
  • 47.
    Report Server We developedthis report server in a layered style: What report to generate; Report parameters; Datasource to use; Outputs formats (HTML, XLS, Word, PDF,...); Delivery channels (Email, FTP, SSH, …); Report completion notification (HTTP, DB); This layered style architecture allows us to extend any of the layers with new features; Will become available has open-source soon...
  • 48.
    Report Server Layer 1: what to generate Report & parameters Layer 2: data source Data to use on report Layer 3: output formats Xls, pdf, doc, rtf... Layer 4: delivery channels Http, db, email Layer 5: completion notification Url, db
  • 49.
    Problems Found Unable toscale; Some queries would read an entire database table if existed long-running campaigns; Changed this and acumulated totals in each banner what is easier to sum; Some internal data is still passed on using temporary tables, but not for long... Not fast enough, of course OpenX is good enought for small site advertising, but not for an entire ad network; Some entities were not working properly or were missing due to business requirements;
  • 50.
    Problems Found But inretrospective OpenX gave us a good starting point... Tweaking open-source code allowed us to: From an existing open-source solution obtain a good base to develop a better solution; Save some costs if we had started for scratch; Gain knowledge about advertisement concepts; Customize new features according to specific needs; So tweaking open-source is a great idea has a base to create good solutions!!!
  • 51.
    Q&A Thank You
  • 52.