Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Epam BI - Near Realtime Marketing Support System


Published on

Presentation how to build near realtime marketing support systems.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Epam BI - Near Realtime Marketing Support System

  1. 1. 1 EPAM BI Competency Center 1Near Real-time Marketing Support System Implementation Details by <Kiryl Sultanau> & <Yauheni Yushyn> & <Dzmitry Maskayeu>
  2. 2. 2 Preamble
  3. 3. 3 Bidding Support Improve Ad campaigns NRT Data Visualization: ● Near Real Time visualization of bids match with clicks, leads etc. ● Detect best Ad type/place/size/position... for different users/devices/regions... ● Quick reaction and better estimation for just started Ad Campaigns. Improve Ad campaigns: ● Improve keyword campaigns with more relevant keywords and better specialized target group, region etc. ● Create new short time campaigns for special events or occasions ● Collect specific users and information about them
  4. 4. 4 Prerequisites External bidding info (impressions, clicks) Ad Campaign info Publisher log streams (impressions, click, search... Other Dictionaries
  5. 5. 5 Architectural Overview
  6. 6. 6 Pipeline Source data ETL layer Presentation layer
  7. 7. 7 Stream Processing server log [cookies, user_agent, city_id, log_type_id …] Ad Exchange - Google DoubleClick AdX - TANX Alibaba - Baidu - Google Mobile ... JOIN City US city names Log Type - bid-impression - bid-click - site-open - site-search - site-impression - site-click Site Pages Owner URL & google tag User Tag External URL & user search keyword State US state names Keywords User Keywords as union of google tags and user search keywords Spark Cache Kafka RDD DataFrame Apply schema
  8. 8. 8 joined server logDataFrame Parse User Agent String Browser OS Group Manufacturer Rendering engine Version: major, minor Name Name Platform Device Manufacturer DataFrame Stream Processing
  9. 9. 9 joined server log + user agentDataFrame JOIN Cassandra table UNPIVOTDataFrame id bid_click_kw site_open_kw site_click_kw site_lead_kwsite_search_kw joined server log + user agent + previous user behaviorDataFrame Stream Processing
  10. 10. 10 joined server log + user agent + previous user behaviorDataFrame joined server log + user agent + previous user behavior + target group marker Stream Processing
  11. 11. 11 Saving data Users Dimension Analytics Service API
  12. 12. 12 Saving data
  13. 13. 13 Visualisation Discover Visualize Dashboard
  14. 14. 14 Tags Analyser Tool ● real time data ● slices by any collected metric (time, geo-location, action type, make, model, user behavior …) ● apply filters on the fly easy as a cake ● combine and manage filters ● share dashboards ● add new visualisations on the fly ● serve all this staff from UI
  15. 15. 15 NRT Data Visualization
  16. 16. 16 Question: How to recognize users that will potentially bring profit to provider? Input data: Logs of searches and clicks on site, logs from partner sites. The data will be merged and split on parts: 60% training, 20% test, 20% validation. Features: The variables for model training that we’ll use as defining the output are: region, city, user actions and searches on site. Algorithms: Deep Learning algorithm from H2O package. Evaluation: The model will be evaluated based on number of predicted clicks + N * number of predicted conversions. Model Usage: After being trained the Model will receive data on user and his actions on site and will provide probability that this user will click on ad. Lead Prediction Using Machine Learning
  17. 17. 17 NRT Bidding region: LA, CA sex: male age: 31 stream: > > search: SUV region: CA tags: top, SUV, 2015 price: 90$ CPM limit: 200$ day region: CA, NY tags: SUV, crossover price: 70$ CPM limit: 300$ day
  18. 18. 18 Crawl Social Networks (event, places, post, feeds...)
  19. 19. 19 Crawl Social Networks (attenders, followers, likers...)
  20. 20. 20 Confidential 20