Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Log Analytics with Wyng


Published on

by Tian Chu, Wyng

Learn how Wyng uses log analtyics and AWS.

  • Login to see the comments

Log Analytics with Wyng

  1. 1. Campaign Debrief User-Generated Content Stream Built on Serverless May 17, 2018 Tian Chu, Director of Engineering
  2. 2. About us Wyng builds technology that powers compelling digital campaigns and promotions for agencies and brands. Our culture is rooted in technology and marketing, spanning diverse disciplines and decades of experience across mar- tech, ad-tech, CX, UX, data, and core mobile and web technologies. In 2011, Wyng powered the first ever hashtag campaign in connection with a Super Bowl ad, and continues to evolve its platform to align with shifts in consumer behavior. We believe great products are defined by intelligent architecture and a passion for innovation. Wyng is headquartered in the NoMad section of New York City. 2 • Wyng is a digital marketing campaign platform • Enables marketers create online Trivia Quiz, Photo Contest, Sweepstakes by drag & drop • Founded in 2009 • Headquarters in New York City • Customers: Nestlé, Dove, L'Oréal, Audible, TripAdvisor and agencies, such as Ogilvy • Big believer in serverless architecture
  3. 3. User-Generated Content (UGC) 3 of people trust images taken from “people like them” over brand created images 70% 61% of people would be more likely to purchase through an advertisement containing user-generated content
  4. 4. Ingest 4 • Ingest user generated content from Facebook, Twitter, and Instagram by #hashtag and @mention • ~ 500K / day on average • > 100K / hour at peak • Content from different sources need to be transformed into ONE format
  5. 5. Curate 5 • Content appear in near real-time • Filter and search over thousands of UGCs with response time < 1 sec • Create collections of content • Reject inappropriate content • Reach out for digital rights
  6. 6. Use 6 • Build influence on social networks • Engage & Inspire • Drive businesses
  7. 7. Analyze (Kibana) 7 • Visualizes content streams • Run analytics on marketing trends on social networks • Generate weekly marketing report • Monitor for abnormal activity of content stream ingestion
  8. 8. Serverless Architecture 8 Elasticsearch Service LambdaAPI Gateway ECS Fargate (Containers) Webhook PowerTrack API Kibana Content API Lambda Transform Kinesis Stream Buffer S3 Persist Glacier Archive Lambda Index Reception Reception DynamoDB Metadata
  9. 9. Reception and Buffer 9 LambdaAPI Gateway ECS Fargate (Containers) Webhook PowerTrack API Kinesis Stream Buffer Reception Reception Requirements Solution Incoming data volume is unpredictable Lambda scales to meet demand Cannot afford missing any data Lambda is highly available Cannot afford lose any data Kinesis persists data for 7 days PowerTrack needs long- lived connections, but no much CPU Docker container provisioned with 0.25 vCPU on ECS Fargate
  10. 10. Transform, Persist and Search 10 Elasticsearch Service Kibana Content API Lambda Transform S3 Persist Glacier Archive Lambda Index DynamoDB Metadata DynamoDB makes Lambda stateful S3 and Glacier provides affordable persistence of huge amount of data Elasticsearch makes data ready for analytics and search
  11. 11. Pros & Cons 11 Pros Cons No server management Locked in to AWS J Easier and quicker to scale Technical Limitations High availability out-of-box Hard to mimic AWS infrastructure on Dev build Cost effective (>50% reduction for us) Enterprise readiness?
  12. 12. Lessons Learned - Lambda 12 General Security Use a framework Run everything in VPC Familiarize yourself with technical limitations Use AWS Secrets Manager Load test everything Use WAF to protect your endpoints Dedupe is a must
  13. 13. Lessons Learned - Elasticsearch 13 Availability Security Index Lots of memory! Run Elasticsearch as non- privileged user Define a mapping Use dedicated masters Do NOT directly expose Elasticsearch to users Define _id for document Sanity check queries Avoid having too many dynamic fields Use time-based indices for “rollover”
  14. 14. Wish List 14 Elasticsearch Service Lambda & Kinesis Stream Auto scaling Lambda max duration > 5 mins Continuous backup and point-in-time restore Kinesis max retention > 7 days Support more custom plugins In-place upgrade
  15. 15. Thank you! (we are hiring!)