Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Automating crowdsourced video creation with Google Cloud Platform and Machine Learning APIs


Published on

Hear how Seenit combines Couchbase and Google Cloud Platform to make crowdsourced video editing quick and affordable. In this joint session, Seenit and Google will discuss Seenit’s cloud requirements, why they chose GCP, and how they deployed and manage the environment. Developers take note — you’ll also hear how Seenit uses Google Cloud Machine Learning APIs to sort clips based on the objects that are in the shot, gender of the speakers, sentiment and speech, as they’re being uploaded.

Published in: Software
  • Be the first to comment

  • Be the first to like this

Automating crowdsourced video creation with Google Cloud Platform and Machine Learning APIs

  1. 1. Dave Starling CTO @ Seenit Anil Dhawan Product Manager Google Cloud Platform Couchbase & Google Cloud Platform
  2. 2. • Founded in 2014 • Provide a new production model for internal comms, broadcasting, marketing, and fan-engagement • Working with companies like Rolls-Royce, HSBC, Red Bull F1 Racing, BT Sport, BBC, Unilever, and more • Python, CherryPy, RabbitMQ on Google Cloud Platform • Built to exploit Couchbase features • Currently running Couchbase Enterprise 4.6.1 in production About Seenit
  3. 3. ● Global Scale ● Schema Agility ● JSON ● Performance ● Resilience ● Support ● Cost ● Security ● Aggressive Feature Development Our Database and Cloud Requirements
  4. 4. ● Global Scale ● Schema Agility ● JSON ● Performance ● Resilience ● Support ● Cost ● Security ● Aggressive Feature Development Why Couchbase?
  5. 5. ● Global Scale ● Schema Agility ● JSON ● Performance ● Resilience ● Support ● Cost ● Security ● Aggressive Feature Development Why Google Cloud Platform?
  6. 6. ● 8 node production cluster on GCE ● Using Couchbase Multi-Dimensional Scaling for performance efficiency ● 4x data nodes (custom machine type, high-mem, SSD attached disk) ● 3x index/query nodes (custom machine type, high-cpu, SSD attached disk) ● 1x FTS node ● Servers are imaged for standardised hardened deployments ● Attached disks are snapshotted nightly, and cbbackup to GCS ● DR cluster in alt GCE DC How They Work Together
  7. 7. Standard Devices HTTPS Architecture: Seenit > Media > Analysis Analytics Vision Speech Natural Language Machine Learning Ingest Task Queues Storage Cloud Storage Couchbase Full Text Search Indexer Couchbase Pipeline Workers Compute Engine Autoscaling Transcoders Pipeline Workers Compute Engine Autoscaling
  8. 8. ● 2014: 3 nodes – no N1QL at that point! ● 2015-2016: gradually added more data nodes, then multiple query + index ● 2017: added dedicated FTS node Scale Over Time
  9. 9. ● GCE Lack of internal/external IP DNS ● Couchbase Debian 8 compatibility ● Test & Development clusters ● Rolling upgrades Challenges
  10. 10. ●Talk JSON, Store JSON, Search JSON ●Computer vision using Google’s ML platform, powered by TensorFlow ●Includes several ready-to-use video, image and audio processing tools: Video, Vision, Speech, and Natural Language ●Full integrated with Couchbase FTS and N1QL for search and intelligence extraction ●Fully-managed service ● ● Google Machine Learning & Couchbase
  11. 11. Thank You
  12. 12. Seven cloud products with ONE BILLION Users Organize the world’s information and make it universally accessible and useful Google’s Mission
  13. 13. Along the way, we created the world’s best infrastructure across hardware, software, network, and operations
  14. 14. Current regions and number of zones Planned regions for 2017 and number of zones # # 3 3 Singapore2 S Carolina N Virginia Belgium London Tokyo TaiwanMumba i Sydney Oregon Iowa Frankfur t São Paulo Finland 3 3 3 3 3 3 2 4 3 3 3 Compute Regions
  15. 15. Network Backbone Google network
  16. 16. Global Network World’s Largest Software Defined Network Edge locations in virtually every country More than 100 peering locations Global Content Delivery Network Global Load Balancing with Single IP Seamless autoscale to over 1M QPS with no pre-warming
  17. 17. $29.4 Billion 3 Year Trailing CAPEX Investment
  18. 18. Machine Learning
  19. 19. Unstructured data accounts for 90% of enterprise data* Machine Learning can help you make sense of it *Source: IDC
  20. 20. Keys to Successful Machine Learning Large Datasets Good Models Lots of Computation
  21. 21. Machine Learning is made for Cloud
  22. 22. Machine Learning with Google Cloud Platform Use Our Models Fully pre-trained Leverage Google’s domain expertise No tools or expertise required Train Your Own Models Build on your own specialized domain expertise Use Google tools for building and training models
  23. 23. Cloud Vision Derive insight from images with powerful API Face Detection Logo Identification Label Identification Explicit Content Detection OCR Landmark Identification DEMO
  24. 24. Video Intelligence Label Identification
  25. 25. Cloud Speech Speech to text conversion powered by Machine Learning “Hello” “Bonjour” “Hola” “Здравствуйте” “안녕하세요” “こんにちは”
  26. 26. Natural Language Derive insights from unstructured text Event Consumer Good Person Organization Place Consumer Good
  27. 27. Use Your Own Data Cloud Storage BigQuery Cloud Datalab Cloud Machine Learning Develop, Model, Train, Test
  28. 28. Built on Open Source Created by Google Brain team Most popular ML project on Github ● Over 480 contributors ● 10,000 commits in 12 months Multiple deployment options: ● Mobile, Desktop, Server, Cloud ● CPU, GPU
  29. 29. Hardware Accelerated Available NVIDIA K80 GPU Unique Tensor Processing Unit (TPU) Custom ASIC built and optimized for TensorFlow Used in production at Google for over 16 months
  30. 30. Machine Learning helped reduce error rates from 11% to 3% in the critical process of correcting satellite image maps Snow or Clouds?
  31. 31. Provided evidence to Kiribati for the first prosecution of illegal fishing in PIPA. Lead to a $2.2M fine (~1% of the country’s GDP)
  32. 32. Try Them Yourself language/