SlideShare a Scribd company logo
1 of 46
NoSQL matters in Catchoom Recognition Service
David Arcos                     http://catchoom.com
                                   catchoom.com | @catchoom
NoSQL matters in             Recognition Service




         1) Introduction
         2) What did we need?
         3) How we build it
         4) Advantages of NoSQL
         5) Cool uses of NoSQL
         6) Limits
         7) Conclusion




David Arcos | @DZPM                  Catchoom | http://catchoom.com | @catchoom
NoSQL matters in                Recognition Service


     Hi! I'm David Arcos

     - Python/Django developer (>4yr)

     - Web backend, distributed systems,
     databases, scalability, security

     - Team leader at Catchoom

     - You can follow me at @DZPM



David Arcos | @DZPM                        Catchoom | http://catchoom.com | @catchoom
NoSQL matters in                 Recognition Service




      Catchoom technology recognizes an
      object by searching through a large
      collection of images in a fraction of a
      second.

      Catchoom targets application
      developers and integrators.


David Arcos | @DZPM                       Catchoom | http://catchoom.com | @catchoom
NoSQL matters in              Recognition Service



       Our customers are leaders in Augmented Reality




David Arcos | @DZPM                   Catchoom | http://catchoom.com | @catchoom
NoSQL matters in                Recognition Service



      Visual Recognition:

      “Identify an object in front of the camera by comparing it
      to a huge collection of reference images”




David Arcos | @DZPM                      Catchoom | http://catchoom.com | @catchoom
NoSQL matters in            Recognition Service



      Examples of recognized objects:

      - CD/DVD and book covers
      - Newspapers and magazines
      - Logos and brands
      - Posters
      - Packaged goods
      - Monuments and places




David Arcos | @DZPM                 Catchoom | http://catchoom.com | @catchoom
NoSQL matters in              Recognition Service



       Catchoom Recognition Service:

       - Cloud-based Visual Recognition (SaaS)
       - RESTful API to integrate
       - “Add VR features to your app/platform”




David Arcos | @DZPM                    Catchoom | http://catchoom.com | @catchoom
NoSQL matters in             Recognition Service



       - Small team of 4 developers, doing SCRUM




David Arcos | @DZPM                  Catchoom | http://catchoom.com | @catchoom
NoSQL matters in             Recognition Service




         1) Introduction
         2) What did we need?
         3) How we build it
         4) Advantages of NoSQL
         5) Cool uses of NoSQL
         6) Limits
         7) Conclusion




David Arcos | @DZPM                  Catchoom | http://catchoom.com | @catchoom
NoSQL matters in                Recognition Service



      Minimum requirements:

      - a public API for the final users to perform Visual
      Recognition

      - a private API for the customer to manage the
      Collections and get statistics

      - a nice website for the customer, providing the
      functionality of both APIs

David Arcos | @DZPM                       Catchoom | http://catchoom.com | @catchoom
NoSQL matters in              Recognition Service



      Must be flexible:

      - A customer who does Augmented Reality, and
      needs a 3D model (binary format) in the item

      - Another one who needs just the item id


      - Our data model needs to allow everything
      (structured and unstructured data)

David Arcos | @DZPM                    Catchoom | http://catchoom.com | @catchoom
NoSQL matters in                 Recognition Service



      Must be reliable:

      - Images or data should never be lost

      - Avoid single points of failure

      - We need redundancy




David Arcos | @DZPM                      Catchoom | http://catchoom.com | @catchoom
NoSQL matters in                        Recognition Service



      Must be very fast:

      “Layar has been using Catchoom’s Visual Search technology since the
      launch of Layar Vision, allowing users to quickly view the AR content placed
      on top of images by just pointing their camera to the image.

      We’ve benchmarked Catchoom’s technology in 2011 against 3 of their main
      competitors and found they had the best results both on speed and on
      successful matches (including lowest false positives)”

      Dirk Groten – CTO of Layar




David Arcos | @DZPM                                Catchoom | http://catchoom.com | @catchoom
NoSQL matters in             Recognition Service




         1) Introduction
         2) What did we need?
         3) How we build it
         4) Advantages of NoSQL
         5) Cool uses of NoSQL
         6) Limits
         7) Conclusion




David Arcos | @DZPM                  Catchoom | http://catchoom.com | @catchoom
NoSQL matters in               Recognition Service



       Technology stack:
       - Development: Python, Django, Tornado, Gevent
       - Deployed using: Supervisord, Nginx, gunicorn, Fabric
       - AWS: EC2, S3, ELB




David Arcos | @DZPM                     Catchoom | http://catchoom.com | @catchoom
NoSQL matters in                  Recognition Service



      The Panel:

      - typical customer portal:

          - manage your Collections, run Visual Recognition

          - get usage statistics

          - and configure the payment method :)


David Arcos | @DZPM                       Catchoom | http://catchoom.com | @catchoom
NoSQL matters in     Recognition Service




David Arcos | @DZPM          Catchoom | http://catchoom.com | @catchoom
NoSQL matters in     Recognition Service




David Arcos | @DZPM          Catchoom | http://catchoom.com | @catchoom
NoSQL matters in               Recognition Service



      Mobile apps:

      - for Android, iOS

      - use the Visual Recognition API

      - the code will be published




David Arcos | @DZPM                      Catchoom | http://catchoom.com | @catchoom
NoSQL matters in                Recognition Service



      Data models:

      - Collection: a set of items. Has at least one token.

      - Item: has at least one Image. Has metadata.

      - Image: you want several images if the item has different
      sides, logos, flavours...

      - Token: for authenticating the requests.

David Arcos | @DZPM                      Catchoom | http://catchoom.com | @catchoom
NoSQL matters in                Recognition Service



      Components:

      - the platform is highly modular

      - “Do one thing, and do it well”

      - they pass json messages

      - optimized hardware settings


David Arcos | @DZPM                      Catchoom | http://catchoom.com | @catchoom
NoSQL matters in                  Recognition Service



      - Frontend:
      gets the API request

      - Extractor:
      extracts the visual points

      - Collector:
      message exchange

      - Searcher:
      looks for matches
David Arcos | @DZPM                       Catchoom | http://catchoom.com | @catchoom
NoSQL matters in            Recognition Service



      Required NoSQL features:

      - key-value storage
      - cache
      - message lists
      - message pub/sub
      - real-time analysis

      What servers have we chosen?


David Arcos | @DZPM                  Catchoom | http://catchoom.com | @catchoom
NoSQL matters in            Recognition Service



      Required NoSQL features:

      - key-value storage
      - cache
      - message lists
      - message pub/sub
      - real-time analysis




David Arcos | @DZPM                 Catchoom | http://catchoom.com | @catchoom
NoSQL matters in            Recognition Service



      Required NoSQL features:

      - key-value storage
      - cache
      - message lists
      - message pub/sub
      - real-time analysis


      - and Filesystem:

David Arcos | @DZPM                 Catchoom | http://catchoom.com | @catchoom
NoSQL matters in             Recognition Service




         1) Introduction
         2) What did we need?
         3) How we build it
         4) Advantages of NoSQL
         5) Cool uses of NoSQL
         6) Limits
         7) Conclusion




David Arcos | @DZPM                  Catchoom | http://catchoom.com | @catchoom
NoSQL matters in                Recognition Service



      Performance:

      - Can't afford writing to disk, or querying slow databases

      - Using Redis, everything stays on memory

      - One V.R. query takes just 300 ms




David Arcos | @DZPM                      Catchoom | http://catchoom.com | @catchoom
NoSQL matters in                  Recognition Service



      Scalability:

      - Need to scale different components, separately

      - Load balancing using Redis Lists:
         BLPOP: Remove and get
         the first element in a list,
         or block until one is available


      - But focus on the bottlenecks!

David Arcos | @DZPM                        Catchoom | http://catchoom.com | @catchoom
NoSQL matters in                  Recognition Service



      Unstructured data: query

      - A query object has many optional parameters
         - each component can add/remove fields dynamically
         - schema change between versions

      - Can't fit in a SQL table

      - We model the query in Redis as a json


David Arcos | @DZPM                       Catchoom | http://catchoom.com | @catchoom
NoSQL matters in                Recognition Service



      Unstructured data: metadata

      - Metadata is optional and unstructed, can be from a json to a
      binary blob

      - Can't fit in a SQL table, and would be too slow

      - Serve the data from Redis, and use S3 as a backup

      - Warning: in the future, if we have huge metadata files,
      Redis will get out of memory. We'll improve this approach
David Arcos | @DZPM                      Catchoom | http://catchoom.com | @catchoom
NoSQL matters in                Recognition Service



      Availability:

      - Avoid single points of failure. Replicate everything!

      - Replicating a SQL server is painful

      - Redis instances configured as Master/Slave
         - When the master dies:
            - promote a slave to be the new master
            - reconfigure the other slaves to use this new master
         - Redis Sentinel does this (beta)
David Arcos | @DZPM                       Catchoom | http://catchoom.com | @catchoom
NoSQL matters in             Recognition Service




         1) Introduction
         2) What did we need?
         3) How we build it
         4) Advantages of NoSQL
         5) Cool uses of NoSQL
         6) Limits
         7) Conclusion




David Arcos | @DZPM                  Catchoom | http://catchoom.com | @catchoom
NoSQL matters in                Recognition Service



      Do real-time calculations:

      - Usage statistics
         - total, monthly, daily, hourly
         - per image, item or collection

      - Metric monitoring for internal use
         - response times, queue size, etc

      - QoS: enforce rate limiting
         - max hits per minute
David Arcos | @DZPM                        Catchoom | http://catchoom.com | @catchoom
NoSQL matters in                   Recognition Service



      Sorted Sets:

      - To create indexes and filters

      - In example, “Most recognized images” (sorted by hits)

      - Updating the Sorted Set, no need to reconsolidate:
           ZADD Add one or more members to a sorted set,
           or update its score if it already exists




David Arcos | @DZPM                         Catchoom | http://catchoom.com | @catchoom
NoSQL matters in                Recognition Service



      Cache:

      - Redis is compatible with memcached API

      - Cache everything:
         - Sessions, metadata, etc

      - ...although the website is internal: no bottleneck here
          - Better focus on optimizing other stuff!


David Arcos | @DZPM                      Catchoom | http://catchoom.com | @catchoom
NoSQL matters in                Recognition Service



      Volatile data:

      - Redis can set an expiration time for a value

      - Very easy for:
         - implementing timeouts
         - removing old queries
         - adding temporary capping




David Arcos | @DZPM                      Catchoom | http://catchoom.com | @catchoom
NoSQL matters in                Recognition Service



      Messages:

      - Redis implements pub/sub and lists.

          - Publish/Subscribe to a channel
             - all components get the message
             - use it for monitoring

          - List: push/pop messages
             - only one component gets the message
             - use the blocking versions for load balancing
David Arcos | @DZPM                      Catchoom | http://catchoom.com | @catchoom
NoSQL matters in             Recognition Service




         1) Introduction
         2) What did we need?
         3) How we build it
         4) Advantages of NoSQL
         5) Cool uses of NoSQL
         6) Limits
         7) Conclusion




David Arcos | @DZPM                  Catchoom | http://catchoom.com | @catchoom
NoSQL matters in               Recognition Service



      Django apps compatibility:

      - we use Django and several contrib and external apps.
         - (“Standing in the shoulder of giants”)

      - but no support for NoSQL in Django ORM

      - dropping SQL is not an option!

      - we use MySQL. South migrations.

David Arcos | @DZPM                      Catchoom | http://catchoom.com | @catchoom
NoSQL matters in             Recognition Service




         1) Introduction
         2) What did we need?
         3) How we build it
         4) Advantages of NoSQL
         5) Cool uses of NoSQL
         6) Limits
         7) Conclusion




David Arcos | @DZPM                  Catchoom | http://catchoom.com | @catchoom
NoSQL matters in               Recognition Service



      Summary:

      - We use a combination of SQL and NoSQL

      - Using NoSQL was necessary to meet the requirements

      - There are a lot of different uses for NoSQL




David Arcos | @DZPM                     Catchoom | http://catchoom.com | @catchoom
NoSQL matters in                   Recognition Service



      Recommendations:

      - There is no silver bullet

      - Use the best tool for each task

      - But avoid unneeded complexity!

      - Try Redis. Don't do a migration, just add it to your stack


David Arcos | @DZPM                        Catchoom | http://catchoom.com | @catchoom
NoSQL matters in                Recognition Service




      Thanks for attending!

      - Our beta will be ready soon.
      Get a free trial at http://catchoom.com

      - Contact me at
      david.arcos@catchoom.com

      - Questions?

David Arcos | @DZPM                      Catchoom | http://catchoom.com | @catchoom
catchoom.com | @catchoom
NoSQL matters in                Recognition Service




      Thanks for attending!

      - Our beta will be ready soon.
      Get a free trial at http://catchoom.com

      - Contact me at
      david.arcos@catchoom.com

      - Questions?

David Arcos | @DZPM                      Catchoom | http://catchoom.com | @catchoom

More Related Content

Viewers also liked

Overview of Testing Talks at Pycon
Overview of Testing Talks at PyconOverview of Testing Talks at Pycon
Overview of Testing Talks at PyconJacqueline Kazil
 
Super Advanced Python –act1
Super Advanced Python –act1Super Advanced Python –act1
Super Advanced Python –act1Ke Wei Louis
 
라이트닝 토크 2015 파이콘
라이트닝 토크 2015 파이콘라이트닝 토크 2015 파이콘
라이트닝 토크 2015 파이콘Jiho Lee
 
2016 py con2016_lightingtalk_php to python
2016 py con2016_lightingtalk_php to python2016 py con2016_lightingtalk_php to python
2016 py con2016_lightingtalk_php to pythonJiho Lee
 
Authentication & Authorization in ASPdotNet MVC
Authentication & Authorization in ASPdotNet MVCAuthentication & Authorization in ASPdotNet MVC
Authentication & Authorization in ASPdotNet MVCMindfire Solutions
 
2007 - 应用系统脆弱性概论
2007 - 应用系统脆弱性概论 2007 - 应用系统脆弱性概论
2007 - 应用系统脆弱性概论 Na Lee
 
Django e il Rap Elia Contini
Django e il Rap Elia ContiniDjango e il Rap Elia Contini
Django e il Rap Elia ContiniWEBdeBS
 
The Django Book Chapter 9 - Django Workshop - Taipei.py
The Django Book Chapter 9 - Django Workshop - Taipei.pyThe Django Book Chapter 9 - Django Workshop - Taipei.py
The Django Book Chapter 9 - Django Workshop - Taipei.pyTzu-ping Chung
 

Viewers also liked (19)

Overview of Testing Talks at Pycon
Overview of Testing Talks at PyconOverview of Testing Talks at Pycon
Overview of Testing Talks at Pycon
 
Super Advanced Python –act1
Super Advanced Python –act1Super Advanced Python –act1
Super Advanced Python –act1
 
Html5 History-API
Html5 History-APIHtml5 History-API
Html5 History-API
 
Django-Queryset
Django-QuerysetDjango-Queryset
Django-Queryset
 
2 × 3 = 6
2 × 3 = 62 × 3 = 6
2 × 3 = 6
 
Bottle - Python Web Microframework
Bottle - Python Web MicroframeworkBottle - Python Web Microframework
Bottle - Python Web Microframework
 
PythonBrasil[8] closing
PythonBrasil[8] closingPythonBrasil[8] closing
PythonBrasil[8] closing
 
Website optimization
Website optimizationWebsite optimization
Website optimization
 
라이트닝 토크 2015 파이콘
라이트닝 토크 2015 파이콘라이트닝 토크 2015 파이콘
라이트닝 토크 2015 파이콘
 
2016 py con2016_lightingtalk_php to python
2016 py con2016_lightingtalk_php to python2016 py con2016_lightingtalk_php to python
2016 py con2016_lightingtalk_php to python
 
Authentication & Authorization in ASPdotNet MVC
Authentication & Authorization in ASPdotNet MVCAuthentication & Authorization in ASPdotNet MVC
Authentication & Authorization in ASPdotNet MVC
 
User-centered open source
User-centered open sourceUser-centered open source
User-centered open source
 
PyClab.__init__(self)
PyClab.__init__(self)PyClab.__init__(self)
PyClab.__init__(self)
 
2007 - 应用系统脆弱性概论
2007 - 应用系统脆弱性概论 2007 - 应用系统脆弱性概论
2007 - 应用系统脆弱性概论
 
Load testing
Load testingLoad testing
Load testing
 
Vim for Mere Mortals
Vim for Mere MortalsVim for Mere Mortals
Vim for Mere Mortals
 
Django e il Rap Elia Contini
Django e il Rap Elia ContiniDjango e il Rap Elia Contini
Django e il Rap Elia Contini
 
The Django Book Chapter 9 - Django Workshop - Taipei.py
The Django Book Chapter 9 - Django Workshop - Taipei.pyThe Django Book Chapter 9 - Django Workshop - Taipei.py
The Django Book Chapter 9 - Django Workshop - Taipei.py
 
EuroDjangoCon 2009 - Ein Rückblick
EuroDjangoCon 2009 - Ein RückblickEuroDjangoCon 2009 - Ein Rückblick
EuroDjangoCon 2009 - Ein Rückblick
 

More from David Arcos

Definiendo Estrategia: Mapas de Wardley
Definiendo Estrategia: Mapas de WardleyDefiniendo Estrategia: Mapas de Wardley
Definiendo Estrategia: Mapas de WardleyDavid Arcos
 
Does Django scale? An introduction to Scalability
Does Django scale? An introduction to ScalabilityDoes Django scale? An introduction to Scalability
Does Django scale? An introduction to ScalabilityDavid Arcos
 
Security for Data Scientists
Security for Data ScientistsSecurity for Data Scientists
Security for Data ScientistsDavid Arcos
 
Efficient Django
Efficient DjangoEfficient Django
Efficient DjangoDavid Arcos
 
The Zen of Python
The Zen of PythonThe Zen of Python
The Zen of PythonDavid Arcos
 
12 tips on Django Best Practices
12 tips on Django Best Practices12 tips on Django Best Practices
12 tips on Django Best PracticesDavid Arcos
 

More from David Arcos (6)

Definiendo Estrategia: Mapas de Wardley
Definiendo Estrategia: Mapas de WardleyDefiniendo Estrategia: Mapas de Wardley
Definiendo Estrategia: Mapas de Wardley
 
Does Django scale? An introduction to Scalability
Does Django scale? An introduction to ScalabilityDoes Django scale? An introduction to Scalability
Does Django scale? An introduction to Scalability
 
Security for Data Scientists
Security for Data ScientistsSecurity for Data Scientists
Security for Data Scientists
 
Efficient Django
Efficient DjangoEfficient Django
Efficient Django
 
The Zen of Python
The Zen of PythonThe Zen of Python
The Zen of Python
 
12 tips on Django Best Practices
12 tips on Django Best Practices12 tips on Django Best Practices
12 tips on Django Best Practices
 

Recently uploaded

Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 

Recently uploaded (20)

Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 

NoSQL matters in Catchoom Recognition Service

  • 1. NoSQL matters in Catchoom Recognition Service David Arcos http://catchoom.com catchoom.com | @catchoom
  • 2. NoSQL matters in Recognition Service 1) Introduction 2) What did we need? 3) How we build it 4) Advantages of NoSQL 5) Cool uses of NoSQL 6) Limits 7) Conclusion David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 3. NoSQL matters in Recognition Service Hi! I'm David Arcos - Python/Django developer (>4yr) - Web backend, distributed systems, databases, scalability, security - Team leader at Catchoom - You can follow me at @DZPM David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 4. NoSQL matters in Recognition Service Catchoom technology recognizes an object by searching through a large collection of images in a fraction of a second. Catchoom targets application developers and integrators. David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 5. NoSQL matters in Recognition Service Our customers are leaders in Augmented Reality David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 6. NoSQL matters in Recognition Service Visual Recognition: “Identify an object in front of the camera by comparing it to a huge collection of reference images” David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 7. NoSQL matters in Recognition Service Examples of recognized objects: - CD/DVD and book covers - Newspapers and magazines - Logos and brands - Posters - Packaged goods - Monuments and places David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 8. NoSQL matters in Recognition Service Catchoom Recognition Service: - Cloud-based Visual Recognition (SaaS) - RESTful API to integrate - “Add VR features to your app/platform” David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 9. NoSQL matters in Recognition Service - Small team of 4 developers, doing SCRUM David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 10. NoSQL matters in Recognition Service 1) Introduction 2) What did we need? 3) How we build it 4) Advantages of NoSQL 5) Cool uses of NoSQL 6) Limits 7) Conclusion David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 11. NoSQL matters in Recognition Service Minimum requirements: - a public API for the final users to perform Visual Recognition - a private API for the customer to manage the Collections and get statistics - a nice website for the customer, providing the functionality of both APIs David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 12. NoSQL matters in Recognition Service Must be flexible: - A customer who does Augmented Reality, and needs a 3D model (binary format) in the item - Another one who needs just the item id - Our data model needs to allow everything (structured and unstructured data) David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 13. NoSQL matters in Recognition Service Must be reliable: - Images or data should never be lost - Avoid single points of failure - We need redundancy David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 14. NoSQL matters in Recognition Service Must be very fast: “Layar has been using Catchoom’s Visual Search technology since the launch of Layar Vision, allowing users to quickly view the AR content placed on top of images by just pointing their camera to the image. We’ve benchmarked Catchoom’s technology in 2011 against 3 of their main competitors and found they had the best results both on speed and on successful matches (including lowest false positives)” Dirk Groten – CTO of Layar David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 15. NoSQL matters in Recognition Service 1) Introduction 2) What did we need? 3) How we build it 4) Advantages of NoSQL 5) Cool uses of NoSQL 6) Limits 7) Conclusion David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 16. NoSQL matters in Recognition Service Technology stack: - Development: Python, Django, Tornado, Gevent - Deployed using: Supervisord, Nginx, gunicorn, Fabric - AWS: EC2, S3, ELB David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 17. NoSQL matters in Recognition Service The Panel: - typical customer portal: - manage your Collections, run Visual Recognition - get usage statistics - and configure the payment method :) David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 18. NoSQL matters in Recognition Service David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 19. NoSQL matters in Recognition Service David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 20. NoSQL matters in Recognition Service Mobile apps: - for Android, iOS - use the Visual Recognition API - the code will be published David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 21. NoSQL matters in Recognition Service Data models: - Collection: a set of items. Has at least one token. - Item: has at least one Image. Has metadata. - Image: you want several images if the item has different sides, logos, flavours... - Token: for authenticating the requests. David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 22. NoSQL matters in Recognition Service Components: - the platform is highly modular - “Do one thing, and do it well” - they pass json messages - optimized hardware settings David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 23. NoSQL matters in Recognition Service - Frontend: gets the API request - Extractor: extracts the visual points - Collector: message exchange - Searcher: looks for matches David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 24. NoSQL matters in Recognition Service Required NoSQL features: - key-value storage - cache - message lists - message pub/sub - real-time analysis What servers have we chosen? David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 25. NoSQL matters in Recognition Service Required NoSQL features: - key-value storage - cache - message lists - message pub/sub - real-time analysis David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 26. NoSQL matters in Recognition Service Required NoSQL features: - key-value storage - cache - message lists - message pub/sub - real-time analysis - and Filesystem: David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 27. NoSQL matters in Recognition Service 1) Introduction 2) What did we need? 3) How we build it 4) Advantages of NoSQL 5) Cool uses of NoSQL 6) Limits 7) Conclusion David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 28. NoSQL matters in Recognition Service Performance: - Can't afford writing to disk, or querying slow databases - Using Redis, everything stays on memory - One V.R. query takes just 300 ms David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 29. NoSQL matters in Recognition Service Scalability: - Need to scale different components, separately - Load balancing using Redis Lists: BLPOP: Remove and get the first element in a list, or block until one is available - But focus on the bottlenecks! David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 30. NoSQL matters in Recognition Service Unstructured data: query - A query object has many optional parameters - each component can add/remove fields dynamically - schema change between versions - Can't fit in a SQL table - We model the query in Redis as a json David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 31. NoSQL matters in Recognition Service Unstructured data: metadata - Metadata is optional and unstructed, can be from a json to a binary blob - Can't fit in a SQL table, and would be too slow - Serve the data from Redis, and use S3 as a backup - Warning: in the future, if we have huge metadata files, Redis will get out of memory. We'll improve this approach David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 32. NoSQL matters in Recognition Service Availability: - Avoid single points of failure. Replicate everything! - Replicating a SQL server is painful - Redis instances configured as Master/Slave - When the master dies: - promote a slave to be the new master - reconfigure the other slaves to use this new master - Redis Sentinel does this (beta) David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 33. NoSQL matters in Recognition Service 1) Introduction 2) What did we need? 3) How we build it 4) Advantages of NoSQL 5) Cool uses of NoSQL 6) Limits 7) Conclusion David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 34. NoSQL matters in Recognition Service Do real-time calculations: - Usage statistics - total, monthly, daily, hourly - per image, item or collection - Metric monitoring for internal use - response times, queue size, etc - QoS: enforce rate limiting - max hits per minute David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 35. NoSQL matters in Recognition Service Sorted Sets: - To create indexes and filters - In example, “Most recognized images” (sorted by hits) - Updating the Sorted Set, no need to reconsolidate: ZADD Add one or more members to a sorted set, or update its score if it already exists David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 36. NoSQL matters in Recognition Service Cache: - Redis is compatible with memcached API - Cache everything: - Sessions, metadata, etc - ...although the website is internal: no bottleneck here - Better focus on optimizing other stuff! David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 37. NoSQL matters in Recognition Service Volatile data: - Redis can set an expiration time for a value - Very easy for: - implementing timeouts - removing old queries - adding temporary capping David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 38. NoSQL matters in Recognition Service Messages: - Redis implements pub/sub and lists. - Publish/Subscribe to a channel - all components get the message - use it for monitoring - List: push/pop messages - only one component gets the message - use the blocking versions for load balancing David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 39. NoSQL matters in Recognition Service 1) Introduction 2) What did we need? 3) How we build it 4) Advantages of NoSQL 5) Cool uses of NoSQL 6) Limits 7) Conclusion David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 40. NoSQL matters in Recognition Service Django apps compatibility: - we use Django and several contrib and external apps. - (“Standing in the shoulder of giants”) - but no support for NoSQL in Django ORM - dropping SQL is not an option! - we use MySQL. South migrations. David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 41. NoSQL matters in Recognition Service 1) Introduction 2) What did we need? 3) How we build it 4) Advantages of NoSQL 5) Cool uses of NoSQL 6) Limits 7) Conclusion David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 42. NoSQL matters in Recognition Service Summary: - We use a combination of SQL and NoSQL - Using NoSQL was necessary to meet the requirements - There are a lot of different uses for NoSQL David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 43. NoSQL matters in Recognition Service Recommendations: - There is no silver bullet - Use the best tool for each task - But avoid unneeded complexity! - Try Redis. Don't do a migration, just add it to your stack David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 44. NoSQL matters in Recognition Service Thanks for attending! - Our beta will be ready soon. Get a free trial at http://catchoom.com - Contact me at david.arcos@catchoom.com - Questions? David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom
  • 46. NoSQL matters in Recognition Service Thanks for attending! - Our beta will be ready soon. Get a free trial at http://catchoom.com - Contact me at david.arcos@catchoom.com - Questions? David Arcos | @DZPM Catchoom | http://catchoom.com | @catchoom

Editor's Notes

  1. Looks easy?
  2. (timestamps, the image index, debug info...)
  3. Efficiency Totals, per month, per day, per image, per item, per collection Response times, queue size Redis is compatible with memcached API Avoid hitting the db
  4. Efficiency No need to consolidate
  5. Efficiency No need to consolidate
  6. Efficiency No need to consolidate
  7. Efficiency No need to consolidate
  8. Efficiency No need to consolidate
  9. Efficiency No need to consolidate
  10. Efficiency No need to consolidate
  11. Efficiency No need to consolidate
  12. Efficiency No need to consolidate
  13. Efficiency No need to consolidate