By OCTO & The RefinersPierre-Alain Jachiet - Aurélien Gervasi
PEN S URCE
ANALYTICS
on MONGO DB
with Schema
Pierre-Alain Jachiet Aurélien Gervasi
DATA
SCIENTIST
Data strategist Applied mathematician
Analysts, with developer skills
DATA
SCIENTIST
DATA
PROCESSOR
Data strategist Applied mathematician
Analysts, with developer skills
“
the major activity in the data science process is
identifying, accessing and preparing data
for analysis
From MongoDB data … to Superset Colors
OCTO TECHNOLOGY > THERE IS A BETTER WAY
So ! What's the point with
MongoDB ?>
MongoDB - The Leading NoSQL Database
Cassandra
Redis
Hbase
MongoDB - A NoSQL database in the big leagues of RDBMS
2013 2014 2015
2016 2017
https://db-engines.com/en/ranking
Popularity score by db-engines.com
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Why MongoDB ?
Yes !
Semi-structured data ? Performance ?Scalability ?
And more generally because
it is natural for developers
a pleasure to use from the
developer perspective
“
“
“ MongoDB is fast
to get started “
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Developers speak json …
XML
JSON
100
75
50
25
2008 2011 2014 2017
(= document with schema)
… the modern data exchange format …
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Developers speak json …
XML
JSON
100
75
50
25
2008 2011 2014 2017
(= document with schema)
… the modern data exchange format …
… and Mongo DB eats JSON
OCTO TECHNOLOGY > THERE IS A BETTER WAY
MongoDB, a common technology to store data
OCTO TECHNOLOGY > THERE IS A BETTER WAY
So far, so good>
And, one day,
someone has a dream…
So far, so good.
But times goes on and data goes in.
AI
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Please ! An analyst for this data !
Hey !
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Please ! An analyst for this data !
But NoSQL / json data is not
natural for analysts
?
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Analysts use SQL
MongoDB : aggregation framework
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Analysts work with tables
Analyst land
… and relations
Developer land
Developer like json
… and imbrications
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Relational database
Code
Model layer
Application
= API to data
Map + Contract
Data
schema
Analyst landDeveloper land
Analysts work with a data schemaDeveloper have a data model
in the code
OCTO TECHNOLOGY > THERE IS A BETTER WAY
MongoDB
Code
Model layer
Application
= API to data
Map + Contract
Data
schema
Analyst landDeveloper land
> But mongoDB is schema-less
Analysts work with a data schemaDeveloper have a data model
in the code
OCTO TECHNOLOGY > THERE IS A BETTER WAY
The usual reaction…
MongoDB ExcelAccessSAS
Hack a pipeline to flatten the Mongo DB data
Pymongo
+ scripts
Python notebooksCSV file
Difficulties
☉ Hard job for the analyst
☉ Batch / no real time
☉ Not robust to changes
=> Difficult to industrialize
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Mongo DB enterprise
solution>
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Mongo BI Connector
Mongo BI Connector
Developed for integration with SQL-based BI tools
An SQL compatibility layer to MongoDB
Mongo SQLD
MongoDB
Data
Model
Tableau
MySQL
Wire
* DRDL = Document - Relational
Definition Language
Mongo DRDL*
- SQL translator
Data
table - Post-processor Data
json
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Mongo BI Connector - Pro & Cons
Pro
☉ Official & Supported
Install it and go
Cons
☉ Commercial → MongoDB Enterprise license
☉ Closed-source → black box
☉ Limited performance ?
☉ Mandatory use of SQL wire protocol
OCTO TECHNOLOGY > THERE IS A BETTER WAY
An open-source
solution ?>
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Open-source bricks put together !
PostgreSQLMongoDB Mongo Connector
Pymongo-Schema
Doc-manager
(PostgreSQL)
Streaming data from MongoDB to PostgreSQL
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Mongo Connector : Connect them all !
Developed by MongoDB Labs
Python 2.6, 2.7, 3.3+
MongoDB 2.4, 2.6, 3.0, 3.2, and 3.4
Apache License 2.0
https://github.com/mongodb-labs/mongo-connector
PostgreSQLMongoDB Mongo Connector
Pymongo-Schema
Doc-manager
Synchronize a Mongodb database with another database
☉ MongoDB
☉ SolR
☉ ElasticSearch
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Mongo Connector : Connect them all !
changes in DB
write new events
(differential)
replication
Oplog file
propagate changes
to other DB
Primary
Secondary
Secondary
Mongo Connector
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Doc-manager : Do you speak PostgreSQL ?
PostgreSQLMongoDB Mongo Connector
Pymongo-Schema
Doc-manager
Developed by Hopwork
Python 2.7, 3.4+
PostgreSQL 9.5
Apache License 2.0
https://github.com/Hopwork/mongo-connector-postgresql
☉ Translate a modification request from MongoConnector to the
target database
☉ Speak the target database language
OCTO TECHNOLOGY > THERE IS A BETTER WAY
{
_id: “12”,
f1: “fu”,
f2: true,
f3: 42,
f4: {
sf1: “pyparis”
sf2: 2017
},
f5: [
“fu”,
“bar”,
“fubar”
]
}
Doc-manager : Do you speak PostgreSQL ?
_id f1 f2 f3
12 “fu” true 42
_id value id_parent
1 ‘fu’ 12
2 ‘bar’ 12
3 ‘fubar’ 12
f4.sf1 f4.sf2
‘pyparis’ 2017
Mongo DB world SQL world
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Pymongo Schema : A mapping to rule them all
PostgreSQLMongoDB Mongo Connector
Pymongo-Schema
Doc-manager
“Homemade”
Python 2.7
Apache License 2.0
https://github.com/pajachiet/pymongo-schema
☉ Scan the entire database to define its data model schema
☉ Generate a mapping file flattening the MongoDB schema into
an SQL-compatible schema
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Demo>
OCTO TECHNOLOGY > THERE IS A BETTER WAY
PostgreSQLMongoDB Mongo Connector
Pymongo-Schema
Doc-manager
☉ Mongodb example Dataset: Restaurants in New York
> Address & coordinates
> Cuisi ne type
> List of grades
☉ Nested data structure
OCTO TECHNOLOGY > THERE IS A BETTER WAY
OCTO TECHNOLOGY > THERE IS A BETTER WAY
PostgreSQLMongoDB Mongo Connector
Pymongo-Schema
Doc-manager
EXTRACT
Read entire database to extract its data model schema
Returns:
☉ Field name and field nesting
☉ Field completion (frequence and ratio)
☉ Field type
OCTO TECHNOLOGY > THERE IS A BETTER WAY
OCTO TECHNOLOGY > THERE IS A BETTER WAY
PostgreSQLMongoDB Mongo Connector
Pymongo-Schema
Doc-manager
TOSQL
Read a schema to generate a MongoDB/SQL mapping.
Returns:
☉ Mapping file used by the doc-manager
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Same table
Column “cuisine”
New table
“restaurants__address__coord
OCTO TECHNOLOGY > THERE IS A BETTER WAY
PostgreSQLMongoDB Mongo Connector
Pymongo-Schema
Doc-manager
Check for updates in
the oplog file
Send update
commands with data
Translate command and
make SQL requests
OCTO TECHNOLOGY > THERE IS A BETTER WAY
PostgreSQLMongoDB Mongo Connector
Pymongo-Schema
Doc-manager
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Time to play with your
analytics tools>
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Adding an open-source BI tool...
PostgreSQLMongoDB Mongo Connector
Pymongo-Schema
Doc-manager
(PostgreSQL)
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Now, in Superset colors !
“Superset is a data exploration platform designed
to be visual, intuitive and interactive.”
PostgreSQLMongoDB Mongo Connector
Pymongo-Schema
Doc-manager
Developed by AirBnB
Python 2.7, 3.4, 3.5
Apache License 2.0
https://github.com/airbnb/superset
Superset
OCTO TECHNOLOGY > THERE IS A BETTER WAY
SQL lab
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Wrap up>
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Take home message
☉ Issues for analysts with NoSQL frameworks
> Developer oriented languages
> Nested data structure
> Schema-less
☉ An open-source stack to unlock analysis of MongoDB data
> Extract a MongoDB schema
> Normalize the data model
> Real time synchronization to PostgreSQL
☉ Currently running in production environments
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Come, use and contribute ! :)
pajachiet@octo.com
agervasi@octo.com
https://github.com/mongodb-labs/mongo-connector
https://github.com/Hopwork/mongo-connector-postgresql
https://github.com/pajachiet/pymongo-schema
https://github.com/airbnb/superset
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Bien rappeler qu’on est sur une stack open-source
☉ Collaborative
☉ Gratuite
But hey ! It’s Open-Source !
OCTO TECHNOLOGY > THERE IS A BETTER WAY 53OCTO TECHNOLOGY > THERE IS A BETTER WAY
« J’analyse mes données
pour me comprendre »
« J’apprends
automatiquement à réaliser
des tâches complexes à partir
des données »
« Je me dote d’outils avancés
me permettant des analyses
complexes et interactives »
Dataviz
Search
Statistics
Organisation pilotée
par la donnée
Learning
OCTO TECHNOLOGY > THERE IS A BETTER WAY
MongoDB popularity
https://db-engines.com/en/ranking
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Analysts use SQL
The Mongo way : aggregation framework
Superset
Architecture des visualisations
Datasource (tables SQLa)Tables PostgreSQL Visualisations Tableau de bord
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Please ! An analyst for this data !
But analysts don’t speak json…
??? ?
Should we call the developer ?

Open-Source Analytics Stack on MongoDB, with Schema, Pierre-Alain Jachiet and Aurélien Gervasi

  • 1.
    By OCTO &The RefinersPierre-Alain Jachiet - Aurélien Gervasi PEN S URCE ANALYTICS on MONGO DB with Schema
  • 2.
    Pierre-Alain Jachiet AurélienGervasi DATA SCIENTIST
  • 3.
    Data strategist Appliedmathematician Analysts, with developer skills DATA SCIENTIST
  • 4.
    DATA PROCESSOR Data strategist Appliedmathematician Analysts, with developer skills
  • 5.
    “ the major activityin the data science process is identifying, accessing and preparing data for analysis
  • 6.
    From MongoDB data… to Superset Colors
  • 7.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY So ! What's the point with MongoDB ?>
  • 8.
    MongoDB - TheLeading NoSQL Database Cassandra Redis Hbase
  • 9.
    MongoDB - ANoSQL database in the big leagues of RDBMS 2013 2014 2015 2016 2017 https://db-engines.com/en/ranking Popularity score by db-engines.com
  • 10.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY Why MongoDB ? Yes ! Semi-structured data ? Performance ?Scalability ? And more generally because it is natural for developers a pleasure to use from the developer perspective “ “ “ MongoDB is fast to get started “
  • 11.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY Developers speak json … XML JSON 100 75 50 25 2008 2011 2014 2017 (= document with schema) … the modern data exchange format …
  • 12.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY Developers speak json … XML JSON 100 75 50 25 2008 2011 2014 2017 (= document with schema) … the modern data exchange format … … and Mongo DB eats JSON
  • 13.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY MongoDB, a common technology to store data
  • 14.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY So far, so good>
  • 15.
    And, one day, someonehas a dream… So far, so good. But times goes on and data goes in. AI
  • 16.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY Please ! An analyst for this data ! Hey !
  • 17.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY Please ! An analyst for this data ! But NoSQL / json data is not natural for analysts ?
  • 18.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY Analysts use SQL MongoDB : aggregation framework
  • 19.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY Analysts work with tables Analyst land … and relations Developer land Developer like json … and imbrications
  • 20.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY Relational database Code Model layer Application = API to data Map + Contract Data schema Analyst landDeveloper land Analysts work with a data schemaDeveloper have a data model in the code
  • 21.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY MongoDB Code Model layer Application = API to data Map + Contract Data schema Analyst landDeveloper land > But mongoDB is schema-less Analysts work with a data schemaDeveloper have a data model in the code
  • 22.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY The usual reaction… MongoDB ExcelAccessSAS Hack a pipeline to flatten the Mongo DB data Pymongo + scripts Python notebooksCSV file Difficulties ☉ Hard job for the analyst ☉ Batch / no real time ☉ Not robust to changes => Difficult to industrialize
  • 23.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY Mongo DB enterprise solution>
  • 24.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY Mongo BI Connector Mongo BI Connector Developed for integration with SQL-based BI tools An SQL compatibility layer to MongoDB Mongo SQLD MongoDB Data Model Tableau MySQL Wire * DRDL = Document - Relational Definition Language Mongo DRDL* - SQL translator Data table - Post-processor Data json
  • 25.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY Mongo BI Connector - Pro & Cons Pro ☉ Official & Supported Install it and go Cons ☉ Commercial → MongoDB Enterprise license ☉ Closed-source → black box ☉ Limited performance ? ☉ Mandatory use of SQL wire protocol
  • 26.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY An open-source solution ?>
  • 27.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY Open-source bricks put together ! PostgreSQLMongoDB Mongo Connector Pymongo-Schema Doc-manager (PostgreSQL) Streaming data from MongoDB to PostgreSQL
  • 28.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY Mongo Connector : Connect them all ! Developed by MongoDB Labs Python 2.6, 2.7, 3.3+ MongoDB 2.4, 2.6, 3.0, 3.2, and 3.4 Apache License 2.0 https://github.com/mongodb-labs/mongo-connector PostgreSQLMongoDB Mongo Connector Pymongo-Schema Doc-manager Synchronize a Mongodb database with another database ☉ MongoDB ☉ SolR ☉ ElasticSearch
  • 29.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY Mongo Connector : Connect them all ! changes in DB write new events (differential) replication Oplog file propagate changes to other DB Primary Secondary Secondary Mongo Connector
  • 30.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY Doc-manager : Do you speak PostgreSQL ? PostgreSQLMongoDB Mongo Connector Pymongo-Schema Doc-manager Developed by Hopwork Python 2.7, 3.4+ PostgreSQL 9.5 Apache License 2.0 https://github.com/Hopwork/mongo-connector-postgresql ☉ Translate a modification request from MongoConnector to the target database ☉ Speak the target database language
  • 31.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY { _id: “12”, f1: “fu”, f2: true, f3: 42, f4: { sf1: “pyparis” sf2: 2017 }, f5: [ “fu”, “bar”, “fubar” ] } Doc-manager : Do you speak PostgreSQL ? _id f1 f2 f3 12 “fu” true 42 _id value id_parent 1 ‘fu’ 12 2 ‘bar’ 12 3 ‘fubar’ 12 f4.sf1 f4.sf2 ‘pyparis’ 2017 Mongo DB world SQL world
  • 32.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY Pymongo Schema : A mapping to rule them all PostgreSQLMongoDB Mongo Connector Pymongo-Schema Doc-manager “Homemade” Python 2.7 Apache License 2.0 https://github.com/pajachiet/pymongo-schema ☉ Scan the entire database to define its data model schema ☉ Generate a mapping file flattening the MongoDB schema into an SQL-compatible schema
  • 33.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY Demo>
  • 34.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY PostgreSQLMongoDB Mongo Connector Pymongo-Schema Doc-manager ☉ Mongodb example Dataset: Restaurants in New York > Address & coordinates > Cuisi ne type > List of grades ☉ Nested data structure
  • 35.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY
  • 36.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY PostgreSQLMongoDB Mongo Connector Pymongo-Schema Doc-manager EXTRACT Read entire database to extract its data model schema Returns: ☉ Field name and field nesting ☉ Field completion (frequence and ratio) ☉ Field type
  • 37.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY
  • 38.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY PostgreSQLMongoDB Mongo Connector Pymongo-Schema Doc-manager TOSQL Read a schema to generate a MongoDB/SQL mapping. Returns: ☉ Mapping file used by the doc-manager
  • 39.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY Same table Column “cuisine” New table “restaurants__address__coord
  • 40.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY PostgreSQLMongoDB Mongo Connector Pymongo-Schema Doc-manager Check for updates in the oplog file Send update commands with data Translate command and make SQL requests
  • 41.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY PostgreSQLMongoDB Mongo Connector Pymongo-Schema Doc-manager
  • 42.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY Time to play with your analytics tools>
  • 43.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY Adding an open-source BI tool... PostgreSQLMongoDB Mongo Connector Pymongo-Schema Doc-manager (PostgreSQL)
  • 44.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY Now, in Superset colors ! “Superset is a data exploration platform designed to be visual, intuitive and interactive.” PostgreSQLMongoDB Mongo Connector Pymongo-Schema Doc-manager Developed by AirBnB Python 2.7, 3.4, 3.5 Apache License 2.0 https://github.com/airbnb/superset Superset
  • 47.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY SQL lab
  • 48.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY Wrap up>
  • 49.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY Take home message ☉ Issues for analysts with NoSQL frameworks > Developer oriented languages > Nested data structure > Schema-less ☉ An open-source stack to unlock analysis of MongoDB data > Extract a MongoDB schema > Normalize the data model > Real time synchronization to PostgreSQL ☉ Currently running in production environments
  • 50.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY Come, use and contribute ! :) pajachiet@octo.com agervasi@octo.com https://github.com/mongodb-labs/mongo-connector https://github.com/Hopwork/mongo-connector-postgresql https://github.com/pajachiet/pymongo-schema https://github.com/airbnb/superset
  • 52.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY Bien rappeler qu’on est sur une stack open-source ☉ Collaborative ☉ Gratuite But hey ! It’s Open-Source !
  • 53.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY 53OCTO TECHNOLOGY > THERE IS A BETTER WAY « J’analyse mes données pour me comprendre » « J’apprends automatiquement à réaliser des tâches complexes à partir des données » « Je me dote d’outils avancés me permettant des analyses complexes et interactives » Dataviz Search Statistics Organisation pilotée par la donnée Learning
  • 54.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY MongoDB popularity https://db-engines.com/en/ranking
  • 55.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY Analysts use SQL The Mongo way : aggregation framework
  • 56.
    Superset Architecture des visualisations Datasource(tables SQLa)Tables PostgreSQL Visualisations Tableau de bord
  • 57.
    OCTO TECHNOLOGY >THERE IS A BETTER WAY Please ! An analyst for this data ! But analysts don’t speak json… ??? ? Should we call the developer ?

Editor's Notes

  • #2 TODO Ajouter logo Python Ajouter logo ou mention Pyparis
  • #3 On se présente mutuellement ? Intérêt de prendre tout les deux la parole dés le début. TODO : Supprimer interrogation We love Python. Easy enough for Data Scientist
  • #4 What do they do Master analyst that controls the world ? Crazy mathematician that build artificial intelligence ? TODO ajouter en rouge animé qui barre l’image “What our moms think we do” “What we think we do” Ajouter / reprendre image data scietintist face à écran qui parle SQL?
  • #5 What do they do Master analyst that controls the world ? Crazy mathematician that build artificial intelligence ? TODO ajouter en rouge animé qui barre l’image “What our moms think we do” “What we think we do” Ajouter / reprendre image data scietintist face à écran qui parle SQL?
  • #6 But in fact This is the main challenge. Always… From 80-95 % of the time (ajouter à la slide ?) TODO : image camembert ? The proportion between the 3 phases may vary a lot Data may be lost and difficult to know about Accessing data might be technically difficult. But it’s often a technical nightmare but even more often
  • #7 Subject of the talk : how to identify, access and prepare MongoDB data for analysis … from a Data Scientist point of view MongoDB data ~= json. Imbracated and flexible => fractal`` Prepare with Open-Source Python Advanced analysis : first level interactive dashboards. TODO Ajouter logo Open-Source Ajouter logo Python Animer la slide (chou pour fractal, arc en ciel, superset, etc)
  • #9 TODO Améliorer le graphique / visibilité
  • #11 Open-Source, and “free” Great APIs & documentation Works well with Object-Oriented programming Schema-less database : adapt your model as you go
  • #20 model layer http://seldo.com/weblog/2011/08/11/orm_is_an_antipattern
  • #21 model layer http://seldo.com/weblog/2011/08/11/orm_is_an_antipattern
  • #22 model layer http://seldo.com/weblog/2011/08/11/orm_is_an_antipattern
  • #23 Several usual reactions Prepare the data for the analysis tool you know … with the transform tool you know Traditional Analysts : mixed graphical / code : SAS stream Data Scientist : custom python scripts TODO animer
  • #28 Stream de MongoDB vers Postgres Procédure en STREAM et pas en BATCH
  • #29 MongoConnector synchronize a MongoDB to another DB Developed by MongoDB labs, open-source license Duplicate a mongodb DB to another database (originally NosQL) Read oplog (dessin) and propagate modifications to another DB
  • #31  Translate a modification request from MongoConnector to the target database. Implement some general methods : Upsert, Bulk Upsert, Update, Remove, etc Already existing for MongoDB, SolR, ElasticSearch (NoSQL DB) Open-source Contributions welcomed In contrary to the other doc-manager, it requires a mapping file which is used to flatten the NoSQL DB schema.
  • #33 By us, inspired by Variety, + license Contributions welcomed Objectif : generate mapping for doc-manager Could have been written by hand, but very error prone. Simplifies schema evolutions.
  • #35 Story to present : startup to grade restaurants in NY
  • #37 Story to present : startup to grade restaurants in NY Source du dataset Présentation d’un élément json
  • #39 Story to present : startup to grade restaurants in NY Source du dataset Présentation d’un élément json Habituellement réalisé à la main
  • #41 Story to present : startup to grade restaurants in NY Source du dataset Présentation d’un élément json
  • #44 Stream de MongoDB vers Postgres Procédure en STREAM et pas en BATCH
  • #45  Pas sec : release fréquentes, gestion des utilisateurs et permissions, documentation Interface web basée sur Flask
  • #55 Classement de l’intérêt des principales base de données