SlideShare a Scribd company logo
1 of 14
Download to read offline
The data processing journey at
Presented by Thuc Nguyen - Lead
Operation Engineer
On-demand logistics service
Official launch on August 10th 2015
Our engineering team (in launch day)
1) Metrics: fulfillment rate, supplier&user growth, intraday dashboard, supplier performance
2) Reporting: multiple teams such as Business Development, Finance and Accounting,
Partners, Driver Relationship Management...
Data problem 1: Metrics & Reporting
Solution on early days
mostly on front-end to visualize data on web pages
data export (on general data collections like orders, users, transactions, etc...)
incremental calculated cache (in-memory and physical) when data load get bigger
specific parameterized metric pages
Scale-up stage
Exposure of:
customized data requests
data sources
technical and resource limitations
=> A solution should:
allows our staff to query and get their desire data by themselves.
be simple for non-tech persons.
be easy to maintain.
Scale-up stage (cont.)
We chose MetaBase - an open source BI tool
- 2 query modes: builder and native query
- visualize data
- saved query and dashboard
- rich apis and utilities
- alternatives: SaaS (chartio, tableau),
OSS (slamdata)
However, MetaBase is not quite good with NoSql, especially for native query (such as
relationship matching and transformative functions supporting)
Scale-up stage (cont.)
Our mongodb status:
- Mongodb 3.2 with WiredTiger storage engine and replication in replica set mode
Pros Cons
- Flexible data schema
- Strong query language with geo support
- Strong indexing (sparse/partial, expire)
- Oplog tailing
- Ineffective relationship query
- Poor utilities function support
- Unfriendly to sql geeks
Scale-up stage (cont.)
MongoDB Mosql Postgresql Metabase
docker image
11.01.XX
docker image
a replication tool
sync via mongodb oplog
We designed a data pipeline to transform mongodb data into postgresql data:
Result: fulfill all reporting requirements, automate 80% reporting works, resolve the bottleneck in data pipeline.
Results
Our productivity is booming on data-related works: 6 staff
can write queries now (none of them knows sql before),
every staff can access their defined metrics.
We reduced the client report preparation from 30mins to
5mins via Google Data Studio with state-of-art beauty (on
the right)
We have 2-way integration with Google Sheets, so that other
teams like Fund Accounting can easily get the data they
want via IMPORTDATA functions.
Geospatial analytics help us answer some common questions such as:
- Which areas have low, high demands in a specific time frame?
- Which areas have low, high supplies at a given time?
- Can we ask our drivers to move from low demanding areas to high demanding ones?
- How we present such kind of data: administrative areas (districts, wards), heatmap,
hexagon, pin map?
Data problem 2: Geospatial Analytics
Geospatial analytics stack
MongoDB CartoDb CartoJs Leaflet
interactive base-map
11.01.XX
visualization layer
a geo-database
The main technologies we are using are Carto (SaaS) and Leaflet (front-end library).
Geospatial analytic showcases
- We’re making use of open source softwares as collective
intelligence to minimize our maintenance effort.
- We’re still exploring new ways to process and present data, and
we think chat app is such a potential channel for this.
- Our team motto: ‘Keep things simple’
Summary
THANK YOU FOR YOUR LISTENINGAHAMOVE
YOUR PRIVATE MOVER

More Related Content

What's hot

Radical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the HoodRadical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the HoodDatabricks
 
Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...
Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...
Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...HostedbyConfluent
 
Designing and Implementing Information Systems with Event Modeling, Bobby Cal...
Designing and Implementing Information Systems with Event Modeling, Bobby Cal...Designing and Implementing Information Systems with Event Modeling, Bobby Cal...
Designing and Implementing Information Systems with Event Modeling, Bobby Cal...confluent
 
Continuous delivery for machine learning
Continuous delivery for machine learningContinuous delivery for machine learning
Continuous delivery for machine learningRajesh Muppalla
 
Data integration with Apache Kafka
Data integration with Apache KafkaData integration with Apache Kafka
Data integration with Apache Kafkaconfluent
 
Building a Streaming Microservice Architecture: with Apache Spark Structured ...
Building a Streaming Microservice Architecture: with Apache Spark Structured ...Building a Streaming Microservice Architecture: with Apache Spark Structured ...
Building a Streaming Microservice Architecture: with Apache Spark Structured ...Databricks
 
Zipline - A Declarative Feature Engineering Framework
Zipline - A Declarative Feature Engineering FrameworkZipline - A Declarative Feature Engineering Framework
Zipline - A Declarative Feature Engineering FrameworkDatabricks
 
Accelerating Machine Learning on Databricks Runtime
Accelerating Machine Learning on Databricks RuntimeAccelerating Machine Learning on Databricks Runtime
Accelerating Machine Learning on Databricks RuntimeDatabricks
 
Building End-to-End Delta Pipelines on GCP
Building End-to-End Delta Pipelines on GCPBuilding End-to-End Delta Pipelines on GCP
Building End-to-End Delta Pipelines on GCPDatabricks
 
Memory Optimization and Reliable Metrics in ML Pipelines at Netflix
Memory Optimization and Reliable Metrics in ML Pipelines at NetflixMemory Optimization and Reliable Metrics in ML Pipelines at Netflix
Memory Optimization and Reliable Metrics in ML Pipelines at NetflixDatabricks
 
When Apache Spark Meets TiDB with Xiaoyu Ma
When Apache Spark Meets TiDB with Xiaoyu MaWhen Apache Spark Meets TiDB with Xiaoyu Ma
When Apache Spark Meets TiDB with Xiaoyu MaDatabricks
 
On Improving Broadcast Joins in Apache Spark SQL
On Improving Broadcast Joins in Apache Spark SQLOn Improving Broadcast Joins in Apache Spark SQL
On Improving Broadcast Joins in Apache Spark SQLDatabricks
 
Using Apache Spark to Predict Installer Retention from Messy Clickstream Data...
Using Apache Spark to Predict Installer Retention from Messy Clickstream Data...Using Apache Spark to Predict Installer Retention from Messy Clickstream Data...
Using Apache Spark to Predict Installer Retention from Messy Clickstream Data...Databricks
 
Scoring at Scale: Generating Follow Recommendations for Over 690 Million Link...
Scoring at Scale: Generating Follow Recommendations for Over 690 Million Link...Scoring at Scale: Generating Follow Recommendations for Over 690 Million Link...
Scoring at Scale: Generating Follow Recommendations for Over 690 Million Link...Databricks
 
Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)
Introduction to ksqlDB and stream processing (Vish Srinivasan  - Confluent)Introduction to ksqlDB and stream processing (Vish Srinivasan  - Confluent)
Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)KafkaZone
 
Tangram: Distributed Scheduling Framework for Apache Spark at Facebook
Tangram: Distributed Scheduling Framework for Apache Spark at FacebookTangram: Distributed Scheduling Framework for Apache Spark at Facebook
Tangram: Distributed Scheduling Framework for Apache Spark at FacebookDatabricks
 
Flink Case Study: Capital One
Flink Case Study: Capital OneFlink Case Study: Capital One
Flink Case Study: Capital OneFlink Forward
 
All Streams Ahead! ksqlDB Workshop ANZ
All Streams Ahead! ksqlDB Workshop ANZAll Streams Ahead! ksqlDB Workshop ANZ
All Streams Ahead! ksqlDB Workshop ANZconfluent
 
Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...
Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...
Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...Khai Tran
 
Choosing the Right Transformer for Your Data Challenge
Choosing the Right Transformer for Your Data ChallengeChoosing the Right Transformer for Your Data Challenge
Choosing the Right Transformer for Your Data ChallengeSafe Software
 

What's hot (20)

Radical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the HoodRadical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the Hood
 
Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...
Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...
Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...
 
Designing and Implementing Information Systems with Event Modeling, Bobby Cal...
Designing and Implementing Information Systems with Event Modeling, Bobby Cal...Designing and Implementing Information Systems with Event Modeling, Bobby Cal...
Designing and Implementing Information Systems with Event Modeling, Bobby Cal...
 
Continuous delivery for machine learning
Continuous delivery for machine learningContinuous delivery for machine learning
Continuous delivery for machine learning
 
Data integration with Apache Kafka
Data integration with Apache KafkaData integration with Apache Kafka
Data integration with Apache Kafka
 
Building a Streaming Microservice Architecture: with Apache Spark Structured ...
Building a Streaming Microservice Architecture: with Apache Spark Structured ...Building a Streaming Microservice Architecture: with Apache Spark Structured ...
Building a Streaming Microservice Architecture: with Apache Spark Structured ...
 
Zipline - A Declarative Feature Engineering Framework
Zipline - A Declarative Feature Engineering FrameworkZipline - A Declarative Feature Engineering Framework
Zipline - A Declarative Feature Engineering Framework
 
Accelerating Machine Learning on Databricks Runtime
Accelerating Machine Learning on Databricks RuntimeAccelerating Machine Learning on Databricks Runtime
Accelerating Machine Learning on Databricks Runtime
 
Building End-to-End Delta Pipelines on GCP
Building End-to-End Delta Pipelines on GCPBuilding End-to-End Delta Pipelines on GCP
Building End-to-End Delta Pipelines on GCP
 
Memory Optimization and Reliable Metrics in ML Pipelines at Netflix
Memory Optimization and Reliable Metrics in ML Pipelines at NetflixMemory Optimization and Reliable Metrics in ML Pipelines at Netflix
Memory Optimization and Reliable Metrics in ML Pipelines at Netflix
 
When Apache Spark Meets TiDB with Xiaoyu Ma
When Apache Spark Meets TiDB with Xiaoyu MaWhen Apache Spark Meets TiDB with Xiaoyu Ma
When Apache Spark Meets TiDB with Xiaoyu Ma
 
On Improving Broadcast Joins in Apache Spark SQL
On Improving Broadcast Joins in Apache Spark SQLOn Improving Broadcast Joins in Apache Spark SQL
On Improving Broadcast Joins in Apache Spark SQL
 
Using Apache Spark to Predict Installer Retention from Messy Clickstream Data...
Using Apache Spark to Predict Installer Retention from Messy Clickstream Data...Using Apache Spark to Predict Installer Retention from Messy Clickstream Data...
Using Apache Spark to Predict Installer Retention from Messy Clickstream Data...
 
Scoring at Scale: Generating Follow Recommendations for Over 690 Million Link...
Scoring at Scale: Generating Follow Recommendations for Over 690 Million Link...Scoring at Scale: Generating Follow Recommendations for Over 690 Million Link...
Scoring at Scale: Generating Follow Recommendations for Over 690 Million Link...
 
Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)
Introduction to ksqlDB and stream processing (Vish Srinivasan  - Confluent)Introduction to ksqlDB and stream processing (Vish Srinivasan  - Confluent)
Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)
 
Tangram: Distributed Scheduling Framework for Apache Spark at Facebook
Tangram: Distributed Scheduling Framework for Apache Spark at FacebookTangram: Distributed Scheduling Framework for Apache Spark at Facebook
Tangram: Distributed Scheduling Framework for Apache Spark at Facebook
 
Flink Case Study: Capital One
Flink Case Study: Capital OneFlink Case Study: Capital One
Flink Case Study: Capital One
 
All Streams Ahead! ksqlDB Workshop ANZ
All Streams Ahead! ksqlDB Workshop ANZAll Streams Ahead! ksqlDB Workshop ANZ
All Streams Ahead! ksqlDB Workshop ANZ
 
Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...
Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...
Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...
 
Choosing the Right Transformer for Your Data Challenge
Choosing the Right Transformer for Your Data ChallengeChoosing the Right Transformer for Your Data Challenge
Choosing the Right Transformer for Your Data Challenge
 

Viewers also liked

TechTalk #13 Grokking: Marrying Elasticsearch with NLP to solve real-world se...
TechTalk #13 Grokking: Marrying Elasticsearch with NLP to solve real-world se...TechTalk #13 Grokking: Marrying Elasticsearch with NLP to solve real-world se...
TechTalk #13 Grokking: Marrying Elasticsearch with NLP to solve real-world se...Grokking VN
 
Grokking TechTalk #16: F**k bad CSS
Grokking TechTalk #16: F**k bad CSSGrokking TechTalk #16: F**k bad CSS
Grokking TechTalk #16: F**k bad CSSGrokking VN
 
TechTalk #14 Grokking: Couchbase - NoSQL + Memcached + Real-time + Offline!
TechTalk #14 Grokking:  Couchbase - NoSQL + Memcached + Real-time + Offline!TechTalk #14 Grokking:  Couchbase - NoSQL + Memcached + Real-time + Offline!
TechTalk #14 Grokking: Couchbase - NoSQL + Memcached + Real-time + Offline!Grokking VN
 
Running Natural Language Queries on MongoDB
Running Natural Language Queries on MongoDBRunning Natural Language Queries on MongoDB
Running Natural Language Queries on MongoDBMongoDB
 
Grokking: Data Engineering Course
Grokking: Data Engineering CourseGrokking: Data Engineering Course
Grokking: Data Engineering CourseGrokking VN
 
Grokking TechTalk #16: Maybe functor in javascript
Grokking TechTalk #16: Maybe functor in javascriptGrokking TechTalk #16: Maybe functor in javascript
Grokking TechTalk #16: Maybe functor in javascriptGrokking VN
 
NLIDB(Natural Language Interface to DataBases)
NLIDB(Natural Language Interface to DataBases)NLIDB(Natural Language Interface to DataBases)
NLIDB(Natural Language Interface to DataBases)Swetha Pallati
 
Natural language search using Neo4j
Natural language search using Neo4jNatural language search using Neo4j
Natural language search using Neo4jKenny Bastani
 
Natural Language Processing for the Semantic Web
Natural Language Processing for the Semantic WebNatural Language Processing for the Semantic Web
Natural Language Processing for the Semantic WebIsabelle Augenstein
 
Grokking TechTalk #16: React stack at lozi
Grokking TechTalk #16: React stack at loziGrokking TechTalk #16: React stack at lozi
Grokking TechTalk #16: React stack at loziGrokking VN
 
Natural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4jNatural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4jWilliam Lyon
 
Natural Language Processing with Neo4j
Natural Language Processing with Neo4jNatural Language Processing with Neo4j
Natural Language Processing with Neo4jKenny Bastani
 
Querying your database in natural language by Daniel Moisset PyData SV 2014
Querying your database in natural language by Daniel Moisset PyData SV 2014Querying your database in natural language by Daniel Moisset PyData SV 2014
Querying your database in natural language by Daniel Moisset PyData SV 2014PyData
 
Uber vs Grab SWOT Anaysis
Uber vs Grab SWOT AnaysisUber vs Grab SWOT Anaysis
Uber vs Grab SWOT AnaysisShooger
 
Hi-Media Couchbase meetup Paris Nb #1
Hi-Media Couchbase meetup Paris Nb #1Hi-Media Couchbase meetup Paris Nb #1
Hi-Media Couchbase meetup Paris Nb #1Mickaël Le Baillif
 

Viewers also liked (18)

TechTalk #13 Grokking: Marrying Elasticsearch with NLP to solve real-world se...
TechTalk #13 Grokking: Marrying Elasticsearch with NLP to solve real-world se...TechTalk #13 Grokking: Marrying Elasticsearch with NLP to solve real-world se...
TechTalk #13 Grokking: Marrying Elasticsearch with NLP to solve real-world se...
 
Grokking TechTalk #16: F**k bad CSS
Grokking TechTalk #16: F**k bad CSSGrokking TechTalk #16: F**k bad CSS
Grokking TechTalk #16: F**k bad CSS
 
TechTalk #14 Grokking: Couchbase - NoSQL + Memcached + Real-time + Offline!
TechTalk #14 Grokking:  Couchbase - NoSQL + Memcached + Real-time + Offline!TechTalk #14 Grokking:  Couchbase - NoSQL + Memcached + Real-time + Offline!
TechTalk #14 Grokking: Couchbase - NoSQL + Memcached + Real-time + Offline!
 
Running Natural Language Queries on MongoDB
Running Natural Language Queries on MongoDBRunning Natural Language Queries on MongoDB
Running Natural Language Queries on MongoDB
 
Grokking: Data Engineering Course
Grokking: Data Engineering CourseGrokking: Data Engineering Course
Grokking: Data Engineering Course
 
Grokking TechTalk #16: Maybe functor in javascript
Grokking TechTalk #16: Maybe functor in javascriptGrokking TechTalk #16: Maybe functor in javascript
Grokking TechTalk #16: Maybe functor in javascript
 
NLIDB(Natural Language Interface to DataBases)
NLIDB(Natural Language Interface to DataBases)NLIDB(Natural Language Interface to DataBases)
NLIDB(Natural Language Interface to DataBases)
 
Natural language search using Neo4j
Natural language search using Neo4jNatural language search using Neo4j
Natural language search using Neo4j
 
Natural Language Processing for the Semantic Web
Natural Language Processing for the Semantic WebNatural Language Processing for the Semantic Web
Natural Language Processing for the Semantic Web
 
Grokking TechTalk #16: React stack at lozi
Grokking TechTalk #16: React stack at loziGrokking TechTalk #16: React stack at lozi
Grokking TechTalk #16: React stack at lozi
 
Young marketers 3 nhóm m.e
Young marketers 3   nhóm m.eYoung marketers 3   nhóm m.e
Young marketers 3 nhóm m.e
 
Natural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4jNatural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4j
 
Natural Language Processing with Neo4j
Natural Language Processing with Neo4jNatural Language Processing with Neo4j
Natural Language Processing with Neo4j
 
Querying your database in natural language by Daniel Moisset PyData SV 2014
Querying your database in natural language by Daniel Moisset PyData SV 2014Querying your database in natural language by Daniel Moisset PyData SV 2014
Querying your database in natural language by Daniel Moisset PyData SV 2014
 
Grabbike
GrabbikeGrabbike
Grabbike
 
GrabBike
GrabBikeGrabBike
GrabBike
 
Uber vs Grab SWOT Anaysis
Uber vs Grab SWOT AnaysisUber vs Grab SWOT Anaysis
Uber vs Grab SWOT Anaysis
 
Hi-Media Couchbase meetup Paris Nb #1
Hi-Media Couchbase meetup Paris Nb #1Hi-Media Couchbase meetup Paris Nb #1
Hi-Media Couchbase meetup Paris Nb #1
 

Similar to TechTalk #15 Grokking: The data processing journey at AhaMove

Lean product management for web2.0 by Sujoy Bhatacharjee, April
Lean product management for web2.0 by Sujoy Bhatacharjee, April Lean product management for web2.0 by Sujoy Bhatacharjee, April
Lean product management for web2.0 by Sujoy Bhatacharjee, April Triggr In
 
Become BI Architect with 1KEY Agile BI Suite - Architecture
Become BI Architect with 1KEY Agile BI Suite  - ArchitectureBecome BI Architect with 1KEY Agile BI Suite  - Architecture
Become BI Architect with 1KEY Agile BI Suite - ArchitectureDhiren Gala
 
Resume_ XuZhiyuan
Resume_ XuZhiyuanResume_ XuZhiyuan
Resume_ XuZhiyuanzhiyuan xu
 
Mammothdb - Public VC Pitchdeck!
Mammothdb - Public VC Pitchdeck!Mammothdb - Public VC Pitchdeck!
Mammothdb - Public VC Pitchdeck!Steve Keil
 
Resume quaish abuzer
Resume quaish abuzerResume quaish abuzer
Resume quaish abuzerquaish abuzer
 
B4UConference_Design Big Data System
B4UConference_Design Big Data SystemB4UConference_Design Big Data System
B4UConference_Design Big Data SystemHoa Le
 
The Evolution of a Scrappy Startup to a Successful Web Service
The Evolution of a Scrappy Startup to a Successful Web ServiceThe Evolution of a Scrappy Startup to a Successful Web Service
The Evolution of a Scrappy Startup to a Successful Web ServicePoornima Vijayashanker
 
The Lyft data platform: Now and in the future
The Lyft data platform: Now and in the futureThe Lyft data platform: Now and in the future
The Lyft data platform: Now and in the futuremarkgrover
 
Lyft data Platform - 2019 slides
Lyft data Platform - 2019 slidesLyft data Platform - 2019 slides
Lyft data Platform - 2019 slidesKarthik Murugesan
 
DEEPAK_Ver1.3
DEEPAK_Ver1.3DEEPAK_Ver1.3
DEEPAK_Ver1.3Deepak CG
 
Building Reactive Real-time Data Pipeline
Building Reactive Real-time Data PipelineBuilding Reactive Real-time Data Pipeline
Building Reactive Real-time Data PipelineTrieu Nguyen
 
sap hana|sap hana database| Introduction to sap hana
sap hana|sap hana database| Introduction to sap hanasap hana|sap hana database| Introduction to sap hana
sap hana|sap hana database| Introduction to sap hanaJames L. Lee
 

Similar to TechTalk #15 Grokking: The data processing journey at AhaMove (20)

Lean product management for web2.0 by Sujoy Bhatacharjee, April
Lean product management for web2.0 by Sujoy Bhatacharjee, April Lean product management for web2.0 by Sujoy Bhatacharjee, April
Lean product management for web2.0 by Sujoy Bhatacharjee, April
 
Resume Pallavi Mishra as of 2017 Feb
Resume Pallavi Mishra as of 2017 FebResume Pallavi Mishra as of 2017 Feb
Resume Pallavi Mishra as of 2017 Feb
 
Big Data przt.pptx
Big Data przt.pptxBig Data przt.pptx
Big Data przt.pptx
 
Become BI Architect with 1KEY Agile BI Suite - Architecture
Become BI Architect with 1KEY Agile BI Suite  - ArchitectureBecome BI Architect with 1KEY Agile BI Suite  - Architecture
Become BI Architect with 1KEY Agile BI Suite - Architecture
 
Resume_ XuZhiyuan
Resume_ XuZhiyuanResume_ XuZhiyuan
Resume_ XuZhiyuan
 
Mammothdb - Public VC Pitchdeck!
Mammothdb - Public VC Pitchdeck!Mammothdb - Public VC Pitchdeck!
Mammothdb - Public VC Pitchdeck!
 
Sabyasachee_Kar_cv
Sabyasachee_Kar_cvSabyasachee_Kar_cv
Sabyasachee_Kar_cv
 
Resume quaish abuzer
Resume quaish abuzerResume quaish abuzer
Resume quaish abuzer
 
B4UConference_Design Big Data System
B4UConference_Design Big Data SystemB4UConference_Design Big Data System
B4UConference_Design Big Data System
 
The Evolution of a Scrappy Startup to a Successful Web Service
The Evolution of a Scrappy Startup to a Successful Web ServiceThe Evolution of a Scrappy Startup to a Successful Web Service
The Evolution of a Scrappy Startup to a Successful Web Service
 
The Lyft data platform: Now and in the future
The Lyft data platform: Now and in the futureThe Lyft data platform: Now and in the future
The Lyft data platform: Now and in the future
 
Lyft data Platform - 2019 slides
Lyft data Platform - 2019 slidesLyft data Platform - 2019 slides
Lyft data Platform - 2019 slides
 
DEEPAK_Ver1.3
DEEPAK_Ver1.3DEEPAK_Ver1.3
DEEPAK_Ver1.3
 
Resume (1)
Resume (1)Resume (1)
Resume (1)
 
Resume (1)
Resume (1)Resume (1)
Resume (1)
 
Building Reactive Real-time Data Pipeline
Building Reactive Real-time Data PipelineBuilding Reactive Real-time Data Pipeline
Building Reactive Real-time Data Pipeline
 
sap hana|sap hana database| Introduction to sap hana
sap hana|sap hana database| Introduction to sap hanasap hana|sap hana database| Introduction to sap hana
sap hana|sap hana database| Introduction to sap hana
 
BizViz - CA PPM Analytics
BizViz - CA PPM AnalyticsBizViz - CA PPM Analytics
BizViz - CA PPM Analytics
 
Kashif guffar
Kashif guffarKashif guffar
Kashif guffar
 
160 mark walter
160 mark walter160 mark walter
160 mark walter
 

More from Grokking VN

Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banksGrokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banksGrokking VN
 
Grokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles ThinkingGrokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles ThinkingGrokking VN
 
Grokking Techtalk #43: Payment gateway demystified
Grokking Techtalk #43: Payment gateway demystifiedGrokking Techtalk #43: Payment gateway demystified
Grokking Techtalk #43: Payment gateway demystifiedGrokking VN
 
Grokking Techtalk #40: Consistency and Availability tradeoff in database cluster
Grokking Techtalk #40: Consistency and Availability tradeoff in database clusterGrokking Techtalk #40: Consistency and Availability tradeoff in database cluster
Grokking Techtalk #40: Consistency and Availability tradeoff in database clusterGrokking VN
 
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platform
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platformGrokking Techtalk #40: AWS’s philosophy on designing MLOps platform
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platformGrokking VN
 
Grokking Techtalk #39: Gossip protocol and applications
Grokking Techtalk #39: Gossip protocol and applicationsGrokking Techtalk #39: Gossip protocol and applications
Grokking Techtalk #39: Gossip protocol and applicationsGrokking VN
 
Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
 Grokking Techtalk #39: How to build an event driven architecture with Kafka ... Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
Grokking Techtalk #39: How to build an event driven architecture with Kafka ...Grokking VN
 
Grokking Techtalk #38: Escape Analysis in Go compiler
 Grokking Techtalk #38: Escape Analysis in Go compiler Grokking Techtalk #38: Escape Analysis in Go compiler
Grokking Techtalk #38: Escape Analysis in Go compilerGrokking VN
 
Grokking Techtalk #37: Software design and refactoring
 Grokking Techtalk #37: Software design and refactoring Grokking Techtalk #37: Software design and refactoring
Grokking Techtalk #37: Software design and refactoringGrokking VN
 
Grokking TechTalk #35: Efficient spellchecking
Grokking TechTalk #35: Efficient spellcheckingGrokking TechTalk #35: Efficient spellchecking
Grokking TechTalk #35: Efficient spellcheckingGrokking VN
 
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
 Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer... Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...Grokking VN
 
Grokking TechTalk #33: High Concurrency Architecture at TIKI
Grokking TechTalk #33: High Concurrency Architecture at TIKIGrokking TechTalk #33: High Concurrency Architecture at TIKI
Grokking TechTalk #33: High Concurrency Architecture at TIKIGrokking VN
 
SOLID & Design Patterns
SOLID & Design PatternsSOLID & Design Patterns
SOLID & Design PatternsGrokking VN
 
Grokking TechTalk #31: Asynchronous Communications
Grokking TechTalk #31: Asynchronous CommunicationsGrokking TechTalk #31: Asynchronous Communications
Grokking TechTalk #31: Asynchronous CommunicationsGrokking VN
 
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at ScaleGrokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at ScaleGrokking VN
 
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedInGrokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedInGrokking VN
 
Grokking TechTalk #27: Optimal Binary Search Tree
Grokking TechTalk #27: Optimal Binary Search TreeGrokking TechTalk #27: Optimal Binary Search Tree
Grokking TechTalk #27: Optimal Binary Search TreeGrokking VN
 
Grokking TechTalk #26: Kotlin, Understand the Magic
Grokking TechTalk #26: Kotlin, Understand the MagicGrokking TechTalk #26: Kotlin, Understand the Magic
Grokking TechTalk #26: Kotlin, Understand the MagicGrokking VN
 
Grokking TechTalk #26: Compare ios and android platform
Grokking TechTalk #26: Compare ios and android platformGrokking TechTalk #26: Compare ios and android platform
Grokking TechTalk #26: Compare ios and android platformGrokking VN
 
Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...
Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...
Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...Grokking VN
 

More from Grokking VN (20)

Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banksGrokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
 
Grokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles ThinkingGrokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles Thinking
 
Grokking Techtalk #43: Payment gateway demystified
Grokking Techtalk #43: Payment gateway demystifiedGrokking Techtalk #43: Payment gateway demystified
Grokking Techtalk #43: Payment gateway demystified
 
Grokking Techtalk #40: Consistency and Availability tradeoff in database cluster
Grokking Techtalk #40: Consistency and Availability tradeoff in database clusterGrokking Techtalk #40: Consistency and Availability tradeoff in database cluster
Grokking Techtalk #40: Consistency and Availability tradeoff in database cluster
 
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platform
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platformGrokking Techtalk #40: AWS’s philosophy on designing MLOps platform
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platform
 
Grokking Techtalk #39: Gossip protocol and applications
Grokking Techtalk #39: Gossip protocol and applicationsGrokking Techtalk #39: Gossip protocol and applications
Grokking Techtalk #39: Gossip protocol and applications
 
Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
 Grokking Techtalk #39: How to build an event driven architecture with Kafka ... Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
 
Grokking Techtalk #38: Escape Analysis in Go compiler
 Grokking Techtalk #38: Escape Analysis in Go compiler Grokking Techtalk #38: Escape Analysis in Go compiler
Grokking Techtalk #38: Escape Analysis in Go compiler
 
Grokking Techtalk #37: Software design and refactoring
 Grokking Techtalk #37: Software design and refactoring Grokking Techtalk #37: Software design and refactoring
Grokking Techtalk #37: Software design and refactoring
 
Grokking TechTalk #35: Efficient spellchecking
Grokking TechTalk #35: Efficient spellcheckingGrokking TechTalk #35: Efficient spellchecking
Grokking TechTalk #35: Efficient spellchecking
 
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
 Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer... Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
 
Grokking TechTalk #33: High Concurrency Architecture at TIKI
Grokking TechTalk #33: High Concurrency Architecture at TIKIGrokking TechTalk #33: High Concurrency Architecture at TIKI
Grokking TechTalk #33: High Concurrency Architecture at TIKI
 
SOLID & Design Patterns
SOLID & Design PatternsSOLID & Design Patterns
SOLID & Design Patterns
 
Grokking TechTalk #31: Asynchronous Communications
Grokking TechTalk #31: Asynchronous CommunicationsGrokking TechTalk #31: Asynchronous Communications
Grokking TechTalk #31: Asynchronous Communications
 
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at ScaleGrokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
 
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedInGrokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
 
Grokking TechTalk #27: Optimal Binary Search Tree
Grokking TechTalk #27: Optimal Binary Search TreeGrokking TechTalk #27: Optimal Binary Search Tree
Grokking TechTalk #27: Optimal Binary Search Tree
 
Grokking TechTalk #26: Kotlin, Understand the Magic
Grokking TechTalk #26: Kotlin, Understand the MagicGrokking TechTalk #26: Kotlin, Understand the Magic
Grokking TechTalk #26: Kotlin, Understand the Magic
 
Grokking TechTalk #26: Compare ios and android platform
Grokking TechTalk #26: Compare ios and android platformGrokking TechTalk #26: Compare ios and android platform
Grokking TechTalk #26: Compare ios and android platform
 
Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...
Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...
Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...
 

Recently uploaded

Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 

Recently uploaded (20)

Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 

TechTalk #15 Grokking: The data processing journey at AhaMove

  • 1. The data processing journey at Presented by Thuc Nguyen - Lead Operation Engineer
  • 2. On-demand logistics service Official launch on August 10th 2015 Our engineering team (in launch day)
  • 3. 1) Metrics: fulfillment rate, supplier&user growth, intraday dashboard, supplier performance 2) Reporting: multiple teams such as Business Development, Finance and Accounting, Partners, Driver Relationship Management... Data problem 1: Metrics & Reporting
  • 4. Solution on early days mostly on front-end to visualize data on web pages data export (on general data collections like orders, users, transactions, etc...) incremental calculated cache (in-memory and physical) when data load get bigger specific parameterized metric pages
  • 5. Scale-up stage Exposure of: customized data requests data sources technical and resource limitations => A solution should: allows our staff to query and get their desire data by themselves. be simple for non-tech persons. be easy to maintain.
  • 6. Scale-up stage (cont.) We chose MetaBase - an open source BI tool - 2 query modes: builder and native query - visualize data - saved query and dashboard - rich apis and utilities - alternatives: SaaS (chartio, tableau), OSS (slamdata) However, MetaBase is not quite good with NoSql, especially for native query (such as relationship matching and transformative functions supporting)
  • 7. Scale-up stage (cont.) Our mongodb status: - Mongodb 3.2 with WiredTiger storage engine and replication in replica set mode Pros Cons - Flexible data schema - Strong query language with geo support - Strong indexing (sparse/partial, expire) - Oplog tailing - Ineffective relationship query - Poor utilities function support - Unfriendly to sql geeks
  • 8. Scale-up stage (cont.) MongoDB Mosql Postgresql Metabase docker image 11.01.XX docker image a replication tool sync via mongodb oplog We designed a data pipeline to transform mongodb data into postgresql data: Result: fulfill all reporting requirements, automate 80% reporting works, resolve the bottleneck in data pipeline.
  • 9. Results Our productivity is booming on data-related works: 6 staff can write queries now (none of them knows sql before), every staff can access their defined metrics. We reduced the client report preparation from 30mins to 5mins via Google Data Studio with state-of-art beauty (on the right) We have 2-way integration with Google Sheets, so that other teams like Fund Accounting can easily get the data they want via IMPORTDATA functions.
  • 10. Geospatial analytics help us answer some common questions such as: - Which areas have low, high demands in a specific time frame? - Which areas have low, high supplies at a given time? - Can we ask our drivers to move from low demanding areas to high demanding ones? - How we present such kind of data: administrative areas (districts, wards), heatmap, hexagon, pin map? Data problem 2: Geospatial Analytics
  • 11. Geospatial analytics stack MongoDB CartoDb CartoJs Leaflet interactive base-map 11.01.XX visualization layer a geo-database The main technologies we are using are Carto (SaaS) and Leaflet (front-end library).
  • 13. - We’re making use of open source softwares as collective intelligence to minimize our maintenance effort. - We’re still exploring new ways to process and present data, and we think chat app is such a potential channel for this. - Our team motto: ‘Keep things simple’ Summary
  • 14. THANK YOU FOR YOUR LISTENINGAHAMOVE YOUR PRIVATE MOVER