Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

TechTalk #15 Grokking: The data processing journey at AhaMove


Published on

Thuc Nguyen - Lead Operation Engineer, AhaMove
Thuc Nguyen is working at AhaMove - an 1.5 year old instant delivery startup - as Lead Operation Engineer.

The topic is to share data problems which AhaMove had to deal with on early days, and how our engineering culture and philosophy affected our decisions and actions on handling these issues with detailed showcases on geospatial analytics and data pipeline.

Published in: Technology
  • Login to see the comments

TechTalk #15 Grokking: The data processing journey at AhaMove

  1. 1. The data processing journey at Presented by Thuc Nguyen - Lead Operation Engineer
  2. 2. On-demand logistics service Official launch on August 10th 2015 Our engineering team (in launch day)
  3. 3. 1) Metrics: fulfillment rate, supplier&user growth, intraday dashboard, supplier performance 2) Reporting: multiple teams such as Business Development, Finance and Accounting, Partners, Driver Relationship Management... Data problem 1: Metrics & Reporting
  4. 4. Solution on early days mostly on front-end to visualize data on web pages data export (on general data collections like orders, users, transactions, etc...) incremental calculated cache (in-memory and physical) when data load get bigger specific parameterized metric pages
  5. 5. Scale-up stage Exposure of: customized data requests data sources technical and resource limitations => A solution should: allows our staff to query and get their desire data by themselves. be simple for non-tech persons. be easy to maintain.
  6. 6. Scale-up stage (cont.) We chose MetaBase - an open source BI tool - 2 query modes: builder and native query - visualize data - saved query and dashboard - rich apis and utilities - alternatives: SaaS (chartio, tableau), OSS (slamdata) However, MetaBase is not quite good with NoSql, especially for native query (such as relationship matching and transformative functions supporting)
  7. 7. Scale-up stage (cont.) Our mongodb status: - Mongodb 3.2 with WiredTiger storage engine and replication in replica set mode Pros Cons - Flexible data schema - Strong query language with geo support - Strong indexing (sparse/partial, expire) - Oplog tailing - Ineffective relationship query - Poor utilities function support - Unfriendly to sql geeks
  8. 8. Scale-up stage (cont.) MongoDB Mosql Postgresql Metabase docker image 11.01.XX docker image a replication tool sync via mongodb oplog We designed a data pipeline to transform mongodb data into postgresql data: Result: fulfill all reporting requirements, automate 80% reporting works, resolve the bottleneck in data pipeline.
  9. 9. Results Our productivity is booming on data-related works: 6 staff can write queries now (none of them knows sql before), every staff can access their defined metrics. We reduced the client report preparation from 30mins to 5mins via Google Data Studio with state-of-art beauty (on the right) We have 2-way integration with Google Sheets, so that other teams like Fund Accounting can easily get the data they want via IMPORTDATA functions.
  10. 10. Geospatial analytics help us answer some common questions such as: - Which areas have low, high demands in a specific time frame? - Which areas have low, high supplies at a given time? - Can we ask our drivers to move from low demanding areas to high demanding ones? - How we present such kind of data: administrative areas (districts, wards), heatmap, hexagon, pin map? Data problem 2: Geospatial Analytics
  11. 11. Geospatial analytics stack MongoDB CartoDb CartoJs Leaflet interactive base-map 11.01.XX visualization layer a geo-database The main technologies we are using are Carto (SaaS) and Leaflet (front-end library).
  12. 12. Geospatial analytic showcases
  13. 13. - We’re making use of open source softwares as collective intelligence to minimize our maintenance effort. - We’re still exploring new ways to process and present data, and we think chat app is such a potential channel for this. - Our team motto: ‘Keep things simple’ Summary