Uber Presentation.pptx

•Download as PPTX, PDF•

0 likes•80 views

Muddasarahmed5

Data Science

Data & Analytics

Uber Architecture
Name: Saqib Salam
Reg No :CS320212010

Uber System Architecture
 We all are familiar with Uber services. A user can request a ride through
the application and within a few minutes, a driver arrives nearby his/her
location to take them to their destination.
 Before 2014, the total amount of data stored at Uber was small enough
to fit into a few traditional OLTP databases. There was no global view
of the data, and data access was fast since each database was queried
directly.

Uber System Architecture
 With Uber’s business growing exponentially (both in terms of the number
of cities/countries and the number of riders/drivers), the amount of
incoming data also increased and the need to access and analyze all the
data in one place required.
 They use Vertica as their data warehouse software because of its fast,
scalable, and column-oriented design. They also developed multiple ad hoc
ETL (Extract, Transform, and Load) jobs that copied data from different
sources (i.e. AWS S3, OLTP databases, service logs, etc.) into Vertica.
 To achieve the latter, They standardized SQL as their solution’s interface
and built an online query service to accept user queries and submit them to
the underlying query engine

Uber System Architecture
Limitations:
 Data reliability became a concern, as data was ingested through ad hoc ETL
jobs and we lacked a formal schema communication mechanism.
 Most of source data was in JSON format, and ingestion jobs were not resilient
to changes in the producer code.
 scaling data warehouse became increasingly expensive. To cut down on costs,
we started deleting older, obsolete data to free up space for new data.
 The same data could be ingested multiple times if different users performed
different transformations during ingestion.

Uber System Architecture
The arrival of Hadoop:
 To address these limitations, They re-architected Big Data platform around the
Hadoop ecosystem.
 More specifically, we introduced a Hadoop data lake where all raw data was
ingested from different online data stores only once and with no transformation
during ingestion.
 In order for users to access data in Hadoop we introduced:
 Presto to enable interactive ad hoc user queries,
 Apache Spark to facilitate programmatic access to raw data
 Apache Hive to serve as the workhorse for extremely large queries.

Recently uploaded

The Significance of Transliteration Enhancingmohamed Elzalabany

如何办理(Dalhousie毕业证书）达尔豪斯大学毕业证成绩单留信学历认证zifhagzkk

Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...ssuserf63bd7

原件一样(UWO毕业证书）西安大略大学毕业证成绩单留信学历认证pwgnohujw

NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...Amil baba

obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...yulianti213969

如何办理(UPenn毕业证书）宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证acoha1

Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Valters Lauzums

Displacement, Velocity, Acceleration, and Second Derivatives23050636

如何办理哥伦比亚大学毕业证(Columbia毕业证）成绩单原版一比一fztigerwe

Digital Marketing Demystified: Expert Tips from Samantha Rae CoolbethSamantha Rae Coolbeth

原件一样伦敦国王学院毕业证成绩单留信学历认证pwgnohujw

NOAM AAUG Adobe Summit 2024: Summit Slam Dunksgmuir1066

Audience Researchndfhcvnfgvgbhujhgfv.pptxStephen266013

How to Transform Clinical Trial Management with Advanced Data AnalyticsBrainSell Technologies

Formulas dax para power bI de microsoft.pdfRobertoOcampo24

社内勉強会資料_Object Recognition as Next Token PredictionNABLAS株式会社

Predictive Precipitation: Advanced Rain Forecasting TechniquesBoston Institute of Analytics

1:1原版定制利物浦大学毕业证(Liverpool毕业证）成绩单学位证书留信学历认证ppy8zfkfm

Sensing the Future: Anomaly Detection and Event Prediction in Sensor NetworksBoston Institute of Analytics

Recently uploaded (20)

The Significance of Transliteration Enhancing

如何办理(Dalhousie毕业证书）达尔豪斯大学毕业证成绩单留信学历认证

Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...

原件一样(UWO毕业证书）西安大略大学毕业证成绩单留信学历认证

NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...

obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...

如何办理(UPenn毕业证书）宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证

Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...

Displacement, Velocity, Acceleration, and Second Derivatives

如何办理哥伦比亚大学毕业证(Columbia毕业证）成绩单原版一比一

Digital Marketing Demystified: Expert Tips from Samantha Rae Coolbeth

原件一样伦敦国王学院毕业证成绩单留信学历认证

NOAM AAUG Adobe Summit 2024: Summit Slam Dunks

Audience Researchndfhcvnfgvgbhujhgfv.pptx

How to Transform Clinical Trial Management with Advanced Data Analytics

Formulas dax para power bI de microsoft.pdf

社内勉強会資料_Object Recognition as Next Token Prediction

Predictive Precipitation: Advanced Rain Forecasting Techniques

1:1原版定制利物浦大学毕业证(Liverpool毕业证）成绩单学位证书留信学历认证

Sensing the Future: Anomaly Detection and Event Prediction in Sensor Networks

Uber Presentation.pptx

1. Uber Architecture Name: Saqib Salam Reg No :CS320212010

2. Uber System Architecture  We all are familiar with Uber services. A user can request a ride through the application and within a few minutes, a driver arrives nearby his/her location to take them to their destination.  Before 2014, the total amount of data stored at Uber was small enough to fit into a few traditional OLTP databases. There was no global view of the data, and data access was fast since each database was queried directly.

3. Uber System Architecture

4. Uber System Architecture  With Uber’s business growing exponentially (both in terms of the number of cities/countries and the number of riders/drivers), the amount of incoming data also increased and the need to access and analyze all the data in one place required.  They use Vertica as their data warehouse software because of its fast, scalable, and column-oriented design. They also developed multiple ad hoc ETL (Extract, Transform, and Load) jobs that copied data from different sources (i.e. AWS S3, OLTP databases, service logs, etc.) into Vertica.  To achieve the latter, They standardized SQL as their solution’s interface and built an online query service to accept user queries and submit them to the underlying query engine

5. Uber System Architecture

6. Uber System Architecture Limitations:  Data reliability became a concern, as data was ingested through ad hoc ETL jobs and we lacked a formal schema communication mechanism.  Most of source data was in JSON format, and ingestion jobs were not resilient to changes in the producer code.  scaling data warehouse became increasingly expensive. To cut down on costs, we started deleting older, obsolete data to free up space for new data.  The same data could be ingested multiple times if different users performed different transformations during ingestion.

7. Uber System Architecture The arrival of Hadoop:  To address these limitations, They re-architected Big Data platform around the Hadoop ecosystem.  More specifically, we introduced a Hadoop data lake where all raw data was ingested from different online data stores only once and with no transformation during ingestion.  In order for users to access data in Hadoop we introduced:  Presto to enable interactive ad hoc user queries,  Apache Spark to facilitate programmatic access to raw data  Apache Hive to serve as the workhorse for extremely large queries.

8. Uber System Architecture

9. Uber System Architecture

10. THANKS……

Uber Presentation.pptx

Recommended

Recommended

More Related Content

Similar to Uber Presentation.pptx

Similar to Uber Presentation.pptx (20)

Recently uploaded

Recently uploaded (20)

Uber Presentation.pptx