Gimel Data Platform is an analytics platform developed by PayPal that aims to simplify data access and analysis. The presentation provides an overview of Gimel, including PayPal's analytics ecosystem, the challenges Gimel addresses in data access and application lifecycle management, a demo of a sample flights cancelled use case using Gimel, and PayPal's plans to open source Gimel.
5. PayPal Big Data Platform
5
160+ PB Data
75,000+ YARN
jobs/day
One of the largest
Aerospike,
Teradata,
Hortonworks and
Oracle installations
Compute supported:
MR, Pig, Hive, Spark,
Beam
13 prod clusters, 12 non-
prod clusters
GPU co-located with
Hadoop
6. 6
Developer Data scientist Analyst Operator
Gimel SDK Notebooks
PCatalog Data API
Infrastructure services leveraged for elasticity and redundancy
Multi-DC Public cloudPredictive resource allocation
Logging
Monitoring
Alerting
Security
Application
Lifecycle
Management
Compute
Frameworkand
APIs
GimelData
Platform
User
Experience
andAccess
R Studio BI tools
22. Q&A
G i t h u b : h t t p : / / g i m e l . i o
Tr y i t y o u r s e l f : h t t p : / / t r y. g i m e l . i o
S l a c k : h t t p s : / / g i m e l - d e v. s l a c k . c o m
G o o g l e G r o u p s : h t t p s : / / g r o u p s . g o o g l e . c o m / d / f o r u m / g i m e l - d e v
22