SlideShare a Scribd company logo
Akash Mishra
Real Time API
delivering Data @ Scale
Agenda
API Overview
Key System Requirement
Big Data System Vs RDBMS
Architecture
Data Flow
Questions?
API Overview
API details
REST based API
Partners can request for various types of reports
Each reports has data in order of T.B's
Sample Request
?start-date=2012-10-01&end-date=2012-10-
29&partner=1&aggregate-by=state,city
Response
Zip file [Size in order of 10-30 M.B]
Key System Requirement
Interactive Filtering Query
– Partner can filter data on various parameter.
Real Time Response
– SLA of 1-3 min.
Security
Extremely private and confidential data.
Need to go through an audit by external vendor
Scalability
Only more machine for more customer
Big Data System Vs Relational Data System
Large Amount of Data [In order of T.B's ]
Hadoop/Hive
RDBMS
Real Time Interactive Filtering/Querying
Hadoop/Hive
RDBMS
Join's between large tables [ millions X millions X millions ]
– Hadoop/Hive
– RDBMS
Big Data System Vs Relational Data System
Access/Security Control
Hadoop/Hive
RDBMS
Resilient to Hardware failure and Auto Scaling
Hadoop/Hive
RDBMS
Fast read operation's
– Hadoop/Hive
– RDBMS
Architecture
Data Flow
De-normalization on Hadoop/Hive
Time: 3hrs
#Records: 230m
Data Flow
Dynamic partitioning on Hadoop/Hive
# Buckets 15
#Records: 230m
Data Flow
Sqoop Export
#Records: 230m
Size: 1 T.B
Data Flow
Security Control in RDBMS
Strong User authentication mechanism.
Restricted access to each user on database and table level
Each partner has specific user and associated tables
No cross-referencing of data across [table] partner.
Data Flow
Java API
Common Pattern [Streaming]
• Read a bunch of records from DB.
• Process records.
• Stream back to client.
Avoiding creating unnecessary objects
• Java heap memory exception because of using String in
place of Char Array.
Questions???

More Related Content

What's hot

Mesh-ing around with Streams across the Enterprise | Phil Scanlon, Solace
Mesh-ing around with Streams across the Enterprise | Phil Scanlon, SolaceMesh-ing around with Streams across the Enterprise | Phil Scanlon, Solace
Mesh-ing around with Streams across the Enterprise | Phil Scanlon, Solace
HostedbyConfluent
 
[WSO2Con EU 2017] Open Interoperability of WSO2 Analytics Platform
[WSO2Con EU 2017] Open Interoperability of WSO2 Analytics Platform[WSO2Con EU 2017] Open Interoperability of WSO2 Analytics Platform
[WSO2Con EU 2017] Open Interoperability of WSO2 Analytics Platform
WSO2
 
What does an event mean? Manage the meaning of your data! | Andreas Wombacher...
What does an event mean? Manage the meaning of your data! | Andreas Wombacher...What does an event mean? Manage the meaning of your data! | Andreas Wombacher...
What does an event mean? Manage the meaning of your data! | Andreas Wombacher...
HostedbyConfluent
 
Modernizing with microservices and fast data
Modernizing with microservices and fast dataModernizing with microservices and fast data
Modernizing with microservices and fast data
Patrick Di Loreto
 
Server Sent Events using Reactive Kafka and Spring Web flux | Gagan Solur Ven...
Server Sent Events using Reactive Kafka and Spring Web flux | Gagan Solur Ven...Server Sent Events using Reactive Kafka and Spring Web flux | Gagan Solur Ven...
Server Sent Events using Reactive Kafka and Spring Web flux | Gagan Solur Ven...
HostedbyConfluent
 
[WSO2Con EU 2018] Decentralized Data Architectures
[WSO2Con EU 2018] Decentralized Data Architectures[WSO2Con EU 2018] Decentralized Data Architectures
[WSO2Con EU 2018] Decentralized Data Architectures
WSO2
 
Scalable Data Management for Kafka and Beyond | Dan Rice, BigID
Scalable Data Management for Kafka and Beyond | Dan Rice, BigIDScalable Data Management for Kafka and Beyond | Dan Rice, BigID
Scalable Data Management for Kafka and Beyond | Dan Rice, BigID
HostedbyConfluent
 
Monitoreo sencillo de la infraestructura, de la ingesta a la visualización
Monitoreo sencillo de la infraestructura, de la ingesta a la visualizaciónMonitoreo sencillo de la infraestructura, de la ingesta a la visualización
Monitoreo sencillo de la infraestructura, de la ingesta a la visualización
Elasticsearch
 
Testing Event Driven Architectures: How to Broker the Complexity | Frank Kilc...
Testing Event Driven Architectures: How to Broker the Complexity | Frank Kilc...Testing Event Driven Architectures: How to Broker the Complexity | Frank Kilc...
Testing Event Driven Architectures: How to Broker the Complexity | Frank Kilc...
HostedbyConfluent
 
Digital Transformation in Healthcare with Kafka—Building a Low Latency Data P...
Digital Transformation in Healthcare with Kafka—Building a Low Latency Data P...Digital Transformation in Healthcare with Kafka—Building a Low Latency Data P...
Digital Transformation in Healthcare with Kafka—Building a Low Latency Data P...
confluent
 
Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google
Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, GoogleHybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google
Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google
HostedbyConfluent
 
Accelerating Innovation with Apache Kafka, Heikki Nousiainen | Heikki Nousiai...
Accelerating Innovation with Apache Kafka, Heikki Nousiainen | Heikki Nousiai...Accelerating Innovation with Apache Kafka, Heikki Nousiainen | Heikki Nousiai...
Accelerating Innovation with Apache Kafka, Heikki Nousiainen | Heikki Nousiai...
HostedbyConfluent
 
Why Kafka Works the Way It Does (And Not Some Other Way) | Tim Berglund, Conf...
Why Kafka Works the Way It Does (And Not Some Other Way) | Tim Berglund, Conf...Why Kafka Works the Way It Does (And Not Some Other Way) | Tim Berglund, Conf...
Why Kafka Works the Way It Does (And Not Some Other Way) | Tim Berglund, Conf...
HostedbyConfluent
 
Event-Driven Microservices with Apache Kafka, Kafka Streams and KSQL
Event-Driven Microservices with Apache Kafka, Kafka Streams and KSQLEvent-Driven Microservices with Apache Kafka, Kafka Streams and KSQL
Event-Driven Microservices with Apache Kafka, Kafka Streams and KSQL
Kai Wähner
 
SnapLogic- iPaaS (Elastic Integration Cloud and Data Integration)
SnapLogic- iPaaS (Elastic Integration Cloud and Data Integration) SnapLogic- iPaaS (Elastic Integration Cloud and Data Integration)
SnapLogic- iPaaS (Elastic Integration Cloud and Data Integration)
Surendar S
 
Kafka Summit NYC 2017 - The Rise of the Streaming Platform
Kafka Summit NYC 2017 - The Rise of the Streaming PlatformKafka Summit NYC 2017 - The Rise of the Streaming Platform
Kafka Summit NYC 2017 - The Rise of the Streaming Platform
confluent
 
How to Define and Share your Event APIs using AsyncAPI and Event API Products...
How to Define and Share your Event APIs using AsyncAPI and Event API Products...How to Define and Share your Event APIs using AsyncAPI and Event API Products...
How to Define and Share your Event APIs using AsyncAPI and Event API Products...
HostedbyConfluent
 
Real Use Cases - Pentaho & Big Data Ecosystem
Real Use Cases - Pentaho & Big Data Ecosystem Real Use Cases - Pentaho & Big Data Ecosystem
Real Use Cases - Pentaho & Big Data Ecosystem
Xpand IT
 
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
HostedbyConfluent
 
IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...
IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...
IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...
Kai Wähner
 

What's hot (20)

Mesh-ing around with Streams across the Enterprise | Phil Scanlon, Solace
Mesh-ing around with Streams across the Enterprise | Phil Scanlon, SolaceMesh-ing around with Streams across the Enterprise | Phil Scanlon, Solace
Mesh-ing around with Streams across the Enterprise | Phil Scanlon, Solace
 
[WSO2Con EU 2017] Open Interoperability of WSO2 Analytics Platform
[WSO2Con EU 2017] Open Interoperability of WSO2 Analytics Platform[WSO2Con EU 2017] Open Interoperability of WSO2 Analytics Platform
[WSO2Con EU 2017] Open Interoperability of WSO2 Analytics Platform
 
What does an event mean? Manage the meaning of your data! | Andreas Wombacher...
What does an event mean? Manage the meaning of your data! | Andreas Wombacher...What does an event mean? Manage the meaning of your data! | Andreas Wombacher...
What does an event mean? Manage the meaning of your data! | Andreas Wombacher...
 
Modernizing with microservices and fast data
Modernizing with microservices and fast dataModernizing with microservices and fast data
Modernizing with microservices and fast data
 
Server Sent Events using Reactive Kafka and Spring Web flux | Gagan Solur Ven...
Server Sent Events using Reactive Kafka and Spring Web flux | Gagan Solur Ven...Server Sent Events using Reactive Kafka and Spring Web flux | Gagan Solur Ven...
Server Sent Events using Reactive Kafka and Spring Web flux | Gagan Solur Ven...
 
[WSO2Con EU 2018] Decentralized Data Architectures
[WSO2Con EU 2018] Decentralized Data Architectures[WSO2Con EU 2018] Decentralized Data Architectures
[WSO2Con EU 2018] Decentralized Data Architectures
 
Scalable Data Management for Kafka and Beyond | Dan Rice, BigID
Scalable Data Management for Kafka and Beyond | Dan Rice, BigIDScalable Data Management for Kafka and Beyond | Dan Rice, BigID
Scalable Data Management for Kafka and Beyond | Dan Rice, BigID
 
Monitoreo sencillo de la infraestructura, de la ingesta a la visualización
Monitoreo sencillo de la infraestructura, de la ingesta a la visualizaciónMonitoreo sencillo de la infraestructura, de la ingesta a la visualización
Monitoreo sencillo de la infraestructura, de la ingesta a la visualización
 
Testing Event Driven Architectures: How to Broker the Complexity | Frank Kilc...
Testing Event Driven Architectures: How to Broker the Complexity | Frank Kilc...Testing Event Driven Architectures: How to Broker the Complexity | Frank Kilc...
Testing Event Driven Architectures: How to Broker the Complexity | Frank Kilc...
 
Digital Transformation in Healthcare with Kafka—Building a Low Latency Data P...
Digital Transformation in Healthcare with Kafka—Building a Low Latency Data P...Digital Transformation in Healthcare with Kafka—Building a Low Latency Data P...
Digital Transformation in Healthcare with Kafka—Building a Low Latency Data P...
 
Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google
Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, GoogleHybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google
Hybrid Streaming Analytics for Apache Kafka Users | Firat Tekiner, Google
 
Accelerating Innovation with Apache Kafka, Heikki Nousiainen | Heikki Nousiai...
Accelerating Innovation with Apache Kafka, Heikki Nousiainen | Heikki Nousiai...Accelerating Innovation with Apache Kafka, Heikki Nousiainen | Heikki Nousiai...
Accelerating Innovation with Apache Kafka, Heikki Nousiainen | Heikki Nousiai...
 
Why Kafka Works the Way It Does (And Not Some Other Way) | Tim Berglund, Conf...
Why Kafka Works the Way It Does (And Not Some Other Way) | Tim Berglund, Conf...Why Kafka Works the Way It Does (And Not Some Other Way) | Tim Berglund, Conf...
Why Kafka Works the Way It Does (And Not Some Other Way) | Tim Berglund, Conf...
 
Event-Driven Microservices with Apache Kafka, Kafka Streams and KSQL
Event-Driven Microservices with Apache Kafka, Kafka Streams and KSQLEvent-Driven Microservices with Apache Kafka, Kafka Streams and KSQL
Event-Driven Microservices with Apache Kafka, Kafka Streams and KSQL
 
SnapLogic- iPaaS (Elastic Integration Cloud and Data Integration)
SnapLogic- iPaaS (Elastic Integration Cloud and Data Integration) SnapLogic- iPaaS (Elastic Integration Cloud and Data Integration)
SnapLogic- iPaaS (Elastic Integration Cloud and Data Integration)
 
Kafka Summit NYC 2017 - The Rise of the Streaming Platform
Kafka Summit NYC 2017 - The Rise of the Streaming PlatformKafka Summit NYC 2017 - The Rise of the Streaming Platform
Kafka Summit NYC 2017 - The Rise of the Streaming Platform
 
How to Define and Share your Event APIs using AsyncAPI and Event API Products...
How to Define and Share your Event APIs using AsyncAPI and Event API Products...How to Define and Share your Event APIs using AsyncAPI and Event API Products...
How to Define and Share your Event APIs using AsyncAPI and Event API Products...
 
Real Use Cases - Pentaho & Big Data Ecosystem
Real Use Cases - Pentaho & Big Data Ecosystem Real Use Cases - Pentaho & Big Data Ecosystem
Real Use Cases - Pentaho & Big Data Ecosystem
 
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
 
IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...
IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...
IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...
 

Similar to Real Time API delivering data @ Scale

Information Virtualization: Query Federation on Data Lakes
Information Virtualization: Query Federation on Data LakesInformation Virtualization: Query Federation on Data Lakes
Information Virtualization: Query Federation on Data Lakes
DataWorks Summit
 
Apache Druid 101
Apache Druid 101Apache Druid 101
Apache Druid 101
Data Con LA
 
Serverless Streams, Topics, Queues, & APIs! Pick the Right Serverless Applica...
Serverless Streams, Topics, Queues, & APIs! Pick the Right Serverless Applica...Serverless Streams, Topics, Queues, & APIs! Pick the Right Serverless Applica...
Serverless Streams, Topics, Queues, & APIs! Pick the Right Serverless Applica...
Chris Munns
 
MongoDB in a Mainframe World
MongoDB in a Mainframe WorldMongoDB in a Mainframe World
MongoDB in a Mainframe World
MongoDB
 
Microsoft SQL Server - Parallel Data Warehouse Presentation
Microsoft SQL Server - Parallel Data Warehouse PresentationMicrosoft SQL Server - Parallel Data Warehouse Presentation
Microsoft SQL Server - Parallel Data Warehouse Presentation
Microsoft Private Cloud
 
Cloud Modernization and Data as a Service Option
Cloud Modernization and Data as a Service OptionCloud Modernization and Data as a Service Option
Cloud Modernization and Data as a Service Option
Denodo
 
InterConnect 2017 : Cognitive DevOps: Get Rid of the Guesswork to Improve Sof...
InterConnect 2017 : Cognitive DevOps: Get Rid of the Guesswork to Improve Sof...InterConnect 2017 : Cognitive DevOps: Get Rid of the Guesswork to Improve Sof...
InterConnect 2017 : Cognitive DevOps: Get Rid of the Guesswork to Improve Sof...
DevOps for Enterprise Systems
 
Building your Datalake on AWS
Building your Datalake on AWSBuilding your Datalake on AWS
Building your Datalake on AWS
Amazon Web Services
 
Why Data Virtualization? An Introduction by Denodo
Why Data Virtualization? An Introduction by DenodoWhy Data Virtualization? An Introduction by Denodo
Why Data Virtualization? An Introduction by Denodo
Justo Hidalgo
 
Comparison of Cloud Computing Services | Torry Harris Whitepaper
Comparison of Cloud Computing Services | Torry Harris WhitepaperComparison of Cloud Computing Services | Torry Harris Whitepaper
Comparison of Cloud Computing Services | Torry Harris Whitepaper
Torry Harris Business Solutions
 
Key Data Management Requirements for the IoT
Key Data Management Requirements for the IoTKey Data Management Requirements for the IoT
Key Data Management Requirements for the IoT
MongoDB
 
Moving Data in and out of Reltio - It-s Super EASY.pdf
Moving Data in and out of Reltio - It-s Super EASY.pdfMoving Data in and out of Reltio - It-s Super EASY.pdf
Moving Data in and out of Reltio - It-s Super EASY.pdf
Alex446314
 
Scaling Database Modernisation with MongoDB - Infosys
Scaling Database Modernisation with MongoDB - InfosysScaling Database Modernisation with MongoDB - Infosys
Scaling Database Modernisation with MongoDB - Infosys
MongoDB
 
Transforming a Large Mission-Critical E-Commerce Platform from a Relational A...
Transforming a Large Mission-Critical E-Commerce Platform from a Relational A...Transforming a Large Mission-Critical E-Commerce Platform from a Relational A...
Transforming a Large Mission-Critical E-Commerce Platform from a Relational A...
MongoDB
 
Requirements of monitoring cloud apps & infrastructure (webinar)
Requirements of monitoring cloud apps & infrastructure (webinar)Requirements of monitoring cloud apps & infrastructure (webinar)
Requirements of monitoring cloud apps & infrastructure (webinar)
New Relic
 
[DataCon.TW 2019] Graph Query on Big-data, REST API, and Live Analysis Systems
[DataCon.TW 2019] Graph Query on Big-data, REST API, and Live Analysis Systems[DataCon.TW 2019] Graph Query on Big-data, REST API, and Live Analysis Systems
[DataCon.TW 2019] Graph Query on Big-data, REST API, and Live Analysis Systems
Jeff Hung
 
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
SoftServe
 
BAS big data_v1 0
BAS big data_v1 0BAS big data_v1 0
BAS big data_v1 0
Garima Sharma
 
SendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data WarehousingSendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data Warehousing
Amazon Web Services
 
AppSphere 15 - Mining the World’s Largest Healthcare Data Warehouse while Ens...
AppSphere 15 - Mining the World’s Largest Healthcare Data Warehouse while Ens...AppSphere 15 - Mining the World’s Largest Healthcare Data Warehouse while Ens...
AppSphere 15 - Mining the World’s Largest Healthcare Data Warehouse while Ens...
AppDynamics
 

Similar to Real Time API delivering data @ Scale (20)

Information Virtualization: Query Federation on Data Lakes
Information Virtualization: Query Federation on Data LakesInformation Virtualization: Query Federation on Data Lakes
Information Virtualization: Query Federation on Data Lakes
 
Apache Druid 101
Apache Druid 101Apache Druid 101
Apache Druid 101
 
Serverless Streams, Topics, Queues, & APIs! Pick the Right Serverless Applica...
Serverless Streams, Topics, Queues, & APIs! Pick the Right Serverless Applica...Serverless Streams, Topics, Queues, & APIs! Pick the Right Serverless Applica...
Serverless Streams, Topics, Queues, & APIs! Pick the Right Serverless Applica...
 
MongoDB in a Mainframe World
MongoDB in a Mainframe WorldMongoDB in a Mainframe World
MongoDB in a Mainframe World
 
Microsoft SQL Server - Parallel Data Warehouse Presentation
Microsoft SQL Server - Parallel Data Warehouse PresentationMicrosoft SQL Server - Parallel Data Warehouse Presentation
Microsoft SQL Server - Parallel Data Warehouse Presentation
 
Cloud Modernization and Data as a Service Option
Cloud Modernization and Data as a Service OptionCloud Modernization and Data as a Service Option
Cloud Modernization and Data as a Service Option
 
InterConnect 2017 : Cognitive DevOps: Get Rid of the Guesswork to Improve Sof...
InterConnect 2017 : Cognitive DevOps: Get Rid of the Guesswork to Improve Sof...InterConnect 2017 : Cognitive DevOps: Get Rid of the Guesswork to Improve Sof...
InterConnect 2017 : Cognitive DevOps: Get Rid of the Guesswork to Improve Sof...
 
Building your Datalake on AWS
Building your Datalake on AWSBuilding your Datalake on AWS
Building your Datalake on AWS
 
Why Data Virtualization? An Introduction by Denodo
Why Data Virtualization? An Introduction by DenodoWhy Data Virtualization? An Introduction by Denodo
Why Data Virtualization? An Introduction by Denodo
 
Comparison of Cloud Computing Services | Torry Harris Whitepaper
Comparison of Cloud Computing Services | Torry Harris WhitepaperComparison of Cloud Computing Services | Torry Harris Whitepaper
Comparison of Cloud Computing Services | Torry Harris Whitepaper
 
Key Data Management Requirements for the IoT
Key Data Management Requirements for the IoTKey Data Management Requirements for the IoT
Key Data Management Requirements for the IoT
 
Moving Data in and out of Reltio - It-s Super EASY.pdf
Moving Data in and out of Reltio - It-s Super EASY.pdfMoving Data in and out of Reltio - It-s Super EASY.pdf
Moving Data in and out of Reltio - It-s Super EASY.pdf
 
Scaling Database Modernisation with MongoDB - Infosys
Scaling Database Modernisation with MongoDB - InfosysScaling Database Modernisation with MongoDB - Infosys
Scaling Database Modernisation with MongoDB - Infosys
 
Transforming a Large Mission-Critical E-Commerce Platform from a Relational A...
Transforming a Large Mission-Critical E-Commerce Platform from a Relational A...Transforming a Large Mission-Critical E-Commerce Platform from a Relational A...
Transforming a Large Mission-Critical E-Commerce Platform from a Relational A...
 
Requirements of monitoring cloud apps & infrastructure (webinar)
Requirements of monitoring cloud apps & infrastructure (webinar)Requirements of monitoring cloud apps & infrastructure (webinar)
Requirements of monitoring cloud apps & infrastructure (webinar)
 
[DataCon.TW 2019] Graph Query on Big-data, REST API, and Live Analysis Systems
[DataCon.TW 2019] Graph Query on Big-data, REST API, and Live Analysis Systems[DataCon.TW 2019] Graph Query on Big-data, REST API, and Live Analysis Systems
[DataCon.TW 2019] Graph Query on Big-data, REST API, and Live Analysis Systems
 
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
 
BAS big data_v1 0
BAS big data_v1 0BAS big data_v1 0
BAS big data_v1 0
 
SendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data WarehousingSendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data Warehousing
 
AppSphere 15 - Mining the World’s Largest Healthcare Data Warehouse while Ens...
AppSphere 15 - Mining the World’s Largest Healthcare Data Warehouse while Ens...AppSphere 15 - Mining the World’s Largest Healthcare Data Warehouse while Ens...
AppSphere 15 - Mining the World’s Largest Healthcare Data Warehouse while Ens...
 

Recently uploaded

ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
Green Software Development
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
Aftab Hussain
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
Neo4j
 
Hand Rolled Applicative User Validation Code Kata
Hand Rolled Applicative User ValidationCode KataHand Rolled Applicative User ValidationCode Kata
Hand Rolled Applicative User Validation Code Kata
Philip Schwarz
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j
 
SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024
Hironori Washizaki
 
Microservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we workMicroservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we work
Sven Peters
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
Rakesh Kumar R
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Neo4j
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
Aftab Hussain
 
Transform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR SolutionsTransform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR Solutions
TheSMSPoint
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke
 
Using Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query PerformanceUsing Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query Performance
Grant Fritchey
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Crescat
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
timtebeek1
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j
 
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Julian Hyde
 
Oracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptxOracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptx
Remote DBA Services
 
Oracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptxOracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptx
Remote DBA Services
 
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdfRevolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
Undress Baby
 

Recently uploaded (20)

ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
 
Hand Rolled Applicative User Validation Code Kata
Hand Rolled Applicative User ValidationCode KataHand Rolled Applicative User ValidationCode Kata
Hand Rolled Applicative User Validation Code Kata
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
 
SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024
 
Microservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we workMicroservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we work
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
 
Transform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR SolutionsTransform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR Solutions
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
 
Using Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query PerformanceUsing Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query Performance
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
 
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)
 
Oracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptxOracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptx
 
Oracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptxOracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptx
 
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdfRevolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
 

Real Time API delivering data @ Scale

  • 1. Akash Mishra Real Time API delivering Data @ Scale
  • 2. Agenda API Overview Key System Requirement Big Data System Vs RDBMS Architecture Data Flow Questions?
  • 3. API Overview API details REST based API Partners can request for various types of reports Each reports has data in order of T.B's Sample Request ?start-date=2012-10-01&end-date=2012-10- 29&partner=1&aggregate-by=state,city Response Zip file [Size in order of 10-30 M.B]
  • 4. Key System Requirement Interactive Filtering Query – Partner can filter data on various parameter. Real Time Response – SLA of 1-3 min. Security Extremely private and confidential data. Need to go through an audit by external vendor Scalability Only more machine for more customer
  • 5. Big Data System Vs Relational Data System Large Amount of Data [In order of T.B's ] Hadoop/Hive RDBMS Real Time Interactive Filtering/Querying Hadoop/Hive RDBMS Join's between large tables [ millions X millions X millions ] – Hadoop/Hive – RDBMS
  • 6. Big Data System Vs Relational Data System Access/Security Control Hadoop/Hive RDBMS Resilient to Hardware failure and Auto Scaling Hadoop/Hive RDBMS Fast read operation's – Hadoop/Hive – RDBMS
  • 8. Data Flow De-normalization on Hadoop/Hive Time: 3hrs #Records: 230m
  • 9. Data Flow Dynamic partitioning on Hadoop/Hive # Buckets 15 #Records: 230m
  • 11. Data Flow Security Control in RDBMS Strong User authentication mechanism. Restricted access to each user on database and table level Each partner has specific user and associated tables No cross-referencing of data across [table] partner.
  • 12. Data Flow Java API Common Pattern [Streaming] • Read a bunch of records from DB. • Process records. • Stream back to client. Avoiding creating unnecessary objects • Java heap memory exception because of using String in place of Char Array.