SlideShare a Scribd company logo
OLAP ON QERIES IN SECONDS ON PETABYTE DATASET 
Distributing Petabucket data using CephFS 
Milosz Tanski, CTO @Adfin 
milosz@adfin.com 
October 2014
Outline 
 Who/what is AdFin? 
 What is PetaBucket? 
 Petabucket on CephFS 
 Contributing FSCache support to CephFS 
2 ©AdFin. All Rights Reserved
About Adfin 
 = Ad-Tech + Finance-Tech 
 Creating tools that bring buying intelligence to programmatic media. 
 Advertising is bought and sold in real time via RTB (since 2008) 
 Brining transparency to the Ad markets. 
 The Bloomberg, S&P, Markit… for Ad markets. 
3 ©AdFin. All Rights Reserved
We Deliver… Pretty Analytics 
4 ©AdFin. All Rights Reserved
We Deliver… Pretty Analytics 
5 ©AdFin. All Rights Reserved
We Deliver… Pretty Analytics 
6 ©AdFin. All Rights Reserved
We Deliver… Pretty Analytics 
7 ©AdFin. All Rights Reserved
What’s the problem? 
 Market is ~500 Billion impressions a day; it’s growing. 
 Each impression is unique. 
 Each is worth a small fraction of a penny. 
 Magnitude more then number of trades in the Financial markets 
 There’s a magnitude more bids for those impressions. 
 That’s a lot of data to process, store, analyze. 
8 ©AdFin. All Rights Reserved
Petabucket 
 Distributed, time series, relational, OLAP database 
 Relational query language (but not SQL) 
 Query in broken up into many smaller chunks 
 Great single node performance. 10s of millions rows a second. 
 Vectorized query processing, vectorized compressed bitmap indexes. 
 Responses in real-time. Goal is low single digit seconds (uncached) 
 Why? Because we’re a bit crazy. 
9 ©AdFin. All Rights Reserved
Queries easy for humans / machines 
10 ©AdFin. All Rights Reserved
High Level System Diagram 
11
Time series bulk import 
12
Petabucket and CephFS 
 CephFS as a single namespace storage for nodes 
 Why? 
 Scalable storage (speed / size) 
 Separate storage from computation 
 No SPOF 
 DFS performance 
 Client (kernel) performance 
13 ©AdFin. All Rights Reserved
High Level System Diagram, part 2 
14 ©AdFin. All Rights Reserved
CephFS is not production ready? 
 Again, we’re a bit crazy? 
 Started in early 2013. 
 When we started client and MDS were not ready. 
 We found and reported a lot of bugs. 
 Yan Zhen fixed a lot of bugs. Thanks Yan. 
 Today we’re happy and in production. 
 Processed multiple PB of data since then. 
15 ©AdFin. All Rights Reserved
FSCache for kclient 
 We decided to add local persistent caching support to the kclient. 
 Our access pattern: 
 Working set larger then node memory (page cache) 
 Append-only data (time series) 
 Most recent month, quarter of data access 100x more often 
 Benefits: 
 Reducing latency / speed lost by moving to non-local filesystem 
 Reduce Ceph network traffic and OSD utilization 
 Cheap local SSD drives get 500MB/s read performance 
 Not re-inventing the wheel 
16 ©AdFin. All Rights Reserved
Kernel programming is hard 
 Have to understand Ceph, kernel, concurrency. 
 An error in the kernel hangs or Oops your machine. 
 Bugs in other parts of the kernel? (CacheFS). 
 Prototype working in two weeks 
 First submission 2 months later. 
 In kernel 5 months later. 
 Number one problem concurrency. 
17 ©AdFin. All Rights Reserved
Ceph with FSCache Status 
 In since: 3.13 
 … Works well since: 3.15 
 … All bugs fixed: 3.17 
 Speed… as fast as your caching disk 
 Tested single client performance 1200MB/s 
18 ©AdFin. All Rights Reserved
Next steps… 
 Contributing to Ceph & kernel is addicting: 
 Ceph performance work. Improving latency / ioops. 
 Kernel work: readv2() syscall. File serving applications 
 http://lwn.net/Articles/612483/ 
19 ©AdFin. All Rights Reserved
Thank You!
Let’s Get in Touch 
21 ©AdFin. All Rights Reserved 
Milosz Tanski 
CTO 
milosz@adfin.com 
16 E. 34th Street, 15th Floor 
New York, New York 10016 
linkedin.com/company/AdFin 
twitter.com/AdFin

More Related Content

Viewers also liked

Transforming the Ceph Integration Tests with OpenStack
Transforming the Ceph Integration Tests with OpenStack Transforming the Ceph Integration Tests with OpenStack
Transforming the Ceph Integration Tests with OpenStack
Ceph Community
 
iSCSI Target Support for Ceph
iSCSI Target Support for Ceph iSCSI Target Support for Ceph
iSCSI Target Support for Ceph
Ceph Community
 
Ceph Day Beijing: Experience Sharing and OpenStack and Ceph Integration
Ceph Day Beijing: Experience Sharing and OpenStack and Ceph Integration Ceph Day Beijing: Experience Sharing and OpenStack and Ceph Integration
Ceph Day Beijing: Experience Sharing and OpenStack and Ceph Integration
Ceph Community
 
Ceph Day Shanghai - CeTune - Benchmarking and tuning your Ceph cluster
Ceph Day Shanghai - CeTune - Benchmarking and tuning your Ceph cluster Ceph Day Shanghai - CeTune - Benchmarking and tuning your Ceph cluster
Ceph Day Shanghai - CeTune - Benchmarking and tuning your Ceph cluster
Ceph Community
 
Ceph Day New York 2014: Ceph, a physical perspective
Ceph Day New York 2014: Ceph, a physical perspective Ceph Day New York 2014: Ceph, a physical perspective
Ceph Day New York 2014: Ceph, a physical perspective
Ceph Community
 
London Ceph Day: Erasure Coding: Purpose and Progress
London Ceph Day: Erasure Coding: Purpose and Progress London Ceph Day: Erasure Coding: Purpose and Progress
London Ceph Day: Erasure Coding: Purpose and Progress
Ceph Community
 
Ceph Day Shanghai - Hyper Converged PLCloud with Ceph
Ceph Day Shanghai - Hyper Converged PLCloud with Ceph Ceph Day Shanghai - Hyper Converged PLCloud with Ceph
Ceph Day Shanghai - Hyper Converged PLCloud with Ceph
Ceph Community
 
London Ceph Day: Ceph Performance and Optimization
London Ceph Day: Ceph Performance and Optimization London Ceph Day: Ceph Performance and Optimization
London Ceph Day: Ceph Performance and Optimization
Ceph Community
 
Ceph Day Berlin: Ceph and iSCSI in a high availability setup
Ceph Day Berlin: Ceph and iSCSI in a high availability setupCeph Day Berlin: Ceph and iSCSI in a high availability setup
Ceph Day Berlin: Ceph and iSCSI in a high availability setup
Ceph Community
 
Ceph Day Berlin: Scaling an Academic Cloud
Ceph Day Berlin: Scaling an Academic CloudCeph Day Berlin: Scaling an Academic Cloud
Ceph Day Berlin: Scaling an Academic Cloud
Ceph Community
 
Ceph Day Beijing: Big Data Analytics on Ceph Object Store
Ceph Day Beijing: Big Data Analytics on Ceph Object Store Ceph Day Beijing: Big Data Analytics on Ceph Object Store
Ceph Day Beijing: Big Data Analytics on Ceph Object Store
Ceph Community
 
Ceph Day Berlin: Measuring and predicting performance of Ceph clusters
Ceph Day Berlin: Measuring and predicting performance of Ceph clustersCeph Day Berlin: Measuring and predicting performance of Ceph clusters
Ceph Day Berlin: Measuring and predicting performance of Ceph clusters
Ceph Community
 
Ceph Day LA: Ceph Ecosystem Update
Ceph Day LA: Ceph Ecosystem Update Ceph Day LA: Ceph Ecosystem Update
Ceph Day LA: Ceph Ecosystem Update
Ceph Community
 
Reference Architecture: Architecting Ceph Storage Solutions
Reference Architecture: Architecting Ceph Storage Solutions Reference Architecture: Architecting Ceph Storage Solutions
Reference Architecture: Architecting Ceph Storage Solutions
Ceph Community
 
Ceph Day 2015 - Erasure Coding
Ceph Day 2015 - Erasure Coding Ceph Day 2015 - Erasure Coding
Ceph Day 2015 - Erasure Coding
Ceph Community
 
Ceph Day Berlin: Erasure Code in Ceph
Ceph Day Berlin: Erasure Code in Ceph Ceph Day Berlin: Erasure Code in Ceph
Ceph Day Berlin: Erasure Code in Ceph
Ceph Community
 
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Community
 
Using Recently Published Ceph Reference Architectures to Select Your Ceph Con...
Using Recently Published Ceph Reference Architectures to Select Your Ceph Con...Using Recently Published Ceph Reference Architectures to Select Your Ceph Con...
Using Recently Published Ceph Reference Architectures to Select Your Ceph Con...
Ceph Community
 
Ceph Day New York 2014: Ceph and the Open Ethernet Drive Architecture
Ceph Day New York 2014: Ceph and the Open Ethernet Drive Architecture Ceph Day New York 2014: Ceph and the Open Ethernet Drive Architecture
Ceph Day New York 2014: Ceph and the Open Ethernet Drive Architecture
Ceph Community
 
Ceph Day Beijing: Containers and Ceph
Ceph Day Beijing: Containers and Ceph Ceph Day Beijing: Containers and Ceph
Ceph Day Beijing: Containers and Ceph
Ceph Community
 

Viewers also liked (20)

Transforming the Ceph Integration Tests with OpenStack
Transforming the Ceph Integration Tests with OpenStack Transforming the Ceph Integration Tests with OpenStack
Transforming the Ceph Integration Tests with OpenStack
 
iSCSI Target Support for Ceph
iSCSI Target Support for Ceph iSCSI Target Support for Ceph
iSCSI Target Support for Ceph
 
Ceph Day Beijing: Experience Sharing and OpenStack and Ceph Integration
Ceph Day Beijing: Experience Sharing and OpenStack and Ceph Integration Ceph Day Beijing: Experience Sharing and OpenStack and Ceph Integration
Ceph Day Beijing: Experience Sharing and OpenStack and Ceph Integration
 
Ceph Day Shanghai - CeTune - Benchmarking and tuning your Ceph cluster
Ceph Day Shanghai - CeTune - Benchmarking and tuning your Ceph cluster Ceph Day Shanghai - CeTune - Benchmarking and tuning your Ceph cluster
Ceph Day Shanghai - CeTune - Benchmarking and tuning your Ceph cluster
 
Ceph Day New York 2014: Ceph, a physical perspective
Ceph Day New York 2014: Ceph, a physical perspective Ceph Day New York 2014: Ceph, a physical perspective
Ceph Day New York 2014: Ceph, a physical perspective
 
London Ceph Day: Erasure Coding: Purpose and Progress
London Ceph Day: Erasure Coding: Purpose and Progress London Ceph Day: Erasure Coding: Purpose and Progress
London Ceph Day: Erasure Coding: Purpose and Progress
 
Ceph Day Shanghai - Hyper Converged PLCloud with Ceph
Ceph Day Shanghai - Hyper Converged PLCloud with Ceph Ceph Day Shanghai - Hyper Converged PLCloud with Ceph
Ceph Day Shanghai - Hyper Converged PLCloud with Ceph
 
London Ceph Day: Ceph Performance and Optimization
London Ceph Day: Ceph Performance and Optimization London Ceph Day: Ceph Performance and Optimization
London Ceph Day: Ceph Performance and Optimization
 
Ceph Day Berlin: Ceph and iSCSI in a high availability setup
Ceph Day Berlin: Ceph and iSCSI in a high availability setupCeph Day Berlin: Ceph and iSCSI in a high availability setup
Ceph Day Berlin: Ceph and iSCSI in a high availability setup
 
Ceph Day Berlin: Scaling an Academic Cloud
Ceph Day Berlin: Scaling an Academic CloudCeph Day Berlin: Scaling an Academic Cloud
Ceph Day Berlin: Scaling an Academic Cloud
 
Ceph Day Beijing: Big Data Analytics on Ceph Object Store
Ceph Day Beijing: Big Data Analytics on Ceph Object Store Ceph Day Beijing: Big Data Analytics on Ceph Object Store
Ceph Day Beijing: Big Data Analytics on Ceph Object Store
 
Ceph Day Berlin: Measuring and predicting performance of Ceph clusters
Ceph Day Berlin: Measuring and predicting performance of Ceph clustersCeph Day Berlin: Measuring and predicting performance of Ceph clusters
Ceph Day Berlin: Measuring and predicting performance of Ceph clusters
 
Ceph Day LA: Ceph Ecosystem Update
Ceph Day LA: Ceph Ecosystem Update Ceph Day LA: Ceph Ecosystem Update
Ceph Day LA: Ceph Ecosystem Update
 
Reference Architecture: Architecting Ceph Storage Solutions
Reference Architecture: Architecting Ceph Storage Solutions Reference Architecture: Architecting Ceph Storage Solutions
Reference Architecture: Architecting Ceph Storage Solutions
 
Ceph Day 2015 - Erasure Coding
Ceph Day 2015 - Erasure Coding Ceph Day 2015 - Erasure Coding
Ceph Day 2015 - Erasure Coding
 
Ceph Day Berlin: Erasure Code in Ceph
Ceph Day Berlin: Erasure Code in Ceph Ceph Day Berlin: Erasure Code in Ceph
Ceph Day Berlin: Erasure Code in Ceph
 
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
 
Using Recently Published Ceph Reference Architectures to Select Your Ceph Con...
Using Recently Published Ceph Reference Architectures to Select Your Ceph Con...Using Recently Published Ceph Reference Architectures to Select Your Ceph Con...
Using Recently Published Ceph Reference Architectures to Select Your Ceph Con...
 
Ceph Day New York 2014: Ceph and the Open Ethernet Drive Architecture
Ceph Day New York 2014: Ceph and the Open Ethernet Drive Architecture Ceph Day New York 2014: Ceph and the Open Ethernet Drive Architecture
Ceph Day New York 2014: Ceph and the Open Ethernet Drive Architecture
 
Ceph Day Beijing: Containers and Ceph
Ceph Day Beijing: Containers and Ceph Ceph Day Beijing: Containers and Ceph
Ceph Day Beijing: Containers and Ceph
 

Similar to Ceph Day New York 2014: Distributed OLAP queries in seconds using CephFS

OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017
OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017
OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017
Cloud Native Day Tel Aviv
 
Streaming solutions for real time problems
Streaming solutions for real time problems Streaming solutions for real time problems
Streaming solutions for real time problems
Aparna Gaonkar
 
Improving Hadoop Resiliency and Operational Efficiency with EMC Isilon
Improving Hadoop Resiliency and Operational Efficiency with EMC IsilonImproving Hadoop Resiliency and Operational Efficiency with EMC Isilon
Improving Hadoop Resiliency and Operational Efficiency with EMC Isilon
DataWorks Summit/Hadoop Summit
 
Hadoop Everywhere
Hadoop EverywhereHadoop Everywhere
Software Stacks to enable SDN and NFV
Software Stacks to enable SDN and NFVSoftware Stacks to enable SDN and NFV
Software Stacks to enable SDN and NFV
Yoshihiro Nakajima
 
Presentazione SimpliVity @ VMUGIT UserCon 2015
Presentazione SimpliVity @ VMUGIT UserCon 2015Presentazione SimpliVity @ VMUGIT UserCon 2015
Presentazione SimpliVity @ VMUGIT UserCon 2015
VMUG IT
 
Optimizing Data for Fast Querying
Optimizing Data for Fast QueryingOptimizing Data for Fast Querying
Optimizing Data for Fast Querying
Andrei Ionescu
 
Big Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case StudyBig Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case Study
Nati Shalom
 
NameNode Analytics - Querying HDFS Namespace in Real Time
NameNode Analytics - Querying HDFS Namespace in Real TimeNameNode Analytics - Querying HDFS Namespace in Real Time
NameNode Analytics - Querying HDFS Namespace in Real Time
Plamen Jeliazkov
 
Puppet Camp Charlotte 2015: Use Puppet to Manage your NetApp Storage Infrastr...
Puppet Camp Charlotte 2015: Use Puppet to Manage your NetApp Storage Infrastr...Puppet Camp Charlotte 2015: Use Puppet to Manage your NetApp Storage Infrastr...
Puppet Camp Charlotte 2015: Use Puppet to Manage your NetApp Storage Infrastr...
Puppet
 
SnapDiff
SnapDiffSnapDiff
SnapDiff
Ashwin Pawar
 
Modern infrastructure for business data lake
Modern infrastructure for business data lakeModern infrastructure for business data lake
Modern infrastructure for business data lakeEMC
 
Growing Monitoring to Keep Up with Technology and Business Demands
Growing Monitoring to Keep Up with Technology and Business DemandsGrowing Monitoring to Keep Up with Technology and Business Demands
Growing Monitoring to Keep Up with Technology and Business Demands
Zenoss
 
In-Place analytics with Unified Data Access
In-Place analytics with Unified Data AccessIn-Place analytics with Unified Data Access
In-Place analytics with Unified Data Access
DataWorks Summit
 
NetApp IT Data Center Strategies to Enable Digital Transformation
NetApp IT Data Center Strategies to Enable Digital TransformationNetApp IT Data Center Strategies to Enable Digital Transformation
NetApp IT Data Center Strategies to Enable Digital Transformation
NetApp
 
Hadoop Analytics on Isilon Deep Dive
Hadoop Analytics on Isilon Deep DiveHadoop Analytics on Isilon Deep Dive
Hadoop Analytics on Isilon Deep Dive
ClaudioFahey1
 
WekaIO: Making Machine Learning Compute Bound Again
WekaIO: Making Machine Learning Compute Bound AgainWekaIO: Making Machine Learning Compute Bound Again
WekaIO: Making Machine Learning Compute Bound Again
inside-BigData.com
 
Aem asset optimizations & best practices
Aem asset optimizations & best practicesAem asset optimizations & best practices
Aem asset optimizations & best practices
Kanika Gera
 
Bridging Your Business Across the Enterprise and Cloud with MongoDB and NetApp
Bridging Your Business Across the Enterprise and Cloud with MongoDB and NetAppBridging Your Business Across the Enterprise and Cloud with MongoDB and NetApp
Bridging Your Business Across the Enterprise and Cloud with MongoDB and NetApp
MongoDB
 
Decreasing Incident Response Time
Decreasing Incident Response TimeDecreasing Incident Response Time
Decreasing Incident Response Time
Boni Bruno
 

Similar to Ceph Day New York 2014: Distributed OLAP queries in seconds using CephFS (20)

OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017
OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017
OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017
 
Streaming solutions for real time problems
Streaming solutions for real time problems Streaming solutions for real time problems
Streaming solutions for real time problems
 
Improving Hadoop Resiliency and Operational Efficiency with EMC Isilon
Improving Hadoop Resiliency and Operational Efficiency with EMC IsilonImproving Hadoop Resiliency and Operational Efficiency with EMC Isilon
Improving Hadoop Resiliency and Operational Efficiency with EMC Isilon
 
Hadoop Everywhere
Hadoop EverywhereHadoop Everywhere
Hadoop Everywhere
 
Software Stacks to enable SDN and NFV
Software Stacks to enable SDN and NFVSoftware Stacks to enable SDN and NFV
Software Stacks to enable SDN and NFV
 
Presentazione SimpliVity @ VMUGIT UserCon 2015
Presentazione SimpliVity @ VMUGIT UserCon 2015Presentazione SimpliVity @ VMUGIT UserCon 2015
Presentazione SimpliVity @ VMUGIT UserCon 2015
 
Optimizing Data for Fast Querying
Optimizing Data for Fast QueryingOptimizing Data for Fast Querying
Optimizing Data for Fast Querying
 
Big Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case StudyBig Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case Study
 
NameNode Analytics - Querying HDFS Namespace in Real Time
NameNode Analytics - Querying HDFS Namespace in Real TimeNameNode Analytics - Querying HDFS Namespace in Real Time
NameNode Analytics - Querying HDFS Namespace in Real Time
 
Puppet Camp Charlotte 2015: Use Puppet to Manage your NetApp Storage Infrastr...
Puppet Camp Charlotte 2015: Use Puppet to Manage your NetApp Storage Infrastr...Puppet Camp Charlotte 2015: Use Puppet to Manage your NetApp Storage Infrastr...
Puppet Camp Charlotte 2015: Use Puppet to Manage your NetApp Storage Infrastr...
 
SnapDiff
SnapDiffSnapDiff
SnapDiff
 
Modern infrastructure for business data lake
Modern infrastructure for business data lakeModern infrastructure for business data lake
Modern infrastructure for business data lake
 
Growing Monitoring to Keep Up with Technology and Business Demands
Growing Monitoring to Keep Up with Technology and Business DemandsGrowing Monitoring to Keep Up with Technology and Business Demands
Growing Monitoring to Keep Up with Technology and Business Demands
 
In-Place analytics with Unified Data Access
In-Place analytics with Unified Data AccessIn-Place analytics with Unified Data Access
In-Place analytics with Unified Data Access
 
NetApp IT Data Center Strategies to Enable Digital Transformation
NetApp IT Data Center Strategies to Enable Digital TransformationNetApp IT Data Center Strategies to Enable Digital Transformation
NetApp IT Data Center Strategies to Enable Digital Transformation
 
Hadoop Analytics on Isilon Deep Dive
Hadoop Analytics on Isilon Deep DiveHadoop Analytics on Isilon Deep Dive
Hadoop Analytics on Isilon Deep Dive
 
WekaIO: Making Machine Learning Compute Bound Again
WekaIO: Making Machine Learning Compute Bound AgainWekaIO: Making Machine Learning Compute Bound Again
WekaIO: Making Machine Learning Compute Bound Again
 
Aem asset optimizations & best practices
Aem asset optimizations & best practicesAem asset optimizations & best practices
Aem asset optimizations & best practices
 
Bridging Your Business Across the Enterprise and Cloud with MongoDB and NetApp
Bridging Your Business Across the Enterprise and Cloud with MongoDB and NetAppBridging Your Business Across the Enterprise and Cloud with MongoDB and NetApp
Bridging Your Business Across the Enterprise and Cloud with MongoDB and NetApp
 
Decreasing Incident Response Time
Decreasing Incident Response TimeDecreasing Incident Response Time
Decreasing Incident Response Time
 

Recently uploaded

Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Shahin Sheidaei
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
Globus
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
Max Andersen
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
Globus
 
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdfEnhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Jay Das
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Globus
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
takuyayamamoto1800
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
NYGGS Automation Suite
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
XfilesPro
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
Cyanic lab
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
vrstrong314
 
A Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdfA Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdf
kalichargn70th171
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Globus
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Globus
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Globus
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
Philip Schwarz
 

Recently uploaded (20)

Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdfEnhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
 
A Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdfA Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdf
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
 

Ceph Day New York 2014: Distributed OLAP queries in seconds using CephFS

  • 1. OLAP ON QERIES IN SECONDS ON PETABYTE DATASET Distributing Petabucket data using CephFS Milosz Tanski, CTO @Adfin milosz@adfin.com October 2014
  • 2. Outline  Who/what is AdFin?  What is PetaBucket?  Petabucket on CephFS  Contributing FSCache support to CephFS 2 ©AdFin. All Rights Reserved
  • 3. About Adfin  = Ad-Tech + Finance-Tech  Creating tools that bring buying intelligence to programmatic media.  Advertising is bought and sold in real time via RTB (since 2008)  Brining transparency to the Ad markets.  The Bloomberg, S&P, Markit… for Ad markets. 3 ©AdFin. All Rights Reserved
  • 4. We Deliver… Pretty Analytics 4 ©AdFin. All Rights Reserved
  • 5. We Deliver… Pretty Analytics 5 ©AdFin. All Rights Reserved
  • 6. We Deliver… Pretty Analytics 6 ©AdFin. All Rights Reserved
  • 7. We Deliver… Pretty Analytics 7 ©AdFin. All Rights Reserved
  • 8. What’s the problem?  Market is ~500 Billion impressions a day; it’s growing.  Each impression is unique.  Each is worth a small fraction of a penny.  Magnitude more then number of trades in the Financial markets  There’s a magnitude more bids for those impressions.  That’s a lot of data to process, store, analyze. 8 ©AdFin. All Rights Reserved
  • 9. Petabucket  Distributed, time series, relational, OLAP database  Relational query language (but not SQL)  Query in broken up into many smaller chunks  Great single node performance. 10s of millions rows a second.  Vectorized query processing, vectorized compressed bitmap indexes.  Responses in real-time. Goal is low single digit seconds (uncached)  Why? Because we’re a bit crazy. 9 ©AdFin. All Rights Reserved
  • 10. Queries easy for humans / machines 10 ©AdFin. All Rights Reserved
  • 11. High Level System Diagram 11
  • 12. Time series bulk import 12
  • 13. Petabucket and CephFS  CephFS as a single namespace storage for nodes  Why?  Scalable storage (speed / size)  Separate storage from computation  No SPOF  DFS performance  Client (kernel) performance 13 ©AdFin. All Rights Reserved
  • 14. High Level System Diagram, part 2 14 ©AdFin. All Rights Reserved
  • 15. CephFS is not production ready?  Again, we’re a bit crazy?  Started in early 2013.  When we started client and MDS were not ready.  We found and reported a lot of bugs.  Yan Zhen fixed a lot of bugs. Thanks Yan.  Today we’re happy and in production.  Processed multiple PB of data since then. 15 ©AdFin. All Rights Reserved
  • 16. FSCache for kclient  We decided to add local persistent caching support to the kclient.  Our access pattern:  Working set larger then node memory (page cache)  Append-only data (time series)  Most recent month, quarter of data access 100x more often  Benefits:  Reducing latency / speed lost by moving to non-local filesystem  Reduce Ceph network traffic and OSD utilization  Cheap local SSD drives get 500MB/s read performance  Not re-inventing the wheel 16 ©AdFin. All Rights Reserved
  • 17. Kernel programming is hard  Have to understand Ceph, kernel, concurrency.  An error in the kernel hangs or Oops your machine.  Bugs in other parts of the kernel? (CacheFS).  Prototype working in two weeks  First submission 2 months later.  In kernel 5 months later.  Number one problem concurrency. 17 ©AdFin. All Rights Reserved
  • 18. Ceph with FSCache Status  In since: 3.13  … Works well since: 3.15  … All bugs fixed: 3.17  Speed… as fast as your caching disk  Tested single client performance 1200MB/s 18 ©AdFin. All Rights Reserved
  • 19. Next steps…  Contributing to Ceph & kernel is addicting:  Ceph performance work. Improving latency / ioops.  Kernel work: readv2() syscall. File serving applications  http://lwn.net/Articles/612483/ 19 ©AdFin. All Rights Reserved
  • 21. Let’s Get in Touch 21 ©AdFin. All Rights Reserved Milosz Tanski CTO milosz@adfin.com 16 E. 34th Street, 15th Floor New York, New York 10016 linkedin.com/company/AdFin twitter.com/AdFin

Editor's Notes

  1. Who is Adfin? What special sauce did we build … very large OLAP DB. Goals: Have you take a look at at CephFS … might be one of the few people talking about it. Realized that it’s possible for your organization to develop some expertise in-house… contribute.
  2. Name implies a combination of Advertising + Finance Markets. Two home town industries (Madison Ave and Wall St) Using tools and knowledge pioneered by the financial industry. Most media (by volume) is bought and sold pragmatically. Ala. HFT It’s an opaque marketplace. Bloomberg … Information Platform, S&P… Indices, Market … aggregating market data (CDS) I am going to keep butchering these analogies.
  3. Pictures of some of the tools we’ve built. Real time analysis into your own data and market data. Run a query get a result… lots of variables. Forecasting
  4. Pictures of some of the tools we’ve built. Real time analysis into your own data and market data. Run a query get a result… lots of variables. Forecasting
  5. Pictures of some of the tools we’ve built. Real time analysis into your own data and market data. Run a query get a result… lots of variables. Forecasting
  6. Pictures of some of the tools we’ve built. Real time analysis into your own data and market data. Run a query get a result… lots of variables. Forecasting
  7. The advertising market is larger then the financial market… in terms of volume of transactions. Each impression is worth a tiny fraction of a penny. When I looked at the number of transactions for an exchange like the NASDAQ… it’s like 50 million, NYSE 100 million. A lot of duct tape, but also a lot of efficiency. This number is not getting smaller. All advertising is going to be digitally bought and sold and that day is coming.
  8. Distributed, relational database for running real time analytics queries on very large time series data. KDB on many many nodes. Some fun things. It’s a relational model, but not SQL. 90% of queries or sums or group bys. Data is sharded into partitions by time. Spread across many nodes. We get pretty amazing singe node performance. 100s of millions of rows a second per partition. There’s been a lot of research into this stuff. Based on research into compression, indexing, query all from like last 3 to 4 years. For large datasets our goal is to answer under 10 seconds for really large queries. Reality is most things we do answer under 1 second. Why? Because the dataset is huge. Also, we’re a bit crazy.
  9. Distributed, relational database for running real time analytics queries on very large time series data. KDB on many many nodes. Some fun things. It’s a relational model, but not SQL. 90% of queries or sums or group bys. Data is sharded into partitions by time. Spread across many nodes. We get pretty amazing singe node performance. 100s of millions of rows a second per partition. There’s been a lot of research into this stuff. Based on research into compression, indexing, query all from like last 3 to 4 years. For large datasets our goal is to answer under 10 seconds for really large queries. Reality is most things we do answer under 1 second. Why? Because the dataset is huge. Also, we’re a bit crazy.
  10. Before we’re storing it all on local disks. Couple problems: Redundancy? Can’t grow computation without storage, vice versa. Looked into Ceph: Scalable storage, just throw more machines at it… don’t worry about topology too much. We could separate storage from computation. No SPOF, redundancy everywhere. Pretty good speed for DFS. We can leverage the kernel. The kernel client versus doing it directly. Page cache etc…Common theme
  11. “Beta company, okay using a beta product” We can get under the good. Early start was a bit rough. There was lots of bugs. We found lots of bugs. Community was great, esp Yan. Yan fixed our last bug around the end of 2013… haven’t had a single problem since. We’re not storing multi-PB yet but we processed multi-PB and haven’t had a problem
  12. We lost some performance as a result of this. Network latency, overhead, Ceph overhead. We can also go even cheaper without Ceph nodes / network. Our access pattern, write once read many (mostly true). Most recent data is most often use (working set larger then RAM smaller the the full DFS) The linux kernel people really put hundreds of man years into scabiliity.
  13. I don’t want to discourage anybody … we did something not smart, picked the hardest problem. It required us to know a lot of things about Ceph, kernel, concurrency. I would pick something simpler next time. There’s bugs in the other parts of the kernel? So one of the reasons we wanted to do this work in the kernel was concurrency, so our benefit was also out PITA.
  14. We got it up to the Ceph code base around 3.13 Bunch of bug fixes from external folks. We’ve exposed issues with FSCache code. We’ve fixed a bunch of concurrency bugs that only happen in the error path of FSCache under VMA pressure. A lot of filesystems benefit. We’re really happy with performance… we’ve made a good bet on the kernel. We’re able to really the fscache up to the speed of the disks we have.
  15. So despite the initial learning curve … we want to contribute work. Where we can leverage our knowledge … performance. We’ve built a lot of things in our system for improving latency. Learned what to do what not to do, where to apply lockless alogs. Readv2 syscall… Help all applications that do both IO and CPU bound work.
  16. Thanks for listening to me. Hopefully it was a good story of what we’re up to… how we’re leveraging Ceph. Motivating to help and contribute. It’s nice to have a vendor you can call up and yell at when things not working, but it’s even better to be able to guide the tool to do what you want. The Ceph community is great, there’s so many people contributing to so many different projects.
  17. Contact info