PG-Strom is an extension of PostgreSQL that utilizes GPUs and NVMe SSDs to enable terabyte-scale data processing and in-database analytics. It features SSD-to-GPU Direct SQL, which loads data directly from NVMe SSDs to GPUs using RDMA, bypassing CPU and RAM. This improves query performance by reducing I/O traffic over the PCIe bus. PG-Strom also uses Apache Arrow columnar storage format to further boost performance by transferring only referenced columns and enabling vector processing on GPUs. Benchmark results show PG-Strom can process over a billion rows per second on a simple 1U server configuration with an NVIDIA GPU and multiple NVMe SSDs.
KaiGai's talk at PGconf.EU 2018, Lisbon.
It shows how SSD2GPU Direct SQL of PG-Strom accelerates I/O intensive big-data queries using GPU in contradiction to the common sense.
KaiGai's talk at PGconf.EU 2018, Lisbon.
It shows how SSD2GPU Direct SQL of PG-Strom accelerates I/O intensive big-data queries using GPU in contradiction to the common sense.
Open Source RAPIDS GPU Platform to Accelerate Predictive Data Analyticsinside-BigData.com
Today NVIDIA announced a GPU-acceleration platform for data science and machine learning, with broad adoption from industry leaders, that enables even the largest companies to analyze massive amounts of data and make accurate business predictions at unprecedented speed.
Data analytics and machine learning are the largest segments of the high performance computing market that have not been accelerated — until now,” said Jensen Huang, founder and CEO of NVIDIA, who revealed RAPIDS in his keynote address at the GPU Technology Conference. “The world’s largest industries run algorithms written by machine learning on a sea of servers to sense complex patterns in their market and environment, and make fast, accurate predictions that directly impact their bottom line.
"RAPIDS open-source software gives data scientists a giant performance boost as they address highly complex business challenges, such as predicting credit card fraud, forecasting retail inventory and understanding customer buying behavior. Reflecting the growing consensus about the GPU’s importance in data analytics, an array of companies is supporting RAPIDS — from pioneers in the open-source community, such as Databricks and Anaconda, to tech leaders like Hewlett Packard Enterprise, IBM and Oracle."
Learn more: https://insidehpc.com/2018/10/open-source-rapids-gpu-platform-accelerate-predictive-data-analytics/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
The RAPIDS suite of software libraries gives you the freedom to execute end-to-end data science and analytics pipelines entirely on GPUs. It relies on NVIDIA® CUDA® primitives for low-level compute optimization, but exposes that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.
Open Source RAPIDS GPU Platform to Accelerate Predictive Data Analyticsinside-BigData.com
Today NVIDIA announced a GPU-acceleration platform for data science and machine learning, with broad adoption from industry leaders, that enables even the largest companies to analyze massive amounts of data and make accurate business predictions at unprecedented speed.
Data analytics and machine learning are the largest segments of the high performance computing market that have not been accelerated — until now,” said Jensen Huang, founder and CEO of NVIDIA, who revealed RAPIDS in his keynote address at the GPU Technology Conference. “The world’s largest industries run algorithms written by machine learning on a sea of servers to sense complex patterns in their market and environment, and make fast, accurate predictions that directly impact their bottom line.
"RAPIDS open-source software gives data scientists a giant performance boost as they address highly complex business challenges, such as predicting credit card fraud, forecasting retail inventory and understanding customer buying behavior. Reflecting the growing consensus about the GPU’s importance in data analytics, an array of companies is supporting RAPIDS — from pioneers in the open-source community, such as Databricks and Anaconda, to tech leaders like Hewlett Packard Enterprise, IBM and Oracle."
Learn more: https://insidehpc.com/2018/10/open-source-rapids-gpu-platform-accelerate-predictive-data-analytics/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
The RAPIDS suite of software libraries gives you the freedom to execute end-to-end data science and analytics pipelines entirely on GPUs. It relies on NVIDIA® CUDA® primitives for low-level compute optimization, but exposes that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...E-Commerce Brasil
Tecnologias NVIDIA aplicadas ao e-commerce. Muito além do hardware.
Jomar Silva
Gerente de relacionamento com desenvolvedores para a América Latina - NVIDIA
https://eventos.ecommercebrasil.com.br/forum/
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...Matej Misik
Graphics cards (GPU) open up new ways of processing and analytics over big data, showing millisecond selections over billions of lines, as well as telling stories about data. #QikkDB
How to present data to be understood by everyone? Data analysis is for scientists, but data storytelling is for everyone. For managers, product owners, sales teams, the general public. #TellStory
Learn about high performance computing with GPU and how to present data with a rich Covid-19 data story example on the upcoming webinar.
Harnessing the virtual realm for successful real world artificial intelligenceAlison B. Lowndes
Artificial Intelligence is impacting all areas of society, from healthcare and transportation to smart cities and energy. How NVIDIA invests both in internal pure research and accelerated computation to enable its diverse customer base, across gaming & extended reality, graphics, AI, robotics, simulation, high performance scientific computing, healthcare & more. You will be introduced to the GPU computing platform & shown real world successfully deployed applications as well as a glimpse into the current state of the art across academia, enterprise and startups.
RAPIDS: GPU-Accelerated ETL and Feature EngineeringKeith Kraus
The RAPIDS suite of open source software libraries gives you the freedom to execute end-to-end data science and analytics pipelines entirely on GPUs. It relies on NVIDIA® CUDA® primitives for low-level compute optimization, but exposes that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Lablup Inc.
This slide introduces technical specs and details about Backend.AI 19.09.
* On-premise clustering / container orchestration / scaling on cloud
* Container-level fractional GPU technology to use one GPU as many GPUs on many containers at the same time.
* NVidia GPU Cloud integrations
* Enterprise features
PostgreSQL 10 is the breakthrough revision of RDBMS with its real parallelism and Table Replication, Quorum Commit, and many more cool features. It is inevitable solution for Enterprises to replace any RDBMS to PostgreSQL.
RAPIDS – Open GPU-accelerated Data ScienceData Works MD
RAPIDS – Open GPU-accelerated Data Science
RAPIDS is an initiative driven by NVIDIA to accelerate the complete end-to-end data science ecosystem with GPUs. It consists of several open source projects that expose familiar interfaces making it easy to accelerate the entire data science pipeline- from the ETL and data wrangling to feature engineering, statistical modeling, machine learning, and graph analysis.
Corey J. Nolet
Corey has a passion for understanding the world through the analysis of data. He is a developer on the RAPIDS open source project focused on accelerating machine learning algorithms with GPUs.
Adam Thompson
Adam Thompson is a Senior Solutions Architect at NVIDIA. With a background in signal processing, he has spent his career participating in and leading programs focused on deep learning for RF classification, data compression, high-performance computing, and managing and designing applications targeting large collection frameworks. His research interests include deep learning, high-performance computing, systems engineering, cloud architecture/integration, and statistical signal processing. He holds a Masters degree in Electrical & Computer Engineering from Georgia Tech and a Bachelors from Clemson University.
First Steps with Globus Compute Multi-User EndpointsGlobus
In this presentation we will share our experiences around getting started with the Globus Compute multi-user endpoint. Working with the Pharmacology group at the University of Auckland, we have previously written an application using Globus Compute that can offload computationally expensive steps in the researcher's workflows, which they wish to manage from their familiar Windows environments, onto the NeSI (New Zealand eScience Infrastructure) cluster. Some of the challenges we have encountered were that each researcher had to set up and manage their own single-user globus compute endpoint and that the workloads had varying resource requirements (CPUs, memory and wall time) between different runs. We hope that the multi-user endpoint will help to address these challenges and share an update on our progress here.
Quarkus Hidden and Forbidden ExtensionsMax Andersen
Quarkus has a vast extension ecosystem and is known for its subsonic and subatomic feature set. Some of these features are not as well known, and some extensions are less talked about, but that does not make them less interesting - quite the opposite.
Come join this talk to see some tips and tricks for using Quarkus and some of the lesser known features, extensions and development techniques.
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamtakuyayamamoto1800
In this slide, we show the simulation example and the way to compile this solver.
In this solver, the Helmholtz equation can be solved by helmholtzFoam. Also, the Helmholtz equation with uniformly dispersed bubbles can be simulated by helmholtzBubbleFoam.
Strategies for Successful Data Migration Tools.pptxvarshanayak241
Data migration is a complex but essential task for organizations aiming to modernize their IT infrastructure and leverage new technologies. By understanding common challenges and implementing these strategies, businesses can achieve a successful migration with minimal disruption. Data Migration Tool like Ask On Data play a pivotal role in this journey, offering features that streamline the process, ensure data integrity, and maintain security. With the right approach and tools, organizations can turn the challenge of data migration into an opportunity for growth and innovation.
Designing for Privacy in Amazon Web ServicesKrzysztofKkol1
Data privacy is one of the most critical issues that businesses face. This presentation shares insights on the principles and best practices for ensuring the resilience and security of your workload.
Drawing on a real-life project from the HR industry, the various challenges will be demonstrated: data protection, self-healing, business continuity, security, and transparency of data processing. This systematized approach allowed to create a secure AWS cloud infrastructure that not only met strict compliance rules but also exceeded the client's expectations.
Multiple Your Crypto Portfolio with the Innovative Features of Advanced Crypt...Hivelance Technology
Cryptocurrency trading bots are computer programs designed to automate buying, selling, and managing cryptocurrency transactions. These bots utilize advanced algorithms and machine learning techniques to analyze market data, identify trading opportunities, and execute trades on behalf of their users. By automating the decision-making process, crypto trading bots can react to market changes faster than human traders
Hivelance, a leading provider of cryptocurrency trading bot development services, stands out as the premier choice for crypto traders and developers. Hivelance boasts a team of seasoned cryptocurrency experts and software engineers who deeply understand the crypto market and the latest trends in automated trading, Hivelance leverages the latest technologies and tools in the industry, including advanced AI and machine learning algorithms, to create highly efficient and adaptable crypto trading bots
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns
Unlocking Business Potential: Tailored Technology Solutions by Prosigns
Discover how Prosigns, a leading technology solutions provider, partners with businesses to drive innovation and success. Our presentation showcases our comprehensive range of services, including custom software development, web and mobile app development, AI & ML solutions, blockchain integration, DevOps services, and Microsoft Dynamics 365 support.
Custom Software Development: Prosigns specializes in creating bespoke software solutions that cater to your unique business needs. Our team of experts works closely with you to understand your requirements and deliver tailor-made software that enhances efficiency and drives growth.
Web and Mobile App Development: From responsive websites to intuitive mobile applications, Prosigns develops cutting-edge solutions that engage users and deliver seamless experiences across devices.
AI & ML Solutions: Harnessing the power of Artificial Intelligence and Machine Learning, Prosigns provides smart solutions that automate processes, provide valuable insights, and drive informed decision-making.
Blockchain Integration: Prosigns offers comprehensive blockchain solutions, including development, integration, and consulting services, enabling businesses to leverage blockchain technology for enhanced security, transparency, and efficiency.
DevOps Services: Prosigns' DevOps services streamline development and operations processes, ensuring faster and more reliable software delivery through automation and continuous integration.
Microsoft Dynamics 365 Support: Prosigns provides comprehensive support and maintenance services for Microsoft Dynamics 365, ensuring your system is always up-to-date, secure, and running smoothly.
Learn how our collaborative approach and dedication to excellence help businesses achieve their goals and stay ahead in today's digital landscape. From concept to deployment, Prosigns is your trusted partner for transforming ideas into reality and unlocking the full potential of your business.
Join us on a journey of innovation and growth. Let's partner for success with Prosigns.
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
Cyaniclab : Software Development Agency Portfolio.pdfCyanic lab
CyanicLab, an offshore custom software development company based in Sweden,India, Finland, is your go-to partner for startup development and innovative web design solutions. Our expert team specializes in crafting cutting-edge software tailored to meet the unique needs of startups and established enterprises alike. From conceptualization to execution, we offer comprehensive services including web and mobile app development, UI/UX design, and ongoing software maintenance. Ready to elevate your business? Contact CyanicLab today and let us propel your vision to success with our top-notch IT solutions.
Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production.
Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process?
In this session we will cover:
- The Art of Effective Code Reviews
- Streamlining the Review Process
- Elevating Reviews with Automated Tools
By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces
How to Position Your Globus Data Portal for Success Ten Good PracticesGlobus
Science gateways allow science and engineering communities to access shared data, software, computing services, and instruments. Science gateways have gained a lot of traction in the last twenty years, as evidenced by projects such as the Science Gateways Community Institute (SGCI) and the Center of Excellence on Science Gateways (SGX3) in the US, The Australian Research Data Commons (ARDC) and its platforms in Australia, and the projects around Virtual Research Environments in Europe. A few mature frameworks have evolved with their different strengths and foci and have been taken up by a larger community such as the Globus Data Portal, Hubzero, Tapis, and Galaxy. However, even when gateways are built on successful frameworks, they continue to face the challenges of ongoing maintenance costs and how to meet the ever-expanding needs of the community they serve with enhanced features. It is not uncommon that gateways with compelling use cases are nonetheless unable to get past the prototype phase and become a full production service, or if they do, they don't survive more than a couple of years. While there is no guaranteed pathway to success, it seems likely that for any gateway there is a need for a strong community and/or solid funding streams to create and sustain its success. With over twenty years of examples to draw from, this presentation goes into detail for ten factors common to successful and enduring gateways that effectively serve as best practices for any new or developing gateway.
Software Engineering, Software Consulting, Tech Lead.
Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Security,
Spring Transaction, Spring MVC,
Log4j, REST/SOAP WEB-SERVICES.
How Recreation Management Software Can Streamline Your Operations.pptxwottaspaceseo
Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Shahin Sheidaei
Games are powerful teaching tools, fostering hands-on engagement and fun. But they require careful consideration to succeed. Join me to explore factors in running and selecting games, ensuring they serve as effective teaching tools. Learn to maintain focus on learning objectives while playing, and how to measure the ROI of gaming in education. Discover strategies for pitching gaming to leadership. This session offers insights, tips, and examples for coaches, team leads, and enterprise leaders seeking to teach from simple to complex concepts.
In software engineering, the right architecture is essential for robust, scalable platforms. Wix has undergone a pivotal shift from event sourcing to a CRUD-based model for its microservices. This talk will chart the course of this pivotal journey.
Event sourcing, which records state changes as immutable events, provided robust auditing and "time travel" debugging for Wix Stores' microservices. Despite its benefits, the complexity it introduced in state management slowed development. Wix responded by adopting a simpler, unified CRUD model. This talk will explore the challenges of event sourcing and the advantages of Wix's new "CRUD on steroids" approach, which streamlines API integration and domain event management while preserving data integrity and system resilience.
Participants will gain valuable insights into Wix's strategies for ensuring atomicity in database updates and event production, as well as caching, materialization, and performance optimization techniques within a distributed system.
Join us to discover how Wix has mastered the art of balancing simplicity and extensibility, and learn how the re-adoption of the modest CRUD has turbocharged their development velocity, resilience, and scalability in a high-growth environment.
Globus Connect Server Deep Dive - GlobusWorld 2024Globus
We explore the Globus Connect Server (GCS) architecture and experiment with advanced configuration options and use cases. This content is targeted at system administrators who are familiar with GCS and currently operate—or are planning to operate—broader deployments at their institution.
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Globus
The U.S. Geological Survey (USGS) has made substantial investments in meeting evolving scientific, technical, and policy driven demands on storing, managing, and delivering data. As these demands continue to grow in complexity and scale, the USGS must continue to explore innovative solutions to improve its management, curation, sharing, delivering, and preservation approaches for large-scale research data. Supporting these needs, the USGS has partnered with the University of Chicago-Globus to research and develop advanced repository components and workflows leveraging its current investment in Globus. The primary outcome of this partnership includes the development of a prototype enterprise repository, driven by USGS Data Release requirements, through exploration and implementation of the entire suite of the Globus platform offerings, including Globus Flow, Globus Auth, Globus Transfer, and Globus Search. This presentation will provide insights into this research partnership, introduce the unique requirements and challenges being addressed and provide relevant project progress.
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
20201006_PGconf_Online_Large_Data_Processing
1. Data processing more than billion rows per second
~unleash full capability of GPU and NVME~
HeteroDB,Inc
Chief Architect & CEO
KaiGai Kohei <kaigai@heterodb.com>
2. Self-Introduction
KaiGai Kohei (海外浩平)
Chief Architect & CEO of HeteroDB
Contributor of PostgreSQL (2006-)
Primary Developer of PG-Strom (2012-)
Interested in: Big-data, GPU, NVME/PMEM, ...
about myself
about our company
Established: 4th-Jul-2017
Location: Shinagawa, Tokyo, Japan
Businesses:
✓ Sales & development of high-performance data-processing
software on top of heterogeneous architecture.
✓ Technology consulting service on GPU&DB area. PG-Strom
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second2
3. What is PG-Strom?
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second3
Transparent GPU acceleration for analytics and reporting workloads
Binary code generation by JIT from SQL statements
PCIe-bus level optimization by SSD-to-GPU Direct SQL technology
Columnar-storage for efficient I/O and vector processors
PG-Strom is an extension of PostgreSQL for terabytes scale data-processing
and in-database analytics, by utilization of GPU and NVME-SSD.
App
GPU
off-loading
for IoT/Big-Data
for ML/Analytics
➢ SSD-to-GPU Direct SQL
➢ Columnar Store (Arrow_Fdw)
➢ Asymmetric Partition-wise
JOIN/GROUP BY
➢ BRIN-Index Support
➢ NVME-over-Fabric Support
➢ PostGIS + GiST Support (WIP)
➢ GPU Memory Store(WIP)
➢ Procedural Language for
GPU native code.
➢ IPC on GPU device memory
over cuPy data frame.
4. PG-Strom as an open source project
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second4
Official documentation in English / Japanese
http://heterodb.github.io/pg-strom/
Many stars ☺
Distributed under GPL-v2.0
Has been developed since 2012
▌Supported PostgreSQL versions
PostgreSQL v12, v11, and v10. WIP for v13 support
▌Related contributions to PostgreSQL
Writable FDW support (9.3)
Custom-Scan interface (9.5)
FDW and Custom-Scan JOIN pushdown (9.5)
▌Talks at PostgreSQL community
PGconf.EU 2012 (Prague), 2015 (Vienna), 2018 (Lisbon)
PGconf.ASIA 2018 (Tokyo), 2019 (Bali, Indonesia)
PGconf.SV 2016 (San Francisco)
PGconf.China 2015 (Beijing)
6. Target: IoT/M2M Log-data processing
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second6
Manufacturing Logistics Mobile Home electronics
Any kind of devices generate various form of log-data.
People want/try to find out insight from the data.
Data importing and processing must be rapid.
System administration should be simple and easy.
JBoF: Just Bunch of Flash
NVME over
Fabric
(RDMA)
DB Admin
BI Tools
Machine-learning
applications
e.g, anomaly detection
shared
data-frame
PG-Strom
7. How much data size we expect? (1/3)
▌This is a factory of an international top-brand company.
▌Most of users don’t have bigger data than them.
Case: A semiconductor factory managed up to 100TB with PostgreSQL
Total 20TB
Per Year
Kept for
5 years
Defect
Investigation
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second7
8. How much data size we expect? (2/3)
Nowadays, 2U server is capable for 100TB capacity
model Supermicro 2029U-TN24R4T Qty
CPU Intel Xeon Gold 6226 (12C, 2.7GHz) 2
RAM 32GB RDIMM (DDR4-2933, ECC) 12
GPU NVIDIA Tesla P40 (3840C, 24GB) 2
HDD Seagate 1.0TB SATA (7.2krpm) 1
NVME Intel DC P4510 (8.0TB, U.2) 24
N/W built-in 10GBase-T 4
8.0TB x 24 = 192TB
How much is it?
60,932USD
by thinkmate.com
(31st-Aug-2019)
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second8
9. How much data size we expect? (3/3)
On the other hands, it is not small enough.
TB
TB
TB
B
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second9
10. Weapons to tackle terabytes scale data
• SSD-to-GPU Direct SQL
Efficient
Storage
• Arrow_Fdw
Efficient
Data Structure
• Table Partitioning
Efficient
Data
Deployment
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second10
11. PG-Strom: SSD-to-GPU Direct SQL
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second11
Efficient Storage
12. GPU’s characteristics - mostly as a computing accelerator
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second12
Over 10years history in HPC, then massive popularization in Machine-Learning
NVIDIA Tesla V100
Super Computer
(TITEC; TSUBAME3.0) Computer Graphics Machine-Learning
How I/O workloads are accelerated by GPU that is a computing accelerator?
Simulation
13. A usual composition of x86_64 server
GPUSSD
CPU
RAM
HDD
N/W
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second13
14. Data flow to process a massive amount of data
CPU RAM
SSD GPU
PCIe
PostgreSQL
Data Blocks
Normal Data Flow
All the records, including junks, must be loaded
onto RAM once, because software cannot check
necessity of the rows prior to the data loading.
So, amount of the I/O traffic over PCIe bus tends
to be large.
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second14
Unless records are not loaded to CPU/RAM once, over the PCIe bus,
software cannot check its necessity even if they are “junk” records.
15. SSD-to-GPU Direct SQL (1/4) – Overview
CPU RAM
SSD GPU
PCIe
PostgreSQL
Data Blocks
NVIDIA GPUDirect RDMA
It allows to load the data blocks on NVME-SSD
to GPU using peer-to-peer DMA over PCIe-bus;
bypassing CPU/RAM. WHERE-clause
JOIN
GROUP BY
Run SQL by GPU
to reduce the data size
Data Size: Small
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second15
16. Background - GPUDirect RDMA by NVIDIA
Physical
address space
PCIe BAR1 Area
GPU
device
memory
RAM
NVMe-SSD Infiniband
HBA
PCIe devices
GPUDirect RDMA
It enables to map GPU device
memory on physical address
space of the host system
Once “physical address of GPU device memory”
appears, we can use is as source or destination
address of DMA with PCIe devices.
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second16
0xf0000000
0xe0000000
DMA Request
SRC: 1200th sector
LEN: 40 sectors
DST: 0xe0200000
17. SSD-to-GPU Direct SQL (2/4) – System configuration and workloads
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second17
Supermicro SYS-1019GP-TT
CPU Xeon Gold 6126T (2.6GHz, 12C) x1
RAM 192GB (32GB DDR4-2666 x 6)
GPU NVIDIA Tesla V100 (5120C, 16GB) x1
SSD
Intel SSD DC P4600 (HHHL; 2.0TB) x3
(striping configuration by md-raid0)
HDD 2.0TB(SATA; 72krpm) x6
Network 10Gb Ethernet 2ports
OS
Red Hat Enterprise Linux 7.6
CUDA 10.1 + NVIDIA Driver 418.40.04
DB
PostgreSQL v11.2
PG-Strom v2.2devel
■ Query Example (Q2_3)
SELECT sum(lo_revenue), d_year, p_brand1
FROM lineorder, date1, part, supplier
WHERE lo_orderdate = d_datekey
AND lo_partkey = p_partkey
AND lo_suppkey = s_suppkey
AND p_category = 'MFGR#12‘
AND s_region = 'AMERICA‘
GROUP BY d_year, p_brand1
ORDER BY d_year, p_brand1;
customer
12M rows
(1.6GB)
date1
2.5K rows
(400KB)
part
1.8M rows
(206MB)
supplier
4.0M rows
(528MB)
lineorder
2.4B rows
(351GB)
Summarizing queries for typical Star-Schema structure on simple 1U server
18. SSD-to-GPU Direct SQL (3/4) – Benchmark results
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second18
Query Execution Throughput = (353GB; DB-size) / (Query response time [sec])
SSD-to-GPU Direct SQL runs the workloads close to the hardware limitation (8.5GB/s)
about x3 times faster than filesystem and CPU based mechanism
2343.7 2320.6 2315.8 2233.1 2223.9 2216.8 2167.8 2281.9 2277.6 2275.9 2172.0 2198.8 2233.3
7880.4 7924.7 7912.5
7725.8 7712.6 7827.2
7398.1 7258.0
7638.5 7750.4
7402.8 7419.6
7639.8
0
1,000
2,000
3,000
4,000
5,000
6,000
7,000
8,000
9,000
Q1_1 Q1_2 Q1_3 Q2_1 Q2_2 Q2_3 Q3_1 Q3_2 Q3_3 Q3_4 Q4_1 Q4_2 Q4_3
QueryExecutionThroughput[MB/s]
Star Schema Benchmark on 1xNVIDIA Tesla V100 + 3xIntel DC P4600
PostgreSQL 11.2 PG-Strom 2.2devel
19. SSD-to-GPU Direct SQL (4/4) – Software Stack
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second19
Filesystem
(ext4)
nvme_strom
kernel module
NVMe SSD
PostgreSQL
pg_strom
extension
read(2) ioctl(2)
Operating
System
Software
Layer
Database
Software
Layer
blk-mq
nvme pcie nvme rdma
Network HBA
NVMe
Request
Network HBANVMe SSD
NVME-over-Fabric Target
RDMA over
Converged
Ethernet
GPUDirect RDMA
Translation from logical location
to physical location on the drive
■ Other software
■ Software by HeteroDB
■ Hardware
Special NVME READ command that
loads the source blocks on NVME-
SSD to GPU’s device memory.
External
Storage
Server
Local
Storage
Layer
20. PG-Strom: Arrow_Fdw (columnar store)
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second20
Efficient Data Strucute
21. Background: Apache Arrow (1/2)
▌Characteristics
Column-oriented data format designed for analytics workloads
A common data format for inter-application exchange
Various primitive data types like integer, floating-point, date/time and so on
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second21
PostgreSQL / PG-Strom
NVIDIA GPU
22. Background: Apache Arrow (2/2)
Apache Arrow Data Types PostgreSQL Data Types extra description
Int int2, int4, int8
FloatingPoint float2, float4, float8 float2 is an enhancement by PG-Strom
Binary bytea
Utf8 text
Bool bool
Decimal numeric Decimal = fixed-point 128bit value
Date date adjusted to unitsz = Day
Time time adjusted to unitsz = MicroSecond
Timestamp timestamp adjusted to unitsz = MicroSecond
Interval interval
List array types Only 1-dimensional array is supportable
Struct composite types
Union ------
FixedSizeBinary char(n)
FixedSizeList ------
Map ------
Most of data types are convertible between Apache Arrow and PostgreSQL.
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second22
23. Log-data characteristics (1/2) – WHERE is it generated on?
ETL
OLTP OLAP
Traditional OLTP&OLAP – Data is generated inside of database system
Data
Creation
IoT/M2M use case – Data is generated outside of database system
Log
processing
BI Tools
BI Tools
Gateway Server
Data
Creation
Data
Creation
Many Devices
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second23
Data Importing becomes a heavy time-consuming operations for big-data processing.
Data
Import
Import!
24. Arrow_Fdw - maps Apache Arrow files as a foreign table
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second24
Apache Arrow
Files
Arrow_Fdw
Foreign Table
PostgreSQL
Tables
Not Importing
Write out
Arrow_Fdw enables to map Apache Arrow files as a PostgreSQL foreign table.
Not Importing, so Apache Arrow files are immediately visible.
25. Arrow_Fdw - maps Apache Arrow files as a foreign table
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second25
Apache Arrow
Files
Arrow_Fdw
Foreign Table
PostgreSQL
Tables
Write out
Arrow_Fdw enables to map Apache Arrow files as a PostgreSQL foreign table.
Not Importing, so Apache Arrow files are immediately visible.
pg2arrow generates Apache Arrow file from query results of PostgreSQL.
pg2arrow
26. Arrow_Fdw - maps Apache Arrow files as a foreign table
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second26
Apache Arrow
Files
Arrow_Fdw
Foreign Table
PostgreSQL
Tables
Write out
Arrow_Fdw enables to map Apache Arrow files as a PostgreSQL foreign table.
Not Importing, so Apache Arrow files are immediately visible.
pg2arrow generates Apache Arrow file from query results of PostgreSQL.
Arrow_Fdw can be writable, but only batched-INSERT is supported.
pg2arrow
27. SSD-to-GPU Direct SQL on Arrow_Fdw (1/3)
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second27
▌Why Apache Arrow is beneficial?
Less amount of I/O to be loaded; only referenced columns
Higher utilization of GPU core; by vector processing and wide memory bus
Read-only structure; No MVCC checks are required on run-time
It transfers ONLY Referenced Columns over SSD-to-GPU Direct SQL mechanism
PCIe Bus
NVMe SSD
GPU
SSD-to-GPU P2P DMA
WHERE-clause
JOIN
GROUP BY
P2P data transfer
only referenced columns
GPU code supports Apache
Arrow format as data source.
Runs SQL workloads in parallel
by thousands cores.
Write back the small results built
as heap-tuple of PostgreSQLResults
metadata
28. SSD-to-GPU Direct SQL on Arrow_Fdw (2/3) - Benchmark Results
Query Execution Throughput = (6.0B rows) / (Query Response Time [sec])
Combined usage of SSD-to-GPU Direct SQL and Columnar-store pulled out the extreme
performance; 130-500M rows per second
Server configuration is identical to what we show on p.17. (1U, 1CPU, 1GPU, 3SSD)
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second28
16.3 16.0 16.0 15.7 15.5 15.6 15.6 15.9 15.7 15.7 15.5 15.6 15.4
55.0 55.1 55.1 54.4 54.4 54.6 52.9 54.0 54.9 54.8 53.2 53.1 54.0
481.0
494.2 499.6
246.6
261.7
282.0
156.2
179.0 183.6 185.0
138.3 139.1
152.6
0
100
200
300
400
500
Q1_1 Q1_2 Q1_3 Q2_1 Q2_2 Q2_3 Q3_1 Q3_2 Q3_3 Q3_4 Q4_1 Q4_2 Q4_3
Numberofrowsprocessedpersecond
[millionrows/sec]
Star Schema Benchmark (s=1000, DBsize: 879GB) on Tesla V100 x1 + DC P4600 x3
PostgreSQL 11.5 PG-Strom 2.2 (Row data) PG-Strom 2.2 + Arrow_Fdw
29. SSD-to-GPU Direct SQL on Arrow_Fdw (3/3) - Benchmark Validation
Foreign table "public.flineorder"
Column | Type | Size
--------------------+---------------+--------------------------------
lo_orderkey | numeric | 89.42GB
lo_linenumber | integer | 22.37GB
lo_custkey | numeric | 89.42GB
lo_partkey | integer | 22.37GB <-- ★Referenced by Q2_1
lo_suppkey | numeric | 89.42GB <-- ★Referenced by Q2_1
lo_orderdate | integer | 22.37GB <-- ★Referenced by Q2_1
lo_orderpriority | character(15) | 83.82GB
lo_shippriority | character(1) | 5.7GB
lo_quantity | integer | 22.37GB
lo_extendedprice | integer | 22.37GB
lo_ordertotalprice | integer | 22.37GB
lo_discount | integer | 22.37GB
lo_revenue | integer | 22.37GB <-- ★Referenced by Q2_1
lo_supplycost | integer | 22.37GB
lo_tax | integer | 22.37GB
lo_commit_date | character(8) | 44.71GB
lo_shipmode | character(10) | 55.88GB
FDW options: (file '/opt/nvme/lineorder_s401.arrow') ... file size = 310GB
▌Q2_1 actuall reads only 157GB of 681GB (23.0%) from the NVME-SSD.
▌Execution time of Q2_1 is 24.3s, so 157GB / 24.3s = 6.46GB/s
➔ Reasonable performance for 3x Intel DC P4600 on single CPU configuration.
Almost equivalent physical data transfer, but no need to load unreferenced columns
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second29
31. Log-data characteristics (2/2) – INSERT-only
✓ MVCC visibility check is (relatively) not significant.
✓ Rows with old timestamp will never inserted.
INSERT
UPDATE
DELETE
INSERT
UPDATE
DELETE
Transactional Data Log Data
with
timestamp
current
2019
2018
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second31
32. Configuration of PostgreSQL partition for log-data (1/2)
▌Mixture of PostgreSQL table and Arrow foreign table in partition declaration
▌Log-data should have timestamp, and never updated
➔ Old data can be moved to Arrow foreign table for more efficient I/O
logdata_201912
logdata_202001
logdata_202002
logdata_current
logdata table
(PARTITION BY timestamp)
2020-03-21
12:34:56
dev_id: 2345
signal: 12.4
Log-data with
timestamp
PostgreSQL tables
Row data store
Read-writable but slot
Arrow foreign table
Column data store
Read-only but fast
Unrelated child tables are skipped,
if query contains WHERE-clause on the timestamp
E.g) WHERE timestamp > ‘2020-01-23’
AND device_id = 456
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second32
33. Configuration of PostgreSQL partition for log-data (2/2)
logdata_201912
logdata_202001
logdata_202002
logdata_current
logdata table
(PARTITION BY timestamp)
2019-03-21
12:34:56
dev_id: 2345
signal: 12.4
logdata_201912
logdata_202001
logdata_202002
logdata_202003
logdata_current
logdata table
(PARTITION BY timestamp)
a month
later
Extract data of
2020-03
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second33
▌Mixture of PostgreSQL table and Arrow foreign table in partition declaration
▌Log-data should have timestamp, and never updated
➔ Old data can be moved to Arrow foreign table for more efficient I/O
Log-data with
timestamp
34. Asymmetric Partition-wise JOIN (1/4)
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second34
lineorder
lineorder_p0
lineorder_p1
lineorder_p2
reminder=0
reminder=1
reminder=2
customer date
supplier parts
tablespace: nvme0
tablespace: nvme1
tablespace: nvme2
Records from partition-leafs must be backed to CPU and processed once!
Scan
Scan
Scan
Append
Join
Agg
Query
Results
Scan
Massive records
must be processed
on CPU first
35. Asymmetric Partition-wise JOIN (2/4)
postgres=# explain select * from ptable p, t1 where p.a = t1.aid;
QUERY PLAN
----------------------------------------------------------------------------------
Hash Join (cost=2.12..24658.62 rows=49950 width=49)
Hash Cond: (p.a = t1.aid)
-> Append (cost=0.00..20407.00 rows=1000000 width=12)
-> Seq Scan on ptable_p0 p (cost=0.00..5134.63 rows=333263 width=12)
-> Seq Scan on ptable_p1 p_1 (cost=0.00..5137.97 rows=333497 width=12)
-> Seq Scan on ptable_p2 p_2 (cost=0.00..5134.40 rows=333240 width=12)
-> Hash (cost=1.50..1.50 rows=50 width=37)
-> Seq Scan on t1 (cost=0.00..1.50 rows=50 width=37)
(8 rows)
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second35
36. Asymmetric Partition-wise JOIN (3/4)
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second36
lineorder
lineorder_p0
lineorder_p1
lineorder_p2
reminder=0
reminder=1
reminder=2
customer date
supplier parts
Push down JOIN/GROUP BY, even if smaller half is not a partitioned table.
Join
Append
Agg
Query
Results
Scan
Scan
PreAgg
Join
Scan
PreAgg
Join
Scan
PreAgg
tablespace: nvme0
tablespace: nvme1
tablespace: nvme2
Very small number of partial
results of JOIN/GROUP BY
shall be gathered on CPU.
37. Asymmetric Partition-wise JOIN (4/4)
postgres=# set enable_partitionwise_join = on;
SET
postgres=# explain select * from ptable p, t1 where p.a = t1.aid;
QUERY PLAN
----------------------------------------------------------------------------------
Append (cost=2.12..19912.62 rows=49950 width=49)
-> Hash Join (cost=2.12..6552.96 rows=16647 width=49)
Hash Cond: (p.a = t1.aid)
-> Seq Scan on ptable_p0 p (cost=0.00..5134.63 rows=333263 width=12)
-> Hash (cost=1.50..1.50 rows=50 width=37)
-> Seq Scan on t1 (cost=0.00..1.50 rows=50 width=37)
-> Hash Join (cost=2.12..6557.29 rows=16658 width=49)
Hash Cond: (p_1.a = t1.aid)
-> Seq Scan on ptable_p1 p_1 (cost=0.00..5137.97 rows=333497 width=12)
-> Hash (cost=1.50..1.50 rows=50 width=37)
-> Seq Scan on t1 (cost=0.00..1.50 rows=50 width=37)
-> Hash Join (cost=2.12..6552.62 rows=16645 width=49)
Hash Cond: (p_2.a = t1.aid)
-> Seq Scan on ptable_p2 p_2 (cost=0.00..5134.40 rows=333240 width=12)
-> Hash (cost=1.50..1.50 rows=50 width=37)
-> Seq Scan on t1 (cost=0.00..1.50 rows=50 width=37)
(16 rows)
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second37
38. PCIe-bus level optimization (1/3)
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second38
Supermicro
SYS-4029TRT2
x96 lane
PCIe
switch
x96 lane
PCIe
switch
CPU2 CPU1
QPI
Gen3
x16
Gen3 x16
for each
slot
Gen3 x16
for each
slotGen3
x16
▌HPC Server – optimization for GPUDirect RDMA
▌I/O Expansion Box
NEC ExpEther 40G
(4slots edition)
Network
Switch
4 slots of
PCIe Gen3 x8
PCIe
Swich
40Gb
Ethernet
CPU
NIC
Extra I/O Boxes
39. PCIe-bus level optimization (2/3)
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second39
By P2P DMA over PCIe-switch, major data traffic bypass CPU
CPU CPU
PCIe
switch
SSD GPU
PCIe
switch
SSD GPU
PCIe
switch
SSD GPU
PCIe
switch
SSD GPU
SCAN SCAN SCAN SCAN
JOIN JOIN JOIN JOIN
GROUP BY GROUP BY GROUP BY GROUP BY
Pre-processed Data (very small)
GATHER GATHER
42. Capable to process
100TB Log data
on a single-node
in reasonable time
100-120GB/s of effective
data transfer capability
by 4-units’ parallel
25-30GB/s of effective
data transfer capability by
columnar data structure
per unit
Large-scale Benchmark (1/4)
QPI
All of SSD-to-GPU Direct SQL, Apache Arrow and PCI-E bus optimization are used
CPU2CPU1RAM RAM
PCIe-SW PCIe-SW
NVME0
NVME1
NVME2
NVME3
NVME4
NVME5
NVME6
NVME7
NVME8
NVME9
NVME10
NVME11
NVME12
NVME13
NVME14
NVME15
GPU0 GPU1 GPU2 GPU3HBA0 HBA1 HBA2 HBA3
8.0-10GB/s of physical
data transfer capability
per GPU+4xSSD unit
42
PCIe-SW PCIe-SW
JBOF unit-0 JBOF unit-1
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second
43. Large-scale Benchmark (2/4)
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second43
Build HPC 4U Server + 4 of GPU + 16 of NVME-SSD configuration
CPU2 CPU1
SerialCables
dual PCI-ENC8G-08A
(U.2 NVME JBOF; 8slots) x2
NVIDIA
Tesla V100 x4
PCIe
switch
PCIe
switch
PCIe
switch
PCIe
switch
Gen3 x16
for each slot
Gen3 x16
for each slot
via SFF-8644 based PCIe x4 cables
(3.2GB/s x 16 = max 51.2GB/s)
Intel SSD
DC P4510 (1.0TB) x16
SerialCables
PCI-AD-x16HE x4
Supermicro
SYS-4029GP-TRT
SPECIAL THANKS
44. Large-scale Benchmark (3/4)
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second44
Build HPC 4U Server + 4 of GPU + 16 of NVME-SSD configuration
45. Large-scale Benchmark (4/4)
UPI
Distributed 3.5TB test data into 4 partitions; 879GB for each
CPU2CPU1RAM RAM
PCIe-SW PCIe-SW
NVME0
NVME1
NVME2
NVME3
NVME4
NVME5
NVME6
NVME7
NVME8
NVME9
NVME10
NVME11
NVME12
NVME13
NVME14
NVME15
GPU0 GPU1 GPU2 GPU3HBA0 HBA1 HBA2 HBA3
45
PCIe-SW PCIe-SW
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second
lineorder_p0
(879GB)
lineorder_p1
(879GB)
lineorder_p2
(879GB)
lineorder_p3
(879GB)
lineorder
(---GB)
Hash-Partition (N=4)
46. Benchmark Results (1/2)
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second46
Billion rows per second at a single-node PostgreSQL!!
64.0 65.0 65.2 58.8 58.0 67.3 43.9 58.7 61.0 60.8 52.9 53.0 59.1
255.2 255.6 256.9 257.6 258.0 257.5 248.5 254.9 253.3 253.3 249.3 249.4 249.6
1,681
1,714 1,719
1,058
1,090
1,127
753
801 816 812
658 664 675
0
200
400
600
800
1,000
1,200
1,400
1,600
1,800
Q1_1 Q1_2 Q1_3 Q2_1 Q2_2 Q2_3 Q3_1 Q3_2 Q3_3 Q3_4 Q4_1 Q4_2 Q4_3
Numberofrowsprocessedpersecond
[millionrows/sec]
Results of Star Schema Benchmark
[SF=4,000 (3.5TB; 24B rows), 4xGPU, 16xNVME-SSD]
PostgreSQL 11.5 PG-Strom 2.2 PG-Strom 2.2 + Arrow_Fdw
47. Benchmark Results (2/2)
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second47
40GB/s Read from NVME-SSDs during SQL execution (95% of H/W limit)
0
5,000
10,000
15,000
20,000
25,000
30,000
35,000
40,000
0 20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 340
NVME-SSDReadThroughput[MB/s]
Elapsed Time [sec]
/dev/md0 (GPU0) /dev/md1 (GPU1) /dev/md2 (GPU2) /dev/md3 (GPU3)
48. PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second48
Future Direction
49. NVIDIA GPUDirect Storage (cuFile API) support
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second49
▌NVIDIA GPUDirect Storage?
API & Driver stack for direct read from NVME to GPU
✓ Features are almost equivalend to NVME-Strom
✓ Ubuntu 18.04 & RHEL8/CentOS8 shall be released at the initial release
NVIDIA will officially release the software in 2020.
▌Why beneficial?
Linux kernel driver that is tested/evaluated by NVIDIA’s QA process.
Involvement of broader ecosystem with third-party solutions;
like distributed-filesystem, block storage, software defined storage
Broader OS support; Ubuntu 18.04.
PG-Strom is here
Ref: GPUDIRECT STORAGE: A DIRECT GPU-STORAGE DATA PATH
https://on-demand.gputechconf.com/supercomputing/2019/pdf/sc1922-gpudirect-storage-transfer-data-directly-to-gpu-memory-alleviating-io-bottlenecks.pdf
WIP
51. PostGIS Functions & Operator support (2/2)
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second51
▌Current status
geometry st_makepoint(float8,float8,...)
float8 st_distance(geometry,geometry)
bool st_dwithin(geometry,geometry,float8)
bool st_contains(geometry,geometry)
bool st_crosses(geometry,geometry)
▌Next: GiST (R-tree) Index support
Index-based nested-loop up to million polygon x billion points.
Runs index-search by GPU’s thousands threads in parallel.
➔ Because of GiST internal structure, GPU-parallel is likely efficient.
WIP
GiST(R-tree) Index
Polygon-definition
Table with
Location data
Index search
by thousands
threads
in parallel
52. Resources
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second52
▌Repository
https://github.com/heterodb/pg-strom
https://heterodb.github.io/swdc/
▌Document
http://heterodb.github.io/pg-strom/
▌Contact
kaigai@heterodb.com
Tw: @kkaigai / @heterodb
55. Apache Arrow Internal (1/3)
▌Apache Arrow Internal
Header
• Fixed byte string: “ARROW1¥0¥0”
Schema Definition
• # of columns and columns definitions.
• Data type/precision, column name, column number, ...
Record Batch
• Internal data block that contains a certain # of rows.
• E.g, if N=1M rows with (Int32, Float64), this record-batch
begins from 1M of Int32 array, then 1M of Float64 array
follows.
Dictionary Batch
• Optional area for dictionary compression.
• E.g) 1 = “Tokyo”, 2 = “Osaka”, 3 = “Kyoto”
➔ Int32 representation for frequently appearing words
Footer
• Metadata - offset/length of RecordBatches and
DictionaryBatches
Header “ARROW1¥0¥0”
Schema Definition
DictionaryBatch-0
RecordBatch-0
RecordBatch-k
Footer
• DictionaryBatch[0] (offset, size)
• RecordBatch[0] (offset, size)
:
• RecordBatch[k] (offset, size)
Apache Arrow file
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second55
56. Apache Arrow Internal (2/3) - writable Arrow_Fdw
Header “ARROW1¥0¥0”
Schema Definition
DictionaryBatch-0
RecordBatch-0
RecordBatch-k
Footer
• DictionaryBatch[0]
• RecordBatch[0]
:
• RecordBatch[k]
Arrow file (before writes)
Header “ARROW1¥0¥0”
Schema Definition
DictionaryBatch-0
RecordBatch-0
RecordBatch-k
Footer (new revision)
• DictionaryBatch[0]
• RecordBatch[0]
:
• RecordBatch[k]
• RecordBatch[k+1]
Arrow file (after writes)
RecordBatch-(k+1)
Overwrite of
the original
Footer area
Rows written by
a single INSERT
command
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second56
57. Apache Arrow Internal (3/3) - writable Arrow_Fdw
Header “ARROW1¥0¥0”
Schema Definition
DictionaryBatch-0
RecordBatch-0
RecordBatch-k
Footer
• DictionaryBatch[0]
• RecordBatch[0]
:
• RecordBatch[k]
Arrow file (before writes)
Header “ARROW1¥0¥0”
Schema Definition
DictionaryBatch-0
RecordBatch-0
RecordBatch-k
Footer (new revision)
• DictionaryBatch[0]
• RecordBatch[0]
:
• RecordBatch[k]
• RecordBatch[k+1]
Arrow file (after writes)
RecordBatch-(k+1)
Overwrite of
the original
Footer area
Footer
• DictionaryBatch[0]
• RecordBatch[0]
:
• RecordBatch[k]
Arrow_Fdw keeps the original
Footer image until commit,
for support of rollback.
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second57
58. Pg2Arrow / MySQL2Arrow
$ pg2arrow --help
Usage:
pg2arrow [OPTION] [database] [username]
General options:
-d, --dbname=DBNAME Database name to connect to
-c, --command=COMMAND SQL command to run
-t, --table=TABLENAME Table name to be dumped
(-c and -t are exclusive, either of them must be given)
-o, --output=FILENAME result file in Apache Arrow format
--append=FILENAME result Apache Arrow file to be appended
(--output and --append are exclusive. If neither of them
are given, it creates a temporary file.)
Arrow format options:
-s, --segment-size=SIZE size of record batch for each
Connection options:
-h, --host=HOSTNAME database server host
-p, --port=PORT database server port
-u, --user=USERNAME database user name
-w, --no-password never prompt for password
-W, --password force password prompt
Other options:
--dump=FILENAME dump information of arrow file
--progress shows progress of the job
--set=NAME:VALUE config option to set before SQL execution
--help shows this message
✓ mysql2arrow also has almost equivalent capability
Pg2Arrow / MySQL2Arrow enables to dump SQL results as Arrow file
Apache Arrow
Data Files
Arrow_Fdw
Pg2Arrow
PostgreSQL Webinar Series 2020 - Data processing more than billion rows per second58