PG-Strom - A FDW module utilizing GPU device

7,230 views
7,079 views

Published on

slides on LT session of PGcon2012

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
7,230
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
25
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

PG-Strom - A FDW module utilizing GPU device

  1. 1. PG-Strom~A FDW module utilizing GPU device~ NEC Europe SAP Global Competence Center KaiGai Kohei <kohei.kaigai@emea.nec.com>
  2. 2. FDW is fun Exec Regular Exec Table Executor Foreign MySQL Table FDWSELECT * FROM … Foreign Oracle Exec Table FDW Run on single PG-Strom Foreign thread FDW Table Utilizing External Regular Computing Table Resource! Page 2 PostgreSQL Conference 2012
  3. 3. Idea of Asynchronous Execution using CPU and GPU vanilla PostgreSQL PostgreSQL with PG-Strom CPU CPU GPU Asynchronous memory transfer and execution Iteration of scan tuples and evaluation of qualifiers Synchronization Larger “chunk” to scan Earlier than the database at once “Only CPU” scan : Red means, scan tuples from the database : Green means, execution of the qualifiers Page 3 PostgreSQL Conference 2012
  4. 4. Architecture of PG-Strom World of CPU World of GPU regular shadow Data Exchange tables tables via shared chunk Massive shared buffer shared chunks Parallel Execution Exec Preload PG-Strom PG-Strom GPU GPU Kernel Calculation Function Executor Server Exec Backend Backend Backend Process Process Process GPU Device DMA Memory Postmaster Transfer PCI-E x16 Gen2 (16GB/sec) Page 4 PostgreSQL Conference 2012
  5. 5. Data Density and Column-base structure Foreign Table FT1 (a, b, c, d) Table: my_schema.ft1.c.cs column store of D column store of C column store of A column store of B Shadow 10100 {‘2010-10-21’, …} rowid map Tables 10200 {‘2011-01-23’, …} 10300 {‘2011-08-17’, …} Table: my_schema.ft1.b.cs 10100 {2.4, 5.6, 4.95, … } row store of FT1 10300 {10.23, 7.54, 5.43, … } ② Calculation rowmap Chunk Buffer of FT1 ① Transfer value a[] <not used> value b[] ③ Write-Back value c[] value d[] <not used> Page 5 PostgreSQL Conference 2012
  6. 6. Benchmark Result postgres=# SELECT COUNT(*) FROM rtbl WHERE sqrt((x-256)^2 + (y-128)^2) < 40; count ------- 25069 (1 row) GPU Accelerated! Time: 3739.492 ms postgres=# SELECT COUNT(*) FROM ftbl WHERE sqrt((x-256)^2 + (y-128)^2) < 40; count ------- 25069 X10 times (1 row) Faster Time: 227.023 ms▌CPU: Intel Xeon E5504 (2.0GHz/4core), GPU: Nvidia GeForce GTS450 (128 cuda core)▌rtbl and ftbl contains 5 million tuples, with same values.▌All the tuples are already in the shared buffers, so seldom disk i/o happen. Page 6 PostgreSQL Conference 2012
  7. 7. Future Development▌Git URL https://github.com/kaigai/pg_strom▌v9.3 development  Writable Foreign Table  Sort / Aggregate acceleration using GPU  Inheritance between regular and foreign tables▌Need your help  Folks who can review the patches  Folks who can provide real-life big data  Folks who can know typical workload of analytic queries Page 7 PostgreSQL Conference 2012
  8. 8. Page 8 PostgreSQL Conference 2012

×