SciQLBridging The Gap Between Science      And Relational DBMS    Martin Kersten, Ying Zhang, Milena Ivanova, Niels Nes   ...
Who needs arrays anyway?             Seismology           – 1-D waveforms, 3-D spatial data             Astronomy         ...
Arrays In DBMS      Research issues already in the 80’s                     OODB, multi-dimensional DBMS,                 ...
What is the problem with RDBMS?             Appropriate array denotations?             Functional complete operation set? ...
SciQL             An array query language based on SQL:2003                Pronounced as ‘cycle’             Distinguish f...
Array Definitions             Dimensions and cell values             Dimension range: [(start|∗) : (step|∗) : (stop|∗)]   ...
Array Definitions                                  Fixed array                             CREATE ARRAY A1 (              ...
Array Definitions                               Unbounded array             CREATE ARRAY A2 (              x INT DIMENSION...
Array Definitions                               Unbounded array             CREATE ARRAY A2 (                INSERT INTO A...
Array Definitions                                  Unbounded array             CREATE ARRAY A2 (                    INSERT...
Array Definitions                                  Unbounded array             CREATE ARRAY A2 (                    INSERT...
Array & Table Coercions                                                    SELECT x, y, v FROM A1;        CREATE ARRAY A1 ...
Array & Table Coercions                                      SELECT [x], [y], v FROM T2;                                  ...
Array Modifications                            DELETE FROM A1 WHERE x = 1;                        y                   null...
Array Modifications                        UPDATE A1 SET v = 0.5 WHERE y = 1;                        INSERT INTO A1 VALUES...
Array Views                                 CREATE ARRAY VIEW A2 (                                  x INT DIMENSION [-1:1:...
Array Views                                 CREATE ARRAY VIEW A2 (                                  x INT DIMENSION [-1:1:...
Array Views                                 CREATE ARRAY VIEW A2 (                                  x INT DIMENSION [-1:1:...
Array Views                                 CREATE ARRAY VIEW A2 (                                  x INT DIMENSION [-1:1:...
Array Views                                 CREATE ARRAY VIEW A2 (                                  x INT DIMENSION [-1:1:...
Array Views                                 CREATE ARRAY VIEW A2 (                                  x INT DIMENSION [-1:1:...
Array Tiling                SELECT [x], [y], AVG(v) FROM A1                GROUP BY A1[x:x+2][y:y+2];                     ...
Array Tiling                      SELECT [x], [y], AVG(v) FROM A1                      GROUP BY A1[x:x+2][y:y+2];         ...
Array Tiling                      SELECT [x], [y], AVG(v) FROM A1                      GROUP BY A1[x:x+2][y:y+2];         ...
Array Tiling                      SELECT [x], [y], AVG(v) FROM A1                      GROUP BY A1[x:x+2][y:y+2];         ...
Array Tiling                      SELECT [x], [y], AVG(v) FROM A1                      GROUP BY A1[x:x+2][y:y+2];         ...
Array Tiling                      SELECT [x], [y], AVG(v) FROM A1                      GROUP BY A1[x:x+2][y:y+2];         ...
Array Tiling                SELECT [x], [y], AVG(v) FROM A1                GROUP BY A1[x:x+2][y:y+2];                     ...
Array Tiling                SELECT [x], [y], AVG(v) FROM A1                GROUP BY A1[x-1][y], A1[x][y-1],               ...
Array Tiling                SELECT [x], [y], AVG(v) FROM A1                GROUP BY A1[x-1][y], A1[x][y-1],               ...
Array Tiling                SELECT [x], [y], AVG(v) FROM A1                GROUP BY A1[x-1][y], A1[x][y-1],               ...
Array Tiling                SELECT [x], [y], AVG(v) FROM A1                GROUP BY A1[x-1][y], A1[x][y-1],               ...
Array Tiling                SELECT [x], [y], AVG(v) FROM A1                GROUP BY A1[x-1][y], A1[x][y-1],               ...
Seismology Use Case      Recent aftershock in Chili             2TB waveform data at 100Hz             detecting seismic e...
Seismology Use Case      Recent aftershock in Chili                     CREATE ARRAY MSeed (                              ...
Seismology Use Case      Recent aftershock in Chili                     --- avg of 2 sec. windows:                        ...
Seismology Use Case      Recent aftershock in Chili                     CREATE TABLE Event(                               ...
Seismology Use Case      Recent aftershock in Chili                      -- detect isolated errors by direct environment  ...
Seismology Use Case      Recent aftershocks in Chili                     -- pass time series to a UDF, written in, e.g., C...
Conclusion             SciQL: a first step towards a tailored scientific DBMS                A symbiosis of relational and...
Upcoming SlideShare
Loading in …5
×

SciQL, Bridging the Gap between Science and Relational DBMS

865 views

Published on

Presented at the 15th International Database Engineering & Applications Symposium (IDEAS 2011), September 21-23, 2011, Lisbon, Portugal.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

SciQL, Bridging the Gap between Science and Relational DBMS

  1. 1. SciQLBridging The Gap Between Science And Relational DBMS Martin Kersten, Ying Zhang, Milena Ivanova, Niels Nes CWI Amsterdam IDEAS 2011, Sep. 21-23, 2011,!"#$%&()*+,#-&$.#/(012#&+$#%3$%#,( Lisbon, Portugal 2.#(4&#$5()*+,#-&$".1(6&$& !"#$%&()&"#*+,-( ./0/123 4")*()5"%%,%*(*#-(( 6!7(8 9:7;;9
  2. 2. Who needs arrays anyway? Seismology – 1-D waveforms, 3-D spatial data Astronomy – temporal ordered rasters Climate simulation – temporal ordered grid Remote sensing – images of 2-D or higher Genomics – ordered DNA strings Scientists love arrays: HDF5, NETCDF, FITS, MSEED, … but also use: lists, tables, XML, ...2011-09-22 IDEAS 2011 2
  3. 3. Arrays In DBMS Research issues already in the 80’s OODB, multi-dimensional DBMS, Sequence DBMS, ... Algebraic frameworks The Longhorn Array Database (S)RAM, AQL, AML, ... RasDaMan SQL language extension Store large arrays in chunks as BLOBs RasQL, AQuery, SRQL, … Array query (RasQL) optimisation on top of DBMS a notion of order Known to work up to 12 TBs! SQL:1999, SQL:2003 PostgreSQL 8.1 collection type, C-style arrays SciDB aggregation functions over arrays Array DBMS from scratch Overlapping chunks for parallel execution2011-09-22 IDEAS 2011 3
  4. 4. What is the problem with RDBMS? Appropriate array denotations? Functional complete operation set? Size limitations (due to BLOB representations)? Existing foreign files? Scale? ...2011-09-22 IDEAS 2011 4
  5. 5. SciQL An array query language based on SQL:2003 Pronounced as ‘cycle’ Distinguish features: Arrays and tables as first class citizens of DBMSs Seamless integration of relational and array paradigms Named dimensions with constraints Flexible structure-based grouping Seismology use case2011-09-22 IDEAS 2011 5
  6. 6. Array Definitions Dimensions and cell values Dimension range: [(start|∗) : (step|∗) : (stop|∗)] A short cut for integer-typed dimensions: [size] Dimension data type: scalar data types Cells: ≽0 value(s) / cell all data types of normal table columns2011-09-22 IDEAS 2011 6
  7. 7. Array Definitions Fixed array CREATE ARRAY A1 ( x INT DIMENSION[0:1:4], y INT DIMENSION[0:1:4], v FLOAT DEFAULT 0.0); y null 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 x 0 1 2 3 null2011-09-22 IDEAS 2011 7
  8. 8. Array Definitions Unbounded array CREATE ARRAY A2 ( x INT DIMENSION, y INT DIMENSION, v FLOAT DEFAULT 0.0); y 3 2 null 1 0 x 0 1 2 32011-09-22 IDEAS 2011 8
  9. 9. Array Definitions Unbounded array CREATE ARRAY A2 ( INSERT INTO A2 VALUES x INT DIMENSION, (1,0,5.5), (1,1,0.4), (2,2,4.5); y INT DIMENSION, v FLOAT DEFAULT 0.0); y 3 2 null 1 0 x 0 1 2 32011-09-22 IDEAS 2011 8
  10. 10. Array Definitions Unbounded array CREATE ARRAY A2 ( INSERT INTO A2 VALUES x INT DIMENSION, (1,0,5.5), (1,1,0.4), (2,2,4.5); y INT DIMENSION, v FLOAT DEFAULT 0.0); y null 3 2 0.0 4.5 null null 1 0.4 0.0 0 5.5 0.0 x 0 1 2 3 null2011-09-22 IDEAS 2011 8
  11. 11. Array Definitions Unbounded array CREATE ARRAY A2 ( INSERT INTO A2 VALUES x INT DIMENSION, (1,0,5.5), (1,1,0.4), (2,2,4.5); y INT DIMENSION, v FLOAT DEFAULT 0.0); current range y null 3 2 0.0 4.5 null null 1 0.4 0.0 0 5.5 0.0 x 0 1 2 3 null2011-09-22 IDEAS 2011 8
  12. 12. Array & Table Coercions SELECT x, y, v FROM A1; CREATE ARRAY A1 ( x y v x INT DIMENSION[0:1:4], y INT DIMENSION[0:1:4], 0 0 0.0 v FLOAT DEFAULT 0.0); 0 1 0.0 full materialisation! y null 0 2 0.0 3 0.0 0.0 0.0 0.0 0 3 0.0 2 0.0 0.0 0.0 0.0 null null 1 0 0.0 1 0.0 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 1 1 0.0 x 0 1 2 3 1 2 0.0 null 1 3 0.0 2 0 0.0 2 1 0.0 2 2 0.0 2 3 0.0 3 0 0.0 3 1 0.0 3 2 0.0 3 3 0.02011-09-22 IDEAS 2011 9
  13. 13. Array & Table Coercions SELECT [x], [y], v FROM T2; dimension qualifiers: ‘[’, ‘]’ CREATE TABLE T2 ( x INT, y INT, y v FLOAT DEFAULT 0.0); null INSERT INTO T2 VALUES 3 (1,0,5.5), (1,1,0.4), (2,2,4.5), (1,1,1.3); 2 0.0 4.5 x y v null null 1 0 5.5 1 0.4 0.0 1 1 0.4 0 5.5 0.0 2 2 4.5 x 1 1 1.3 0 1 2 3 null An unbounded array dimension ranges derived from the minimal bounding box cells values from the table or the column default duplicates are overwritten arbitrarily2011-09-22 IDEAS 2011 10
  14. 14. Array Modifications DELETE FROM A1 WHERE x = 1; y null 3 0.0 null 0.0 0.0 2 0.0 null 0.0 0.0 null null 1 0.0 null 0.0 0.0 0 0.0 null 0.0 0.0 x 0 1 2 3 null creates holes in the array2011-09-22 IDEAS 2011 11
  15. 15. Array Modifications UPDATE A1 SET v = 0.5 WHERE y = 1; INSERT INTO A1 VALUES (0,1,0.5), (1,1,0.5), (2,1,0.5), (3,1,0.5); y null 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.5 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 x 0 1 2 3 null set/change cell values overwrite existing values2011-09-22 IDEAS 2011 12
  16. 16. Array Views CREATE ARRAY VIEW A2 ( x INT DIMENSION [-1:1:5], y INT DIMENSION [-1:1:5], w FLOAT DEFAULT 0.0) AS SELECT x-1, y, v FROM A1 WHERE x > 1 UNION SELECT x, y, 1.0 FROM A1 WHERE x = 3; y null y null 4 0.0 0.0 0.0 0.0 0.0 0.0 3 -1.0 -1.0 -1.0 -1.0 3 0.0 0.0 0.0 0.0 0.0 0.0 2 -1.0 -1.0 -1.0 -1.0 2 0.0 0.0 0.0 0.0 0.0 0.0null null null null 1 -1.0 0.5 0.5 0.5 1 0.0 0.0 0.0 0.0 0.0 0.0 0 -1.0 -1.0 -1.0 -1.0 0 0.0 0.0 0.0 0.0 0.0 0.0 0 1 2 3 x -1 0.0 0.0 0.0 0.0 0.0 0.0 null -1 0 1 2 3 4 x null2011-09-22 IDEAS 2011 13
  17. 17. Array Views CREATE ARRAY VIEW A2 ( x INT DIMENSION [-1:1:5], y INT DIMENSION [-1:1:5], w FLOAT DEFAULT 0.0) AS SELECT x-1, y, v FROM A1 WHERE x > 1 UNION SELECT x, y, 1.0 FROM A1 WHERE x = 3; y null y null 4 0.0 0.0 0.0 0.0 0.0 0.0 3 -1.0 -1.0 -1.0 -1.0 3 0.0 0.0 0.0 0.0 0.0 0.0 2 -1.0 -1.0 -1.0 -1.0 2 0.0 0.0 0.0 0.0 0.0 0.0null null null null 1 -1.0 0.5 0.5 0.5 1 0.0 0.0 0.0 0.0 0.0 0.0 0 -1.0 -1.0 -1.0 -1.0 0 0.0 0.0 0.0 0.0 0.0 0.0 0 1 2 3 x -1 0.0 0.0 0.0 0.0 0.0 0.0 null -1 0 1 2 3 4 x null2011-09-22 IDEAS 2011 13
  18. 18. Array Views CREATE ARRAY VIEW A2 ( x INT DIMENSION [-1:1:5], y INT DIMENSION [-1:1:5], w FLOAT DEFAULT 0.0) AS SELECT x-1, y, v FROM A1 WHERE x > 1 UNION SELECT x, y, 1.0 FROM A1 WHERE x = 3; y null y null 4 0.0 0.0 0.0 0.0 0.0 0.0 3 -1.0 -1.0 -1.0 -1.0 3 0.0 -1.0 -1.0 -1.0 0.0 0.0 0.0 0.0 0.0 2 -1.0 -1.0 -1.0 -1.0 2 0.0 -1.0 -1.0 -1.0 0.0 0.0 0.0 0.0 0.0null null null null 1 -1.0 0.5 0.5 0.5 1 0.0 0.5 0.0 0.5 0.0 0.5 0.0 0.0 0.0 0 -1.0 -1.0 -1.0 -1.0 0 0.0 -1.0 -1.0 -1.0 0.0 0.0 0.0 0.0 0.0 0 1 2 3 x -1 0.0 0.0 0.0 0.0 0.0 0.0 null -1 0 1 2 3 4 x null2011-09-22 IDEAS 2011 13
  19. 19. Array Views CREATE ARRAY VIEW A2 ( x INT DIMENSION [-1:1:5], y INT DIMENSION [-1:1:5], w FLOAT DEFAULT 0.0) AS SELECT x-1, y, v FROM A1 WHERE x > 1 UNION SELECT x, y, 1.0 FROM A1 WHERE x = 3; y null y null 4 0.0 0.0 0.0 0.0 0.0 0.0 3 -1.0 -1.0 -1.0 -1.0 3 0.0 -1.0 -1.0 -1.0 0.0 0.0 0.0 0.0 0.0 2 -1.0 -1.0 -1.0 -1.0 2 0.0 -1.0 -1.0 -1.0 0.0 0.0 0.0 0.0 0.0null null null null 1 -1.0 0.5 0.5 0.5 1 0.0 0.5 0.0 0.5 0.0 0.5 0.0 0.0 0.0 0 -1.0 -1.0 -1.0 -1.0 0 0.0 -1.0 -1.0 -1.0 0.0 0.0 0.0 0.0 0.0 0 1 2 3 x -1 0.0 0.0 0.0 0.0 0.0 0.0 null -1 0 1 2 3 4 x null2011-09-22 IDEAS 2011 13
  20. 20. Array Views CREATE ARRAY VIEW A2 ( x INT DIMENSION [-1:1:5], y INT DIMENSION [-1:1:5], w FLOAT DEFAULT 0.0) AS SELECT x-1, y, v FROM A1 WHERE x > 1 UNION SELECT x, y, 1.0 FROM A1 WHERE x = 3; y null y null 4 0.0 0.0 0.0 0.0 0.0 0.0 3 -1.0 -1.0 -1.0 -1.0 3 0.0 -1.0 -1.0 -1.0 0.0 0.0 0.0 0.0 0.0 2 -1.0 -1.0 -1.0 -1.0 2 0.0 -1.0 -1.0 -1.0 0.0 0.0 0.0 0.0 0.0null null null null 1 -1.0 0.5 0.5 0.5 1 0.0 0.5 0.0 0.5 0.0 0.5 0.0 0.0 0.0 0 -1.0 -1.0 -1.0 -1.0 0 0.0 -1.0 -1.0 -1.0 0.0 0.0 0.0 0.0 0.0 0 1 2 3 x -1 0.0 0.0 0.0 0.0 0.0 0.0 null -1 0 1 2 3 4 x null2011-09-22 IDEAS 2011 13
  21. 21. Array Views CREATE ARRAY VIEW A2 ( x INT DIMENSION [-1:1:5], y INT DIMENSION [-1:1:5], w FLOAT DEFAULT 0.0) AS SELECT x-1, y, v FROM A1 WHERE x > 1 UNION SELECT x, y, 1.0 FROM A1 WHERE x = 3; y null y null 4 0.0 0.0 0.0 0.0 0.0 0.0 3 -1.0 -1.0 -1.0 -1.0 3 0.0 -1.0 -1.0 -1.0 0.0 0.0 0.0 0.0 1.0 0.0 2 -1.0 -1.0 -1.0 -1.0 2 0.0 -1.0 -1.0 -1.0 0.0 0.0 0.0 0.0 1.0 0.0null null null null 1 -1.0 0.5 0.5 0.5 1 0.0 0.5 0.0 0.5 0.0 0.5 0.0 1.0 0.0 0.0 0 -1.0 -1.0 -1.0 -1.0 0 0.0 -1.0 -1.0 -1.0 0.0 0.0 0.0 0.0 1.0 0.0 0 1 2 3 x -1 0.0 0.0 0.0 0.0 0.0 0.0 null -1 0 1 2 3 4 x null2011-09-22 IDEAS 2011 13
  22. 22. Array Tiling SELECT [x], [y], AVG(v) FROM A1 GROUP BY A1[x:x+2][y:y+2]; y null 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null2011-09-22 IDEAS 2011 14
  23. 23. Array Tiling SELECT [x], [y], AVG(v) FROM A1 GROUP BY A1[x:x+2][y:y+2]; y null 3 0.0 0.0 0.0 0.0 Anchor point: 2 0.0 0.0 0.0 0.0 A1[x][y] null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null2011-09-22 IDEAS 2011 14
  24. 24. Array Tiling SELECT [x], [y], AVG(v) FROM A1 GROUP BY A1[x:x+2][y:y+2]; y null 3 0.0 0.0 0.0 0.0 Anchor point: 2 0.0 0.0 0.0 0.0 A1[x][y] null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null2011-09-22 IDEAS 2011 14
  25. 25. Array Tiling SELECT [x], [y], AVG(v) FROM A1 GROUP BY A1[x:x+2][y:y+2]; y null 3 0.0 0.0 0.0 0.0 Anchor point: 2 0.0 0.0 0.0 0.0 A1[x][y] null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null2011-09-22 IDEAS 2011 14
  26. 26. Array Tiling SELECT [x], [y], AVG(v) FROM A1 GROUP BY A1[x:x+2][y:y+2]; y null 3 0.0 0.0 0.0 0.0 Anchor point: 2 0.0 0.0 0.0 0.0 A1[x][y] null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null2011-09-22 IDEAS 2011 14
  27. 27. Array Tiling SELECT [x], [y], AVG(v) FROM A1 GROUP BY A1[x:x+2][y:y+2]; y null 3 0.0 0.0 0.0 0.0 Anchor point: 2 0.0 0.0 0.0 0.0 A1[x][y] null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null2011-09-22 IDEAS 2011 14
  28. 28. Array Tiling SELECT [x], [y], AVG(v) FROM A1 GROUP BY A1[x:x+2][y:y+2]; y null 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.125 0.25 0.25 0.25 0 0.125 0.25 0.25 0.25 0 1 2 3 x null2011-09-22 IDEAS 2011 15
  29. 29. Array Tiling SELECT [x], [y], AVG(v) FROM A1 GROUP BY A1[x-1][y], A1[x][y-1], A1[x][y], A1[x+1][y], A1[x][y+1]; y null 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null2011-09-22 IDEAS 2011 16
  30. 30. Array Tiling SELECT [x], [y], AVG(v) FROM A1 GROUP BY A1[x-1][y], A1[x][y-1], A1[x][y], A1[x+1][y], A1[x][y+1]; y null 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null2011-09-22 IDEAS 2011 17
  31. 31. Array Tiling SELECT [x], [y], AVG(v) FROM A1 GROUP BY A1[x-1][y], A1[x][y-1], A1[x][y], A1[x+1][y], A1[x][y+1]; y null 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null2011-09-22 IDEAS 2011 18
  32. 32. Array Tiling SELECT [x], [y], AVG(v) FROM A1 GROUP BY A1[x-1][y], A1[x][y-1], A1[x][y], A1[x+1][y], A1[x][y+1]; y null 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null2011-09-22 IDEAS 2011 19
  33. 33. Array Tiling SELECT [x], [y], AVG(v) FROM A1 GROUP BY A1[x-1][y], A1[x][y-1], A1[x][y], A1[x+1][y], A1[x][y+1]; y null 3 0.0 0.0 0.0 0.0 2 0.0 0.1 0.1 0.0 null null 1 0.125 0.2 0.3 0.25 0 0.0 0.125 0.125 0.167 0 1 2 3 x null2011-09-22 IDEAS 2011 20
  34. 34. Seismology Use Case Recent aftershock in Chili 2TB waveform data at 100Hz detecting seismic events using STA/LTA (e.g., 2 sec / 15 sec) remove false positives window-based 3 min. cuts further analysis: digital signal processing operations Current problems accessing waveform files too slow unpacking and positioning MSEED data every time take too long2011-09-22 IDEAS 2011 21
  35. 35. Seismology Use Case Recent aftershock in Chili CREATE ARRAY MSeed ( station VARCHAR(5) DIMENSION [‘0’:*:‘ZZZZZ’]; time TIMESTAMP DIMENSION, 2TB waveform data at 100Hz data DECIMAL(8,6) ); detecting seismic events using STA/LTA (e.g., 2 sec / 15 sec) station remove false positives efg window-based 3 min. cuts bce further analysis: digital signal bcd processing operations abc Current problems time accessing waveform files too slow unpacking and positioning MSEED data every time take too long2011-09-22 IDEAS 2011 22
  36. 36. Seismology Use Case Recent aftershock in Chili --- avg of 2 sec. windows: SELECT M.station, M.time, AVG(M.data) 2TB waveform data at 100Hz FROM MSeed AS M GROUP BY detecting seismic events using M[station][time - INTERVAL ‘2’ SECOND : time]; STA/LTA (e.g., 2 sec / 15 sec) remove false positives window-based 3 min. cuts further analysis: digital signal processing operations Current problems accessing waveform files too slow unpacking and positioning MSEED data every time take too long2011-09-22 IDEAS 2011 23
  37. 37. Seismology Use Case Recent aftershock in Chili CREATE TABLE Event( station VARCHAR(5), time TIMESTAMP, 2TB waveform data at 100Hz ratio FLOAT, PRIMARY KEY (station, time)); detecting seismic events using STA/LTA (e.g., 2 sec / 15 sec) INSERT INTO Event SELECT M1.station, M1.time, remove false positives AVG(M1.data)/AVG(M2.data) AS ratio FROM MSeed AS M1, MSeed AS M2 WHERE M1.station = M2.station window-based 3 min. cuts AND M1.time = M2.time GROUP BY further analysis: digital signal M1[station][time - INTERVAL ‘2’ SECOND: time], processing operations M2[station][time - INTERVAL ‘15’ SECOND: time] HAVING AVG(M1.data)/AVG(M2.data) > ?delta; Current problems accessing waveform files too slow unpacking and positioning MSEED data every time take too long2011-09-22 IDEAS 2011 24
  38. 38. Seismology Use Case Recent aftershock in Chili -- detect isolated errors by direct environment -- using wave propagation statics 2TB waveform data at 100Hz CREATE TABLE Neighbors( station1 VARCHAR(5), detecting seismic events using station2 VARCHAR(5), STA/LTA (e.g., 2 sec / 15 sec) mindelay INTERVAL SECOND, maxdelay INTERVAL SECOND, remove false positives weight FLOAT ); window-based 3 min. cuts -- remove the false positives from Event further analysis: digital signal processing operations DELETE FROM Event WHERE id NOT IN ( SELECT E1.id Current problems FROM Event AS E1, Event AS E2, Neighbor AS N WHERE E1.station = N.station1 AND E2.station = N.station2 accessing waveform files too slow AND E2.time BETWEEN E1.time + N.mindelay AND E1.time + N.maxdelay unpacking and positioning MSEED AND E1.ratio > E2.ratio * N.weight); data every time take too long2011-09-22 IDEAS 2011 25
  39. 39. Seismology Use Case Recent aftershocks in Chili -- pass time series to a UDF, written in, e.g., C: SELECT myfunction(M[station].*) 2TB waveform data at 100Hz FROM MSeed AS M, Event AS E WHERE M.station = E.station detecting seismic events using AND M.time = E.time STA/LTA (e.g., 2 sec / 15 sec) GROUP BY DISTINCT M[station][time - INTERVAL ‘1’ MINUTE : remove false positives time + INTERVAL ‘2’ MINUTE]; window-based 3 min. cuts further analysis: digital signal processing operations Current problems accessing waveform files too slow unpacking and positioning MSEED data every time take too long2011-09-22 IDEAS 2011 26
  40. 40. Conclusion SciQL: a first step towards a tailored scientific DBMS A symbiosis of relational and array paradigms Under active implementation Open issues: Appropriate array denotations Functional complete operation set Size limitations (due to BLOB representations) Existing foreign files !"#$%&()*+,#-&$.#/(012#&+$#%3$%#,( 2.#(4&#$5()*+,#-&$".1(6&$& Scale !"#$%&()&"#*+,-( ./0/123 4")*()5"%%,%*(*#-(( 6!7(8 9:7;;92011-09-22 IDEAS 2011 27

×