SciQLA Query Language for Science        Applications    M. Kersten, Y. Zhang, M. Ivanova, N. Nes                CWI Amste...
Who needs arrays anyway?             Seismology             – 1-D time-series, 3-D spatial data             Astronomy     ...
Arrays In DBMS             Research issues already in the 80’s             SQL language extension (add notion of order):  ...
Arrays In DBMS             DBMS support                OODB, multi-dimensional DBMS, Sequence DBMS, ...                the...
What is the problem with RDBMS?             Appropriate array denotations?             Functional complete operation set? ...
SciQL             An extension of SQL:2003 (pronounced as ‘cycle’)             Array as first class citizens of DBMS      ...
Array Definitions                                      Fixed array                                                        ...
Array Definitions                                      Fixed array                                                        ...
Array Definitions                                           Fixed array                                                   ...
Array Definitions                                           Fixed array                                                   ...
Array Dimensions         CREATE ARRAY A1 (             CREATE ARRAY A2 (            x INT DIMENSION[0:4:1],       x INT DI...
Array versus Table     CREATE ARRAY A1 (                              CREATE TABLE T1 (        x INT DIMENSION[0:4:1],    ...
Array versus Table     CREATE ARRAY A1 (                              CREATE TABLE T1 (        x INT DIMENSION[0:4:1],    ...
Array versus Table     CREATE ARRAY A1 (                                 CREATE TABLE T1 (        x INT DIMENSION[0:4:1], ...
Array & Table Coercions  CREATE ARRAY A1 (                                       SELECT x, y, v FROM A1;     x INT DIMENSI...
Array & Table Coercions  CREATE ARRAY A1 (                                       SELECT x, y, v FROM A1;     x INT DIMENSI...
Array & Table Coercions  CREATE TABLE T2 (     x INT, y INT, v FLOAT  );  INSERT INTO T2 VALUES    (1,0,5.5), (1,1,0.4),  ...
Array & Table Coercions  CREATE TABLE T2 (     x INT, y INT, v FLOAT  );  INSERT INTO T2 VALUES              SELECT [x], [...
Array Modifications  CREATE ARRAY A1 (     x INT DIMENSION[0:4:1],     y INT DIMENSION[0:4:1],     v FLOAT DEFAULT 0.0  );...
Array Modifications  CREATE ARRAY A1 (     x INT DIMENSION[0:4:1],     y INT DIMENSION[0:4:1],     v FLOAT DEFAULT 0.0  );...
Array Modifications  CREATE ARRAY A1 (     x INT DIMENSION[0:4:1],     y INT DIMENSION[0:4:1],     v FLOAT DEFAULT 0.0  );...
Array Modifications  CREATE ARRAY A1 (     x INT DIMENSION[0:4:1],     y INT DIMENSION[0:4:1],     v FLOAT DEFAULT 0.0  );...
Array Views  CREATE ARRAY A1 (     x INT DIMENSION[0:4:1],     y INT DIMENSION[0:4:1],     v FLOAT DEFAULT -1.0  );  INSER...
Array Views  CREATE ARRAY A1 (                                CREATE ARRAY VIEW A2 (     x INT DIMENSION[0:4:1],          ...
Array Views  CREATE ARRAY A1 (                                CREATE ARRAY VIEW A2 (     x INT DIMENSION[0:4:1],          ...
Array Views  CREATE ARRAY A1 (                                CREATE ARRAY VIEW A2 (     x INT DIMENSION[0:4:1],          ...
Array Views  CREATE ARRAY A1 (                                CREATE ARRAY VIEW A2 (     x INT DIMENSION[0:4:1],          ...
Array Views  CREATE ARRAY A1 (                                CREATE ARRAY VIEW A2 (     x INT DIMENSION[0:4:1],          ...
Array Views  CREATE ARRAY A1 (                                CREATE ARRAY VIEW A2 (     x INT DIMENSION[0:4:1],          ...
Array Tiling  CREATE ARRAY A1 (                       SELECT [x], [y], AVG(v) FROM A1     x INT DIMENSION[0:4:1],         ...
Array Tiling  CREATE ARRAY A1 (                       SELECT [x], [y], AVG(v) FROM A1     x INT DIMENSION[0:4:1],         ...
Array Tiling  CREATE ARRAY A1 (                       SELECT [x], [y], AVG(v) FROM A1     x INT DIMENSION[0:4:1],         ...
Array Tiling  CREATE ARRAY A1 (                       SELECT [x], [y], AVG(v) FROM A1     x INT DIMENSION[0:4:1],         ...
Array Tiling  CREATE ARRAY A1 (                       SELECT [x], [y], AVG(v) FROM A1     x INT DIMENSION[0:4:1],         ...
Array Tiling  CREATE ARRAY A1 (                       SELECT [x], [y], AVG(v) FROM A1     x INT DIMENSION[0:4:1],         ...
Array Tiling  CREATE ARRAY A1 (                       SELECT [x], [y], AVG(v) FROM A1     x INT DIMENSION[0:4:1],         ...
Array Tiling  CREATE ARRAY A1 (                       SELECT [x], [y], AVG(v) FROM A1     x INT DIMENSION[0:4:1],         ...
Array Tiling  CREATE ARRAY A1 (                       SELECT [x], [y], AVG(v) FROM A1     x INT DIMENSION[0:4:1],         ...
Array Tiling  CREATE ARRAY A1 (                       SELECT [x], [y], AVG(v) FROM A1     x INT DIMENSION[0:4:1],         ...
Array Tiling  CREATE ARRAY A1 (                       SELECT [x], [y], AVG(v) FROM A1     x INT DIMENSION[0:4:1],         ...
Array Tiling  CREATE ARRAY A1 (                       SELECT [x], [y], AVG(v) FROM A1     x INT DIMENSION[0:4:1],         ...
Array Tiling  CREATE ARRAY A1 (                       SELECT [x], [y], AVG(v) FROM A1     x INT DIMENSION[0:4:1],         ...
Array Tiling  CREATE ARRAY A1 (                       SELECT [x], [y], AVG(v) FROM A1     x INT DIMENSION[0:4:1],         ...
Array Tiling  CREATE ARRAY A1 (                       SELECT [x], [y], AVG(v) FROM A1     x INT DIMENSION[0:4:1],         ...
Array Tiling  CREATE ARRAY A1 (                       SELECT [x], [y], AVG(v) FROM A1     x INT DIMENSION[0:4:1],         ...
Array Tiling  CREATE ARRAY A1 (                       SELECT [x], [y], AVG(v) FROM A1     x INT DIMENSION[0:4:1],         ...
Array Tiling  CREATE ARRAY A1 (                       SELECT [x], [y], AVG(v) FROM A1     x INT DIMENSION[0:4:1],         ...
Array Tiling  CREATE ARRAY A1 (                       SELECT [x], [y], AVG(v) FROM A1[1:*][1:*]     x INT DIMENSION[0:4:1]...
Array Tiling  CREATE ARRAY A1 (                       SELECT [x], [y], AVG(v) FROM A1[1:*][1:*]     x INT DIMENSION[0:4:1]...
Seismology Use Case      Recent aftershock in Chili             2TB waveform data at 100Hz             detecting seismic e...
Seismology Use Case      Recent aftershock in Chili                    CREATE TABLE MSeed (                               ...
Seismology Use Case      Recent aftershock in Chili                    --- avg of 2 sec. windows:                         ...
Seismology Use Case      Recent aftershock in Chili                    CREATE TABLE Event(                                ...
Seismology Use Case      Recent aftershock in Chili                    -- detect isolated errors by direct environment    ...
Seismology Use Case      Recent aftershock in Chili                    -- detect false positives:                         ...
Seismology Use Case      Recent aftershock in Chili                    -- pass time series to a UDF, written in, e.g., C: ...
Conclusion             Appropriate array denotations             Functional complete operation set             Size limita...
Upcoming SlideShare
Loading in …5
×

SciQL, A Query Language for Science Applications

1,115 views

Published on

The talk was delivered by Ying Zhang at the the First International Array Databases Workshop , co-located with the EDBT/ICDT 2011 Joint Conference on March 25, 2011 in Uppsala, Sweden.

Publication: http://bit.ly/zyQPBq

Abstract:
Scientific applications are still poorly served by contemporary relational database systems. At best, the system provides a bridge towards an external library using user-defined functions, explicit import/export facilities or linked-in Java/C# interpreters. Time has come to rectify this with SciQL1, a SQL query language for scientific applications with arrays as first class citizens. It provides a seamless symbiosis of array-, set-, and sequence- interpretation using a clear separation of the mathematical object from its underlying implementation. A key innovation is to extend valuebased grouping in SQL:2003 with structural grouping, i.e., fixedsized and unbounded groups based on explicit relationships between their dimension attributes. It leads to a generalization of window-based query processing with wide applicability in science domains. This paper is focused on the language features, extensively illustrated with examples of its intended use.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,115
On SlideShare
0
From Embeds
0
Number of Embeds
57
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

SciQL, A Query Language for Science Applications

  1. 1. SciQLA Query Language for Science Applications M. Kersten, Y. Zhang, M. Ivanova, N. Nes CWI Amsterdam Array Database Workshop March 25th, 2011
  2. 2. Who needs arrays anyway? Seismology – 1-D time-series, 3-D spatial data Astronomy – temporal ordered rasters Climate simulation – temporal ordered grid Remote sensing – images of 2-D or higher Genomics – ordered DNA strings Scientists love arrays: HDF5, NETCDF, FITS, MSEED, … but also use: lists, tables, XML, ...2011-03-25 Array Database workshop 2
  3. 3. Arrays In DBMS Research issues already in the 80’s SQL language extension (add notion of order): RasQL, AQuery, SRQL, ... SQL:1999, SQL:2003 collection type, C-style arrays Algebraic frameworks (S)RAM, AQL, AML, ...2011-03-25 Array Database workshop 3
  4. 4. Arrays In DBMS DBMS support OODB, multi-dimensional DBMS, Sequence DBMS, ... the Longhorn Array Database RasDaMan Array in chunks as BLOB Array query optimisation on top of DBMS Known to work up to 12 TBs! PostgreSQL 8.1 SciDB Array DBMS from scratch Overlapping chunks for parallel execution2011-03-25 Array Database workshop 4
  5. 5. What is the problem with RDBMS? Appropriate array denotations? Functional complete operation set? Size limitations due to (BLOB) representations? Existing foreign files? Scale? ...2011-03-25 Array Database workshop 5
  6. 6. SciQL An extension of SQL:2003 (pronounced as ‘cycle’) Array as first class citizens of DBMS Seamless integration of tables and arrays Named dimensions with constraints Flexible structure-based grouping Seismology use case2011-03-25 Array Database workshop 6
  7. 7. Array Definitions Fixed array y null CREATE ARRAY A1 ( 3 0.0 0.0 0.0 0.0 x INT DIMENSION[0:4:1], 2 0.0 0.0 0.0 0.0 y INT DIMENSION[0:4:1], null null 1 0.0 0.0 0.0 0.0 v FLOAT DEFAULT 0.0 ); 0 0.0 0.0 0.0 0.0 x 0 1 2 3 null2011-03-25 Array Database workshop 7
  8. 8. Array Definitions Fixed array y null CREATE ARRAY A1 ( 3 0.0 0.0 0.0 0.0 x INT DIMENSION[0:4:1], 2 0.0 0.0 0.0 0.0 y INT DIMENSION[0:4:1], null null 1 0.0 0.0 0.0 0.0 v FLOAT DEFAULT 0.0 ); 0 0.0 0.0 0.0 0.0 x 0 1 2 3 null Unbounded array y CREATE ARRAY A2 ( 3 x INT DIMENSION, 2 y INT DIMENSION, null 1 v FLOAT DEFAULT 0.0 0 ); x 0 1 2 32011-03-25 Array Database workshop 7
  9. 9. Array Definitions Fixed array y null CREATE ARRAY A1 ( 3 0.0 0.0 0.0 0.0 x INT DIMENSION[0:4:1], 2 0.0 0.0 0.0 0.0 y INT DIMENSION[0:4:1], null null 1 0.0 0.0 0.0 0.0 v FLOAT DEFAULT 0.0 ); 0 0.0 0.0 0.0 0.0 x 0 1 2 3 null Unbounded array y CREATE ARRAY A2 ( 3 x INT DIMENSION, 2 y INT DIMENSION, null 1 v FLOAT DEFAULT 0.0 0 ); x 0 1 2 3 y 3 null INSERT INTO A2 VALUES 2 0.0 4.5 (1,0,5.5), (1,1,0.4), (2,2,4.5); 1 null 0.4 0.0 null 0 5.5 0.0 x 0 1 2 3 null2011-03-25 Array Database workshop 7
  10. 10. Array Definitions Fixed array y null CREATE ARRAY A1 ( 3 0.0 0.0 0.0 0.0 x INT DIMENSION[0:4:1], 2 0.0 0.0 0.0 0.0 y INT DIMENSION[0:4:1], null null 1 0.0 0.0 0.0 0.0 v FLOAT DEFAULT 0.0 ); 0 0.0 0.0 0.0 0.0 x 0 1 2 3 null Unbounded array y CREATE ARRAY A2 ( 3 x INT DIMENSION, 2 y INT DIMENSION, null 1 v FLOAT DEFAULT 0.0 0 ); x 0 1 2 implicit size 3 y 3 null INSERT INTO A2 VALUES 2 0.0 4.5 (1,0,5.5), (1,1,0.4), (2,2,4.5); 1 null 0.4 0.0 null 0 5.5 0.0 x 0 1 2 3 null2011-03-25 Array Database workshop 7
  11. 11. Array Dimensions CREATE ARRAY A1 ( CREATE ARRAY A2 ( x INT DIMENSION[0:4:1], x INT DIMENSION, y INT DIMENSION[0:4:1], y INT DIMENSION, v FLOAT DEFAULT 0.0 v FLOAT DEFAULT 0.0 ); ); Fixed dimensions: [start:final:step] INT dimension: [size] Unbounded dimensions: [(start|∗) : (final|∗) : (step|∗)] Dimension data type: scalar data types Time series: CREATE ARRAY Experiment ( time TIMESTAMP DIMENSION [TIMESTAMP ‘2011-03-25’ : * : INTERVAL ‘1’ MINUTE], data FLOAT );2011-03-25 Array Database workshop 8
  12. 12. Array versus Table CREATE ARRAY A1 ( CREATE TABLE T1 ( x INT DIMENSION[0:4:1], x INT, y INT DIMENSION[0:4:1], y INT, PRIMARY KEY (x,y), v FLOAT DEFAULT 0.0 v FLOAT DEFAULT 0.0 ); ); SELECT * FROM A1; SELECT * FROM T1; x y v x y v 0 0 0.0 0 1 0.0 0 2 0.0 0 3 0.0 1 0 0.0 1 1 0.0 1 2 0.0 1 3 0.0 2 0 0.0 2 1 0.0 2 2 0.0 2 3 0.0 3 0 0.0 3 1 0.0 3 2 0.0 3 3 0.02011-03-25 Array Database workshop 9
  13. 13. Array versus Table CREATE ARRAY A1 ( CREATE TABLE T1 ( x INT DIMENSION[0:4:1], x INT, y INT DIMENSION[0:4:1], y INT, PRIMARY KEY (x,y), v FLOAT DEFAULT 0.0 v FLOAT DEFAULT 0.0 ); ); SELECT * FROM A1; SELECT * FROM T1; x y v x y v 0 0 0.0 0 1 0.0 0 2 0.0 0 3 0.0 1 0 0.0 1 1 0.0 1 2 0.0 1 3 0.0 2 0 0.0 2 1 0.0 2 2 0.0 2 3 0.0 3 0 0.0 3 1 0.0 3 2 0.0 3 3 0.02011-03-25 Array Database workshop 9
  14. 14. Array versus Table CREATE ARRAY A1 ( CREATE TABLE T1 ( x INT DIMENSION[0:4:1], x INT, y INT DIMENSION[0:4:1], y INT, PRIMARY KEY (x,y), v FLOAT DEFAULT 0.0 v FLOAT DEFAULT 0.0 ); ); A collection of a priori defined tuples A collection of tuples To be updated with INSERT/DELETE Explicitly create/remove with INSERT/ (and UPDATE) DELETE Indexed by dimension expressions Indexed by a (primary) key Default value for non-dimensional Default value for each column attributes (i.e., cells)2011-03-25 Array Database workshop 10
  15. 15. Array & Table Coercions CREATE ARRAY A1 ( SELECT x, y, v FROM A1; x INT DIMENSION[0:4:1], y INT DIMENSION[0:4:1], x y v v FLOAT DEFAULT 0.0 0 0 0.0 ); 0 1 0.0 y null 0 2 0.0 3 0.0 0.0 0.0 0.0 0 3 0.0 null 2 0.0 0.0 0.0 0.0 null 1 0 0.0 1 0.0 0.0 0.0 0.0 1 1 0.0 0 0.0 0.0 0.0 0.0 0 1 2 3 x 1 2 0.0 null 1 3 0.0 2 0 0.0 2 1 0.0 2 2 0.0 2 3 0.0 3 0 0.0 3 1 0.0 3 2 0.0 3 3 0.02011-03-25 Array Database workshop 11
  16. 16. Array & Table Coercions CREATE ARRAY A1 ( SELECT x, y, v FROM A1; x INT DIMENSION[0:4:1], y INT DIMENSION[0:4:1], x y v v FLOAT DEFAULT 0.0 0 0 0.0 ); 0 1 0.0 y null 0 2 0.0 3 0.0 0.0 0.0 0.0 0 3 0.0 null 2 0.0 0.0 0.0 0.0 null 1 0 0.0 1 0.0 0.0 0.0 0.0 1 1 0.0 0 0.0 0.0 0.0 0.0 0 1 2 3 x 1 2 0.0 null 1 3 0.0 2 0 0.0 2 1 0.0 2 2 0.0 2 3 0.0 3 0 0.0 full materialisation! 3 1 0.0 3 2 0.0 3 3 0.02011-03-25 Array Database workshop 11
  17. 17. Array & Table Coercions CREATE TABLE T2 ( x INT, y INT, v FLOAT ); INSERT INTO T2 VALUES (1,0,5.5), (1,1,0.4), (2,2,4.5), (1,1,1.3); x y v 1 0 5.5 1 1 0.4 2 2 4.5 1 1 1.32011-03-25 Array Database workshop 12
  18. 18. Array & Table Coercions CREATE TABLE T2 ( x INT, y INT, v FLOAT ); INSERT INTO T2 VALUES SELECT [x], [y], v FROM T2; (1,0,5.5), (1,1,0.4), (2,2,4.5), (1,1,1.3); y x y v 3 0.0 1 0 5.5 2 0.0 4.5 1 1 0.4 1 0.0 0.4 0.0 0.0 2 2 4.5 0 5.5 0.0 1 1 1.3 x 0 1 2 3 0.0 An unbounded array min/max of dimensions are derived from the minimal bounding rectangle non-dimentional attributes inherit default column values duplicates are overwritten2011-03-25 Array Database workshop 12
  19. 19. Array Modifications CREATE ARRAY A1 ( x INT DIMENSION[0:4:1], y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); DELETE FROM A1 WHERE x = 1; y null 3 0.0 null 0.0 0.0 2 0.0 null 0.0 0.0 null null 1 0.0 null 0.0 0.0 0 0.0 null 0.0 0.0 0 1 2 3 x null2011-03-25 Array Database workshop 13
  20. 20. Array Modifications CREATE ARRAY A1 ( x INT DIMENSION[0:4:1], y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); DELETE FROM A1 WHERE x = 1; y null 3 0.0 null 0.0 0.0 2 0.0 null 0.0 0.0 null null 1 0.0 null 0.0 0.0 0 0.0 null 0.0 0.0 0 1 2 3 x null creates holes in the array2011-03-25 Array Database workshop 13
  21. 21. Array Modifications CREATE ARRAY A1 ( x INT DIMENSION[0:4:1], y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES (1,1,0.5), (2,1,0.5), (3,1,0.5); y null 3 0.0 null 0.0 0.0 2 0.0 null 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 null 0.0 0.0 0 1 2 3 x null2011-03-25 Array Database workshop 14
  22. 22. Array Modifications CREATE ARRAY A1 ( x INT DIMENSION[0:4:1], y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES (1,1,0.5), (2,1,0.5), (3,1,0.5); y null 3 0.0 null 0.0 0.0 2 0.0 null 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 null 0.0 0.0 0 1 2 3 x null set (change) values of cells2011-03-25 Array Database workshop 14
  23. 23. Array Views CREATE ARRAY A1 ( x INT DIMENSION[0:4:1], y INT DIMENSION[0:4:1], v FLOAT DEFAULT -1.0 ); INSERT INTO A1 VALUES (1,1,0.5), (2,1,0.5), (3,1,0.5); y null 3 -1.0 -1.0 -1.0 -1.0 2 -1.0 -1.0 -1.0 -1.0null null 1 -1.0 0.5 0.5 0.5 0 -1.0 -1.0 -1.0 -1.0 0 1 2 3 x null2011-03-25 Array Database workshop 15
  24. 24. Array Views CREATE ARRAY A1 ( CREATE ARRAY VIEW A2 ( x INT DIMENSION[0:4:1], x INT DIMENSION [-1:5:1], y INT DIMENSION[0:4:1], y INT DIMENSION [-1:5:1], v FLOAT DEFAULT -1.0 w FLOAT DEFAULT 0.0 ); ) AS SELECT x-1, y, v FROM A1 WHERE x > 1 INSERT INTO A1 VALUES UNION (1,1,0.5), (2,1,0.5), (3,1,0.5); SELECT x, y, 1.0 FROM A1 WHERE x = 3; y null 3 -1.0 -1.0 -1.0 -1.0 2 -1.0 -1.0 -1.0 -1.0null null 1 -1.0 0.5 0.5 0.5 0 -1.0 -1.0 -1.0 -1.0 0 1 2 3 x null2011-03-25 Array Database workshop 15
  25. 25. Array Views CREATE ARRAY A1 ( CREATE ARRAY VIEW A2 ( x INT DIMENSION[0:4:1], x INT DIMENSION [-1:5:1], y INT DIMENSION[0:4:1], y INT DIMENSION [-1:5:1], v FLOAT DEFAULT -1.0 w FLOAT DEFAULT 0.0 ); ) AS SELECT x-1, y, v FROM A1 WHERE x > 1 INSERT INTO A1 VALUES UNION (1,1,0.5), (2,1,0.5), (3,1,0.5); SELECT x, y, 1.0 FROM A1 WHERE x = 3; y null 3 -1.0 -1.0 -1.0 -1.0 2 -1.0 -1.0 -1.0 -1.0null null 1 -1.0 0.5 0.5 0.5 0 -1.0 -1.0 -1.0 -1.0 0 1 2 3 x null2011-03-25 Array Database workshop 15
  26. 26. Array Views CREATE ARRAY A1 ( CREATE ARRAY VIEW A2 ( x INT DIMENSION[0:4:1], x INT DIMENSION [-1:5:1], y INT DIMENSION[0:4:1], y INT DIMENSION [-1:5:1], v FLOAT DEFAULT -1.0 w FLOAT DEFAULT 0.0 ); ) AS SELECT x-1, y, v FROM A1 WHERE x > 1 INSERT INTO A1 VALUES UNION (1,1,0.5), (2,1,0.5), (3,1,0.5); SELECT x, y, 1.0 FROM A1 WHERE x = 3; y null 3 -1.0 -1.0 -1.0 -1.0 2 -1.0 -1.0 -1.0 -1.0null null 1 -1.0 0.5 0.5 0.5 0 -1.0 -1.0 -1.0 -1.0 0 1 2 3 x null2011-03-25 Array Database workshop 15
  27. 27. Array Views CREATE ARRAY A1 ( CREATE ARRAY VIEW A2 ( x INT DIMENSION[0:4:1], x INT DIMENSION [-1:5:1], y INT DIMENSION[0:4:1], y INT DIMENSION [-1:5:1], v FLOAT DEFAULT -1.0 w FLOAT DEFAULT 0.0 ); ) AS SELECT x-1, y, v FROM A1 WHERE x > 1 INSERT INTO A1 VALUES UNION (1,1,0.5), (2,1,0.5), (3,1,0.5); SELECT x, y, 1.0 FROM A1 WHERE x = 3; y null y null 4 0.0 0.0 0.0 0.0 0.0 0.0 3 -1.0 -1.0 -1.0 -1.0 3 0.0 0.0 0.0 0.0 0.0 0.0 2 -1.0 -1.0 -1.0 -1.0 2 0.0 0.0 0.0 0.0 0.0 0.0null null null null 1 -1.0 0.5 0.5 0.5 1 0.0 0.0 0.0 0.0 0.0 0.0 0 -1.0 -1.0 -1.0 -1.0 0 0.0 0.0 0.0 0.0 0.0 0.0 0 1 2 3 x -1 0.0 0.0 0.0 0.0 0.0 0.0 null -1 0 1 2 3 4 x null2011-03-25 Array Database workshop 15
  28. 28. Array Views CREATE ARRAY A1 ( CREATE ARRAY VIEW A2 ( x INT DIMENSION[0:4:1], x INT DIMENSION [-1:5:1], y INT DIMENSION[0:4:1], y INT DIMENSION [-1:5:1], v FLOAT DEFAULT -1.0 w FLOAT DEFAULT 0.0 ); ) AS SELECT x-1, y, v FROM A1 WHERE x > 1 INSERT INTO A1 VALUES UNION (1,1,0.5), (2,1,0.5), (3,1,0.5); SELECT x, y, 1.0 FROM A1 WHERE x = 3; y null y null 4 0.0 0.0 0.0 0.0 0.0 0.0 3 -1.0 -1.0 -1.0 -1.0 3 0.0 -1.0 -1.0 -1.0 0.0 0.0 0.0 0.0 0.0 2 -1.0 -1.0 -1.0 -1.0 2 0.0 -1.0 -1.0 -1.0 0.0 0.0 0.0 0.0 0.0null null null null 1 -1.0 0.5 0.5 0.5 1 0.0 0.5 0.0 0.5 0.0 0.5 0.0 0.0 0.0 0 -1.0 -1.0 -1.0 -1.0 0 0.0 -1.0 -1.0 -1.0 0.0 0.0 0.0 0.0 0.0 0 1 2 3 x -1 0.0 0.0 0.0 0.0 0.0 0.0 null -1 0 1 2 3 4 x null2011-03-25 Array Database workshop 15
  29. 29. Array Views CREATE ARRAY A1 ( CREATE ARRAY VIEW A2 ( x INT DIMENSION[0:4:1], x INT DIMENSION [-1:5:1], y INT DIMENSION[0:4:1], y INT DIMENSION [-1:5:1], v FLOAT DEFAULT -1.0 w FLOAT DEFAULT 0.0 ); ) AS SELECT x-1, y, v FROM A1 WHERE x > 1 INSERT INTO A1 VALUES UNION (1,1,0.5), (2,1,0.5), (3,1,0.5); SELECT x, y, 1.0 FROM A1 WHERE x = 3; y null y null 4 0.0 0.0 0.0 0.0 0.0 0.0 3 -1.0 -1.0 -1.0 -1.0 3 0.0 -1.0 -1.0 -1.0 0.0 0.0 0.0 0.0 1.0 0.0 2 -1.0 -1.0 -1.0 -1.0 2 0.0 -1.0 -1.0 -1.0 0.0 0.0 0.0 0.0 1.0 0.0null null null null 1 -1.0 0.5 0.5 0.5 1 0.0 0.5 0.0 0.5 0.0 0.5 0.0 1.0 0.0 0.0 0 -1.0 -1.0 -1.0 -1.0 0 0.0 -1.0 -1.0 -1.0 0.0 0.0 0.0 0.0 1.0 0.0 0 1 2 3 x -1 0.0 0.0 0.0 0.0 0.0 0.0 null -1 0 1 2 3 4 x null2011-03-25 Array Database workshop 15
  30. 30. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null2011-03-25 Array Database workshop 16
  31. 31. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null Anchor point2011-03-25 Array Database workshop 16
  32. 32. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null Anchor point2011-03-25 Array Database workshop 16
  33. 33. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null Anchor point2011-03-25 Array Database workshop 16
  34. 34. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null Anchor point2011-03-25 Array Database workshop 16
  35. 35. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null Anchor point2011-03-25 Array Database workshop 16
  36. 36. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null Anchor point2011-03-25 Array Database workshop 16
  37. 37. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null Anchor point2011-03-25 Array Database workshop 16
  38. 38. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 tiling ≠ 0 0.0 0.0 0.0 0.0 windowing x 0 1 2 3 null Anchor point2011-03-25 Array Database workshop 16
  39. 39. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY DISTINCT A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null2011-03-25 Array Database workshop 17
  40. 40. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY DISTINCT A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null Anchor point2011-03-25 Array Database workshop 17
  41. 41. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY DISTINCT A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null Anchor point2011-03-25 Array Database workshop 17
  42. 42. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY DISTINCT A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null Anchor point2011-03-25 Array Database workshop 17
  43. 43. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY DISTINCT A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null Anchor point2011-03-25 Array Database workshop 17
  44. 44. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY DISTINCT A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null Anchor point2011-03-25 Array Database workshop 17
  45. 45. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY DISTINCT A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null Anchor point2011-03-25 Array Database workshop 17
  46. 46. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY DISTINCT A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null Anchor point2011-03-25 Array Database workshop 17
  47. 47. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY DISTINCT A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null Anchor point2011-03-25 Array Database workshop 17
  48. 48. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1[1:*][1:*] x INT DIMENSION[0:4:1], GROUP BY DISTINCT A1[x-1][y], A1[x][y-1], y INT DIMENSION[0:4:1], A1[x][y], A1[x+1][y], A1[x][y+1]; v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null2011-03-25 Array Database workshop 18
  49. 49. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1[1:*][1:*] x INT DIMENSION[0:4:1], GROUP BY DISTINCT A1[x-1][y], A1[x][y-1], y INT DIMENSION[0:4:1], A1[x][y], A1[x+1][y], A1[x][y+1]; v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null Anchor point2011-03-25 Array Database workshop 18
  50. 50. Seismology Use Case Recent aftershock in Chili 2TB waveform data at 100Hz detecting seismic events using STA/ LTA (e.g., 2 sec / 15 sec) remove false positives window-based 3 min. cuts heuristic tests Current problems accessing waveform files too slow unpacking and positioning MSEED data takes too long2011-03-25 Array Database workshop 19
  51. 51. Seismology Use Case Recent aftershock in Chili CREATE TABLE MSeed ( station VARCHAR(10); ts ARRAY ( 2TB waveform data at 100Hz tick TIMESTAMP DIMENSION [* : * : INTERVAL ‘0.01’ SECOND], detecting seismic events using STA/ data DECIMAL(8,6) LTA (e.g., 2 sec / 15 sec) ) ); remove false positives window-based 3 min. cuts heuristic tests Current problems accessing waveform files too slow unpacking and positioning MSEED data takes too long2011-03-25 Array Database workshop 20
  52. 52. Seismology Use Case Recent aftershock in Chili --- avg of 2 sec. windows: SELECT A.station, A.ts.tick, AVG(A.ts.data) 2TB waveform data at 100Hz FROM MSeed AS A GROUP BY detecting seismic events using STA/ A.ts[tick - INTERVAL ‘2’ SECOND : tick]; LTA (e.g., 2 sec / 15 sec) remove false positives window-based 3 min. cuts heuristic tests Current problems accessing waveform files too slow unpacking and positioning MSEED data takes too long2011-03-25 Array Database workshop 21
  53. 53. Seismology Use Case Recent aftershock in Chili CREATE TABLE Event( station STRING, tick TIMESTAMP, 2TB waveform data at 100Hz ratio FLOAT) AS detecting seismic events using STA/ SELECT A.station, A.ts.tick, LTA (e.g., 2 sec / 15 sec) AVG(A.ts.data)/AVG(B.ts.data) AS ratio FROM MSeed AS A, MSeed AS B remove false positives WHERE A.station = B.station AND A.ts.tick = B.ts.tick GROUP BY window-based 3 min. cuts A.ts[tick - INTERVAL ‘2’ SECOND : tick], B.ts[tick - INTERVAL ‘15’ SECOND : tick] heuristic tests HAVING AVG(A.ts.data)/AVG(B.ts.data) > ?delta WITH DATA; Current problems accessing waveform files too slow unpacking and positioning MSEED data takes too long2011-03-25 Array Database workshop 22
  54. 54. Seismology Use Case Recent aftershock in Chili -- detect isolated errors by direct environment -- using wave propagation statics 2TB waveform data at 100Hz CREATE TABLE Neighbors( head STRING, detecting seismic events using STA/ tail STRING, LTA (e.g., 2 sec / 15 sec) delay TIMESTAMP, weight FLOAT remove false positives ); window-based 3 min. cuts heuristic tests Current problems accessing waveform files too slow unpacking and positioning MSEED data takes too long2011-03-25 Array Database workshop 23
  55. 55. Seismology Use Case Recent aftershock in Chili -- detect false positives: SELECT A.station, A.tick 2TB waveform data at 100Hz FROM Event AS A, Event AS B, Neighbor AS N WHERE A.station = N.head detecting seismic events using STA/ AND B.station = N.tail LTA (e.g., 2 sec / 15 sec) AND B.tick = A.tick + N.delay AND A.ratio > B.ratio * N.weight; remove false positives -- remove the false positives from Event window-based 3 min. cuts heuristic tests Current problems accessing waveform files too slow unpacking and positioning MSEED data takes too long2011-03-25 Array Database workshop 24
  56. 56. Seismology Use Case Recent aftershock in Chili -- pass time series to a UDF, written in, e.g., C: SELECT A.station, myfunction(A.ts) 2TB waveform data at 100Hz FROM MSeed A, Event B WHERE A.station = B.station detecting seismic events using STA/ AND A.ts.tick = B.tick LTA (e.g., 2 sec / 15 sec) GROUP BY DISTINCT A.ts[tick - INTERVAL ‘3’ MINUTE : tick]; remove false positives window-based 3 min. cuts heuristic tests Current problems accessing waveform files too slow unpacking and positioning MSEED data takes too long2011-03-25 Array Database workshop 25
  57. 57. Conclusion Appropriate array denotations Functional complete operation set Size limitations due to (blob) representations Existing foreign files? Scale? An Array DBMS for sciences Symbiosis of relational and array paradigms2011-03-25 Array Database workshop 26

×