Your SlideShare is downloading. ×
SciQL, A Query Language for Science Applications
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

SciQL, A Query Language for Science Applications

807
views

Published on

The talk was delivered by Ying Zhang at the the First International Array Databases Workshop , co-located with the EDBT/ICDT 2011 Joint Conference on March 25, 2011 in Uppsala, Sweden. …

The talk was delivered by Ying Zhang at the the First International Array Databases Workshop , co-located with the EDBT/ICDT 2011 Joint Conference on March 25, 2011 in Uppsala, Sweden.

Publication: http://bit.ly/zyQPBq

Abstract:
Scientific applications are still poorly served by contemporary relational database systems. At best, the system provides a bridge towards an external library using user-defined functions, explicit import/export facilities or linked-in Java/C# interpreters. Time has come to rectify this with SciQL1, a SQL query language for scientific applications with arrays as first class citizens. It provides a seamless symbiosis of array-, set-, and sequence- interpretation using a clear separation of the mathematical object from its underlying implementation. A key innovation is to extend valuebased grouping in SQL:2003 with structural grouping, i.e., fixedsized and unbounded groups based on explicit relationships between their dimension attributes. It leads to a generalization of window-based query processing with wide applicability in science domains. This paper is focused on the language features, extensively illustrated with examples of its intended use.

Published in: Technology

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
807
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. SciQLA Query Language for Science Applications M. Kersten, Y. Zhang, M. Ivanova, N. Nes CWI Amsterdam Array Database Workshop March 25th, 2011
  • 2. Who needs arrays anyway? Seismology – 1-D time-series, 3-D spatial data Astronomy – temporal ordered rasters Climate simulation – temporal ordered grid Remote sensing – images of 2-D or higher Genomics – ordered DNA strings Scientists love arrays: HDF5, NETCDF, FITS, MSEED, … but also use: lists, tables, XML, ...2011-03-25 Array Database workshop 2
  • 3. Arrays In DBMS Research issues already in the 80’s SQL language extension (add notion of order): RasQL, AQuery, SRQL, ... SQL:1999, SQL:2003 collection type, C-style arrays Algebraic frameworks (S)RAM, AQL, AML, ...2011-03-25 Array Database workshop 3
  • 4. Arrays In DBMS DBMS support OODB, multi-dimensional DBMS, Sequence DBMS, ... the Longhorn Array Database RasDaMan Array in chunks as BLOB Array query optimisation on top of DBMS Known to work up to 12 TBs! PostgreSQL 8.1 SciDB Array DBMS from scratch Overlapping chunks for parallel execution2011-03-25 Array Database workshop 4
  • 5. What is the problem with RDBMS? Appropriate array denotations? Functional complete operation set? Size limitations due to (BLOB) representations? Existing foreign files? Scale? ...2011-03-25 Array Database workshop 5
  • 6. SciQL An extension of SQL:2003 (pronounced as ‘cycle’) Array as first class citizens of DBMS Seamless integration of tables and arrays Named dimensions with constraints Flexible structure-based grouping Seismology use case2011-03-25 Array Database workshop 6
  • 7. Array Definitions Fixed array y null CREATE ARRAY A1 ( 3 0.0 0.0 0.0 0.0 x INT DIMENSION[0:4:1], 2 0.0 0.0 0.0 0.0 y INT DIMENSION[0:4:1], null null 1 0.0 0.0 0.0 0.0 v FLOAT DEFAULT 0.0 ); 0 0.0 0.0 0.0 0.0 x 0 1 2 3 null2011-03-25 Array Database workshop 7
  • 8. Array Definitions Fixed array y null CREATE ARRAY A1 ( 3 0.0 0.0 0.0 0.0 x INT DIMENSION[0:4:1], 2 0.0 0.0 0.0 0.0 y INT DIMENSION[0:4:1], null null 1 0.0 0.0 0.0 0.0 v FLOAT DEFAULT 0.0 ); 0 0.0 0.0 0.0 0.0 x 0 1 2 3 null Unbounded array y CREATE ARRAY A2 ( 3 x INT DIMENSION, 2 y INT DIMENSION, null 1 v FLOAT DEFAULT 0.0 0 ); x 0 1 2 32011-03-25 Array Database workshop 7
  • 9. Array Definitions Fixed array y null CREATE ARRAY A1 ( 3 0.0 0.0 0.0 0.0 x INT DIMENSION[0:4:1], 2 0.0 0.0 0.0 0.0 y INT DIMENSION[0:4:1], null null 1 0.0 0.0 0.0 0.0 v FLOAT DEFAULT 0.0 ); 0 0.0 0.0 0.0 0.0 x 0 1 2 3 null Unbounded array y CREATE ARRAY A2 ( 3 x INT DIMENSION, 2 y INT DIMENSION, null 1 v FLOAT DEFAULT 0.0 0 ); x 0 1 2 3 y 3 null INSERT INTO A2 VALUES 2 0.0 4.5 (1,0,5.5), (1,1,0.4), (2,2,4.5); 1 null 0.4 0.0 null 0 5.5 0.0 x 0 1 2 3 null2011-03-25 Array Database workshop 7
  • 10. Array Definitions Fixed array y null CREATE ARRAY A1 ( 3 0.0 0.0 0.0 0.0 x INT DIMENSION[0:4:1], 2 0.0 0.0 0.0 0.0 y INT DIMENSION[0:4:1], null null 1 0.0 0.0 0.0 0.0 v FLOAT DEFAULT 0.0 ); 0 0.0 0.0 0.0 0.0 x 0 1 2 3 null Unbounded array y CREATE ARRAY A2 ( 3 x INT DIMENSION, 2 y INT DIMENSION, null 1 v FLOAT DEFAULT 0.0 0 ); x 0 1 2 implicit size 3 y 3 null INSERT INTO A2 VALUES 2 0.0 4.5 (1,0,5.5), (1,1,0.4), (2,2,4.5); 1 null 0.4 0.0 null 0 5.5 0.0 x 0 1 2 3 null2011-03-25 Array Database workshop 7
  • 11. Array Dimensions CREATE ARRAY A1 ( CREATE ARRAY A2 ( x INT DIMENSION[0:4:1], x INT DIMENSION, y INT DIMENSION[0:4:1], y INT DIMENSION, v FLOAT DEFAULT 0.0 v FLOAT DEFAULT 0.0 ); ); Fixed dimensions: [start:final:step] INT dimension: [size] Unbounded dimensions: [(start|∗) : (final|∗) : (step|∗)] Dimension data type: scalar data types Time series: CREATE ARRAY Experiment ( time TIMESTAMP DIMENSION [TIMESTAMP ‘2011-03-25’ : * : INTERVAL ‘1’ MINUTE], data FLOAT );2011-03-25 Array Database workshop 8
  • 12. Array versus Table CREATE ARRAY A1 ( CREATE TABLE T1 ( x INT DIMENSION[0:4:1], x INT, y INT DIMENSION[0:4:1], y INT, PRIMARY KEY (x,y), v FLOAT DEFAULT 0.0 v FLOAT DEFAULT 0.0 ); ); SELECT * FROM A1; SELECT * FROM T1; x y v x y v 0 0 0.0 0 1 0.0 0 2 0.0 0 3 0.0 1 0 0.0 1 1 0.0 1 2 0.0 1 3 0.0 2 0 0.0 2 1 0.0 2 2 0.0 2 3 0.0 3 0 0.0 3 1 0.0 3 2 0.0 3 3 0.02011-03-25 Array Database workshop 9
  • 13. Array versus Table CREATE ARRAY A1 ( CREATE TABLE T1 ( x INT DIMENSION[0:4:1], x INT, y INT DIMENSION[0:4:1], y INT, PRIMARY KEY (x,y), v FLOAT DEFAULT 0.0 v FLOAT DEFAULT 0.0 ); ); SELECT * FROM A1; SELECT * FROM T1; x y v x y v 0 0 0.0 0 1 0.0 0 2 0.0 0 3 0.0 1 0 0.0 1 1 0.0 1 2 0.0 1 3 0.0 2 0 0.0 2 1 0.0 2 2 0.0 2 3 0.0 3 0 0.0 3 1 0.0 3 2 0.0 3 3 0.02011-03-25 Array Database workshop 9
  • 14. Array versus Table CREATE ARRAY A1 ( CREATE TABLE T1 ( x INT DIMENSION[0:4:1], x INT, y INT DIMENSION[0:4:1], y INT, PRIMARY KEY (x,y), v FLOAT DEFAULT 0.0 v FLOAT DEFAULT 0.0 ); ); A collection of a priori defined tuples A collection of tuples To be updated with INSERT/DELETE Explicitly create/remove with INSERT/ (and UPDATE) DELETE Indexed by dimension expressions Indexed by a (primary) key Default value for non-dimensional Default value for each column attributes (i.e., cells)2011-03-25 Array Database workshop 10
  • 15. Array & Table Coercions CREATE ARRAY A1 ( SELECT x, y, v FROM A1; x INT DIMENSION[0:4:1], y INT DIMENSION[0:4:1], x y v v FLOAT DEFAULT 0.0 0 0 0.0 ); 0 1 0.0 y null 0 2 0.0 3 0.0 0.0 0.0 0.0 0 3 0.0 null 2 0.0 0.0 0.0 0.0 null 1 0 0.0 1 0.0 0.0 0.0 0.0 1 1 0.0 0 0.0 0.0 0.0 0.0 0 1 2 3 x 1 2 0.0 null 1 3 0.0 2 0 0.0 2 1 0.0 2 2 0.0 2 3 0.0 3 0 0.0 3 1 0.0 3 2 0.0 3 3 0.02011-03-25 Array Database workshop 11
  • 16. Array & Table Coercions CREATE ARRAY A1 ( SELECT x, y, v FROM A1; x INT DIMENSION[0:4:1], y INT DIMENSION[0:4:1], x y v v FLOAT DEFAULT 0.0 0 0 0.0 ); 0 1 0.0 y null 0 2 0.0 3 0.0 0.0 0.0 0.0 0 3 0.0 null 2 0.0 0.0 0.0 0.0 null 1 0 0.0 1 0.0 0.0 0.0 0.0 1 1 0.0 0 0.0 0.0 0.0 0.0 0 1 2 3 x 1 2 0.0 null 1 3 0.0 2 0 0.0 2 1 0.0 2 2 0.0 2 3 0.0 3 0 0.0 full materialisation! 3 1 0.0 3 2 0.0 3 3 0.02011-03-25 Array Database workshop 11
  • 17. Array & Table Coercions CREATE TABLE T2 ( x INT, y INT, v FLOAT ); INSERT INTO T2 VALUES (1,0,5.5), (1,1,0.4), (2,2,4.5), (1,1,1.3); x y v 1 0 5.5 1 1 0.4 2 2 4.5 1 1 1.32011-03-25 Array Database workshop 12
  • 18. Array & Table Coercions CREATE TABLE T2 ( x INT, y INT, v FLOAT ); INSERT INTO T2 VALUES SELECT [x], [y], v FROM T2; (1,0,5.5), (1,1,0.4), (2,2,4.5), (1,1,1.3); y x y v 3 0.0 1 0 5.5 2 0.0 4.5 1 1 0.4 1 0.0 0.4 0.0 0.0 2 2 4.5 0 5.5 0.0 1 1 1.3 x 0 1 2 3 0.0 An unbounded array min/max of dimensions are derived from the minimal bounding rectangle non-dimentional attributes inherit default column values duplicates are overwritten2011-03-25 Array Database workshop 12
  • 19. Array Modifications CREATE ARRAY A1 ( x INT DIMENSION[0:4:1], y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); DELETE FROM A1 WHERE x = 1; y null 3 0.0 null 0.0 0.0 2 0.0 null 0.0 0.0 null null 1 0.0 null 0.0 0.0 0 0.0 null 0.0 0.0 0 1 2 3 x null2011-03-25 Array Database workshop 13
  • 20. Array Modifications CREATE ARRAY A1 ( x INT DIMENSION[0:4:1], y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); DELETE FROM A1 WHERE x = 1; y null 3 0.0 null 0.0 0.0 2 0.0 null 0.0 0.0 null null 1 0.0 null 0.0 0.0 0 0.0 null 0.0 0.0 0 1 2 3 x null creates holes in the array2011-03-25 Array Database workshop 13
  • 21. Array Modifications CREATE ARRAY A1 ( x INT DIMENSION[0:4:1], y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES (1,1,0.5), (2,1,0.5), (3,1,0.5); y null 3 0.0 null 0.0 0.0 2 0.0 null 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 null 0.0 0.0 0 1 2 3 x null2011-03-25 Array Database workshop 14
  • 22. Array Modifications CREATE ARRAY A1 ( x INT DIMENSION[0:4:1], y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES (1,1,0.5), (2,1,0.5), (3,1,0.5); y null 3 0.0 null 0.0 0.0 2 0.0 null 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 null 0.0 0.0 0 1 2 3 x null set (change) values of cells2011-03-25 Array Database workshop 14
  • 23. Array Views CREATE ARRAY A1 ( x INT DIMENSION[0:4:1], y INT DIMENSION[0:4:1], v FLOAT DEFAULT -1.0 ); INSERT INTO A1 VALUES (1,1,0.5), (2,1,0.5), (3,1,0.5); y null 3 -1.0 -1.0 -1.0 -1.0 2 -1.0 -1.0 -1.0 -1.0null null 1 -1.0 0.5 0.5 0.5 0 -1.0 -1.0 -1.0 -1.0 0 1 2 3 x null2011-03-25 Array Database workshop 15
  • 24. Array Views CREATE ARRAY A1 ( CREATE ARRAY VIEW A2 ( x INT DIMENSION[0:4:1], x INT DIMENSION [-1:5:1], y INT DIMENSION[0:4:1], y INT DIMENSION [-1:5:1], v FLOAT DEFAULT -1.0 w FLOAT DEFAULT 0.0 ); ) AS SELECT x-1, y, v FROM A1 WHERE x > 1 INSERT INTO A1 VALUES UNION (1,1,0.5), (2,1,0.5), (3,1,0.5); SELECT x, y, 1.0 FROM A1 WHERE x = 3; y null 3 -1.0 -1.0 -1.0 -1.0 2 -1.0 -1.0 -1.0 -1.0null null 1 -1.0 0.5 0.5 0.5 0 -1.0 -1.0 -1.0 -1.0 0 1 2 3 x null2011-03-25 Array Database workshop 15
  • 25. Array Views CREATE ARRAY A1 ( CREATE ARRAY VIEW A2 ( x INT DIMENSION[0:4:1], x INT DIMENSION [-1:5:1], y INT DIMENSION[0:4:1], y INT DIMENSION [-1:5:1], v FLOAT DEFAULT -1.0 w FLOAT DEFAULT 0.0 ); ) AS SELECT x-1, y, v FROM A1 WHERE x > 1 INSERT INTO A1 VALUES UNION (1,1,0.5), (2,1,0.5), (3,1,0.5); SELECT x, y, 1.0 FROM A1 WHERE x = 3; y null 3 -1.0 -1.0 -1.0 -1.0 2 -1.0 -1.0 -1.0 -1.0null null 1 -1.0 0.5 0.5 0.5 0 -1.0 -1.0 -1.0 -1.0 0 1 2 3 x null2011-03-25 Array Database workshop 15
  • 26. Array Views CREATE ARRAY A1 ( CREATE ARRAY VIEW A2 ( x INT DIMENSION[0:4:1], x INT DIMENSION [-1:5:1], y INT DIMENSION[0:4:1], y INT DIMENSION [-1:5:1], v FLOAT DEFAULT -1.0 w FLOAT DEFAULT 0.0 ); ) AS SELECT x-1, y, v FROM A1 WHERE x > 1 INSERT INTO A1 VALUES UNION (1,1,0.5), (2,1,0.5), (3,1,0.5); SELECT x, y, 1.0 FROM A1 WHERE x = 3; y null 3 -1.0 -1.0 -1.0 -1.0 2 -1.0 -1.0 -1.0 -1.0null null 1 -1.0 0.5 0.5 0.5 0 -1.0 -1.0 -1.0 -1.0 0 1 2 3 x null2011-03-25 Array Database workshop 15
  • 27. Array Views CREATE ARRAY A1 ( CREATE ARRAY VIEW A2 ( x INT DIMENSION[0:4:1], x INT DIMENSION [-1:5:1], y INT DIMENSION[0:4:1], y INT DIMENSION [-1:5:1], v FLOAT DEFAULT -1.0 w FLOAT DEFAULT 0.0 ); ) AS SELECT x-1, y, v FROM A1 WHERE x > 1 INSERT INTO A1 VALUES UNION (1,1,0.5), (2,1,0.5), (3,1,0.5); SELECT x, y, 1.0 FROM A1 WHERE x = 3; y null y null 4 0.0 0.0 0.0 0.0 0.0 0.0 3 -1.0 -1.0 -1.0 -1.0 3 0.0 0.0 0.0 0.0 0.0 0.0 2 -1.0 -1.0 -1.0 -1.0 2 0.0 0.0 0.0 0.0 0.0 0.0null null null null 1 -1.0 0.5 0.5 0.5 1 0.0 0.0 0.0 0.0 0.0 0.0 0 -1.0 -1.0 -1.0 -1.0 0 0.0 0.0 0.0 0.0 0.0 0.0 0 1 2 3 x -1 0.0 0.0 0.0 0.0 0.0 0.0 null -1 0 1 2 3 4 x null2011-03-25 Array Database workshop 15
  • 28. Array Views CREATE ARRAY A1 ( CREATE ARRAY VIEW A2 ( x INT DIMENSION[0:4:1], x INT DIMENSION [-1:5:1], y INT DIMENSION[0:4:1], y INT DIMENSION [-1:5:1], v FLOAT DEFAULT -1.0 w FLOAT DEFAULT 0.0 ); ) AS SELECT x-1, y, v FROM A1 WHERE x > 1 INSERT INTO A1 VALUES UNION (1,1,0.5), (2,1,0.5), (3,1,0.5); SELECT x, y, 1.0 FROM A1 WHERE x = 3; y null y null 4 0.0 0.0 0.0 0.0 0.0 0.0 3 -1.0 -1.0 -1.0 -1.0 3 0.0 -1.0 -1.0 -1.0 0.0 0.0 0.0 0.0 0.0 2 -1.0 -1.0 -1.0 -1.0 2 0.0 -1.0 -1.0 -1.0 0.0 0.0 0.0 0.0 0.0null null null null 1 -1.0 0.5 0.5 0.5 1 0.0 0.5 0.0 0.5 0.0 0.5 0.0 0.0 0.0 0 -1.0 -1.0 -1.0 -1.0 0 0.0 -1.0 -1.0 -1.0 0.0 0.0 0.0 0.0 0.0 0 1 2 3 x -1 0.0 0.0 0.0 0.0 0.0 0.0 null -1 0 1 2 3 4 x null2011-03-25 Array Database workshop 15
  • 29. Array Views CREATE ARRAY A1 ( CREATE ARRAY VIEW A2 ( x INT DIMENSION[0:4:1], x INT DIMENSION [-1:5:1], y INT DIMENSION[0:4:1], y INT DIMENSION [-1:5:1], v FLOAT DEFAULT -1.0 w FLOAT DEFAULT 0.0 ); ) AS SELECT x-1, y, v FROM A1 WHERE x > 1 INSERT INTO A1 VALUES UNION (1,1,0.5), (2,1,0.5), (3,1,0.5); SELECT x, y, 1.0 FROM A1 WHERE x = 3; y null y null 4 0.0 0.0 0.0 0.0 0.0 0.0 3 -1.0 -1.0 -1.0 -1.0 3 0.0 -1.0 -1.0 -1.0 0.0 0.0 0.0 0.0 1.0 0.0 2 -1.0 -1.0 -1.0 -1.0 2 0.0 -1.0 -1.0 -1.0 0.0 0.0 0.0 0.0 1.0 0.0null null null null 1 -1.0 0.5 0.5 0.5 1 0.0 0.5 0.0 0.5 0.0 0.5 0.0 1.0 0.0 0.0 0 -1.0 -1.0 -1.0 -1.0 0 0.0 -1.0 -1.0 -1.0 0.0 0.0 0.0 0.0 1.0 0.0 0 1 2 3 x -1 0.0 0.0 0.0 0.0 0.0 0.0 null -1 0 1 2 3 4 x null2011-03-25 Array Database workshop 15
  • 30. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null2011-03-25 Array Database workshop 16
  • 31. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null Anchor point2011-03-25 Array Database workshop 16
  • 32. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null Anchor point2011-03-25 Array Database workshop 16
  • 33. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null Anchor point2011-03-25 Array Database workshop 16
  • 34. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null Anchor point2011-03-25 Array Database workshop 16
  • 35. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null Anchor point2011-03-25 Array Database workshop 16
  • 36. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null Anchor point2011-03-25 Array Database workshop 16
  • 37. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null Anchor point2011-03-25 Array Database workshop 16
  • 38. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 tiling ≠ 0 0.0 0.0 0.0 0.0 windowing x 0 1 2 3 null Anchor point2011-03-25 Array Database workshop 16
  • 39. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY DISTINCT A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null2011-03-25 Array Database workshop 17
  • 40. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY DISTINCT A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null Anchor point2011-03-25 Array Database workshop 17
  • 41. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY DISTINCT A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null Anchor point2011-03-25 Array Database workshop 17
  • 42. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY DISTINCT A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null Anchor point2011-03-25 Array Database workshop 17
  • 43. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY DISTINCT A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null Anchor point2011-03-25 Array Database workshop 17
  • 44. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY DISTINCT A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null Anchor point2011-03-25 Array Database workshop 17
  • 45. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY DISTINCT A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null Anchor point2011-03-25 Array Database workshop 17
  • 46. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY DISTINCT A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null Anchor point2011-03-25 Array Database workshop 17
  • 47. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1 x INT DIMENSION[0:4:1], GROUP BY DISTINCT A1[x:x+2][y:y+2]; y INT DIMENSION[0:4:1], v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null Anchor point2011-03-25 Array Database workshop 17
  • 48. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1[1:*][1:*] x INT DIMENSION[0:4:1], GROUP BY DISTINCT A1[x-1][y], A1[x][y-1], y INT DIMENSION[0:4:1], A1[x][y], A1[x+1][y], A1[x][y+1]; v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null2011-03-25 Array Database workshop 18
  • 49. Array Tiling CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1[1:*][1:*] x INT DIMENSION[0:4:1], GROUP BY DISTINCT A1[x-1][y], A1[x][y-1], y INT DIMENSION[0:4:1], A1[x][y], A1[x+1][y], A1[x][y+1]; v FLOAT DEFAULT 0.0 ); INSERT INTO A1 VALUES y null (1,1,0.5), (2,1,0.5), (3,1,0.5); 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null Anchor point2011-03-25 Array Database workshop 18
  • 50. Seismology Use Case Recent aftershock in Chili 2TB waveform data at 100Hz detecting seismic events using STA/ LTA (e.g., 2 sec / 15 sec) remove false positives window-based 3 min. cuts heuristic tests Current problems accessing waveform files too slow unpacking and positioning MSEED data takes too long2011-03-25 Array Database workshop 19
  • 51. Seismology Use Case Recent aftershock in Chili CREATE TABLE MSeed ( station VARCHAR(10); ts ARRAY ( 2TB waveform data at 100Hz tick TIMESTAMP DIMENSION [* : * : INTERVAL ‘0.01’ SECOND], detecting seismic events using STA/ data DECIMAL(8,6) LTA (e.g., 2 sec / 15 sec) ) ); remove false positives window-based 3 min. cuts heuristic tests Current problems accessing waveform files too slow unpacking and positioning MSEED data takes too long2011-03-25 Array Database workshop 20
  • 52. Seismology Use Case Recent aftershock in Chili --- avg of 2 sec. windows: SELECT A.station, A.ts.tick, AVG(A.ts.data) 2TB waveform data at 100Hz FROM MSeed AS A GROUP BY detecting seismic events using STA/ A.ts[tick - INTERVAL ‘2’ SECOND : tick]; LTA (e.g., 2 sec / 15 sec) remove false positives window-based 3 min. cuts heuristic tests Current problems accessing waveform files too slow unpacking and positioning MSEED data takes too long2011-03-25 Array Database workshop 21
  • 53. Seismology Use Case Recent aftershock in Chili CREATE TABLE Event( station STRING, tick TIMESTAMP, 2TB waveform data at 100Hz ratio FLOAT) AS detecting seismic events using STA/ SELECT A.station, A.ts.tick, LTA (e.g., 2 sec / 15 sec) AVG(A.ts.data)/AVG(B.ts.data) AS ratio FROM MSeed AS A, MSeed AS B remove false positives WHERE A.station = B.station AND A.ts.tick = B.ts.tick GROUP BY window-based 3 min. cuts A.ts[tick - INTERVAL ‘2’ SECOND : tick], B.ts[tick - INTERVAL ‘15’ SECOND : tick] heuristic tests HAVING AVG(A.ts.data)/AVG(B.ts.data) > ?delta WITH DATA; Current problems accessing waveform files too slow unpacking and positioning MSEED data takes too long2011-03-25 Array Database workshop 22
  • 54. Seismology Use Case Recent aftershock in Chili -- detect isolated errors by direct environment -- using wave propagation statics 2TB waveform data at 100Hz CREATE TABLE Neighbors( head STRING, detecting seismic events using STA/ tail STRING, LTA (e.g., 2 sec / 15 sec) delay TIMESTAMP, weight FLOAT remove false positives ); window-based 3 min. cuts heuristic tests Current problems accessing waveform files too slow unpacking and positioning MSEED data takes too long2011-03-25 Array Database workshop 23
  • 55. Seismology Use Case Recent aftershock in Chili -- detect false positives: SELECT A.station, A.tick 2TB waveform data at 100Hz FROM Event AS A, Event AS B, Neighbor AS N WHERE A.station = N.head detecting seismic events using STA/ AND B.station = N.tail LTA (e.g., 2 sec / 15 sec) AND B.tick = A.tick + N.delay AND A.ratio > B.ratio * N.weight; remove false positives -- remove the false positives from Event window-based 3 min. cuts heuristic tests Current problems accessing waveform files too slow unpacking and positioning MSEED data takes too long2011-03-25 Array Database workshop 24
  • 56. Seismology Use Case Recent aftershock in Chili -- pass time series to a UDF, written in, e.g., C: SELECT A.station, myfunction(A.ts) 2TB waveform data at 100Hz FROM MSeed A, Event B WHERE A.station = B.station detecting seismic events using STA/ AND A.ts.tick = B.tick LTA (e.g., 2 sec / 15 sec) GROUP BY DISTINCT A.ts[tick - INTERVAL ‘3’ MINUTE : tick]; remove false positives window-based 3 min. cuts heuristic tests Current problems accessing waveform files too slow unpacking and positioning MSEED data takes too long2011-03-25 Array Database workshop 25
  • 57. Conclusion Appropriate array denotations Functional complete operation set Size limitations due to (blob) representations Existing foreign files? Scale? An Array DBMS for sciences Symbiosis of relational and array paradigms2011-03-25 Array Database workshop 26