SlideShare a Scribd company logo
1 of 21
SciQL
A Query Language for Unified Scientific Data Processing and
                      Management



                        Javad Chamanara
                   University of Jena, Germany
                 javad.chamanara@uni-jena.de
                                At:
                    CIKM 2012, Maui, HI, USA
                           Nov. 2, 2012
What is scientific data?




November 2, 2012                                         javad.chamanara@uni-jena.de                                                   2
SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
What is available?




November 2, 2012                                         javad.chamanara@uni-jena.de                                                   3
SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
What is proposed here?




November 2, 2012                                         javad.chamanara@uni-jena.de                                                   4
SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
What does it provide?




November 2, 2012                                         javad.chamanara@uni-jena.de                                                   5
SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
A Sample
Define Perspective p1 As
{
       Attribute Temp_Fahrenheit MapTo Function(1.8 * Temp_Celsius + 32)
       Attribute SN_mg MapTo Function(SN_g * 1000)
       Attribute Year MapTo Function(Year(Timestamp)) DataType=Integer
}
Connection d Adapter=Spreadsheet Source_URI="c:datadata1.xls"
Bind Perspective=p1 Connection=d Version=Latest As pdLatest
Var pdAll = Select From pdLatest
Draw Data=pdLatest GraphType=Scatter V-Axis=NS_mg H-Axis=Temp_Fahrenheit
Var pdGroupped = Select Average(Temp_Fahrenheit) As Avg From pdLatest Group
By Year

November 2, 2012                                         javad.chamanara@uni-jena.de                                                   6
SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
How does it work?
Var x = Select Average(Temp_Fahrenheit) As Avg From
pdLatest Where Year > 2001 Group By Year




November 2, 2012                                         javad.chamanara@uni-jena.de                                                   7
SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
How does it work? (AST)

                                                        =

                                                                                              Select



              VAR DEF                                           Project              Fetch                Filter              Aggregate



                                                                                      pdLat
      Var                         x                                Avg                 est                    >                        Group




                                                                                                Year                     2001              Year

November 2, 2012                                         javad.chamanara@uni-jena.de                                                              8
SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
How does it work? (E-AST, CSV Adapter)
                                                        =

                                                                                              Select



               VAR DEF                                          Project              Fetch                Filter             Aggregate
                                                                                     CSV


                                                                                      pdLat
        Var                        x                               Avg                 est                   >                     Group




                                                                                                Year                    2001           Year
November 2, 2012                                         javad.chamanara@uni-jena.de                                                          9
SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
How does it work? (E-AST, Excel Adapter)
                                                          =

                                                                                              Select
                                                                                             Default


                 VAR DEF                                          Project             Fetch                Filter             Aggregate
                 Default                                          Excel               Excel               Excel              Default


                                                                                       pdLat
         Var                         x                               Avg                est                    >                       Group




                                                                                                 Year                     2001             Year

November 2, 2012                                         javad.chamanara@uni-jena.de                                                              10
SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
How does it work? (E-AST, Database Adapter)

                                                        =

                                                                                              Select
                                                                                               DB


               VAR DEF                                          Project              Fetch                Filter             Aggregate
               Default                                           DB                   DB                  DB                  DB


                                                                                      pdLat
        Var                        x                               Avg                 est                   >                     Group




                                                                                                Year                    2001           Year

November 2, 2012                                         javad.chamanara@uni-jena.de                                                          11
SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
Design
• Grammar
• Architecture
• Execution Engine




November 2, 2012                                         javad.chamanara@uni-jena.de                                                   12
SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
SciQL Language Constructs




November 2, 2012                                         javad.chamanara@uni-jena.de                                                   13
SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
The Grammar




November 2, 2012                                         javad.chamanara@uni-jena.de                                                   14
SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
General Architecture
       cmp Components




                  Custom                             Matlab                       R Console                 Declarative Console
                 Application


                                                                     SciQL



                          Spreadsheet Adapter                                 RDBMS Adapter                    Vendor Specific
                                                                                                                  Adapter



                       CSV                      Spreadsheet                         R DBMS                            Other



November 2, 2012                                         javad.chamanara@uni-jena.de                                                   15
SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
Query Execution Engine
                                                                                               Query Engine


                                                                  Data
                                                                 Source



                                             Adapter
 E-AST                                                                                                                             Result set

                                                                               Query
                                                                             Execution
                                                                              Engine



November 2, 2012                                         javad.chamanara@uni-jena.de                                                        16
SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
Mapping
cmp Perspectiv e


                     Perpectiv e 1                                     Data                                   Perspectiv e 2



                                              Data Field 1                             Data Field 1                                    Attribute A
   Attribute 1


                                              Data Field 2                             Data Field 2
                                                                                                                                       Attribute B
   Attribute 2
                                              Data Field 3
                                                                                       Data Field 3
                                                                  Port1                                                                Attribute C


   Attribute 3                               Data Field 4




November 2, 2012                                         javad.chamanara@uni-jena.de                                                                 17
SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
What would be the benefits?
• Scientists deal with just one language
• It has a data source independent instruction
  set
• Its easier to learn and share
• Integration to other tools is easy
• Mitigates the need for computer knowledge


November 2, 2012                                         javad.chamanara@uni-jena.de                                                   18
SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
The Evaluation Plan
• To be used in the context of BExIS
        – Big and diverse user community
        – Various data
• Open source and free
        – Early feedback
        – Contribution



November 2, 2012                                         javad.chamanara@uni-jena.de                                                   19
SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
The Work Plan
• Define the grammar of the language
        – 6-9 months
• Compare to related works and revise
        – 3-6 months
• Compile the formal specification of the language
        – 3-6 months
• Develop the proof of concept implementation
        – 9-12 months
• Evaluation
        – 6 months

November 2, 2012                                         javad.chamanara@uni-jena.de                                                   20
SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
Thanks




November 2, 2012                                         javad.chamanara@uni-jena.de                                                   21
SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA

More Related Content

Similar to SciQL: A Scientific Query Language

Make your data great now
Make your data great nowMake your data great now
Make your data great nowDaniel JACOB
 
Pain points for preservation services / workflows in repositories
Pain points for preservation services /  workflows in repositories Pain points for preservation services /  workflows in repositories
Pain points for preservation services / workflows in repositories prwheatley
 
Adaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of EndpointsAdaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of EndpointsMaribel Acosta Deibe
 
Performance Improvement of Heterogeneous Hadoop Cluster using Ranking Algorithm
Performance Improvement of Heterogeneous Hadoop Cluster using Ranking AlgorithmPerformance Improvement of Heterogeneous Hadoop Cluster using Ranking Algorithm
Performance Improvement of Heterogeneous Hadoop Cluster using Ranking AlgorithmIRJET Journal
 
Sizing Your Software: A Fast Path Approach
Sizing Your Software: A Fast Path ApproachSizing Your Software: A Fast Path Approach
Sizing Your Software: A Fast Path ApproachDCG Software Value
 
State and future of linked data in learning analytics
State and future of linked data in learning analyticsState and future of linked data in learning analytics
State and future of linked data in learning analyticsMathieu d'Aquin
 
Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsGaignard Alban
 
Jena based implementation of a iso 11179 meta data registry
Jena based implementation of a iso 11179 meta data registryJena based implementation of a iso 11179 meta data registry
Jena based implementation of a iso 11179 meta data registryA. Anil Sinaci
 
Data Management for Librarians: An Introduction
Data Management for Librarians: An IntroductionData Management for Librarians: An Introduction
Data Management for Librarians: An IntroductionGarethKnight
 
IRJET- A Survey on Predictive Analytics and Parallel Algorithms for Knowl...
IRJET-  	  A Survey on Predictive Analytics and Parallel Algorithms for Knowl...IRJET-  	  A Survey on Predictive Analytics and Parallel Algorithms for Knowl...
IRJET- A Survey on Predictive Analytics and Parallel Algorithms for Knowl...IRJET Journal
 
Kliment oggioni ppt_gi2011_env_europe_remote_final
Kliment oggioni ppt_gi2011_env_europe_remote_finalKliment oggioni ppt_gi2011_env_europe_remote_final
Kliment oggioni ppt_gi2011_env_europe_remote_finalIGN Vorstand
 
Beyond Hadoop 1.0: A Holistic View of Hadoop YARN, Spark and GraphLab
Beyond Hadoop 1.0: A Holistic View of Hadoop YARN, Spark and GraphLabBeyond Hadoop 1.0: A Holistic View of Hadoop YARN, Spark and GraphLab
Beyond Hadoop 1.0: A Holistic View of Hadoop YARN, Spark and GraphLabVijay Srinivas Agneeswaran, Ph.D
 
Vital AI: Big Data Modeling
Vital AI: Big Data ModelingVital AI: Big Data Modeling
Vital AI: Big Data ModelingVital.AI
 
Fedbench - A Benchmark Suite for Federated Semantic Data Processing
Fedbench - A Benchmark Suite for Federated Semantic Data ProcessingFedbench - A Benchmark Suite for Federated Semantic Data Processing
Fedbench - A Benchmark Suite for Federated Semantic Data ProcessingPeter Haase
 
Introduction to Apache Spark
Introduction to Apache Spark Introduction to Apache Spark
Introduction to Apache Spark Hubert Fan Chiang
 

Similar to SciQL: A Scientific Query Language (20)

Make your data great now
Make your data great nowMake your data great now
Make your data great now
 
Pain points for preservation services / workflows in repositories
Pain points for preservation services /  workflows in repositories Pain points for preservation services /  workflows in repositories
Pain points for preservation services / workflows in repositories
 
iEvoBio 2010 cdaostore
iEvoBio 2010 cdaostoreiEvoBio 2010 cdaostore
iEvoBio 2010 cdaostore
 
Ievobio2010cdaostore
Ievobio2010cdaostoreIevobio2010cdaostore
Ievobio2010cdaostore
 
Adaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of EndpointsAdaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of Endpoints
 
Performance Improvement of Heterogeneous Hadoop Cluster using Ranking Algorithm
Performance Improvement of Heterogeneous Hadoop Cluster using Ranking AlgorithmPerformance Improvement of Heterogeneous Hadoop Cluster using Ranking Algorithm
Performance Improvement of Heterogeneous Hadoop Cluster using Ranking Algorithm
 
Sizing Your Software: A Fast Path Approach
Sizing Your Software: A Fast Path ApproachSizing Your Software: A Fast Path Approach
Sizing Your Software: A Fast Path Approach
 
State and future of linked data in learning analytics
State and future of linked data in learning analyticsState and future of linked data in learning analytics
State and future of linked data in learning analytics
 
Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reports
 
EDR8204-7
EDR8204-7EDR8204-7
EDR8204-7
 
Jena based implementation of a iso 11179 meta data registry
Jena based implementation of a iso 11179 meta data registryJena based implementation of a iso 11179 meta data registry
Jena based implementation of a iso 11179 meta data registry
 
Data Management for Librarians: An Introduction
Data Management for Librarians: An IntroductionData Management for Librarians: An Introduction
Data Management for Librarians: An Introduction
 
IRJET- A Survey on Predictive Analytics and Parallel Algorithms for Knowl...
IRJET-  	  A Survey on Predictive Analytics and Parallel Algorithms for Knowl...IRJET-  	  A Survey on Predictive Analytics and Parallel Algorithms for Knowl...
IRJET- A Survey on Predictive Analytics and Parallel Algorithms for Knowl...
 
Kliment oggioni ppt_gi2011_env_europe_remote_final
Kliment oggioni ppt_gi2011_env_europe_remote_finalKliment oggioni ppt_gi2011_env_europe_remote_final
Kliment oggioni ppt_gi2011_env_europe_remote_final
 
Beyond Hadoop 1.0: A Holistic View of Hadoop YARN, Spark and GraphLab
Beyond Hadoop 1.0: A Holistic View of Hadoop YARN, Spark and GraphLabBeyond Hadoop 1.0: A Holistic View of Hadoop YARN, Spark and GraphLab
Beyond Hadoop 1.0: A Holistic View of Hadoop YARN, Spark and GraphLab
 
Big Data Analysis Starts with R
Big Data Analysis Starts with RBig Data Analysis Starts with R
Big Data Analysis Starts with R
 
Vital AI: Big Data Modeling
Vital AI: Big Data ModelingVital AI: Big Data Modeling
Vital AI: Big Data Modeling
 
Hadoop.powerpoint.pptx
Hadoop.powerpoint.pptxHadoop.powerpoint.pptx
Hadoop.powerpoint.pptx
 
Fedbench - A Benchmark Suite for Federated Semantic Data Processing
Fedbench - A Benchmark Suite for Federated Semantic Data ProcessingFedbench - A Benchmark Suite for Federated Semantic Data Processing
Fedbench - A Benchmark Suite for Federated Semantic Data Processing
 
Introduction to Apache Spark
Introduction to Apache Spark Introduction to Apache Spark
Introduction to Apache Spark
 

More from javadch

Data Lifecycle is not a Cycle, but a Plane!
Data Lifecycle is not a Cycle, but a Plane!Data Lifecycle is not a Cycle, but a Plane!
Data Lifecycle is not a Cycle, but a Plane!javadch
 
Scrum Project Management with Jira as showcase
Scrum Project Management with Jira as showcaseScrum Project Management with Jira as showcase
Scrum Project Management with Jira as showcasejavadch
 
8 implementation notes
8 implementation notes8 implementation notes
8 implementation notesjavadch
 
7 Source Control and Release Management
7 Source Control and Release Management7 Source Control and Release Management
7 Source Control and Release Managementjavadch
 
6 The UI Structure and The Web API
6 The UI Structure and The Web API6 The UI Structure and The Web API
6 The UI Structure and The Web APIjavadch
 
5 BEXIS Extensibility
5 BEXIS Extensibility5 BEXIS Extensibility
5 BEXIS Extensibilityjavadch
 
An Itroduction to the QUIS Language
An Itroduction to the QUIS LanguageAn Itroduction to the QUIS Language
An Itroduction to the QUIS Languagejavadch
 
Research Data Management, BExIS Hands-On Workshop
Research Data Management, BExIS Hands-On WorkshopResearch Data Management, BExIS Hands-On Workshop
Research Data Management, BExIS Hands-On Workshopjavadch
 
Added Value of Conceptual Modeling in Geosciences
Added Value of Conceptual Modeling in GeosciencesAdded Value of Conceptual Modeling in Geosciences
Added Value of Conceptual Modeling in Geosciencesjavadch
 
4 the 3rd party libraries
4 the 3rd party libraries4 the 3rd party libraries
4 the 3rd party librariesjavadch
 
3 the system architecture
3 the system architecture3 the system architecture
3 the system architecturejavadch
 
2 the conceptual model
2 the conceptual model2 the conceptual model
2 the conceptual modeljavadch
 
1 the big picture
1 the big picture1 the big picture
1 the big picturejavadch
 

More from javadch (13)

Data Lifecycle is not a Cycle, but a Plane!
Data Lifecycle is not a Cycle, but a Plane!Data Lifecycle is not a Cycle, but a Plane!
Data Lifecycle is not a Cycle, but a Plane!
 
Scrum Project Management with Jira as showcase
Scrum Project Management with Jira as showcaseScrum Project Management with Jira as showcase
Scrum Project Management with Jira as showcase
 
8 implementation notes
8 implementation notes8 implementation notes
8 implementation notes
 
7 Source Control and Release Management
7 Source Control and Release Management7 Source Control and Release Management
7 Source Control and Release Management
 
6 The UI Structure and The Web API
6 The UI Structure and The Web API6 The UI Structure and The Web API
6 The UI Structure and The Web API
 
5 BEXIS Extensibility
5 BEXIS Extensibility5 BEXIS Extensibility
5 BEXIS Extensibility
 
An Itroduction to the QUIS Language
An Itroduction to the QUIS LanguageAn Itroduction to the QUIS Language
An Itroduction to the QUIS Language
 
Research Data Management, BExIS Hands-On Workshop
Research Data Management, BExIS Hands-On WorkshopResearch Data Management, BExIS Hands-On Workshop
Research Data Management, BExIS Hands-On Workshop
 
Added Value of Conceptual Modeling in Geosciences
Added Value of Conceptual Modeling in GeosciencesAdded Value of Conceptual Modeling in Geosciences
Added Value of Conceptual Modeling in Geosciences
 
4 the 3rd party libraries
4 the 3rd party libraries4 the 3rd party libraries
4 the 3rd party libraries
 
3 the system architecture
3 the system architecture3 the system architecture
3 the system architecture
 
2 the conceptual model
2 the conceptual model2 the conceptual model
2 the conceptual model
 
1 the big picture
1 the big picture1 the big picture
1 the big picture
 

SciQL: A Scientific Query Language

  • 1. SciQL A Query Language for Unified Scientific Data Processing and Management Javad Chamanara University of Jena, Germany javad.chamanara@uni-jena.de At: CIKM 2012, Maui, HI, USA Nov. 2, 2012
  • 2. What is scientific data? November 2, 2012 javad.chamanara@uni-jena.de 2 SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
  • 3. What is available? November 2, 2012 javad.chamanara@uni-jena.de 3 SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
  • 4. What is proposed here? November 2, 2012 javad.chamanara@uni-jena.de 4 SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
  • 5. What does it provide? November 2, 2012 javad.chamanara@uni-jena.de 5 SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
  • 6. A Sample Define Perspective p1 As { Attribute Temp_Fahrenheit MapTo Function(1.8 * Temp_Celsius + 32) Attribute SN_mg MapTo Function(SN_g * 1000) Attribute Year MapTo Function(Year(Timestamp)) DataType=Integer } Connection d Adapter=Spreadsheet Source_URI="c:datadata1.xls" Bind Perspective=p1 Connection=d Version=Latest As pdLatest Var pdAll = Select From pdLatest Draw Data=pdLatest GraphType=Scatter V-Axis=NS_mg H-Axis=Temp_Fahrenheit Var pdGroupped = Select Average(Temp_Fahrenheit) As Avg From pdLatest Group By Year November 2, 2012 javad.chamanara@uni-jena.de 6 SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
  • 7. How does it work? Var x = Select Average(Temp_Fahrenheit) As Avg From pdLatest Where Year > 2001 Group By Year November 2, 2012 javad.chamanara@uni-jena.de 7 SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
  • 8. How does it work? (AST) = Select VAR DEF Project Fetch Filter Aggregate pdLat Var x Avg est > Group Year 2001 Year November 2, 2012 javad.chamanara@uni-jena.de 8 SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
  • 9. How does it work? (E-AST, CSV Adapter) = Select VAR DEF Project Fetch Filter Aggregate CSV pdLat Var x Avg est > Group Year 2001 Year November 2, 2012 javad.chamanara@uni-jena.de 9 SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
  • 10. How does it work? (E-AST, Excel Adapter) = Select Default VAR DEF Project Fetch Filter Aggregate Default Excel Excel Excel Default pdLat Var x Avg est > Group Year 2001 Year November 2, 2012 javad.chamanara@uni-jena.de 10 SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
  • 11. How does it work? (E-AST, Database Adapter) = Select DB VAR DEF Project Fetch Filter Aggregate Default DB DB DB DB pdLat Var x Avg est > Group Year 2001 Year November 2, 2012 javad.chamanara@uni-jena.de 11 SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
  • 12. Design • Grammar • Architecture • Execution Engine November 2, 2012 javad.chamanara@uni-jena.de 12 SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
  • 13. SciQL Language Constructs November 2, 2012 javad.chamanara@uni-jena.de 13 SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
  • 14. The Grammar November 2, 2012 javad.chamanara@uni-jena.de 14 SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
  • 15. General Architecture cmp Components Custom Matlab R Console Declarative Console Application SciQL Spreadsheet Adapter RDBMS Adapter Vendor Specific Adapter CSV Spreadsheet R DBMS Other November 2, 2012 javad.chamanara@uni-jena.de 15 SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
  • 16. Query Execution Engine Query Engine Data Source Adapter E-AST Result set Query Execution Engine November 2, 2012 javad.chamanara@uni-jena.de 16 SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
  • 17. Mapping cmp Perspectiv e Perpectiv e 1 Data Perspectiv e 2 Data Field 1 Data Field 1 Attribute A Attribute 1 Data Field 2 Data Field 2 Attribute B Attribute 2 Data Field 3 Data Field 3 Port1 Attribute C Attribute 3 Data Field 4 November 2, 2012 javad.chamanara@uni-jena.de 17 SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
  • 18. What would be the benefits? • Scientists deal with just one language • It has a data source independent instruction set • Its easier to learn and share • Integration to other tools is easy • Mitigates the need for computer knowledge November 2, 2012 javad.chamanara@uni-jena.de 18 SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
  • 19. The Evaluation Plan • To be used in the context of BExIS – Big and diverse user community – Various data • Open source and free – Early feedback – Contribution November 2, 2012 javad.chamanara@uni-jena.de 19 SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
  • 20. The Work Plan • Define the grammar of the language – 6-9 months • Compare to related works and revise – 3-6 months • Compile the formal specification of the language – 3-6 months • Develop the proof of concept implementation – 9-12 months • Evaluation – 6 months November 2, 2012 javad.chamanara@uni-jena.de 20 SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA
  • 21. Thanks November 2, 2012 javad.chamanara@uni-jena.de 21 SciQL: A Query Language for Unified Scientific Data Processing and Management, 5th Ph.D. Workshop (PIKM) at CIKM 2012, Maui, HI, USA

Editor's Notes

  1. Describe that these slide are showing concept maps in that, boxes are the concepts and labels on the relationships are the meanings/ purposes/ reasonsArrows and lines are equal, it is a tool issueData driven scienceHere are just some attributes of the scientific data
  2. Related workTools have single focus/ general on processing and visualizationversioning/ provenance issuesData processing pipelineImpedance mismatch Format, Data type, Unit, AccuracyShaping data to work in workflowsMulti tool integration:
  3. Is customized to work on scientific dataConsiders VersionsProduces provenance datathe difference and similarities to the slide before
  4. BKR: Again, a layout more similar to the previous one would make it easier for the listener to get the picture ;-)
  5. Describe the sample in briefPoint to the last select statement and tell that you like to investigate what happens to it.
  6. User InputState Information
  7. Input ParsingTree Construction
  8. CSV AdapterDefault adapterAdapter capability matchingAST node selection based on the adapter’s capabilities
  9. Spreadsheet Adapter
  10. Database Adapter
  11. The designed grammar is implemented in a language design framework likeAntLR/ JavaBKR: I think this won’t be readable
  12. QEE: Optimization, Caching, State Management, AST Node selection and delegation, Result compilationAdapter: E-AST node implementation, executes the received node against the actual data sourceData Source: Is data + functionality. Data sources like spreadsheets, DBMSs, etc. have functions that the adapter may rely on them.
  13. The sample is finished here.
  14. UnificationDecouplingAdaptabilityIntegrationExtendibilityCollaboration
  15. The overall duration: 27-39