EdSkyQuery-G Overview Brian Hills, December 2004 www.edikt.org
Contents <ul><li>Edikt </li></ul><ul><li>Motivation & Aims </li></ul><ul><li>Architecture </li></ul><ul><li>Current Status...
Edikt <ul><li>“ e-Science Data, Information and Knowledge Transformation” </li></ul><ul><ul><li>Bridge the gap between  ap...
Astronomy Requirements <ul><li>Sky Surveys collect masses of data to be managed: </li></ul><ul><ul><li>For example, Sloan ...
EdSkyQuery-G: Motivation & Aims <ul><li>Support the “Open SkyQuery Initiative” </li></ul><ul><ul><li>Move from a .NET-spec...
SkyQuery: High Level Architecture 1. User submits query. 2. Client sends query to Portal. 3. Portal invokes SkyNode with a...
EdSkyQuery-G: Architecture <ul><li>Inspired by Greg Riccardi’s paper for DAIS-WG: </li></ul><ul><ul><li>http://www.cs.fsu....
EdSkyQuery-G: Architecture 6. Transfer results file. 1. User submits query. 2. Query Builder sends query to Service Manage...
EdSkyQuery-G: Service Manager Service Manager Call to Query service Client Uses Service Client Uses Service Call to Xmatch...
EdSkyQuery-G: SkyNode Architecture SkyNode Invoked via the Service Manager EldasSkyNode Grid Services Query XMatch Info Qu...
EdSkyQuery-G: Stored Procedures Export Export Transport Transport Transport Import Import Export Export EDR  Results:  .cs...
Current Status <ul><li>Pre-Alpha (internal release), November ‘04. </li></ul><ul><li>End-to-end invocation of all componen...
EdSkyQuery-G: Pre-Alpha Stored Procedures Export JDBC Query Results Import Import Export Export EDR SSA Results:  .csv .xm...
Results <ul><li>Tested with 3 queries from ROE cookbook: </li></ul><ul><ul><li>http://surveys.roe.ac.uk/ssa/sqlcookbook.ht...
Deliverables <ul><li>Software (Internal): </li></ul><ul><ul><li>Prototype client & GUI client. </li></ul></ul><ul><ul><li>...
Short Term Focus <ul><li>Enable science: </li></ul><ul><ul><li>Compare different cross match algorithms. </li></ul></ul><u...
Longer Term Focus: AstroGrid Client Registry MySpace Parser Query Builder Service Manager SkyNode 2 DB2 - Linux SkyNode 1 ...
Longer Term Focus <ul><li>Interaction with Astrogrid components: </li></ul><ul><ul><li>Clients, Registry, ADQL Parser, MyS...
Thank you Questions?
Upcoming SlideShare
Loading in …5
×

EdSkyQuery-G Overview Brian Hills, December 2004

290 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
290
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

EdSkyQuery-G Overview Brian Hills, December 2004

  1. 1. EdSkyQuery-G Overview Brian Hills, December 2004 www.edikt.org
  2. 2. Contents <ul><li>Edikt </li></ul><ul><li>Motivation & Aims </li></ul><ul><li>Architecture </li></ul><ul><li>Current Status </li></ul><ul><li>Results </li></ul><ul><li>Future Outlook </li></ul>
  3. 3. Edikt <ul><li>“ e-Science Data, Information and Knowledge Transformation” </li></ul><ul><ul><li>Bridge the gap between applications and computer science: </li></ul></ul><ul><ul><ul><li>Produce robust tools… </li></ul></ul></ul><ul><ul><ul><li>… for real application science problems… </li></ul></ul></ul><ul><ul><ul><li>… test them under extreme science conditions… </li></ul></ul></ul><ul><ul><ul><li>… and keep an eye on the commercial possibilities. </li></ul></ul></ul><ul><ul><li>Projects which may be of interest to astronomers: </li></ul></ul><ul><ul><ul><li>BinX , Eldas & EdSkyquery-G. </li></ul></ul></ul><ul><ul><li>Visit: www.edikt.org </li></ul></ul>
  4. 4. Astronomy Requirements <ul><li>Sky Surveys collect masses of data to be managed: </li></ul><ul><ul><li>For example, Sloan Digital Sky Survey: 15TB. </li></ul></ul><ul><ul><li>2 * 10GB databases to be used for the project. </li></ul></ul><ul><ul><li>Edikt will have access to a 155TB SAN. </li></ul></ul><ul><li>Further research by leveraging data from different surveys: </li></ul><ul><ul><li>Must identify same object from different catalogues. </li></ul></ul><ul><li>Require a “federated” view: </li></ul><ul><ul><li>Data is distributed, homogeneous, large scale. </li></ul></ul><ul><ul><li>Building one big data warehouse isn’t feasible. </li></ul></ul><ul><ul><li>Interoperable services to combine disparate data sources. </li></ul></ul>Is the middleware up to the task?
  5. 5. EdSkyQuery-G: Motivation & Aims <ul><li>Support the “Open SkyQuery Initiative” </li></ul><ul><ul><li>Move from a .NET-specific implementation. </li></ul></ul><ul><ul><li>Enable similar functionality on other platforms. </li></ul></ul><ul><li>Extensible framework for e-science: </li></ul><ul><ul><li>Handle heterogeneous archives. </li></ul></ul><ul><ul><li>‘ Plug in’ algorithms e.g. Nearest Neighbour. </li></ul></ul><ul><ul><li>Interact with Astrogrid components & VO. </li></ul></ul><ul><ul><li>Leveraged for the BRIDGES project to perform simple joins. </li></ul></ul><ul><li>Apply Eldas to large scale E-Science problems: </li></ul><ul><ul><li>Test: functionality, scalability & performance. </li></ul></ul><ul><li>Cross team collaboration: </li></ul><ul><ul><li>Dr. Bob Mann (UoE, ROE), Edikt, EPCC, NeSC. </li></ul></ul>
  6. 6. SkyQuery: High Level Architecture 1. User submits query. 2. Client sends query to Portal. 3. Portal invokes SkyNode with an execution plan. 4. Recursive call to next SkyNode in the execution plan. 7. Performs cross-match and returns results. 8. Performs cross-match and returns final set of results. 9. Returns results for display. 5. Recursive call to last SkyNode in the execution plan. 6. Runs query and returns results. Portal SkyNode 1 Client SkyNode 2 SkyNode 3
  7. 7. EdSkyQuery-G: Architecture <ul><li>Inspired by Greg Riccardi’s paper for DAIS-WG: </li></ul><ul><ul><li>http://www.cs.fsu.edu/~riccardi/grid/skyquery.pdf </li></ul></ul><ul><li>Discusses two approaches: </li></ul><ul><ul><li>Access recipes for service interactions. </li></ul></ul><ul><ul><li>Retained state for service interactions. </li></ul></ul><ul><li>Potential benefits of #2: </li></ul><ul><ul><li>Scalability </li></ul></ul><ul><ul><li>Robustness </li></ul></ul><ul><ul><li>Usability </li></ul></ul>
  8. 8. EdSkyQuery-G: Architecture 6. Transfer results file. 1. User submits query. 2. Query Builder sends query to Service Manager (SM). 4. Handle to results. 7. Handle to results. 5. SM invokes SkyNode 1 with an execution plan + results handle. 8. SM returns a handle to the results. 3. SM invokes SkyNode 1 with an execution plan. 9. Client may pull results back from SkyNode. Query Builder Service Manager SkyNode 2 DB2 - Linux SkyNode 1 DB2 - XP
  9. 9. EdSkyQuery-G: Service Manager Service Manager Call to Query service Client Uses Service Client Uses Service Call to Xmatch service Call to Info service JBOSS Grid Service Layer doQuery getMetadata Distributed Query Processor Split Optimize Execute Generic Parser
  10. 10. EdSkyQuery-G: SkyNode Architecture SkyNode Invoked via the Service Manager EldasSkyNode Grid Services Query XMatch Info Query & Stored Proc invocations DB2
  11. 11. EdSkyQuery-G: Stored Procedures Export Export Transport Transport Transport Import Import Export Export EDR Results: .csv .xml SSA Results: .csv .xml Results: .csv .xml
  12. 12. Current Status <ul><li>Pre-Alpha (internal release), November ‘04. </li></ul><ul><li>End-to-end invocation of all components: </li></ul><ul><ul><li>Client->ServiceManager->SkyNode->Database </li></ul></ul><ul><li>2 * 10GB DB2 test databases: </li></ul><ul><ul><li>SSA from SuperCosmos, hosted by NeSC. </li></ul></ul><ul><ul><li>EDR from SDSS, hosted by EPCC. </li></ul></ul><ul><li>Limitations: </li></ul><ul><ul><li>SM: No query parser/splitting. </li></ul></ul><ul><ul><li>Simple cross database join: not yet using XMatch. </li></ul></ul><ul><ul><li>No data transport, other than JDBC between databases. </li></ul></ul><ul><ul><li>Final results reside on database server. </li></ul></ul>
  13. 13. EdSkyQuery-G: Pre-Alpha Stored Procedures Export JDBC Query Results Import Import Export Export EDR SSA Results: .csv .xml Results: .csv .xml
  14. 14. Results <ul><li>Tested with 3 queries from ROE cookbook: </li></ul><ul><ul><li>http://surveys.roe.ac.uk/ssa/sqlcookbook.html </li></ul></ul><ul><ul><li>Queries #16, #17, #19. </li></ul></ul><ul><li>Results show: </li></ul><ul><ul><li>Exporting data is quick. </li></ul></ul><ul><ul><li>Importing data is >10 * slower: </li></ul></ul><ul><ul><ul><li>Should we use native DB calls rather than JDBC for import only? </li></ul></ul></ul><ul><ul><li>Queries slow: </li></ul></ul><ul><ul><ul><li>Need more indexes and database tuning? </li></ul></ul></ul>1474 1470 1110 Join Query Time (secs) 1571 2704 2825 Total Time (secs) 15 1138 1585 Import Time (secs) 82 4,667 19 96 383,672 17 130 488,718 16 Export Time (secs) #Rows selected Query No.
  15. 15. Deliverables <ul><li>Software (Internal): </li></ul><ul><ul><li>Prototype client & GUI client. </li></ul></ul><ul><ul><li>Service Manager. </li></ul></ul><ul><ul><li>Eldas + Skynode Interface. </li></ul></ul><ul><ul><li>Database stored procedures (Java). </li></ul></ul><ul><li>Documentation (some on NescForge): </li></ul><ul><ul><li>Use Cases, Requirements, Design. </li></ul></ul><ul><ul><li>Performance Testing, Installation. </li></ul></ul><ul><li>Papers </li></ul><ul><ul><li>AHM 04 Poster & Paper: </li></ul></ul><ul><ul><ul><li>http://www.allhands.org.uk/proceedings/papers/123.pdf </li></ul></ul></ul><ul><ul><li>ADASS 04 Paper. </li></ul></ul>
  16. 16. Short Term Focus <ul><li>Enable science: </li></ul><ul><ul><li>Compare different cross match algorithms. </li></ul></ul><ul><ul><li>Incorporate XMatch, CSIRO, ROE as stored procedures. </li></ul></ul><ul><li>Data Transfer – SCP: </li></ul><ul><ul><li>Between databases. </li></ul></ul><ul><ul><li>Pull results back to client. </li></ul></ul><ul><ul><li>Deliver to a third party. </li></ul></ul><ul><li>Client & Service Manager enhancements: </li></ul><ul><ul><li>Handle broader range of queries. </li></ul></ul><ul><li>Performance: </li></ul><ul><ul><li>Compare with pre-alpha benchmarks. </li></ul></ul><ul><li>Improve test infrastructure. </li></ul>
  17. 17. Longer Term Focus: AstroGrid Client Registry MySpace Parser Query Builder Service Manager SkyNode 2 DB2 - Linux SkyNode 1 DB2 - XP
  18. 18. Longer Term Focus <ul><li>Interaction with Astrogrid components: </li></ul><ul><ul><li>Clients, Registry, ADQL Parser, MySpace. </li></ul></ul><ul><li>OpenSkyQuery: </li></ul><ul><ul><li>Compliance with interfaces. </li></ul></ul><ul><ul><li>Test with other DBMS and catalogues. </li></ul></ul><ul><li>Query Builder GUI/Data Integration tool: </li></ul><ul><ul><li>Lead user through choosing fields from different datasets. </li></ul></ul><ul><li>GridFTP: </li></ul><ul><ul><li>Currently unsuitable…. </li></ul></ul><ul><ul><li>NeSC course in Jan 05 may reveal more. </li></ul></ul>
  19. 19. Thank you Questions?

×