Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

DataFinder concepts and example: General (20100503)


Published on

Published in: Technology
  • Be the first to comment

DataFinder concepts and example: General (20100503)

  1. 1. DataFinder: Concepts and Usage German Aerospace Center (DLR), Cologne/Berlin/Braunschweig
  2. 2. Outline <ul><li>Introduction </li></ul><ul><li>Configuration and customization </li></ul><ul><ul><li>Requirements Analysis </li></ul></ul><ul><ul><li>Installation </li></ul></ul><ul><ul><li>Configuration </li></ul></ul><ul><ul><li>Customization </li></ul></ul><ul><ul><li>Data Migration </li></ul></ul>
  3. 3. DataFinder Introduction Background: Data Management Problem <ul><li>Absent organizational structures </li></ul><ul><li>No central data management policy </li></ul><ul><li>Every employee organizes his/her data individually </li></ul><ul><li> Researchers spend about 30% of their time searching for data </li></ul><ul><li> Problem with data left behind by temporary staff </li></ul><ul><li>Increase of data because of growing size and regulations </li></ul><ul><li>Rapidly growing volume of simulation and experimental data </li></ul><ul><li>Legal requirements for long-term availability of data (up to 50 years!) </li></ul><ul><li>Situation is similar for every DLR institute, many research labs and agencies and even for the industry </li></ul>
  4. 4. DataFinder Introduction Basic Concept <ul><li>Lightweight Client-Server solution </li></ul><ul><li>Based on open and stable standards , such as XML and WebDAV </li></ul><ul><li>Extensible through Python scripts to fit multiple scenarios </li></ul>
  5. 5. DataFinder Introduction Graphical User Interfaces of DataFinder 1.x User Client Administrator Client Implementation in Python with Qt/PyQt Current Version differs Current Version differs
  6. 6. DataFinder Introduction Data Store Concept Logical View User Client Storage Locations
  7. 7. DataFinder Configuration and Customization
  8. 8. DataFinder Configuration and Customization Preparing DataFinder for certain “use cases” <ul><li>Requirements Analysis </li></ul><ul><li>Analyze data, working environment and user workflows </li></ul><ul><li>Configuration </li></ul><ul><li>Server and Client setup </li></ul><ul><li>Define and configure data model </li></ul><ul><li>Configure distributed storage resources (Data Stores) </li></ul><ul><li>Customization </li></ul><ul><li>Write functional extensions with Python scripts (GUI) </li></ul><ul><li>Tool integration </li></ul><ul><li>Data Migration </li></ul><ul><li>Analyzing current data </li></ul><ul><li>Migration of the data into new system </li></ul>
  9. 9. <ul><li>Meta data server </li></ul><ul><ul><li>Apache and Catacomb (based on the WebDAV Protocol) </li></ul></ul><ul><ul><li>Apache and mod_dav (xampp) </li></ul></ul><ul><li>Data server </li></ul><ul><ul><li>Apache and Catacomb (based on the WebDAV Protocol) </li></ul></ul><ul><ul><li>Apache and mod_dav (xampp) </li></ul></ul><ul><li>Administrator and user client </li></ul><ul><ul><li>Source and precompiled Versions (for WinXP and SUSE64) available </li></ul></ul>DataFinder Configuration and Customization Installation
  10. 10. DataFinder Configuration and Customization Data Model: Mapping of Organizational Data Structures User Object (directory) Object (file) Relation Project A Project B Project C File 1 File 2 Simulation I Experiment Simulation II
  11. 11. DataFinder Configuration and Customization Exkurs: Meta Data <ul><li>Describe and annotate data (“files”) and collections (“directories”) </li></ul><ul><li>Different levels of meta data </li></ul><ul><ul><li>Required meta data defined by administrator </li></ul></ul><ul><ul><li>User is free to choose additional ones </li></ul></ul><ul><li>Different types of meta data </li></ul><ul><ul><li>String </li></ul></ul><ul><ul><li>Numbers (float, double, …) </li></ul></ul><ul><ul><li>Lists </li></ul></ul><ul><ul><li>Dates </li></ul></ul><ul><li>User can search in meta data </li></ul>
  12. 12. DataFinder Configuration and Customization Exkurs: Meta Data and the User Impact <ul><li>DataFinder restricts the rights of users! </li></ul><ul><li>Enforcement of “good behavior” </li></ul><ul><li>User must comply to organizational standards </li></ul><ul><li>Data is stored in defined (directory) hierarchy on data server </li></ul><ul><li>Required meta data must be set prior upload </li></ul><ul><li>User have certain access rights within hierarchy </li></ul>“ Damn! I’m a great scientist! I want freedom to have my own directory layout…”
  13. 13. DataFinder Configuration and Customization Customization: Python-Scripting for Extension and Automation <ul><li>Integration of DataFinder with environment </li></ul><ul><li>User, infrastructure, software, … </li></ul><ul><li>Extension of DataFinder by Python scripts </li></ul><ul><li>Actions for resources (i.e., files, directories) </li></ul><ul><li>User interface extensions </li></ul><ul><li>Typical automations and customizations </li></ul><ul><li>Data migration and data import </li></ul><ul><li>Start of external application (with downloaded data files) </li></ul><ul><li>Extraction of meta data from result files </li></ul><ul><li>Automation of recurring tasks (“workflows”) </li></ul>
  14. 14. DataFinder Configuration and Customization Example: Downloading File and Starting Application # Creating a file “/text.txt” using data store “Data Store”. from datafinder.gui.user import script_api as gui_api from datafinder.script_api.repository import setWorkingRepository from datafinder.script_api.item.item_support import createLeaf # Get representation of the current managed repository mr = gui_api.managedRepositoryDescription() # Get currently selected collection in DataFinder Server-View if not mr is None : setWorkingRepository(mr) def _createLeaf(): properties = dict() properties[&quot;____dataformat____&quot;] = &quot;TEXT&quot; properties[&quot;____datastorename____&quot;] = &quot;Data Store&quot; … createLeaf(&quot;/test.txt&quot;, properties) script_api.performWithProgressDialog(_createLeaf)
  15. 15. DataFinder Demo Example <ul><li>Live Demo DataFinder </li></ul><ul><li>Server structure </li></ul><ul><li>Admin client: showing XML file of meta model and in client </li></ul><ul><li>Admin client: setting up a DataStore for development files </li></ul><ul><li>Admin client: loading a script extension </li></ul><ul><li>User client: loading a script extension </li></ul><ul><li>User client: making a structure </li></ul><ul><li>User client: upload of a Experimental file into the store </li></ul><ul><li>User client: double-click on the file opening it </li></ul><ul><li>User client: script extension: creating a file </li></ul>
  16. 16. Availability <ul><li>DataFinder core available as Open Source </li></ul><ul><ul><li>Current stable release: DataFinder 2.0 </li></ul></ul><ul><ul><li>Simplified BSD License </li></ul></ul><ul><ul><li>Open Source platforms </li></ul></ul><ul><ul><ul><li>Launchpad </li></ul></ul></ul><ul><ul><ul><li>Sourceforge </li></ul></ul></ul><ul><ul><ul><li>Freshmeat </li></ul></ul></ul><ul><ul><li>Windows XP and SLED64 bit precompiled </li></ul></ul><ul><li>Become a DataFinder fan on Facebook! </li></ul>
  17. 17. Links <ul><li>DataFinder Web site </li></ul><ul><li> </li></ul><ul><li>DataFinder Open Source </li></ul><ul><li> </li></ul><ul><li> </li></ul><ul><li>DataFinder Wiki </li></ul><ul><li> </li></ul><ul><li>Catacomb – recommended Server </li></ul><ul><li> </li></ul>