Datacamp @ Transparency Camp 2010

813 views

Published on

Datacamp - Data management and publishing web application

http://github.com/Stiivi/datacamp

Published in: Technology
  • Be the first to comment

Datacamp @ Transparency Camp 2010

  1. 1. Datacamp Data publishing and management for small and medium organizations Stefan Urbanek March 2010 stefan@knowerce.sk
  2. 2. Introduction and context
  3. 3. Datacamp nt elo pme dev Datacamp ETL und er Manager Brewery 3
  4. 4. 4
  5. 5. 5
  6. 6. Datacamp
  7. 7. Fair-play Alliance 7
  8. 8. = + 8
  9. 9. + manage data with the quality process of an enterprise 9
  10. 10. publish data like documents in CMS 10
  11. 11. For Visitors data catalogue searching sharing 11
  12. 12. For Owners data descriptions import quality management 12
  13. 13. For Remixers application programming interface 13
  14. 14. Features
  15. 15. Data Storage hosted and managed internally 15
  16. 16. Refillable Datasets dataset – container with predefined structure 16
  17. 17. Metadata localizable descriptions of datasets and fields 17
  18. 18. Number of Datasets 10 100 1 000 10 000+ small to medium number of datasets 18
  19. 19. 19
  20. 20. import new check/publish publish active suspended (published) (hidden) suspend delete delete undelete deleted (closed) destroy 20
  21. 21. Application Programming Interface
  22. 22. <?xml version="1.0" encoding="UTF-8"?> <dataset-description> <category-id type="integer">1</category-id> <collection-mode></collection-mode> <created-at type="datetime">2009-09-13T09:59:51Z</created-at> <data-provider></data-provider> <data-source-type></data-source-type> <database nil="true"></database> <format-rule-id type="integer" nil="true"></format-rule-id> <granularity></granularity> <id type="integer">5</id> <identifier>ds_eurodonations</identifier> dataset and field metadata 22
  23. 23. raw data 23
  24. 24. Command Line Tool $ datacamp -h Usage: tools/datacamp [-h] [OPTIONS] REQUEST [ARGUMENTS] Send REQUEST to a Datacamp application and return server reply. Options: -b url specify base URL for Datacamp. Default: http://localhost:3000 -k api_key specify API key for accessing Datacamp data -f format request different format, if available. Options are: xml -g get_method method of accessing the datacamp: curl (default), wget Environment variables: DATACAMP_BASE_URL DATACAMP_API_KEY DATACAMP_FORMAT DATACAMP_GET_METHOD Example: datacamp version datacamp datasets 24
  25. 25. 25 new applications
  26. 26. The Application
  27. 27. 27
  28. 28. 28
  29. 29. 29
  30. 30. 30
  31. 31. 31
  32. 32. 32
  33. 33. 33
  34. 34. Future
  35. 35. Short-term ■ search engine improvement ■ expose predicates API ■ “known datacamps” ■ public quality rating 35
  36. 36. Long Term ■ “business” rules ■ UI for change history ■ attachments ■ scanned invoices, reports, ... 36
  37. 37. Datacamp ETL Manager slightly more technical
  38. 38. files with data web staging ETL datasets application extraction transformation loading 38
  39. 39. company register extraction public procurement extraction ... extraction staging loading datasets web manager temporary or downloaded files 39
  40. 40. 40
  41. 41. List of Jobs ■ job identification ■ enabled? ■ order in which jobs are run ■ schedule – days on which jobs are run ■ flag jobs to force run again despite scheduling 41
  42. 42. failed 42
  43. 43. Current State ■ map one table to other ■ identity mapping for convenience ■ append ■ update by key ■ compare tables ■ automatically finalize loading on success 43
  44. 44. SQL 44
  45. 45. Future ■ ETL jobs without programming ■ no SQL, no Ruby ■ covers most of the cases ■ parallelisation of jobs ■ finer scheduling ■ mail notification 45
  46. 46. we have data, what now?
  47. 47. 47
  48. 48. Thank you
  49. 49. Copyrights and Credits ■ Silos by Noodle Snacks: http://commons.wikimedia.org/wiki/File:Maria_Cement_Silos.jpg, CC Attribution, Share Alike 3.0 Unported ■ Icons by Oxygen Team: http://www.iconfinder.net/search/1/?q=iconset:oxygen, GPL ■ Icons by Alessandro Rei, KDE, GPL ■ Angel Wings by *Spyrogs, Deviant Art: http://spirogs.deviantart.com/art/Angel-Wings-Tatoo-87089782 ■ Coins by Mnemo, Wikimedia Commons, http://commons.wikimedia.org/wiki/File:Swedish_coins_20050924.jpg, CC Attribution, Share Alike 3.0 ■ Folder icon: Benji Garner, Icon set: Rise, Free for commercial use ■ Application icon by Sergio Sanchez Lopez, GPL ■ Network icon by Everaldo Coelho, Icon set: Crystal Clear, LGPL 49

×