Your SlideShare is downloading. ×
0
ANTABIF Training                                   getting your data online                           Bruno Danis, Anton V...
Objectives                   • familiarize with ANTABIF                   • learn about architecture, functionalities     ...
On the Menu Today                   • Background about ANTABIF                   • Technical overview                   • ...
BackgroundWednesday 26 October 11
Antarctic Treaty                      « In order to promote international                      cooperation in scientific in...
SCAR-MarBIN & ANTABIF     •      www.scarmarbin.be     •      www.antabif.be or www.biodiversity.aq     •      Core fundin...
General Philosophy     • Build an electronic ecosystem     • Offer free and open access to data and technology     • Expos...
Wednesday 26 October 11
Achievements  •      The first RAMS  •      Board of 60+ editors  •      Feeds WoRMS, CoL and EoL  •      17,098 taxa (RAMS...
Achievements  •      1,288,441 records  •      198 datasets  •      5,235 taxa  •      Feeds OBIS, GBIF  •      Downloadab...
Achievements •      Up since Oct 2005 •      open access •      909,915 visitors •      8,093,774 hits •      51,416,196 d...
Achievements                          Records      SMB       ANTABIF     Progress                          Metadata     19...
Nuts and BoltsWednesday 26 October 11
100% Open Source     •      Language: Ruby     •      Framework: Rails(ActiveRecord) and YUI     •      (smart) Search eng...
Data flow                                     (your point of view)  Your data                         DwC-A             IPT...
Data flow                          (our point of view)Wednesday 26 October 11
Standards, tools, resourcesWednesday 26 October 11
Metadata           Information about datasets deteriorates over time!Wednesday 26 October 11
Metadata               • preferred MD catalogue = Antarctic Master                      Directory (subset of GCMD)        ...
DarwinCore                      "A vocabulary of words that biologists,                      hackers, and citizen scientis...
DarwinCore Archive       • Complete package of data             –One file             –Multiple files       • Text Files…   ...
DarwinCore Archive                            Archives always have a ‘core’ data file                                      ...
DarwinCore Archive                            Archives always have a ‘core’ data file                                      ...
DarwinCore Archive                            Darwin Core Archive (two files)                          meta.xml	  describes...
DarwinCore Archive         Multiple extensions are available             Columns	  in	  extensions	  are	  mapped	  to	  D...
DarwinCore Archive       Many extensions are available            h?p://rs.gbif.org/extension/Wednesday 26 October 11
Spreadsheet templates                   • Metadata - describe a database or other                          data resource. ...
Wednesday 26 October 11
Wednesday 26 October 11
Wednesday 26 October 11
Wednesday 26 October 11
Wednesday 26 October 11
Spreadsheet processor                   • web application: Excel spreadsheet to                          DwC-A.           ...
Wednesday 26 October 11
DwC-A validator                   • tests Darwin Core Archives                   • validates the content against the known...
Wednesday 26 October 11
IPT - Integrated Publishing Toolkit                   • Publishing primary biodiversity data                   • Resources...
The Data Paper concept        • A scholarly journal publication whose primary purpose is to          describe a dataset or...
Data Paper: Incentivising Data DiscoveryWednesday 26 October 11
Reward data publishing                          Metadata document   Data PaperWednesday 26 October 11
Step-by-Step       • Complete metadata of a dataset using metadata editor in IPT         2.0.2       • Generate ‘Data Pape...
Once paper is accepted        • Digital Object Identifier is assigned to the Data Paper        • Paper is published in (a) ...
Important to consider        • Metadata is complete in all the respect        • All the claims are adequately substantiate...
ORC                   • GBIF’s Online Resource Center                   • Provides access to documents, best              ...
Wednesday 26 October 11
FunctionalitiesWednesday 26 October 11
www.biodiversity.aq                   • general website                   • latest news                   • contact       ...
www. biodiversity.aqWednesday 26 October 11
data. biodiversity.aq                   • find primary biodiversity data                   • visualize occurrence data on m...
data. biodiversity.aqWednesday 26 October 11
ipt. biodiversity.aq                   • prepare and clean your data                   • publish primary biodiversity data...
ipt. biodiversity.aqWednesday 26 October 11
afg. biodiversity.aq                   • (nice-looking) Identification aid                   • Publication/sharing platform...
afg. biodiversity.aqWednesday 26 October 11
share. biodiversity.aq                   • download shared resources                   • reports, communication material  ...
share. biodiversity.aqWednesday 26 October 11
PIC                   • polarcommons.org                   • Emergency solution for orphan datasets                   • Se...
www.polarcommons.orgWednesday 26 October 11
Future directionsWednesday 26 October 11
Architecture                   • A network of IPTs                   • Enhanced data flow                   • Community inv...
Challenges ahead                   • Data intensive science                   • Data deluge                   • Digital di...
Hands on nowWednesday 26 October 11
The rest of the day                   • Using the portals                   • Using data tools                    • templa...
http://share.biodiversity.aq/training/Wednesday 26 October 11
Upcoming SlideShare
Loading in...5
×

Antabif training

586

Published on

Introduction presentation for ANTABIF training.

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
586
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Antabif training"

  1. 1. ANTABIF Training getting your data online Bruno Danis, Anton Van de Putte and Nabil YoudjouWednesday 26 October 11
  2. 2. Objectives • familiarize with ANTABIF • learn about architecture, functionalities tools and standards we offer • hands on exercises with dummy and *real* data • collect feedback on the fitness for use for this communityWednesday 26 October 11
  3. 3. On the Menu Today • Background about ANTABIF • Technical overview • Standards, tools and resources • Functionalities • Future directions • Hands onWednesday 26 October 11
  4. 4. BackgroundWednesday 26 October 11
  5. 5. Antarctic Treaty « In order to promote international cooperation in scientific investigation in Antarctica, […], the Contracting Parties agree that, to the greatest extent feasible and practicable: […] Scientific observations and results from Antarctica shall be exchanged and made freely available. »Wednesday 26 October 11
  6. 6. SCAR-MarBIN & ANTABIF • www.scarmarbin.be • www.antabif.be or www.biodiversity.aq • Core funding: BELSPO.be • International Polar Year 2007/08 • Census of Antarctic Marine Life • Ocean Biogeographic Information System • Global Biodiversity Information FacilityWednesday 26 October 11
  7. 7. General Philosophy • Build an electronic ecosystem • Offer free and open access to data and technology • Expose all the (biodiversity) data and metadata, in multiple contexts • Remain community-driven, and collaborative • Adopt strong standardization • Work for science, conservation, managementWednesday 26 October 11
  8. 8. Wednesday 26 October 11
  9. 9. Achievements • The first RAMS • Board of 60+ editors • Feeds WoRMS, CoL and EoL • 17,098 taxa (RAMS) • Building a dynamic RAS • 24,248 taxa (RAS)Wednesday 26 October 11
  10. 10. Achievements • 1,288,441 records • 198 datasets • 5,235 taxa • Feeds OBIS, GBIF • Downloadable • WebGIS • WebservicesWednesday 26 October 11
  11. 11. Achievements • Up since Oct 2005 • open access • 909,915 visitors • 8,093,774 hits • 51,416,196 dld records • Citations: 183 • Cited Publications: 38Wednesday 26 October 11
  12. 12. Achievements Records SMB ANTABIF Progress Metadata 198 7.200 36,4 Occurrence 1.288.441 2.659.392 2,1 Taxonomy 17.184 30.472 1,8Wednesday 26 October 11
  13. 13. Nuts and BoltsWednesday 26 October 11
  14. 14. 100% Open Source • Language: Ruby • Framework: Rails(ActiveRecord) and YUI • (smart) Search engine: Full text (Elasticsearch-Lucene) • Database/GIS server/SpatialDB: PostGresql/Geoserver/PostGIS • Mapping client: OpenLayers • Web services: RESTish (all resources) • Protocols/Standards: DIF, DwC, DwC-A, Tapir…etc • GBIF tools : HIT, IPT • Hosting: BeBIF (ULB/VUB joint IT Center) • Metadata systems: GCMD API (DIF)Wednesday 26 October 11
  15. 15. Data flow (your point of view) Your data DwC-A IPT ANTABIF standardize upload publish publish Data PaperWednesday 26 October 11
  16. 16. Data flow (our point of view)Wednesday 26 October 11
  17. 17. Standards, tools, resourcesWednesday 26 October 11
  18. 18. Metadata Information about datasets deteriorates over time!Wednesday 26 October 11
  19. 19. Metadata • preferred MD catalogue = Antarctic Master Directory (subset of GCMD) • standard = DIF (Data Interchange Format) • used by the whole SCAR community • crawled by Google, Scopus...Wednesday 26 October 11
  20. 20. DarwinCore "A vocabulary of words that biologists, hackers, and citizen scientists use to broadly describe the biodiversity of life on earth."Wednesday 26 October 11
  21. 21. DarwinCore Archive • Complete package of data –One file –Multiple files • Text Files… • Self-documenting • Intended to be shared/distributedWednesday 26 October 11
  22. 22. DarwinCore Archive Archives always have a ‘core’ data file My_data.txt The  core  data  file  is  a  text  file.Wednesday 26 October 11
  23. 23. DarwinCore Archive Archives always have a ‘core’ data file My_data.txt The  core  data  file  is  a  text  file.Wednesday 26 October 11
  24. 24. DarwinCore Archive Darwin Core Archive (two files) meta.xml  describes  the  mappings  in  the core  data  file  (species.txt)Wednesday 26 October 11
  25. 25. DarwinCore Archive Multiple extensions are available Columns  in  extensions  are  mapped  to  Darwin  Core  using  the  meta.xml  fileWednesday 26 October 11
  26. 26. DarwinCore Archive Many extensions are available h?p://rs.gbif.org/extension/Wednesday 26 October 11
  27. 27. Spreadsheet templates • Metadata - describe a database or other data resource.  • Species Occurrence - store basic species collections or observational data • Species Checklists – recording and storing simple annotated species checklists.Wednesday 26 October 11
  28. 28. Wednesday 26 October 11
  29. 29. Wednesday 26 October 11
  30. 30. Wednesday 26 October 11
  31. 31. Wednesday 26 October 11
  32. 32. Wednesday 26 October 11
  33. 33. Spreadsheet processor • web application: Excel spreadsheet to DwC-A. • Excel files contain data entry and GBIF metadata profile. • Worksheet supports publication of primary biodiversity data • Processor performs data validation and transformation and returns a validated DwC-AWednesday 26 October 11
  34. 34. Wednesday 26 October 11
  35. 35. DwC-A validator • tests Darwin Core Archives • validates the content against the known extensions and terms registered within the GBIF network for sharing biodiversity data.Wednesday 26 October 11
  36. 36. Wednesday 26 October 11
  37. 37. IPT - Integrated Publishing Toolkit • Publishing primary biodiversity data • Resources • Metadata • Source Data (text, zip, SQL) • Source Mappings • Visibility • Published ReleaseWednesday 26 October 11
  38. 38. The Data Paper concept • A scholarly journal publication whose primary purpose is to describe a dataset or group of datasets, rather than to report a research investigation. • Benefits of the Data Paper – Scholarly credit to Data Publishers – Describe the data in structured human readable form – Bring the existence of the data to the attention of the scholarly communityWednesday 26 October 11
  39. 39. Data Paper: Incentivising Data DiscoveryWednesday 26 October 11
  40. 40. Reward data publishing Metadata document Data PaperWednesday 26 October 11
  41. 41. Step-by-Step • Complete metadata of a dataset using metadata editor in IPT 2.0.2 • Generate ‘Data Paper’ manuscript (menu: Manage Resource – RTF Download) • Submit the manuscript for possible publication in one of the PenSoft publication (ZooKeys, PhytoKeys, BioRisks, NeoBiota). • Revision (if any) is carried out using metadata editor in IPT 2.0.2 and manuscript re-submitted to PenSoft Open Journal SystemWednesday 26 October 11
  42. 42. Once paper is accepted • Digital Object Identifier is assigned to the Data Paper • Paper is published in (a) print format, (b) PDF format, (c) semantically enhanced HTML, and (d) XML is archived in PubMedCentral • DoI of the Data Paper is linked with the Persistent Identifier of the metadata document in the GBIF Registry • Data Paper is indexed by Web of Knowledge (ISI), PubMedCentral, Scopus, Zoological Record, Google Scholar, CAB Abstracts, Directory of Open Access Journal (DOAJ), EBSCO.Wednesday 26 October 11
  43. 43. Important to consider • Metadata is complete in all the respect • All the claims are adequately substantiated • Data described in ‘Data Paper’ is freely available at the time of submission of the manuscriptWednesday 26 October 11
  44. 44. ORC • GBIF’s Online Resource Center • Provides access to documents, best practices, tools and links • Wide thematic scope • Different ways of accessing resources • Enabling community contributions • Different levels of resource access • Multilanguage supportWednesday 26 October 11
  45. 45. Wednesday 26 October 11
  46. 46. FunctionalitiesWednesday 26 October 11
  47. 47. www.biodiversity.aq • general website • latest news • contact • sponsors • governance • linksWednesday 26 October 11
  48. 48. www. biodiversity.aqWednesday 26 October 11
  49. 49. data. biodiversity.aq • find primary biodiversity data • visualize occurrence data on map • view taxonomic data • download data • view metrics • send feedback • access technical documentationWednesday 26 October 11
  50. 50. data. biodiversity.aqWednesday 26 October 11
  51. 51. ipt. biodiversity.aq • prepare and clean your data • publish primary biodiversity data • publish metadata • push data and metadata to ANTABIF & GBIF • get a Data PaperWednesday 26 October 11
  52. 52. ipt. biodiversity.aqWednesday 26 October 11
  53. 53. afg. biodiversity.aq • (nice-looking) Identification aid • Publication/sharing platform for customized Field Guides • High quality (useful) pictures • Expert Descriptions • Built dynamically from various sourcesWednesday 26 October 11
  54. 54. afg. biodiversity.aqWednesday 26 October 11
  55. 55. share. biodiversity.aq • download shared resources • reports, communication material • original datasets, tools, resourcesWednesday 26 October 11
  56. 56. share. biodiversity.aqWednesday 26 October 11
  57. 57. PIC • polarcommons.org • Emergency solution for orphan datasets • Setup of a commons • IT cloud • Set of norms • All polar data (IPY) • Simple procedure!Wednesday 26 October 11
  58. 58. www.polarcommons.orgWednesday 26 October 11
  59. 59. Future directionsWednesday 26 October 11
  60. 60. Architecture • A network of IPTs • Enhanced data flow • Community involved in data management • Enhanced interoperability • Optimization of research efforts/resources • Integrative, connected science • Factual, adaptative conservationWednesday 26 October 11
  61. 61. Challenges ahead • Data intensive science • Data deluge • Digital divides • Other data types and integration • Orphan datasets • Cultural changeWednesday 26 October 11
  62. 62. Hands on nowWednesday 26 October 11
  63. 63. The rest of the day • Using the portals • Using data tools • templates • data validation • documentation • publishingWednesday 26 October 11
  64. 64. http://share.biodiversity.aq/training/Wednesday 26 October 11
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×