Updates to Canadensys network activities at the North America Regional Nodes Meeting to the Global Biodiversity Information Facility (GBIF) in the Biodiversity Institute of Ontario at the University of Guelph in Guelph, Ontario.
12. Contributions Made to GBIF
Darwin Core Archive Validator
Narwhal Processor
IPT Customization
Data Licensing
Digital Object Identifiers
Country Pages Requirements
Code DevelopmentConsultations
17. https://github.com/gbif/dwca-validator
$ java -jar dwca-validator.jar -s http://ipt.iobis.org/obiscanada/archive.do?r=dfo_canadianais
CORE : 1698481
->WARNING,RECORD_CONTENT:The value 79:48 [sic] is not a numerical value
->WARNING,RECORD_CONTENT:The value 44:29 [sic] is not a numerical value
CORE : 1698482
->WARNING,RECORD_CONTENT:The value 79:31 [sic] is not a numerical value
->WARNING,RECORD_CONTENT:The value 44:35 [sic] is not a numerical value
CORE : 1002205
->WARNING,RECORD_CONTENT:The value -123.333º is not a numerical value
CORE : null
->ERROR,FIELD_UNIQUENESS:The value null was already used for term coreId
CORE : null
Darwin Core Archive Validator
19. Collaborations
Developers & International Projects
17 Canadensys code projects
20 « forks »
Working collaborations with
GBIF, France, Colombia, Brazil,
French Guiana
25. « Developing the next generation of
IPM tools – mobile applications for
pest identification, monitoring and
forecasting for sustainable and
profitable crop production »
Dr. Barb Sharanowski
Mobile first
Occurrence observations
Real-time species distribution modelling
Identification keys
26. CFI Cyberinfrastructure Initiative
January 2015 Notice of intent (NOI). The CFI
expects all institutions to identify
collaboration with Compute
Canada
June 2015 Full proposal due
27. Canadensys Wish List
1. Formal representation of NA research projects in GBIF
Governance Model
2. Recognition of in-kind support
3. Shared development & hackathons
• accelerate delivery of generic solutions
• minimize duplication
Editor's Notes
What is Canadensys
History, funding, goals
DwC-A harvester: semi-automated from any IPT
-todo: ensure presence of field, not merely that a field has content or is empty
-todo: validate fields when they are split across fields (eg dates that are present in separate year, month, day fields)
-todo: evaluate a field based on content of other fields (eg is datum provided when there is a latitude and longitude)
-architecture is an evaluation chain, may have subchains
-500,000 records in 5s
-application, as web site, as a service that can be extended and reused
-core of the codebase is format agnostic, ie although designed for use with DarwinCore Archives as input, there’s nothing in the code that precludes implementation elsewhere (eg integrated within the IPT to be used prior to data publication as a mechanism to ensure Data Quality
Numerical values
Invalid characters
Blank values
Unique values (occurrenceID)
Adherence to controlled vocabularies