Developed by
 LABEL PROCESSING METHOD   SilverBiology
HELPINGSCIENCE.ORG         Michael Giddens
GOAL (REPEAT 60+ MILLION TIMES)


  From This                                To This
                                            • StateProvince: Arkansas
                                            • County: Bradley
                                            • Genus: Botr ychium
                                            • SpecificEpithet: biternatum
                                            • Authorship: (Sav.) Under wood
                                            • Collector: Sherri Leslie, Kaylon
                                              Cornish
                                            • CollectorNumber: 593
                                            • DateCollected: 1984-09-23
                                            • TRS: Sec. 3, T12S, R9W


Species Lookup: http://ecat-dev.gbif.org/usage/2650191
WORKFLOW




                    Identify                   Enter    Accept
                                 Lexical
Identify    OCR     Primary                   Values     Value    Assemble
                               Grouping &
 Labels    Labels     DwC                       for    for Each   Label Data
                               Verification
                     Fields                   Fields     Field
IDENTIFYING LABELS


Step 1                   To This
 Sign In
 Request Image
   Click & Drag
   Click & Drag
   <Enter>
   Repeat


 Average Time: 300/hr
  per person
WORKFLOW




                    Identify                   Enter    Accept
                                 Lexical
Identify    OCR     Primary                   Values     Value    Assemble
                               Grouping &
 Labels    Labels     DwC                       for    for Each   Label Data
                               Verification
                     Fields                   Fields     Field
EVERNOTE OCR PROCESSING




Sample JPG label




   $5/1GB per month       JSON output
   ~ label cost: $0.001
AFTER EVERNOTE
WORKFLOW




                    Identify                   Enter    Accept
                                 Lexical
Identify    OCR     Primary                   Values     Value    Assemble
                               Grouping &
 Labels    Labels     DwC                       for    for Each   Label Data
                               Verification
                     Fields                   Fields     Field
STEP 2) IDENTIFYING LABELS


What we capture.                                                                 What we leave on the label.
• Scientific Information                                                         •   Habitat Information
 ( Fa m i l y, G e n u s , S p e c i e s , S u b s p e c i e s , A u t h o r )
                                                                                 •   Locality Description
• Collection Information
 ( N a m e , N u m b e r, D a t e )                                              •   Collector Notes
• Geographical Information                                                       •   Other
 ( C o u n t r y, S t a t e , C o u n t y, L o c a l i t y,
               Lat/Lon, TRS)

• Determination Information
 ( D e t e r m i n e r, S c i e n t i f i c N a m e , D a t e
 Determined)

• Extra Information
 ( A c c e s s i o n N u m b e r, Ty p e S t a t u s )
WORKFLOW




                    Identify                   Enter    Accept
                                 Lexical
Identify    OCR     Primary                   Values     Value    Assemble
                               Grouping &
 Labels    Labels     DwC                       for    for Each   Label Data
                               Verification
                     Fields                   Fields     Field
LEXICAL GROUPING

Internal Step

 Compare words to
  OCR value and if
  it is distinct assign
  to lexical set.

 Send image to
  data entry.
BULK VALIDATION

Internal Step

 Look at the value
  that will be assigned
  to the list of images
  if any are not the
  correct value move to
  manual data entry
  blacklist.
 Repeat



 Based on Lexical Groups
WORKFLOW




                    Identify                   Enter    Accept
                                 Lexical
Identify    OCR     Primary                   Values     Value    Assemble
                               Grouping &
 Labels    Labels     DwC                       for    for Each   Label Data
                               Verification
                     Fields                   Fields     Field
DATA ENTRY

Public Step

 Multiple Interfaces
     Dates
     Lat/Lng
     Names
     Scientific Names

 User receive virtual
  tokens to use in the
  store for every correct
  word
DATA ENTRY VARIATIONS
WORKFLOW




                    Identify                   Enter    Accept
                                 Lexical
Identify    OCR     Primary                   Values     Value    Assemble
                               Grouping &
 Labels    Labels     DwC                       for    for Each   Label Data
                               Verification
                     Fields                   Fields     Field
FIELD VERIFICATION




Computer:            Asplenium     Frequency

Volunteer 2:         Asplenium     Asplenium: 2

Volunteer 3:         Asplenlum     Asplenlum: 1


Volunteer 4:         Asplenium     Asplenium: 3

                                   Asplenium: 1


Points Earned:   Volunteer 2 & 4
WORKFLOW




                    Identify                   Enter    Accept
                                 Lexical
Identify    OCR     Primary                   Values     Value    Assemble
                               Grouping &
 Labels    Labels     DwC                       for    for Each   Label Data
                               Verification
                     Fields                   Fields     Field
OCCURRENCE DATA

 Export Formats
   CSV
   Darwin Core Archive
   Other on request

 Filters
   By any combination of DarwinCore Fields

 Restful web services
SUSTAINABILIT Y

 HelpingScience depends on a symbiotic relationship between
  collections providing specimen sheets and volunteers to per form data
  entr y.
 Volunteers are given HS Tokens to be used in the HS Store in
  exchange for their time.
 The store is a percentage of the cost per label that is given back to
  the community.

The Store
 Fundraisers
   Small micro loans given to botany undergraduate students for research
   Sponsorships for students to attend scientific conferences
   K12 equipment funding for science departments
 Charitable Organizations
 Fund Small Herbaria Digitization
HELPINGSCIENCE.ORG

If you manage a collection and interested in testing or
                    processing labels please contact:

                     mikegiddens@silverbiology.com

                               www.SilverBiology.com

HelpingScience

  • 1.
    Developed by LABELPROCESSING METHOD SilverBiology HELPINGSCIENCE.ORG Michael Giddens
  • 2.
    GOAL (REPEAT 60+MILLION TIMES) From This To This • StateProvince: Arkansas • County: Bradley • Genus: Botr ychium • SpecificEpithet: biternatum • Authorship: (Sav.) Under wood • Collector: Sherri Leslie, Kaylon Cornish • CollectorNumber: 593 • DateCollected: 1984-09-23 • TRS: Sec. 3, T12S, R9W Species Lookup: http://ecat-dev.gbif.org/usage/2650191
  • 3.
    WORKFLOW Identify Enter Accept Lexical Identify OCR Primary Values Value Assemble Grouping & Labels Labels DwC for for Each Label Data Verification Fields Fields Field
  • 4.
    IDENTIFYING LABELS Step 1 To This  Sign In  Request Image  Click & Drag  Click & Drag  <Enter>  Repeat  Average Time: 300/hr per person
  • 5.
    WORKFLOW Identify Enter Accept Lexical Identify OCR Primary Values Value Assemble Grouping & Labels Labels DwC for for Each Label Data Verification Fields Fields Field
  • 6.
    EVERNOTE OCR PROCESSING SampleJPG label $5/1GB per month JSON output ~ label cost: $0.001
  • 7.
  • 8.
    WORKFLOW Identify Enter Accept Lexical Identify OCR Primary Values Value Assemble Grouping & Labels Labels DwC for for Each Label Data Verification Fields Fields Field
  • 9.
    STEP 2) IDENTIFYINGLABELS What we capture. What we leave on the label. • Scientific Information • Habitat Information ( Fa m i l y, G e n u s , S p e c i e s , S u b s p e c i e s , A u t h o r ) • Locality Description • Collection Information ( N a m e , N u m b e r, D a t e ) • Collector Notes • Geographical Information • Other ( C o u n t r y, S t a t e , C o u n t y, L o c a l i t y, Lat/Lon, TRS) • Determination Information ( D e t e r m i n e r, S c i e n t i f i c N a m e , D a t e Determined) • Extra Information ( A c c e s s i o n N u m b e r, Ty p e S t a t u s )
  • 10.
    WORKFLOW Identify Enter Accept Lexical Identify OCR Primary Values Value Assemble Grouping & Labels Labels DwC for for Each Label Data Verification Fields Fields Field
  • 11.
    LEXICAL GROUPING Internal Step Compare words to OCR value and if it is distinct assign to lexical set.  Send image to data entry.
  • 12.
    BULK VALIDATION Internal Step Look at the value that will be assigned to the list of images if any are not the correct value move to manual data entry blacklist.  Repeat  Based on Lexical Groups
  • 13.
    WORKFLOW Identify Enter Accept Lexical Identify OCR Primary Values Value Assemble Grouping & Labels Labels DwC for for Each Label Data Verification Fields Fields Field
  • 14.
    DATA ENTRY Public Step Multiple Interfaces  Dates  Lat/Lng  Names  Scientific Names  User receive virtual tokens to use in the store for every correct word
  • 15.
  • 16.
    WORKFLOW Identify Enter Accept Lexical Identify OCR Primary Values Value Assemble Grouping & Labels Labels DwC for for Each Label Data Verification Fields Fields Field
  • 17.
    FIELD VERIFICATION Computer: Asplenium Frequency Volunteer 2: Asplenium Asplenium: 2 Volunteer 3: Asplenlum Asplenlum: 1 Volunteer 4: Asplenium Asplenium: 3 Asplenium: 1 Points Earned: Volunteer 2 & 4
  • 18.
    WORKFLOW Identify Enter Accept Lexical Identify OCR Primary Values Value Assemble Grouping & Labels Labels DwC for for Each Label Data Verification Fields Fields Field
  • 19.
    OCCURRENCE DATA  ExportFormats  CSV  Darwin Core Archive  Other on request  Filters  By any combination of DarwinCore Fields  Restful web services
  • 20.
    SUSTAINABILIT Y  HelpingSciencedepends on a symbiotic relationship between collections providing specimen sheets and volunteers to per form data entr y.  Volunteers are given HS Tokens to be used in the HS Store in exchange for their time.  The store is a percentage of the cost per label that is given back to the community. The Store  Fundraisers  Small micro loans given to botany undergraduate students for research  Sponsorships for students to attend scientific conferences  K12 equipment funding for science departments  Charitable Organizations  Fund Small Herbaria Digitization
  • 21.
    HELPINGSCIENCE.ORG If you managea collection and interested in testing or processing labels please contact: mikegiddens@silverbiology.com www.SilverBiology.com