SlideShare a Scribd company logo
1 of 24
Taxonomy Assessments -
                                 Part Two
                                 February 9, 2012




                                  Access Innovations, Inc.
             Leveraging Your Content Semantically
                                             Jay Ven Eman, Ph.D., CEO
                                                  j_ven_eman@accessinn.com
                                                      www.accessinn.com
                                                     www.dataharmony.com
                                                        +1.505.998.0800
                                                       Albuquerque, NM




© 2012. Access Innovations, Inc. All rights reserved.
Indexing
     Subject term assignment
     Permanent meta-data to indexed object
     Used for retrieval and evaluation
     Processes
      •     Manual
            •     Publisher
            •     3rd party aggregators
            •     Authors
      •     Automated methods


    © 2011. Access Innovations, Inc. All rights reserved.
Integration / workflow
                                                                      API’s, Client/Server,
              Author Submission                                     Web Services, HTTP-TCP/IP
                   System


Books
                                                                           Content
                                                                       Repository “A”
                                                                       Or Intermediate
Conference                                                               Processes
Proceedings



                                                                                  Content
  ETC.
                                                                                 Repository
                                                                                  “B”, etc.
                                   Thesaurus
                                                           M.A.I.
                                    Master


 Web                                       Data Harmony
 Sites                                     MAIstro Server


                                   Classification System

   © 2011. Access Innovations, Inc. All rights reserved.
Select the document collection
                                                                 CMS



                               Please select the database and the the document directory to load




 © 2011. Access Innovations, Inc. All rights reserved.
CMS




© 2011. Access Innovations, Inc. All rights reserved.
Sample unstructured document




 © 2011. Access Innovations, Inc. All rights reserved.
Run the documents through a metadata extraction
process to create well-formed, rich XML




                                                       • Automatic (per doc template)
                                                       • E.g. Dublin Core Metadata
                                                       • Bibliographic citation




    © 2011. Access Innovations, Inc. All rights reserved.
Automatically add the taxonomy
terms




                                                    Entity extraction: People,
                                                      Places, Things
                                                    Conceptual indexing: using the
                                                      taxonomy




 © 2011. Access Innovations, Inc. All rights reserved.
Classification Process or Assigned Indexing
                                                         <Anchor><Date>09-14-11</Date>
09-14-11
                                                         <TI>“Solving the Challenge”</TI>
“Solving the Challenge”
                                                         <BLH>By</BLH>
By Jay Ven Eman
                                                         <Author>
                                                         <AU_FN>Jay</AU_FN>
The process of indexing
                                                         <AU_MI></AU_MI>
a content object begins
                                                         <AU_LN>Ven Eman</AU_LN>
with…
                                                         </Author>
                                                         <Body>The process of indexing a content
                                                         object begins with…</Body>

                                                         <Subject>Indexing</Subject>
                                                         <Subject>Thesauri</Subject>
                                                         <Subject>Standards</Subject>
                                                         <Subject>Classification</Subject>
   Unstructured
                                                         </Anchor>

                                                                                             Structured


     Thesaurus
                               M.A.I.
      Master
                                                                       Content
              Data Harmony                                             Repository
              MAIstro Server                                           e.g. Database
       Classification System
     © 2011. Access Innovations, Inc. All rights reserved.
Indexing
     Indexing measures
      •     Indexing experts
      •     Subject matter experts (SME)
      •     Hits, misses, & noise
      •     85% hits
     In conjunction with taxonomy measures
      •     Over & under used terms
      •     Over & under indexed content



    © 2011. Access Innovations, Inc. All rights reserved.
Indexing & Search Metrics
     Hit, Miss, Noise
     Subjective
      •     Relevance
      •     Aboutness
     Statistical
      •     Precision
      •     Recall
      •     Level of effort



    © 2011. Access Innovations, Inc. All rights reserved.
Hit, Miss, Noise
     Hit – exactly what a human indexer would use
     Miss – human indexer would use, but system
      did not assign
     Noise – system assigned, but human did not
      •     Relevant noise – could have been assigned
      •     Irrelevant noise – just plain wrong




    © 2011. Access Innovations, Inc. All rights reserved.
Subjective
     Relevance
      •     Reflects how akin it is to the users request
     “Aboutness”
      •     Reflects the topical match between the document
            content and the term
      •     How well the topic describes what the document is
            about
     Varies with level of conceptual terms vs. factual
      terms in the thesaurus




    © 2011. Access Innovations, Inc. All rights reserved.
Indexing
     All content types & sources
      •     Inventory control
      •     Everything in, everything out
     Document types
      •     Articles
      •     Proceedings
      •     Corporate




    © 2011. Access Innovations, Inc. All rights reserved.
Link to Community Resources
(Source: Helen Atkins, AACR)
                                                CME
                                                               Upcoming
                   Other                     Activity on
                                                               Conference
                  Journal                     Topic A
                                                               on Topic A
                 Articles on
                  Topic A
                                                                        Job Posting
                                                  Journal                for Expert
                                                 Article on              on Topic A
                                                  Topic A

                Grant Available                               Podcast Interview
               for Researchers                                 with Researcher
                 Working on                                   Working on Topic A
                    Topic A               Author Networks
                                          Social Networking
                                          SME – Topic A

    © 2011. Access Innovations, Inc. All rights reserved.
Indexing with Data Harmony® M.A.I.™
     Rule base development
      •     80/20 rule
      •     Indexing objectives
     GUI
     Time-to-market
      •     Level of effort to build
      •     Level of effort to maintain
      •     Less than all other alternatives when
            indexing for high precision & recall


    © 2011. Access Innovations, Inc. All rights reserved.
Updating Rule Base
     Automatic for matching rules when using
      Data Harmony MAIstro™
     80/20 rule
     Re-index when 5% to 10% changes to
      taxonomy – arbitrary ranges:
      •     Monthly with small databases – 5k to 20k
      •     Quarterly with medium – 20k to 1 million
      •     Annual with large – greater than 1 million
     Depends on search software, too

    © 2011. Access Innovations, Inc. All rights reserved.
NAMES




© 2012. Access Innovations, Inc. All rights reserved.
What’s in a name?
     Juliet:
"What's in a name? That which
      we call a rose
     By any other name would smell as
      sweet."
     Romeo and Juliet (II, ii, 1-2)




    © 2011. Access Innovations, Inc. All rights reserved.
© 2012. Access Innovations, Inc. All rights reserved.
Magnitude of the Problem:
Facebook - 700 Million Users Projected for 2011(Open-First)




         700 Million Names

        How will your boss, peers,
        anyone ever find you?


    © 2012. Access Innovations, Inc. All rights reserved.
What’s in a name?
     My name         Jay Ven Eman
                      Ven Eman, Jay
      <First_Name>Jay</First_Name>
      <Last_Name>Ven Eman</Last_Name>
     Name variants  Aliases
      Jay Von Eman    William Henry McCarty
      Jay Van Eman    Henry Antrim
      Jay van Eman    William H. Bonney
      Jay ven Eman    Billy the Kid
      Jay Veneman  National & Cultural
      Jay Venema      Conventions
    © 2011. Access Innovations, Inc. All rights reserved.
Names
     Computationally & editorially intense
     Author submissions
     Membership records & the like
     Industry initiatives – ORCID, VIVO
     Subject term disambiguation
     Inventory control basics apply here, too
     Difficulty level is high
     Constance maintenance needed


    © 2011. Access Innovations, Inc. All rights reserved.
Taxonomy Assessments -
                                 Part Two
                                 February 9, 2012


                                 Thank you! Questions?
                                  Access Innovations, Inc.
             Leveraging Your Content Semantically
                                             Jay Ven Eman, Ph.D., CEO
                                                  j_ven_eman@accessinn.com
                                                      www.accessinn.com
                                                     www.dataharmony.com
                                                        +1.505.998.0800
                                                       Albuquerque, NM




© 2012. Access Innovations, Inc. All rights reserved.

More Related Content

Similar to Taxonomy Assessments - Part Two

Taxonomies for Publishing
Taxonomies for PublishingTaxonomies for Publishing
Taxonomies for PublishingTSoholt
 
SharePoint Taxonomy and Metadata 11-19-09
SharePoint Taxonomy and Metadata 11-19-09SharePoint Taxonomy and Metadata 11-19-09
SharePoint Taxonomy and Metadata 11-19-09Stephanie Lemieux
 
“It’s not rocket science!” Applying CMS and semantic enrichment to transform...
“It’s not rocket science!”  Applying CMS and semantic enrichment to transform...“It’s not rocket science!”  Applying CMS and semantic enrichment to transform...
“It’s not rocket science!” Applying CMS and semantic enrichment to transform...Sarah Silveri, RSI Content Solutions
 
10 mistakes when moving to topic-based authoring
10 mistakes when moving to topic-based authoring10 mistakes when moving to topic-based authoring
10 mistakes when moving to topic-based authoringSharon Burton
 
Business Objects....is it LOV?
Business Objects....is it LOV?Business Objects....is it LOV?
Business Objects....is it LOV?Terry Smith
 
Don't Re-write Code to Get Better Analytics
Don't Re-write Code to Get Better AnalyticsDon't Re-write Code to Get Better Analytics
Don't Re-write Code to Get Better AnalyticsSplunk
 
AI-SDV 2021: Jay ven Eman - implementation-of-new-technology-within-a-big-pha...
AI-SDV 2021: Jay ven Eman - implementation-of-new-technology-within-a-big-pha...AI-SDV 2021: Jay ven Eman - implementation-of-new-technology-within-a-big-pha...
AI-SDV 2021: Jay ven Eman - implementation-of-new-technology-within-a-big-pha...Dr. Haxel Consult
 
Elsevier Smart Content LDR SemTech NYC Oct-17-2012
Elsevier Smart Content LDR SemTech NYC Oct-17-2012Elsevier Smart Content LDR SemTech NYC Oct-17-2012
Elsevier Smart Content LDR SemTech NYC Oct-17-2012Alan Yagoda
 
Why I teach Content Strategy in Information Architecture
Why I teach Content Strategy in Information ArchitectureWhy I teach Content Strategy in Information Architecture
Why I teach Content Strategy in Information ArchitectureMisty Weaver
 
AI-SDV 2020: Can There Be Profitable Revenue from an AI Deployment? The Upsid...
AI-SDV 2020: Can There Be Profitable Revenue from an AI Deployment? The Upsid...AI-SDV 2020: Can There Be Profitable Revenue from an AI Deployment? The Upsid...
AI-SDV 2020: Can There Be Profitable Revenue from an AI Deployment? The Upsid...Dr. Haxel Consult
 
Better front-end development in Atlassian plugins
Better front-end development in Atlassian pluginsBetter front-end development in Atlassian plugins
Better front-end development in Atlassian pluginsAtlassian
 
Taxonomies and Metadata in Information Architecture
Taxonomies and Metadata in Information ArchitectureTaxonomies and Metadata in Information Architecture
Taxonomies and Metadata in Information ArchitectureAccess Innovations, Inc.
 
TCUK 2012, Nolwenn Kerzreho, Metadata: Why Should Technical Communicators Care?
TCUK 2012, Nolwenn Kerzreho, Metadata: Why Should Technical Communicators Care?TCUK 2012, Nolwenn Kerzreho, Metadata: Why Should Technical Communicators Care?
TCUK 2012, Nolwenn Kerzreho, Metadata: Why Should Technical Communicators Care?TCUK Conference
 
Enforcing SharePoint Governance
Enforcing SharePoint GovernanceEnforcing SharePoint Governance
Enforcing SharePoint GovernanceRandy Williams
 
FatWire Tutorial For Site Studio Developers
FatWire Tutorial For Site Studio DevelopersFatWire Tutorial For Site Studio Developers
FatWire Tutorial For Site Studio DevelopersBrian Huff
 
Content Management, Metadata and Semantic Web
Content Management, Metadata and Semantic WebContent Management, Metadata and Semantic Web
Content Management, Metadata and Semantic WebAmit Sheth
 
Content Management, Metadata and Semantic Web
Content Management, Metadata and Semantic WebContent Management, Metadata and Semantic Web
Content Management, Metadata and Semantic WebAmit Sheth
 

Similar to Taxonomy Assessments - Part Two (20)

Taxonomies for Publishing
Taxonomies for PublishingTaxonomies for Publishing
Taxonomies for Publishing
 
SharePoint Taxonomy and Metadata 11-19-09
SharePoint Taxonomy and Metadata 11-19-09SharePoint Taxonomy and Metadata 11-19-09
SharePoint Taxonomy and Metadata 11-19-09
 
“It’s not rocket science!” Applying CMS and semantic enrichment to transform...
“It’s not rocket science!”  Applying CMS and semantic enrichment to transform...“It’s not rocket science!”  Applying CMS and semantic enrichment to transform...
“It’s not rocket science!” Applying CMS and semantic enrichment to transform...
 
10 mistakes when moving to topic-based authoring
10 mistakes when moving to topic-based authoring10 mistakes when moving to topic-based authoring
10 mistakes when moving to topic-based authoring
 
Business Objects....is it LOV?
Business Objects....is it LOV?Business Objects....is it LOV?
Business Objects....is it LOV?
 
Don't Re-write Code to Get Better Analytics
Don't Re-write Code to Get Better AnalyticsDon't Re-write Code to Get Better Analytics
Don't Re-write Code to Get Better Analytics
 
AI-SDV 2021: Jay ven Eman - implementation-of-new-technology-within-a-big-pha...
AI-SDV 2021: Jay ven Eman - implementation-of-new-technology-within-a-big-pha...AI-SDV 2021: Jay ven Eman - implementation-of-new-technology-within-a-big-pha...
AI-SDV 2021: Jay ven Eman - implementation-of-new-technology-within-a-big-pha...
 
Elsevier Smart Content LDR SemTech NYC Oct-17-2012
Elsevier Smart Content LDR SemTech NYC Oct-17-2012Elsevier Smart Content LDR SemTech NYC Oct-17-2012
Elsevier Smart Content LDR SemTech NYC Oct-17-2012
 
Why I teach Content Strategy in Information Architecture
Why I teach Content Strategy in Information ArchitectureWhy I teach Content Strategy in Information Architecture
Why I teach Content Strategy in Information Architecture
 
AI-SDV 2020: Can There Be Profitable Revenue from an AI Deployment? The Upsid...
AI-SDV 2020: Can There Be Profitable Revenue from an AI Deployment? The Upsid...AI-SDV 2020: Can There Be Profitable Revenue from an AI Deployment? The Upsid...
AI-SDV 2020: Can There Be Profitable Revenue from an AI Deployment? The Upsid...
 
Better front-end development in Atlassian plugins
Better front-end development in Atlassian pluginsBetter front-end development in Atlassian plugins
Better front-end development in Atlassian plugins
 
Taxonomies and Metadata in Information Architecture
Taxonomies and Metadata in Information ArchitectureTaxonomies and Metadata in Information Architecture
Taxonomies and Metadata in Information Architecture
 
TCUK 2012, Nolwenn Kerzreho, Metadata: Why Should Technical Communicators Care?
TCUK 2012, Nolwenn Kerzreho, Metadata: Why Should Technical Communicators Care?TCUK 2012, Nolwenn Kerzreho, Metadata: Why Should Technical Communicators Care?
TCUK 2012, Nolwenn Kerzreho, Metadata: Why Should Technical Communicators Care?
 
(27.05) MOSSCA Invita - Búsqueda empresarial 2
(27.05) MOSSCA Invita - Búsqueda empresarial 2(27.05) MOSSCA Invita - Búsqueda empresarial 2
(27.05) MOSSCA Invita - Búsqueda empresarial 2
 
(28/05) MOSSCA Invita - Administración de Contenido Empresarial
(28/05) MOSSCA Invita - Administración de Contenido Empresarial(28/05) MOSSCA Invita - Administración de Contenido Empresarial
(28/05) MOSSCA Invita - Administración de Contenido Empresarial
 
Enforcing SharePoint Governance
Enforcing SharePoint GovernanceEnforcing SharePoint Governance
Enforcing SharePoint Governance
 
Alfresco content model
Alfresco content modelAlfresco content model
Alfresco content model
 
FatWire Tutorial For Site Studio Developers
FatWire Tutorial For Site Studio DevelopersFatWire Tutorial For Site Studio Developers
FatWire Tutorial For Site Studio Developers
 
Content Management, Metadata and Semantic Web
Content Management, Metadata and Semantic WebContent Management, Metadata and Semantic Web
Content Management, Metadata and Semantic Web
 
Content Management, Metadata and Semantic Web
Content Management, Metadata and Semantic WebContent Management, Metadata and Semantic Web
Content Management, Metadata and Semantic Web
 

More from Access Innovations, Inc.

Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy Results
Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy ResultsMaking AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy Results
Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy ResultsAccess Innovations, Inc.
 
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8Access Innovations, Inc.
 
Hindawi taxonomy and personalization 27.10 (1)
Hindawi taxonomy and personalization 27.10 (1)Hindawi taxonomy and personalization 27.10 (1)
Hindawi taxonomy and personalization 27.10 (1)Access Innovations, Inc.
 
Asco using ai-taxos-for meta-titles-february-2021
Asco using ai-taxos-for meta-titles-february-2021Asco using ai-taxos-for meta-titles-february-2021
Asco using ai-taxos-for meta-titles-february-2021Access Innovations, Inc.
 
Ai webinar 2 -what's in a name (consolidated pdf)
Ai webinar 2 -what's in a name (consolidated pdf)Ai webinar 2 -what's in a name (consolidated pdf)
Ai webinar 2 -what's in a name (consolidated pdf)Access Innovations, Inc.
 
Tagging overview - Why Keywords Don't Cut It
Tagging overview  - Why Keywords Don't Cut ItTagging overview  - Why Keywords Don't Cut It
Tagging overview - Why Keywords Don't Cut ItAccess Innovations, Inc.
 
DHUG 2018: Towards Web-Centric Repository Interoperability
DHUG 2018: Towards Web-Centric Repository InteroperabilityDHUG 2018: Towards Web-Centric Repository Interoperability
DHUG 2018: Towards Web-Centric Repository InteroperabilityAccess Innovations, Inc.
 
DHUG 2017 - Understanding ROI Just Enough to Get Your Project Funded
DHUG 2017 - Understanding ROI Just Enough to Get Your Project FundedDHUG 2017 - Understanding ROI Just Enough to Get Your Project Funded
DHUG 2017 - Understanding ROI Just Enough to Get Your Project FundedAccess Innovations, Inc.
 

More from Access Innovations, Inc. (20)

Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy Results
Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy ResultsMaking AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy Results
Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy Results
 
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
 
Smart submit
Smart submitSmart submit
Smart submit
 
Plos taxonomy beyond search dhug 2021
Plos taxonomy beyond search   dhug 2021Plos taxonomy beyond search   dhug 2021
Plos taxonomy beyond search dhug 2021
 
Hindawi taxonomy and personalization 27.10 (1)
Hindawi taxonomy and personalization 27.10 (1)Hindawi taxonomy and personalization 27.10 (1)
Hindawi taxonomy and personalization 27.10 (1)
 
Data harmonycloudpowerpointclientfacing
Data harmonycloudpowerpointclientfacingData harmonycloudpowerpointclientfacing
Data harmonycloudpowerpointclientfacing
 
Data harmony update 2021
Data harmony update 2021 Data harmony update 2021
Data harmony update 2021
 
Atypon dhug2021
Atypon dhug2021Atypon dhug2021
Atypon dhug2021
 
Asco using ai-taxos-for meta-titles-february-2021
Asco using ai-taxos-for meta-titles-february-2021Asco using ai-taxos-for meta-titles-february-2021
Asco using ai-taxos-for meta-titles-february-2021
 
Asce more than just topic taxonomies
Asce more than just topic taxonomiesAsce more than just topic taxonomies
Asce more than just topic taxonomies
 
Acs discoverability-dhug2021
Acs discoverability-dhug2021Acs discoverability-dhug2021
Acs discoverability-dhug2021
 
Ai webinar 2 -what's in a name (consolidated pdf)
Ai webinar 2 -what's in a name (consolidated pdf)Ai webinar 2 -what's in a name (consolidated pdf)
Ai webinar 2 -what's in a name (consolidated pdf)
 
Tagging overview - Why Keywords Don't Cut It
Tagging overview  - Why Keywords Don't Cut ItTagging overview  - Why Keywords Don't Cut It
Tagging overview - Why Keywords Don't Cut It
 
Health Affairs - Why Keywords Don't Cut It
Health Affairs - Why Keywords Don't Cut ItHealth Affairs - Why Keywords Don't Cut It
Health Affairs - Why Keywords Don't Cut It
 
Why Keywords Don't Cut It
Why Keywords Don't Cut ItWhy Keywords Don't Cut It
Why Keywords Don't Cut It
 
Data Harmony update 2020 final
Data Harmony update 2020 finalData Harmony update 2020 final
Data Harmony update 2020 final
 
Data Harmony Update 2020 final
Data Harmony Update 2020 finalData Harmony Update 2020 final
Data Harmony Update 2020 final
 
DHUG 2018: Towards Web-Centric Repository Interoperability
DHUG 2018: Towards Web-Centric Repository InteroperabilityDHUG 2018: Towards Web-Centric Repository Interoperability
DHUG 2018: Towards Web-Centric Repository Interoperability
 
DHUG 2018 - Florida Thesis OCR
DHUG 2018 - Florida Thesis OCRDHUG 2018 - Florida Thesis OCR
DHUG 2018 - Florida Thesis OCR
 
DHUG 2017 - Understanding ROI Just Enough to Get Your Project Funded
DHUG 2017 - Understanding ROI Just Enough to Get Your Project FundedDHUG 2017 - Understanding ROI Just Enough to Get Your Project Funded
DHUG 2017 - Understanding ROI Just Enough to Get Your Project Funded
 

Recently uploaded

Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptxmary850239
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)cama23
 
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxMusic 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxleah joy valeriano
 

Recently uploaded (20)

Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)
 
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxMusic 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
 

Taxonomy Assessments - Part Two

  • 1. Taxonomy Assessments - Part Two February 9, 2012 Access Innovations, Inc. Leveraging Your Content Semantically Jay Ven Eman, Ph.D., CEO j_ven_eman@accessinn.com www.accessinn.com www.dataharmony.com +1.505.998.0800 Albuquerque, NM © 2012. Access Innovations, Inc. All rights reserved.
  • 2. Indexing  Subject term assignment  Permanent meta-data to indexed object  Used for retrieval and evaluation  Processes • Manual • Publisher • 3rd party aggregators • Authors • Automated methods © 2011. Access Innovations, Inc. All rights reserved.
  • 3. Integration / workflow API’s, Client/Server, Author Submission Web Services, HTTP-TCP/IP System Books Content Repository “A” Or Intermediate Conference Processes Proceedings Content ETC. Repository “B”, etc. Thesaurus M.A.I. Master Web Data Harmony Sites MAIstro Server Classification System © 2011. Access Innovations, Inc. All rights reserved.
  • 4. Select the document collection CMS Please select the database and the the document directory to load © 2011. Access Innovations, Inc. All rights reserved.
  • 5. CMS © 2011. Access Innovations, Inc. All rights reserved.
  • 6. Sample unstructured document © 2011. Access Innovations, Inc. All rights reserved.
  • 7. Run the documents through a metadata extraction process to create well-formed, rich XML • Automatic (per doc template) • E.g. Dublin Core Metadata • Bibliographic citation © 2011. Access Innovations, Inc. All rights reserved.
  • 8. Automatically add the taxonomy terms Entity extraction: People, Places, Things Conceptual indexing: using the taxonomy © 2011. Access Innovations, Inc. All rights reserved.
  • 9. Classification Process or Assigned Indexing <Anchor><Date>09-14-11</Date> 09-14-11 <TI>“Solving the Challenge”</TI> “Solving the Challenge” <BLH>By</BLH> By Jay Ven Eman <Author> <AU_FN>Jay</AU_FN> The process of indexing <AU_MI></AU_MI> a content object begins <AU_LN>Ven Eman</AU_LN> with… </Author> <Body>The process of indexing a content object begins with…</Body> <Subject>Indexing</Subject> <Subject>Thesauri</Subject> <Subject>Standards</Subject> <Subject>Classification</Subject> Unstructured </Anchor> Structured Thesaurus M.A.I. Master Content Data Harmony Repository MAIstro Server e.g. Database Classification System © 2011. Access Innovations, Inc. All rights reserved.
  • 10. Indexing  Indexing measures • Indexing experts • Subject matter experts (SME) • Hits, misses, & noise • 85% hits  In conjunction with taxonomy measures • Over & under used terms • Over & under indexed content © 2011. Access Innovations, Inc. All rights reserved.
  • 11. Indexing & Search Metrics  Hit, Miss, Noise  Subjective • Relevance • Aboutness  Statistical • Precision • Recall • Level of effort © 2011. Access Innovations, Inc. All rights reserved.
  • 12. Hit, Miss, Noise  Hit – exactly what a human indexer would use  Miss – human indexer would use, but system did not assign  Noise – system assigned, but human did not • Relevant noise – could have been assigned • Irrelevant noise – just plain wrong © 2011. Access Innovations, Inc. All rights reserved.
  • 13. Subjective  Relevance • Reflects how akin it is to the users request  “Aboutness” • Reflects the topical match between the document content and the term • How well the topic describes what the document is about  Varies with level of conceptual terms vs. factual terms in the thesaurus © 2011. Access Innovations, Inc. All rights reserved.
  • 14. Indexing  All content types & sources • Inventory control • Everything in, everything out  Document types • Articles • Proceedings • Corporate © 2011. Access Innovations, Inc. All rights reserved.
  • 15. Link to Community Resources (Source: Helen Atkins, AACR) CME Upcoming Other Activity on Conference Journal Topic A on Topic A Articles on Topic A Job Posting Journal for Expert Article on on Topic A Topic A Grant Available Podcast Interview for Researchers with Researcher Working on Working on Topic A Topic A Author Networks Social Networking SME – Topic A © 2011. Access Innovations, Inc. All rights reserved.
  • 16. Indexing with Data Harmony® M.A.I.™  Rule base development • 80/20 rule • Indexing objectives  GUI  Time-to-market • Level of effort to build • Level of effort to maintain • Less than all other alternatives when indexing for high precision & recall © 2011. Access Innovations, Inc. All rights reserved.
  • 17. Updating Rule Base  Automatic for matching rules when using Data Harmony MAIstro™  80/20 rule  Re-index when 5% to 10% changes to taxonomy – arbitrary ranges: • Monthly with small databases – 5k to 20k • Quarterly with medium – 20k to 1 million • Annual with large – greater than 1 million  Depends on search software, too © 2011. Access Innovations, Inc. All rights reserved.
  • 18. NAMES © 2012. Access Innovations, Inc. All rights reserved.
  • 19. What’s in a name?  Juliet:
"What's in a name? That which we call a rose  By any other name would smell as sweet."  Romeo and Juliet (II, ii, 1-2) © 2011. Access Innovations, Inc. All rights reserved.
  • 20. © 2012. Access Innovations, Inc. All rights reserved.
  • 21. Magnitude of the Problem: Facebook - 700 Million Users Projected for 2011(Open-First) 700 Million Names How will your boss, peers, anyone ever find you? © 2012. Access Innovations, Inc. All rights reserved.
  • 22. What’s in a name?  My name Jay Ven Eman Ven Eman, Jay <First_Name>Jay</First_Name> <Last_Name>Ven Eman</Last_Name>  Name variants  Aliases Jay Von Eman William Henry McCarty Jay Van Eman Henry Antrim Jay van Eman William H. Bonney Jay ven Eman Billy the Kid Jay Veneman  National & Cultural Jay Venema Conventions © 2011. Access Innovations, Inc. All rights reserved.
  • 23. Names  Computationally & editorially intense  Author submissions  Membership records & the like  Industry initiatives – ORCID, VIVO  Subject term disambiguation  Inventory control basics apply here, too  Difficulty level is high  Constance maintenance needed © 2011. Access Innovations, Inc. All rights reserved.
  • 24. Taxonomy Assessments - Part Two February 9, 2012 Thank you! Questions? Access Innovations, Inc. Leveraging Your Content Semantically Jay Ven Eman, Ph.D., CEO j_ven_eman@accessinn.com www.accessinn.com www.dataharmony.com +1.505.998.0800 Albuquerque, NM © 2012. Access Innovations, Inc. All rights reserved.

Editor's Notes

  1. PDF
  2. Post processing“Labels” content itemBut also classifies author
  3. Thanks to Helen Atkins of AACR for this illustration.The real power of this is that the links can all go in all directions, so we take advantage of having the user’s attention regardless of how they step into our “web”Continuing Medical Education (CME)
  4. Johnny Carson