SlideShare a Scribd company logo
1 of 26
Using MAI to
Filter News Data
NewsIndexer –
a case study in filtering
       Filters / categorizes / tags news content
       Manages massive information flow
       Based on Thesaurus Master and M.A.I.
        Specialized thesaurus
        Specialized rulebase
NewsIndexer’s vocabulary
   Broad and general subject matter
   Reflects coverage of typical news publications
   Over 5200 terms, nine levels deep
       Six top level categories
       Geographic terms
   Starter vocabulary
   Easily adapted and customized
NewsIndexer’s brain
   M.A.I. rulebase customized for news topics
   Words in text trigger M.A.I. rules
   Conditions in rules determine precise
    taxonomy term(s) to apply
       Rules capture human knowledge and analysis
       Rules use context to distinguish between
        homographs
                            Chicago Bears
                            Bear market
                            Bears in the woods
Why filter?
   Reduce noise to enhance retrieval precision

 Disambiguate homographs to increase accuracy
   Limit unnecessary detail to reduce data flow
   Direct data to targeted recipients
Filter to cut noise
   M.A.I. suggests terms as directed by rules

   Index with most specific appropriate terms



    Result: precision and accuracy in retrieval
Filter to disambiguate
   Common words used with very different
    meanings in different contexts
       Utilities –
         electricity / water / sewer?
         utility software?
       Architecture –
         of buildings?
         of computer systems?
   M.A.I. rule conditions differentiate concepts
       Information Architect doesn’t want to retrieve
        building blueprints
I want it ALL!

   Rulebase filters data, yields ALL terms that
    meet conditions of M.A.I. rules
   Editor can select, reject and add terms
   Most specific appropriate term – as chosen
    by editor – is saved with the document
       Subject metadata
       XML format
Red Sox          Crime
                                               Baseball
             Elections
                                 Pharmaceuticals
                                              Gun
        Health sciences                      control
                                  Medicine
    Law

                  Antibiotics             Major League
Penicillin                                  Baseball


                           Campaign finance
             Politics
Taxonomy      2nd level   3rd level   4th level   5th level
Top Term

              Health
            conditions



  Health
            Medicine      Pharma-       Anti-
 sciences
                          ceuticals    biotics Penicillin


             Medical
             facilities
Filter to limit detail

 Want all terms or a select few?
 Roll up terms to the first, second, or third level
  in your taxonomy

                Up-posting
 Good for automatic indexing
 Programmers can set filter to reduce detail
Pharmaceuticals


         Health sciences
                                Medicine


                 Antibiotics
Penicillin
Pharmaceuticals
                              AND Antibiotics
                               AND Penicillin


        Health sciences
                               Medicine


                Antibiotics
Penicillin
Taxonomy     2nd level     3rd, 4th, and
Top Term                    5th levels

              Health
            conditions                     Up-post
                          Penicillin
                                              to
  Health                  Antibiotics       third
 sciences                                   level
            Medicine      Pharma-
                          ceuticals
                                       Narrower terms
                                            go in
             Medical                     Medicine
             facilities                    bucket
No details –
just the big picture
   Index comprehensively and retain details
             BUT
   Display only general terms for end user

Display
higher        Health sciences
level term       Medicine
                     Pharmaceuticals
                         Antibiotics      Index with
                             Penicillin   most
                                            specific
Health sciences
        AND Medicine      Pharmaceuticals
      AND Pharmaceuticals
        AND Antibiotics
         AND Penicillin
                           Medicine


             Antibiotics
Penicillin
Penicillin
                 Up-post
Antibiotics
                     to
Pharma-             top
ceuticals          level
Medicine              --
              Narrower terms
  Health
                   go in
 sciences     Health sciences
                  bucket
Filter to direct data
   User expresses interest in general topics
       e.g., Technology, Environment, Law
   Materials indexed with those topics or any or
    their Narrower Terms are forwarded

              Applications:
                    User profiles
                    Interest groups
                    Specific departments
Specialized filtering –
NewsIndexer and IPTC
   International Press Telecommunications Council
    (IPTC) proposal for NewsCodes
   Part of News Industry Text Format (NITF)
   ~1300 terms describe topics of news articles
   Broad coverage (heavy on sports)

NewsIndexer rulebase can apply detailed
NewsIndexer terms and/or IPTC NewsCodes
 Comply with growing news standards
 Achieve greater detail for news indexing
Thesaurus
           Master
          manages           RESULT:          RESULT:
        custom vocab           ALL           Higher level
                              terms          categories,
News                           that          reduced
 feed                          meet          data stream
                              M.A.I.         -- for portal,
         M.A.I. adds            rule         targeted
          metadata          conditions       users,
        (vocab in TM)                        and other
                                             purposes

                    Cut noise,      Up-post to
                   disambiguate    limit returns
Filtering advantages
   For the End User
       Simpler, more manageable presentation of concepts
       Consistent with typical user’s search strategy
       Differentiated concepts associated with homographs
       Targeted information according to user profile
   For the Internal User
       Documents retain subject metadata reflecting
        granular indexing
       Precision search gets precision results
For more information and a live demo, visit

  www.newsindexer.com

More Related Content

Similar to Using MAI™ to Filter News Data

Current BioData\'s new databases for the Business Of Biology
Current BioData\'s new databases for the Business Of BiologyCurrent BioData\'s new databases for the Business Of Biology
Current BioData\'s new databases for the Business Of Biologyjjacketti
 
Informatics for Pharm D students
Informatics for Pharm D studentsInformatics for Pharm D students
Informatics for Pharm D studentsamy.beaith
 
Med ra terminology
Med ra terminologyMed ra terminology
Med ra terminologycrtutor
 
Searching PubMed
Searching PubMedSearching PubMed
Searching PubMedTTUHSC
 
Multiplex Assays in Translational Medicine: Technologies, Applications, and F...
Multiplex Assays in Translational Medicine: Technologies, Applications, and F...Multiplex Assays in Translational Medicine: Technologies, Applications, and F...
Multiplex Assays in Translational Medicine: Technologies, Applications, and F...ReportLinker.com
 
The state of the art in behavioral machine learning for healthcare
The state of the art in behavioral machine learning for healthcareThe state of the art in behavioral machine learning for healthcare
The state of the art in behavioral machine learning for healthcareAfrica Perianez
 
NLM for Health Sciences Student Session 3 - Using PubMed Pt. 2
NLM for Health Sciences Student Session 3 - Using PubMed Pt. 2NLM for Health Sciences Student Session 3 - Using PubMed Pt. 2
NLM for Health Sciences Student Session 3 - Using PubMed Pt. 2mputerba
 
The journey to evidence 2 1
The journey to evidence 2 1The journey to evidence 2 1
The journey to evidence 2 1stanbridge
 
Embase webinar - An introduction 17 July 2012
Embase webinar  - An introduction 17 July 2012Embase webinar  - An introduction 17 July 2012
Embase webinar - An introduction 17 July 2012Ann-Marie Roche
 
Nursing Research Spring 2016
Nursing Research   Spring 2016Nursing Research   Spring 2016
Nursing Research Spring 2016KrisSlivka
 
Find out about the Pharmacokinetics Module
Find out about the Pharmacokinetics Module Find out about the Pharmacokinetics Module
Find out about the Pharmacokinetics Module PharmaPendium, PK
 
Information on Pharmacovigilance.pptx
Information on Pharmacovigilance.pptxInformation on Pharmacovigilance.pptx
Information on Pharmacovigilance.pptxImmanuel Jebastine M
 
An introduction to Embase
An introduction to EmbaseAn introduction to Embase
An introduction to EmbaseAnn-Marie Roche
 
Searching with point of care tools sept 2010
Searching with point of care tools sept 2010Searching with point of care tools sept 2010
Searching with point of care tools sept 2010Robin Featherstone
 

Similar to Using MAI™ to Filter News Data (20)

Current BioData\'s new databases for the Business Of Biology
Current BioData\'s new databases for the Business Of BiologyCurrent BioData\'s new databases for the Business Of Biology
Current BioData\'s new databases for the Business Of Biology
 
Embase.pdf
Embase.pdfEmbase.pdf
Embase.pdf
 
Ebp rh-july2011g
Ebp rh-july2011gEbp rh-july2011g
Ebp rh-july2011g
 
Informatics for Pharm D students
Informatics for Pharm D studentsInformatics for Pharm D students
Informatics for Pharm D students
 
Med ra terminology
Med ra terminologyMed ra terminology
Med ra terminology
 
Searching PubMed
Searching PubMedSearching PubMed
Searching PubMed
 
Multiplex Assays in Translational Medicine: Technologies, Applications, and F...
Multiplex Assays in Translational Medicine: Technologies, Applications, and F...Multiplex Assays in Translational Medicine: Technologies, Applications, and F...
Multiplex Assays in Translational Medicine: Technologies, Applications, and F...
 
Peer Reviewed Databases and Resources in Environmental Health
 Peer Reviewed Databases and Resources in Environmental Health Peer Reviewed Databases and Resources in Environmental Health
Peer Reviewed Databases and Resources in Environmental Health
 
The state of the art in behavioral machine learning for healthcare
The state of the art in behavioral machine learning for healthcareThe state of the art in behavioral machine learning for healthcare
The state of the art in behavioral machine learning for healthcare
 
NLM for Health Sciences Student Session 3 - Using PubMed Pt. 2
NLM for Health Sciences Student Session 3 - Using PubMed Pt. 2NLM for Health Sciences Student Session 3 - Using PubMed Pt. 2
NLM for Health Sciences Student Session 3 - Using PubMed Pt. 2
 
The journey to evidence 2 1
The journey to evidence 2 1The journey to evidence 2 1
The journey to evidence 2 1
 
Embase webinar - An introduction 17 July 2012
Embase webinar  - An introduction 17 July 2012Embase webinar  - An introduction 17 July 2012
Embase webinar - An introduction 17 July 2012
 
Nursing Research Spring 2016
Nursing Research   Spring 2016Nursing Research   Spring 2016
Nursing Research Spring 2016
 
Find out about the Pharmacokinetics Module
Find out about the Pharmacokinetics Module Find out about the Pharmacokinetics Module
Find out about the Pharmacokinetics Module
 
Information on Pharmacovigilance.pptx
Information on Pharmacovigilance.pptxInformation on Pharmacovigilance.pptx
Information on Pharmacovigilance.pptx
 
An introduction to Embase
An introduction to EmbaseAn introduction to Embase
An introduction to Embase
 
Scibite flyer 2013
Scibite flyer 2013Scibite flyer 2013
Scibite flyer 2013
 
Keeping Smart with a Smartphone
Keeping Smart with a SmartphoneKeeping Smart with a Smartphone
Keeping Smart with a Smartphone
 
Searching with point of care tools sept 2010
Searching with point of care tools sept 2010Searching with point of care tools sept 2010
Searching with point of care tools sept 2010
 
Searching Medical Sources
Searching Medical SourcesSearching Medical Sources
Searching Medical Sources
 

More from Access Innovations, Inc.

Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy Results
Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy ResultsMaking AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy Results
Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy ResultsAccess Innovations, Inc.
 
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8Access Innovations, Inc.
 
Hindawi taxonomy and personalization 27.10 (1)
Hindawi taxonomy and personalization 27.10 (1)Hindawi taxonomy and personalization 27.10 (1)
Hindawi taxonomy and personalization 27.10 (1)Access Innovations, Inc.
 
Asco using ai-taxos-for meta-titles-february-2021
Asco using ai-taxos-for meta-titles-february-2021Asco using ai-taxos-for meta-titles-february-2021
Asco using ai-taxos-for meta-titles-february-2021Access Innovations, Inc.
 
Ai webinar 2 -what's in a name (consolidated pdf)
Ai webinar 2 -what's in a name (consolidated pdf)Ai webinar 2 -what's in a name (consolidated pdf)
Ai webinar 2 -what's in a name (consolidated pdf)Access Innovations, Inc.
 
Tagging overview - Why Keywords Don't Cut It
Tagging overview  - Why Keywords Don't Cut ItTagging overview  - Why Keywords Don't Cut It
Tagging overview - Why Keywords Don't Cut ItAccess Innovations, Inc.
 
DHUG 2018: Towards Web-Centric Repository Interoperability
DHUG 2018: Towards Web-Centric Repository InteroperabilityDHUG 2018: Towards Web-Centric Repository Interoperability
DHUG 2018: Towards Web-Centric Repository InteroperabilityAccess Innovations, Inc.
 
DHUG 2017 - Understanding ROI Just Enough to Get Your Project Funded
DHUG 2017 - Understanding ROI Just Enough to Get Your Project FundedDHUG 2017 - Understanding ROI Just Enough to Get Your Project Funded
DHUG 2017 - Understanding ROI Just Enough to Get Your Project FundedAccess Innovations, Inc.
 

More from Access Innovations, Inc. (20)

Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy Results
Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy ResultsMaking AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy Results
Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy Results
 
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
 
Smart submit
Smart submitSmart submit
Smart submit
 
Plos taxonomy beyond search dhug 2021
Plos taxonomy beyond search   dhug 2021Plos taxonomy beyond search   dhug 2021
Plos taxonomy beyond search dhug 2021
 
Hindawi taxonomy and personalization 27.10 (1)
Hindawi taxonomy and personalization 27.10 (1)Hindawi taxonomy and personalization 27.10 (1)
Hindawi taxonomy and personalization 27.10 (1)
 
Data harmonycloudpowerpointclientfacing
Data harmonycloudpowerpointclientfacingData harmonycloudpowerpointclientfacing
Data harmonycloudpowerpointclientfacing
 
Data harmony update 2021
Data harmony update 2021 Data harmony update 2021
Data harmony update 2021
 
Atypon dhug2021
Atypon dhug2021Atypon dhug2021
Atypon dhug2021
 
Asco using ai-taxos-for meta-titles-february-2021
Asco using ai-taxos-for meta-titles-february-2021Asco using ai-taxos-for meta-titles-february-2021
Asco using ai-taxos-for meta-titles-february-2021
 
Asce more than just topic taxonomies
Asce more than just topic taxonomiesAsce more than just topic taxonomies
Asce more than just topic taxonomies
 
Acs discoverability-dhug2021
Acs discoverability-dhug2021Acs discoverability-dhug2021
Acs discoverability-dhug2021
 
Ai webinar 2 -what's in a name (consolidated pdf)
Ai webinar 2 -what's in a name (consolidated pdf)Ai webinar 2 -what's in a name (consolidated pdf)
Ai webinar 2 -what's in a name (consolidated pdf)
 
Tagging overview - Why Keywords Don't Cut It
Tagging overview  - Why Keywords Don't Cut ItTagging overview  - Why Keywords Don't Cut It
Tagging overview - Why Keywords Don't Cut It
 
Health Affairs - Why Keywords Don't Cut It
Health Affairs - Why Keywords Don't Cut ItHealth Affairs - Why Keywords Don't Cut It
Health Affairs - Why Keywords Don't Cut It
 
Why Keywords Don't Cut It
Why Keywords Don't Cut ItWhy Keywords Don't Cut It
Why Keywords Don't Cut It
 
Data Harmony update 2020 final
Data Harmony update 2020 finalData Harmony update 2020 final
Data Harmony update 2020 final
 
Data Harmony Update 2020 final
Data Harmony Update 2020 finalData Harmony Update 2020 final
Data Harmony Update 2020 final
 
DHUG 2018: Towards Web-Centric Repository Interoperability
DHUG 2018: Towards Web-Centric Repository InteroperabilityDHUG 2018: Towards Web-Centric Repository Interoperability
DHUG 2018: Towards Web-Centric Repository Interoperability
 
DHUG 2018 - Florida Thesis OCR
DHUG 2018 - Florida Thesis OCRDHUG 2018 - Florida Thesis OCR
DHUG 2018 - Florida Thesis OCR
 
DHUG 2017 - Understanding ROI Just Enough to Get Your Project Funded
DHUG 2017 - Understanding ROI Just Enough to Get Your Project FundedDHUG 2017 - Understanding ROI Just Enough to Get Your Project Funded
DHUG 2017 - Understanding ROI Just Enough to Get Your Project Funded
 

Recently uploaded

Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin ClassesCeline George
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.MateoGardella
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 

Recently uploaded (20)

Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 

Using MAI™ to Filter News Data

  • 2. NewsIndexer – a case study in filtering  Filters / categorizes / tags news content  Manages massive information flow  Based on Thesaurus Master and M.A.I.  Specialized thesaurus  Specialized rulebase
  • 3. NewsIndexer’s vocabulary  Broad and general subject matter  Reflects coverage of typical news publications  Over 5200 terms, nine levels deep  Six top level categories  Geographic terms  Starter vocabulary  Easily adapted and customized
  • 4.
  • 5. NewsIndexer’s brain  M.A.I. rulebase customized for news topics  Words in text trigger M.A.I. rules  Conditions in rules determine precise taxonomy term(s) to apply  Rules capture human knowledge and analysis  Rules use context to distinguish between homographs Chicago Bears Bear market Bears in the woods
  • 6.
  • 7.
  • 8. Why filter?  Reduce noise to enhance retrieval precision  Disambiguate homographs to increase accuracy  Limit unnecessary detail to reduce data flow  Direct data to targeted recipients
  • 9. Filter to cut noise  M.A.I. suggests terms as directed by rules  Index with most specific appropriate terms Result: precision and accuracy in retrieval
  • 10. Filter to disambiguate  Common words used with very different meanings in different contexts  Utilities – electricity / water / sewer? utility software?  Architecture – of buildings? of computer systems?  M.A.I. rule conditions differentiate concepts  Information Architect doesn’t want to retrieve building blueprints
  • 11. I want it ALL!  Rulebase filters data, yields ALL terms that meet conditions of M.A.I. rules  Editor can select, reject and add terms  Most specific appropriate term – as chosen by editor – is saved with the document  Subject metadata  XML format
  • 12. Red Sox Crime Baseball Elections Pharmaceuticals Gun Health sciences control Medicine Law Antibiotics Major League Penicillin Baseball Campaign finance Politics
  • 13. Taxonomy 2nd level 3rd level 4th level 5th level Top Term Health conditions Health Medicine Pharma- Anti- sciences ceuticals biotics Penicillin Medical facilities
  • 14. Filter to limit detail  Want all terms or a select few?  Roll up terms to the first, second, or third level in your taxonomy Up-posting  Good for automatic indexing  Programmers can set filter to reduce detail
  • 15.
  • 16. Pharmaceuticals Health sciences Medicine Antibiotics Penicillin
  • 17. Pharmaceuticals AND Antibiotics AND Penicillin Health sciences Medicine Antibiotics Penicillin
  • 18. Taxonomy 2nd level 3rd, 4th, and Top Term 5th levels Health conditions Up-post Penicillin to Health Antibiotics third sciences level Medicine Pharma- ceuticals Narrower terms go in Medical Medicine facilities bucket
  • 19. No details – just the big picture  Index comprehensively and retain details BUT  Display only general terms for end user Display higher Health sciences level term Medicine Pharmaceuticals Antibiotics Index with Penicillin most specific
  • 20. Health sciences AND Medicine Pharmaceuticals AND Pharmaceuticals AND Antibiotics AND Penicillin Medicine Antibiotics Penicillin
  • 21. Penicillin Up-post Antibiotics to Pharma- top ceuticals level Medicine -- Narrower terms Health go in sciences Health sciences bucket
  • 22. Filter to direct data  User expresses interest in general topics  e.g., Technology, Environment, Law  Materials indexed with those topics or any or their Narrower Terms are forwarded Applications: User profiles Interest groups Specific departments
  • 23. Specialized filtering – NewsIndexer and IPTC  International Press Telecommunications Council (IPTC) proposal for NewsCodes  Part of News Industry Text Format (NITF)  ~1300 terms describe topics of news articles  Broad coverage (heavy on sports) NewsIndexer rulebase can apply detailed NewsIndexer terms and/or IPTC NewsCodes  Comply with growing news standards  Achieve greater detail for news indexing
  • 24. Thesaurus Master manages RESULT: RESULT: custom vocab ALL Higher level terms categories, News that reduced feed meet data stream M.A.I. -- for portal, M.A.I. adds rule targeted metadata conditions users, (vocab in TM) and other purposes Cut noise, Up-post to disambiguate limit returns
  • 25. Filtering advantages  For the End User  Simpler, more manageable presentation of concepts  Consistent with typical user’s search strategy  Differentiated concepts associated with homographs  Targeted information according to user profile  For the Internal User  Documents retain subject metadata reflecting granular indexing  Precision search gets precision results
  • 26. For more information and a live demo, visit www.newsindexer.com