Data Harmony Version 3.9 Features Update

3,058 views
2,935 views

Published on

Marjorie M.K. Hlava, President and founder of Access Innovations, Inc., unveils the newest version and module updates of the Data Harmony indexing software suite.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
3,058
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
11
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Data Harmony Version 3.9 Features Update

  1. 1. Marjorie M.K. Hlava mhlava@accessinn.com Access Innovations, Inc. www.accessinn.com Leveraging your content semantically 10th Annual Data Harmony User Group Meeting
  2. 2. DH Technical Support Team  Development programming team  Lamine Idjeraoui **  Allexander Lyons  Daniel Vasicek  Scott Roberts  Doug Vendcat  Customer support  Mary Garcia **  Jack Bruce  Gabe Carr  Samantha Lewis  Documentation  Jack Bruce **  Kirk Sanders  Gena San Nicolas  Barbara Gilles  Systems  Tom Peterson**  SWCP
  3. 3. DH Customer Support Team  Sales and Licensing  Marjorie Hlava  Janice McIntyre  Bill Richardson  Jay Ven Eman **  Leland Yates  Blog and Web team  Barbara Gilles  Melody Smith **  Timothy Soholt **  Marketing  Heather Kotula **  Ashley Beard
  4. 4. Editorial Team Taxonomy and Rule Building  Gabe Carr  Jack Bruce  Kathy Brown  Barbara Gilles  Bob Kasenchak **  Samantha Lewis  Kirk Sanders  Tim Soholt  Gena San Nicolas  Alice Redmond-Neal  Eric Ziecker
  5. 5. Access Integrity  Kathy Brown  Jerry Jorgeson  John Kuranz**  Leland Yates  Access Rule Building Team  Access Programming Team
  6. 6. Who’s Who?  Introduce yourself  Relationship to Data Harmony  Where do you use Data Harmony  Project Name(s)
  7. 7. Access Innovations What do we do?
  8. 8. Four Divisions  Database Services  Data Harmony  NewsIndexer  National Information Center for Educational Media (NICEM)  MediaSleuth  Access Integrity  Medical Claims Compliance  Integracoder
  9. 9. Database Services  Database Design  Consulting  DTD / Metadata Schemas  Workflow Scheduling  Editorial Services  Metadata capture and creation  Tagging – XML, SGML  Abstracting  Indexing  Author disambiguation
  10. 10. Database Services - 2  Taxonomy Construction  Thesaurus  Vocabulary  Ontology  Data Linking (linked data)  Authority Files – pick lists  Rule Bases  Semantic Enrichment  Data Format Conversion  Database Applications  Retrospective metadata tagging  Author disambiguation
  11. 11. Database Services - 3  Applications development  Search – Lucene and Solr  Search Harmony interface  Web services layer  Link to user experience or user interface  Web calls  API setup and linking  www.accessinn.com
  12. 12. Data Harmony  Built for our use starting in 1987  Visual Basic C++ Java  Aid to the editorial and indexing processes  Alleviate the clerical aspects  Speed the tagging process  Guarantee accuracy, consistency, and depth of indexing
  13. 13. Data Harmony Suite – Main Modules  M.A.I.  Thesaurus Master  XIS  XML Intranet System  Administrative configuration module  “The Data Harmony Suite”
  14. 14. Tech stuff  Downloadable  Documentation revised 2014  APIs for client server versions  Internet accessible Cloud and SaaS  Full multilingual display  Unicode - Accepts ASCII data  Entification tables converted  Drivers for display and print  For most languages
  15. 15. Data Harmony  Java  Platform independent  Applet modules  Web services  APIs  XML  TCP/IP  JSON and SSL on WEB Start  GlassFish for extension support  www.dataharmony.com
  16. 16. Full multilingual display
  17. 17. Data Harmony  Machine Aided Indexing (M.A.I.)  Semantic, syntactic, morphological, etc. layer  Rule Builder for users  Concept Extractor for text  Statistics for Machine Learning  Use in automatic, batch, or assisted mode  Thesaurus Master  For creating taxonomies, thesauri, ontologies, and authority files  MAIstro  Thesaurus Master and M.A.I. combined
  18. 18. Data Harmony Extensions  Inline Tagging  Metadata Extractor  MAIChem  Search Harmony  SharePoint integration  Recommender
  19. 19. New  DH Author Submission System  Author / Name Disambiguation  MAIBatch GUI  Semantic Fingerprinting  Web Start  Sneak Peek at “Ontology Master”
  20. 20. Retiring  Automatic Summarizer  WebThes  ThesViewer
  21. 21. TaxoDiary  Daily blog  Weekly feature  3 + items per day  Big archive  Launched in June 2010
  22. 22. DH Bulletin Board Exchange http://dhd.accessinn.com
  23. 23. Data Harmony Forum  Discussion threads  Solutions to reported problems  Access to the newest documentation  Announcements of features  Bug reports  Enhancement requests
  24. 24. Data Harmony Partners  EJ Press  MarkLogic  Really strategies (R Suite)  Yuxi  Xquire  Publishing Technology  More ….
  25. 25. Some DH Connectors & Exports… ACD/Labs’ Lucene(org.&Solr) PerfectSearch Oracle/StellentUniversal ContentManagement JiveSoftware’s Clearspace EJPress PublishingTechnology OpenOffice MarkLogic’sMarkLogic Server Microsoft’sSharePoint NorthPlains Temis Synaptica and more…
  26. 26. Other DH offerings  Off-the-shelf taxonomy  Term records  Browseable list  Rule bases  Consulting  Information architecture  DTD and schema creation  Search implementation
  27. 27. Knowledge Domains in over 40 subject areas. • Agriculture • Applied Technologies • Business (popular) • Business and Finance • Communications • Computer and Information Science (popular) • Computer Science • Consumer and Homemaking Education • Corporate Names • Counseling and Guidance • Economics • Education • Engineering • Environment • Geography (subject) • Geographical Place Names • Health and Safety • History • Language Arts • Languages • Literature and Drama • Mathematics • News • Occupations • Organizational Names • Personal Names • Physical Education and Recreation • Political Science • Psychology • Religion and Philosophy • Science (popular) • Science, Technology, and Medicine (STM) • Society • Sports • Technology • Visual and Performing Arts • US Industrial Codes (NAICS) • US Zip Codes and Places Go to TaxoBank formore!
  28. 28. NewsIndexer  Automatic indexing of newspapers  8 topical areas  Maps to IPTC, NAICS, ICB, and GICS codes  Popular, automatic, and fast  Remote submission / ASP  13 levels Filter to 3  License and augment  www.newsindexer.com
  29. 29. National Information Center for Educational Media - NICEM  667,000 records for non-print educational media  23,000 producers and distributors  Based on school curriculum needs  Online and CD-ROMs  MARC cataloging  Thesaurus  Print  www.nicem.com
  30. 30. MediaSleuth  Online ordering of media from NICEM  Search Harmony implementation  Full e-commerce platform for ordering  Educational and popular materials  www.mediasleuth.com
  31. 31. Access Integrity, Inc. (AI2)  Medical Claims Compliance  Automatic IDC-9 suggestions  CPT rule base  HCPCS rule base  ICD-9 V 3 Hospitals  ICD-10  Accurate, deep, consistent coding  Making medical billing efficient
  32. 32. Corporate Information  Closely held  Financed by  Sweat and Persistence  Good Cash Flow and Management  Since 1978 - 35 years in business  Marjorie M.K. Hlava  Jay Ven Eman  Joanna Ginter  www.accessinn.com Woman Owned Small Business
  33. 33. UPDATE Data Harmony Users Group Meeting February 10-14, 2014
  34. 34. The 15 modules + extensions What’s new  Admin Module  Author Submission System  Author / Name Disambiguation  Inline Tagging  Metadata Extractor  M.A.I.  MAIBatch GUI  MAIChem  Ontology Master  Thesaurus Master  Search Harmony  SharePoint  Recommender  Web Start  XIS
  35. 35. Rule Base Term Key Record Concept Extractor Statistics Module Taxonomy Authority files All terms Alphabetic Permuted view XML (Extensible Markup Language) - Unicode Java Virtual Machine TCP/IP Transmission Control Protocol / Internet Protocol Native XML Content Creation Repository OWL Zthes SKOS XML MARC, etc. Administrative modules Data Harmony 2013 Stack
  36. 36. Data Harmony 2014 Stack Rule Base Term Key Record Concept Extractor Statistics Module Taxonomy Authority files All terms Alphabetic Permuted view XML (Extensible Markup Language) - Unicode Java Virtual Machine TCP/IP Transmission Control Protocol / Internet Protocol Native XML Content Creation Repository OWL Zthes SKOS XML MARC, etc. Administrative modules
  37. 37. Admin Module  Configuration of Thesaurus Master, M.A.I., MAIstro  Separate Admin Module for XIS  MAIBatch added to MAIstro Admin Module
  38. 38. The author pastes the data into the document template, attaching images, graphs, etc. as necessary: Copyright © 2013 Access Innovations, Inc. Author Submission Module
  39. 39. Author Submission Module Copyright © 2013 Access Innovations, Inc. The author fills in the data to the document template, attaching images and graphs as necessary. An API calls Data Harmony and generates a list of indexing terms based on the content.
  40. 40. Authors review the indexing and may change it. Content is stored into a data repository as HTML, XML, etc. Author Submission Module Copyright © 2013 Access Innovations, Inc.
  41. 41. DH Author Submission System  Leveraging Records Management with Documentum, Author Submission, and MAIstro Marjorie M.K. Hlava and Leland Yates, Access Innovations, Inc.
  42. 42. Admin Module
  43. 43. DH Author Submission System
  44. 44.  Configure any field  Index on any field  XML or XHTML  Link to the CMS Author Submission System Configuration Module
  45. 45. Author Disambiguation  Build a file of authors  Name: first, second, surname  DOIs published  Publication rank (first author, etc.)  Keywords for those DOIs  Affiliation(s)  Location(s) city, state, country, etc.  Co-authors (inferred by DOI)  Etc.
  46. 46. Affiliation Disambiguation  Build a file of affiliations  Name  Lab, institute, etc. name  DOI  Location  Full address  Keywords  Etc.
  47. 47. Author Disambiguation  Link the two databases  Build a web service to accept files  Auto-disambiguate incoming files  Review new or non-match to ensure accuracy  Leveraging Semantic Fingerprinting for Building Author Networks Bob Kasenchak, Wednesday @ 9:30 AM
  48. 48. Inline Tagging  Full text tagging  Send search query directly to the place in the document where the concept is mentioned.  Flexible in XML and HTML views  Inline Tagging and Dictionary Connection Gena San Nicolas, Wednesday @ 2:15
  49. 49. Inline tagging Web service  Use M.A.I. to put terms in context for high-precision indexing
  50. 50. Inline Tagging Shows the exact point where the concept is mentioned Mouse over to view the term record Statistical summary, showing the number of times each term is mentioned in the article
  51. 51. XML View for Inline Tagging Copyright © 2013 Access Innovations, Inc.
  52. 52. Metadata Extractor  Automatic creation from PDF digital layer  Position training needed  Dublin Core metadata  Bibliographic citation created  Automatic summarization added  Uses M.A.I. on full text  Can be linked to Author Disambiguation
  53. 53. Input file
  54. 54. Source file PDF digital layer
  55. 55. Metadata Extractor Full Record Display
  56. 56. Output in XML
  57. 57. Or use with HTML Pages . <document> <title>Access Innovations - Knowledge Management Professionals</title> <document-type>Web Page</document-type> <copyright>© 2007 Access Innovations, Inc.</copyright> <address> <street>131 Adams NE</street> <city>Albuquerque</city> <state>New Mexico</state> </address> <subject-terms> <term>Data Harmony</term> <term>Indexing</term> <term>Taxonomies</term> </subject-terms> </document>
  58. 58. M.A.I.  M.A.I. is used to describe or categorize items by matching text to controlled vocabulary terms  Rule Builder  Concept Extractor  Statistics Collector  Test MAI
  59. 59. M.A.I. 2014  Find in Test MAI  Export Fields function  Expanded warning and information labels  Expanded print functions  Rule error details  Emphasis tags  MAIBatch GUI
  60. 60. Find Function In Test MAI
  61. 61. Export with fields selection
  62. 62. Expanded warning and information labels  Delete term warning
  63. 63. Term warnings  Term with multiple Broader Terms warning  Remove relationship warning message
  64. 64. Move term functions Move a single term
  65. 65. Expanded print functions
  66. 66. Test the syntax of a rule
  67. 67. View information about a thesaurus term
  68. 68. MAIBatch GUI
  69. 69. MAIBatch input format  PDF  XML, nXML  Web content (HTML, HTM)  Plain text (TXT), rich text (RTF)  MS Word documents (DOC, DOCX)
  70. 70. Full window with suggested AND used terms
  71. 71. Select all or just some files to process
  72. 72. MAIBatch XML  Add Custom tags  Click on “XML tags” in the Settings menu.
  73. 73. MAIBatch - Adding files Viewing results Upload File/Directory Row of asterisks separates each document file path of a document suggested thesaurus terms
  74. 74. Log Statistics  From source data to compare accuracy  By human editors assigning values  HIT  MISS  NOISE From source file data <USEDTERMS> <TERM>Term 1</TERM> <TERM>Term 2</TERM> </USEDTERMS>
  75. 75. M.A.I. Statistics Module
  76. 76. Exporting MAIBatch results Save as .txt file through export menu Save to Log Spreadsheet .xls
  77. 77. MAIChem  Dictionaries  Full terms  Beginners  Enders  M.A.I. Concept Extractor  Links to graphical displays
  78. 78. Ontology Master  Sneak Peek  Built on Thesaurus Master  Full OWL and SKOS exports  Full directional relationships  Same extensive functionality  Bob Kasenchak – Wednesday @ 1:15 PM
  79. 79. Recommender
  80. 80. More Like This - Recommender
  81. 81. Search Harmony  Built to leverage semantically enriched text  Uses the thesaurus sections  BT-NT relationships for taxonomy tree  Type ahead from tab, permuted index  Related terms  Narrower terms
  82. 82. Copyright © 2005 - Access Innovations, Inc. Taxonomy view Thesaurus Term Record view
  83. 83. Search Presentation Layer Automatic completion and type ahead from thesaurus
  84. 84. Search Presentation Layer Related Narrower
  85. 85. Search Presentation Layer The Hierarchical view of the thesaurus is also a browseable view of the content. The numbers include the number of hits 1. For the term 2. For the branch
  86. 86. Semantic Fingerprinting  People / Authors  Articles  Medical records  Organizations and affiliations  Point ads to users  Related to author disambiguation
  87. 87. Thesaurus Master Machine Aided Indexer (M.A.I.™) Repository Search Presentation: 90% accuracy Browse by Subject Auto-completion Broader Terms Narrower Terms Related Terms Client Taxonomy Inline Tagging Metadata and Entity Extractor Automatic Summarization Search Software Client Data Full Text HTML, PDF, Data Feeds, etc. Client taxonomy Fully integrated SharePoint Copyright © 2013 Access Innovations, Inc. [Data Harmony fully integrated with MOSS.]
  88. 88. Select term store management located under Site Administration Edit term sets to accurately reflect your document libraries and content types. Term sets can be individual taxonomies or flat controlled vocabulary lists. 90
  89. 89. Thesaurus Master - 2014  Built for vocabulary control  Taxonomy  Thesaurus  Entities  Full standards compliance  ISO 25964 Parts 1 and 2  NISO Z39.19 – 2010
  90. 90. Emphasis Is Available for Preferred Terms  bold, italics, or underline  Term with emphasized words  Term with enriched words  Change Term dialog with enhancement buttons
  91. 91. XML Emphasis Export
  92. 92. Full Path Export  Data Harmony Custom Features as Implemented for Triumph Learning  Kirk Sanders Wednesday @ 11:00  Emphasis  Full path export
  93. 93. Thesaurus Master 2014  Emphasis tags – more  Wednesday @ 11:00  Data Harmony Custom Features as Implemented for Triumph Learning Kirk Sanders, Access Innovations, Inc.
  94. 94. Pattern analysis Domain associations
  95. 95. Pattern analysis Component gaps
  96. 96. Web Start  Replacing WebThes and ThesViewer  Allows auto-start from the browser  Full featured  Password access control  Everything from view only to full access
  97. 97. V
  98. 98. XIS  A XIS project consists of the following:  Folders that XIS uses. These are the “project folders.”  A schema (configuration file) called projects.MyProject.xml.  A XIS DTD, called “projects.dtd.”
  99. 99. XIS links to Thesaurus Master and M.A.I.
  100. 100. XIS and Lucene Search within a search (recursive search) New Lucene search Using Lucene for Search within XIS Allexander Lyons, Wednesday @ 11:45
  101. 101. DHUG 2015  Albuquerque  February 16 – 20  Call for papers is now open  Ideas for what to do better and differently VERY welcome
  102. 102. We Apply Imagination Keep the System Flexible Make the Applications Fun Thank you! Marjorie M.K. Hlava, President, Access Innovations 505-998-0800 mhlava@accessinn.com

×