SlideShare a Scribd company logo
XIS Lucene Indexing and Search
What is XIS?
 XIS is a XML schema-based database system used to
store user data
 All records are stored in individual XML files
 Option to zip XML files available with XIS Project DTD
How XIS Data Is Stored
 Docsets
 Stores records with multiple fields (similar to SQL Table)
 Can also have subfields and lists of field values nested within a
record
 Can look up values from other fields in other Docsets or other
tables
 Tables
 Stores a single list of values
 Can be referenced by other Docsets
 Can be directly accessible for editing or kept hidden from user
view
How to Create a XIS Project
 Create DTD file for XIS project
 Specify MAI Thesaurus to link to project
 Create Docset and Tables
 Specify ID lengths for each Docset
 Create fields for Docsets
 Save DTD to dhserver/projects/projects/xml folder
 Create XIS Project folder under dhserver/data
 Create subfolders for each Docset under XIS Project
folder as well as Tables directory
 XIS Projects can only be created by administrators
Starting a XIS Project
 Start Data Harmony server where project is located
 Log in to Admin module
 Start MAI Thesaurus
 Start XIS Project
 Index XIS Project, especially if just created
 Run startXis program
 Enter server, port, thesaurus, username, and password
to log in
Indexing a XIS Project
XIS Login Screen
XIS Project View
XIS Docset View
XIS Table View
XIS Record Format
 Saved in XML file
 Starts with tag to represent Docset name along with ID
as attribute
 Fields are listed within Docset tag along with values.
Subfields are nested within their parent fields
XIS Search View
XIS Search Results
Current XIS Indexing and Search
 Uses text-based indexes
 Creates large number of index files (one for each field)
 Generates temporary files for results
 Uses less reliable RandomAccessFile search
 Has limited amount of search operands
 Does not take into account numerical values
Lucene vs. Current XIS Index
 Fewer index files needed
 Allows for broader searches
 Fuzzy matching
 Start and end wildcard searches
 Recognizes numerical and date fields as such
 Can be utilized to remove stopwords
New Lucene Search Process
 Establish index reader to perform search
 Submit query string containing fields and parameters
 Return results
Other Lucene Functions
 Will be used for adding, updating, and deleting XIS
records
 Indexes will be housed on Data Harmony server
Any Questions?

More Related Content

Similar to Using Lucene for Search within XIS

Lucene indexing
Lucene indexingLucene indexing
Lucene indexing
Lucky Sharma
 
Dspace 7 presentation
Dspace 7 presentationDspace 7 presentation
Dspace 7 presentation
mohamed Elzalabany
 
Introduction To Apache Lucene
Introduction To Apache LuceneIntroduction To Apache Lucene
Introduction To Apache Lucene
Mindfire Solutions
 
Eol Drupal Dman Presentation
Eol   Drupal   Dman PresentationEol   Drupal   Dman Presentation
Eol Drupal Dman PresentationDavid Shorthouse
 
Dspace OAI-PMH
Dspace OAI-PMHDspace OAI-PMH
Dspace OAI-PMH
Sem Gebresilassie
 
Explore SharePoint 2010 Enterprise & Document Management features
Explore SharePoint 2010 Enterprise & Document Management features Explore SharePoint 2010 Enterprise & Document Management features
Explore SharePoint 2010 Enterprise & Document Management features
K.Mohamed Faizal
 
Theory of LaTeX
Theory of LaTeXTheory of LaTeX
Theory of LaTeX
Srikrishnan Suresh
 
21 domino mohan-1
21 domino mohan-121 domino mohan-1
21 domino mohan-1
ashish61_scs
 
Structured Dynamics' Semantic Technologies Product Stack
Structured Dynamics' Semantic Technologies Product StackStructured Dynamics' Semantic Technologies Product Stack
Structured Dynamics' Semantic Technologies Product Stack
Mike Bergman
 
IR with lucene
IR with luceneIR with lucene
IR with lucene
Stelios Gorilas
 
SCDJWS 6. REST JAX-P
SCDJWS 6. REST  JAX-PSCDJWS 6. REST  JAX-P
SCDJWS 6. REST JAX-P
Francesco Ierna
 
Dspace Webinar
Dspace WebinarDspace Webinar
Dspace Webinar
Gavin Henrick
 
SharePoint Connections Coast to Coast Overview of Enterprise Content Management
SharePoint Connections Coast to Coast Overview of Enterprise Content ManagementSharePoint Connections Coast to Coast Overview of Enterprise Content Management
SharePoint Connections Coast to Coast Overview of Enterprise Content Management
Ivan Sanders
 
Microsoft Azure e Open Source
Microsoft Azure e Open SourceMicrosoft Azure e Open Source
Microsoft Azure e Open Source
Danilo Bordini
 
Elasticsearch for beginners
Elasticsearch for beginnersElasticsearch for beginners
Elasticsearch for beginners
Neil Baker
 
Using RSS to Share KOS Metadata
Using RSS to Share KOS MetadataUsing RSS to Share KOS Metadata
Using RSS to Share KOS Metadata
Margherita Sini
 
Linq to xml
Linq to xmlLinq to xml
Linq to xml
Mickey
 
Unit 11 File Management.pptx
Unit 11 File Management.pptxUnit 11 File Management.pptx
Unit 11 File Management.pptx
TrnChuThy
 
"If I knew then what I know now"
"If I knew then what I know now""If I knew then what I know now"
"If I knew then what I know now"
Visual Resources Association
 

Similar to Using Lucene for Search within XIS (20)

Lucene indexing
Lucene indexingLucene indexing
Lucene indexing
 
Dspace 7 presentation
Dspace 7 presentationDspace 7 presentation
Dspace 7 presentation
 
Introduction To Apache Lucene
Introduction To Apache LuceneIntroduction To Apache Lucene
Introduction To Apache Lucene
 
Eol Drupal Dman Presentation
Eol   Drupal   Dman PresentationEol   Drupal   Dman Presentation
Eol Drupal Dman Presentation
 
Dspace OAI-PMH
Dspace OAI-PMHDspace OAI-PMH
Dspace OAI-PMH
 
Explore SharePoint 2010 Enterprise & Document Management features
Explore SharePoint 2010 Enterprise & Document Management features Explore SharePoint 2010 Enterprise & Document Management features
Explore SharePoint 2010 Enterprise & Document Management features
 
Theory of LaTeX
Theory of LaTeXTheory of LaTeX
Theory of LaTeX
 
21 domino mohan-1
21 domino mohan-121 domino mohan-1
21 domino mohan-1
 
Structured Dynamics' Semantic Technologies Product Stack
Structured Dynamics' Semantic Technologies Product StackStructured Dynamics' Semantic Technologies Product Stack
Structured Dynamics' Semantic Technologies Product Stack
 
IR with lucene
IR with luceneIR with lucene
IR with lucene
 
SCDJWS 6. REST JAX-P
SCDJWS 6. REST  JAX-PSCDJWS 6. REST  JAX-P
SCDJWS 6. REST JAX-P
 
Dspace Webinar
Dspace WebinarDspace Webinar
Dspace Webinar
 
SharePoint Connections Coast to Coast Overview of Enterprise Content Management
SharePoint Connections Coast to Coast Overview of Enterprise Content ManagementSharePoint Connections Coast to Coast Overview of Enterprise Content Management
SharePoint Connections Coast to Coast Overview of Enterprise Content Management
 
Lucece Indexing
Lucece IndexingLucece Indexing
Lucece Indexing
 
Microsoft Azure e Open Source
Microsoft Azure e Open SourceMicrosoft Azure e Open Source
Microsoft Azure e Open Source
 
Elasticsearch for beginners
Elasticsearch for beginnersElasticsearch for beginners
Elasticsearch for beginners
 
Using RSS to Share KOS Metadata
Using RSS to Share KOS MetadataUsing RSS to Share KOS Metadata
Using RSS to Share KOS Metadata
 
Linq to xml
Linq to xmlLinq to xml
Linq to xml
 
Unit 11 File Management.pptx
Unit 11 File Management.pptxUnit 11 File Management.pptx
Unit 11 File Management.pptx
 
"If I knew then what I know now"
"If I knew then what I know now""If I knew then what I know now"
"If I knew then what I know now"
 

More from Access Innovations, Inc.

Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdfSupercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Access Innovations, Inc.
 
Eureka, I found it! - Special Libraries Association 2021 Presentation
Eureka, I found it! - Special Libraries Association 2021 PresentationEureka, I found it! - Special Libraries Association 2021 Presentation
Eureka, I found it! - Special Libraries Association 2021 Presentation
Access Innovations, Inc.
 
Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy Results
Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy ResultsMaking AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy Results
Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy Results
Access Innovations, Inc.
 
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
Access Innovations, Inc.
 
Smart submit
Smart submitSmart submit
Plos taxonomy beyond search dhug 2021
Plos taxonomy beyond search   dhug 2021Plos taxonomy beyond search   dhug 2021
Plos taxonomy beyond search dhug 2021
Access Innovations, Inc.
 
Hindawi taxonomy and personalization 27.10 (1)
Hindawi taxonomy and personalization 27.10 (1)Hindawi taxonomy and personalization 27.10 (1)
Hindawi taxonomy and personalization 27.10 (1)
Access Innovations, Inc.
 
Data harmonycloudpowerpointclientfacing
Data harmonycloudpowerpointclientfacingData harmonycloudpowerpointclientfacing
Data harmonycloudpowerpointclientfacing
Access Innovations, Inc.
 
Data harmony update 2021
Data harmony update 2021 Data harmony update 2021
Data harmony update 2021
Access Innovations, Inc.
 
Atypon dhug2021
Atypon dhug2021Atypon dhug2021
Atypon dhug2021
Access Innovations, Inc.
 
Asco using ai-taxos-for meta-titles-february-2021
Asco using ai-taxos-for meta-titles-february-2021Asco using ai-taxos-for meta-titles-february-2021
Asco using ai-taxos-for meta-titles-february-2021
Access Innovations, Inc.
 
Asce more than just topic taxonomies
Asce more than just topic taxonomiesAsce more than just topic taxonomies
Asce more than just topic taxonomies
Access Innovations, Inc.
 
Acs discoverability-dhug2021
Acs discoverability-dhug2021Acs discoverability-dhug2021
Acs discoverability-dhug2021
Access Innovations, Inc.
 
Ai webinar 2 -what's in a name (consolidated pdf)
Ai webinar 2 -what's in a name (consolidated pdf)Ai webinar 2 -what's in a name (consolidated pdf)
Ai webinar 2 -what's in a name (consolidated pdf)
Access Innovations, Inc.
 
Tagging overview - Why Keywords Don't Cut It
Tagging overview  - Why Keywords Don't Cut ItTagging overview  - Why Keywords Don't Cut It
Tagging overview - Why Keywords Don't Cut It
Access Innovations, Inc.
 
Health Affairs - Why Keywords Don't Cut It
Health Affairs - Why Keywords Don't Cut ItHealth Affairs - Why Keywords Don't Cut It
Health Affairs - Why Keywords Don't Cut It
Access Innovations, Inc.
 
Why Keywords Don't Cut It
Why Keywords Don't Cut ItWhy Keywords Don't Cut It
Why Keywords Don't Cut It
Access Innovations, Inc.
 
Data Harmony update 2020 final
Data Harmony update 2020 finalData Harmony update 2020 final
Data Harmony update 2020 final
Access Innovations, Inc.
 
Data Harmony Update 2020 final
Data Harmony Update 2020 finalData Harmony Update 2020 final
Data Harmony Update 2020 final
Access Innovations, Inc.
 
DHUG 2018: Towards Web-Centric Repository Interoperability
DHUG 2018: Towards Web-Centric Repository InteroperabilityDHUG 2018: Towards Web-Centric Repository Interoperability
DHUG 2018: Towards Web-Centric Repository Interoperability
Access Innovations, Inc.
 

More from Access Innovations, Inc. (20)

Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdfSupercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
 
Eureka, I found it! - Special Libraries Association 2021 Presentation
Eureka, I found it! - Special Libraries Association 2021 PresentationEureka, I found it! - Special Libraries Association 2021 Presentation
Eureka, I found it! - Special Libraries Association 2021 Presentation
 
Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy Results
Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy ResultsMaking AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy Results
Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy Results
 
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
 
Smart submit
Smart submitSmart submit
Smart submit
 
Plos taxonomy beyond search dhug 2021
Plos taxonomy beyond search   dhug 2021Plos taxonomy beyond search   dhug 2021
Plos taxonomy beyond search dhug 2021
 
Hindawi taxonomy and personalization 27.10 (1)
Hindawi taxonomy and personalization 27.10 (1)Hindawi taxonomy and personalization 27.10 (1)
Hindawi taxonomy and personalization 27.10 (1)
 
Data harmonycloudpowerpointclientfacing
Data harmonycloudpowerpointclientfacingData harmonycloudpowerpointclientfacing
Data harmonycloudpowerpointclientfacing
 
Data harmony update 2021
Data harmony update 2021 Data harmony update 2021
Data harmony update 2021
 
Atypon dhug2021
Atypon dhug2021Atypon dhug2021
Atypon dhug2021
 
Asco using ai-taxos-for meta-titles-february-2021
Asco using ai-taxos-for meta-titles-february-2021Asco using ai-taxos-for meta-titles-february-2021
Asco using ai-taxos-for meta-titles-february-2021
 
Asce more than just topic taxonomies
Asce more than just topic taxonomiesAsce more than just topic taxonomies
Asce more than just topic taxonomies
 
Acs discoverability-dhug2021
Acs discoverability-dhug2021Acs discoverability-dhug2021
Acs discoverability-dhug2021
 
Ai webinar 2 -what's in a name (consolidated pdf)
Ai webinar 2 -what's in a name (consolidated pdf)Ai webinar 2 -what's in a name (consolidated pdf)
Ai webinar 2 -what's in a name (consolidated pdf)
 
Tagging overview - Why Keywords Don't Cut It
Tagging overview  - Why Keywords Don't Cut ItTagging overview  - Why Keywords Don't Cut It
Tagging overview - Why Keywords Don't Cut It
 
Health Affairs - Why Keywords Don't Cut It
Health Affairs - Why Keywords Don't Cut ItHealth Affairs - Why Keywords Don't Cut It
Health Affairs - Why Keywords Don't Cut It
 
Why Keywords Don't Cut It
Why Keywords Don't Cut ItWhy Keywords Don't Cut It
Why Keywords Don't Cut It
 
Data Harmony update 2020 final
Data Harmony update 2020 finalData Harmony update 2020 final
Data Harmony update 2020 final
 
Data Harmony Update 2020 final
Data Harmony Update 2020 finalData Harmony Update 2020 final
Data Harmony Update 2020 final
 
DHUG 2018: Towards Web-Centric Repository Interoperability
DHUG 2018: Towards Web-Centric Repository InteroperabilityDHUG 2018: Towards Web-Centric Repository Interoperability
DHUG 2018: Towards Web-Centric Repository Interoperability
 

Recently uploaded

Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 

Recently uploaded (20)

Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 

Using Lucene for Search within XIS

  • 1. XIS Lucene Indexing and Search
  • 2. What is XIS?  XIS is a XML schema-based database system used to store user data  All records are stored in individual XML files  Option to zip XML files available with XIS Project DTD
  • 3. How XIS Data Is Stored  Docsets  Stores records with multiple fields (similar to SQL Table)  Can also have subfields and lists of field values nested within a record  Can look up values from other fields in other Docsets or other tables  Tables  Stores a single list of values  Can be referenced by other Docsets  Can be directly accessible for editing or kept hidden from user view
  • 4. How to Create a XIS Project  Create DTD file for XIS project  Specify MAI Thesaurus to link to project  Create Docset and Tables  Specify ID lengths for each Docset  Create fields for Docsets  Save DTD to dhserver/projects/projects/xml folder  Create XIS Project folder under dhserver/data  Create subfolders for each Docset under XIS Project folder as well as Tables directory  XIS Projects can only be created by administrators
  • 5. Starting a XIS Project  Start Data Harmony server where project is located  Log in to Admin module  Start MAI Thesaurus  Start XIS Project  Index XIS Project, especially if just created  Run startXis program  Enter server, port, thesaurus, username, and password to log in
  • 6. Indexing a XIS Project
  • 11. XIS Record Format  Saved in XML file  Starts with tag to represent Docset name along with ID as attribute  Fields are listed within Docset tag along with values. Subfields are nested within their parent fields
  • 14. Current XIS Indexing and Search  Uses text-based indexes  Creates large number of index files (one for each field)  Generates temporary files for results  Uses less reliable RandomAccessFile search  Has limited amount of search operands  Does not take into account numerical values
  • 15. Lucene vs. Current XIS Index  Fewer index files needed  Allows for broader searches  Fuzzy matching  Start and end wildcard searches  Recognizes numerical and date fields as such  Can be utilized to remove stopwords
  • 16. New Lucene Search Process  Establish index reader to perform search  Submit query string containing fields and parameters  Return results
  • 17. Other Lucene Functions  Will be used for adding, updating, and deleting XIS records  Indexes will be housed on Data Harmony server