1. The document summarizes new features in Oracle Text 11g and the roadmap for Oracle's search products, including Oracle Text and Secure Enterprise Search.
2. Key new features in Oracle Text 11g include composite domain indexes, automatic language recognition with context-sensitive stemming, and offline index creation. Oracle Text 11.2.0.2 introduces entity extraction, name search, and a result set interface that returns XML results.
3. The roadmap discusses merging Oracle Text and Secure Enterprise Search and bringing additional natural language processing, partitioning, faceted navigation, and performance improvements to Oracle's search products.
The talk presents the sfSolrPlugin which transparently integrates the Solr search engine into symfony.
The talk explains :
* the features of the solr search engine
* how to integrate the search engine into symfony
* complex search : faceted and geolocalized search
* usage example : http://www.menugourmet.com and http://resolutionfinder.org
High Performance JSON Search and Relational Faceted Browsing with Lucenelucenerevolution
Presented by Renaud Delbru, Co-Founder, SindiceTech
In this presentation, we will discuss how Lucene and Solr can be used for very efficient search of tree-shaped schemaless document, e.g. JSON or XML, and can be then made to address both graph and relational data search. We will discuss the capabilities of SIREn, a Lucene/Solr plugin we have developed to deal with huge collections of tree-shaped schemaless documents, and how SIREn is built using Lucene extensibility capabilities (Analysis, Codec, Flexible Query Parser). We will compare it with Lucene's BlockJoin Query API in nested schemaless data intensive scenarios. We will then go through use cases that show how relational or graph data can be turned into JSON documents using Hadoop and Pig, and how this can be used in conjunction with SIREn to create relational faceting systems with unprecedented performance. Take-away lessons from this session will be awareness about using Lucene/Solr and Hadoop for relational and graph data search, as well as the awareness that it is now possible to have relational faceted browsers with sub-second response time on commodity hardware.
Building Intelligent Search Applications with Apache Solr and PHP5israelekpo
ZendCon 2010 - Building Intelligent Search Applications with Apache Solr and PHP5. This is a presentation on how to create intelligent web-based search applications using PHP 5 and the out-of-the-box features available in Solr 1.4.1 After we finish we finish the illustration of adding, updating and removing data from the Solr index, we will discuss how to add features such as auto-completion, hit highlighting, faceted navigation, spelling suggestions etc
The talk presents the sfSolrPlugin which transparently integrates the Solr search engine into symfony.
The talk explains :
* the features of the solr search engine
* how to integrate the search engine into symfony
* complex search : faceted and geolocalized search
* usage example : http://www.menugourmet.com and http://resolutionfinder.org
High Performance JSON Search and Relational Faceted Browsing with Lucenelucenerevolution
Presented by Renaud Delbru, Co-Founder, SindiceTech
In this presentation, we will discuss how Lucene and Solr can be used for very efficient search of tree-shaped schemaless document, e.g. JSON or XML, and can be then made to address both graph and relational data search. We will discuss the capabilities of SIREn, a Lucene/Solr plugin we have developed to deal with huge collections of tree-shaped schemaless documents, and how SIREn is built using Lucene extensibility capabilities (Analysis, Codec, Flexible Query Parser). We will compare it with Lucene's BlockJoin Query API in nested schemaless data intensive scenarios. We will then go through use cases that show how relational or graph data can be turned into JSON documents using Hadoop and Pig, and how this can be used in conjunction with SIREn to create relational faceting systems with unprecedented performance. Take-away lessons from this session will be awareness about using Lucene/Solr and Hadoop for relational and graph data search, as well as the awareness that it is now possible to have relational faceted browsers with sub-second response time on commodity hardware.
Building Intelligent Search Applications with Apache Solr and PHP5israelekpo
ZendCon 2010 - Building Intelligent Search Applications with Apache Solr and PHP5. This is a presentation on how to create intelligent web-based search applications using PHP 5 and the out-of-the-box features available in Solr 1.4.1 After we finish we finish the illustration of adding, updating and removing data from the Solr index, we will discuss how to add features such as auto-completion, hit highlighting, faceted navigation, spelling suggestions etc
Faceted search is a powerful technique to let users easily navigate the search results. It can also be used to develop rich user interfaces, which give an analyst quick insights about the documents space. In this session I will introduce the Facets module, how to use it, under-the-hood details as well as optimizations and best practices. I will also describe advanced faceted search capabilities with Lucene Facets.
Introduction to Solr, presented at Bangkok meetup in April 2014:
http://www.meetup.com/bkk-web/events/172090992/
Covers high-level use-cases for Solr. Demos include support for Thai language (with GitHub link for source).
Has slides showcasing Solr-ecosystem as well as couple of ideas for possible Solr-specific learning projects.
Multi faceted responsive search, autocomplete, feeds engine & logginglucenerevolution
Presented by Remi Mikalsen, Search Engineer, The Norwegian Centre for ICT in Education
Learn how utdanning.no leverages open source technologies to deliver a blazing fast multi-faceted responsive search experience and a flexible and efficient feeds engine on top of Solr 3.6. Among the key open source projects that will be covered are Solr, Ajax-Solr, SolrPHPClient, Bootstrap, jQuery and Drupal. Notable highlights are ajaxified pivot facets, multiple parents hierarchical facets, ajax autocomplete with edge-n-gram and grouping, integrating our search widgets on any external website, custom Solr logging and using Solr to deliver Atom feeds. utdanning.no is a governmental website that collects, normalizes and publishes study information for related to secondary school and higher education in Norway. With 1.2 million visitors each year and 12.000 indexed documents we focus on precise information and a high degree of usability for students, potential students and counselors.
Applied Semantic Search with Microsoft SQL ServerMark Tabladillo
Text mining is projected to dominate data mining, and the reasons are evident: we have more text available than numeric data. Microsoft introduced a new technology to SQL Server 2012 called Semantic Search. This session's detailed description and demos give you important information for the enterprise implementation of Tag Index and Document Similarity Index. The demos include a web-based Silverlight application, and content documents from Wikipedia. We'll also look at strategy tips for how to best leverage the new semantic technology with existing Microsoft data mining.
(ATS6-PLAT02) Accelrys Catalog and Protocol ValidationBIOVIA
Accelrys Catalog is a powerful new technology for creating an index of the protocols and components within your organization. You will learn about strategies for indexing and how search capabilities can be deployed to professional client and Web Port end users. You will also learn how to use this technology to find out about system usage to aid with system upgrades, server consolidations, and general system maintenance. The protocol validation capability in the admin portal allows administrators to created standard reports on server usage characteristics. You will learn how to report on violations of IT policies (e.g. around security), bad protocol authoring practices, or missing or incomplete protocol documentation. Developers will also learn how to extend and customize the rules used to create these reports.
Introduction to libre « fulltext » technologyRobert Viseur
The presentation will be based on my personal experience on SQLite, MySQL and Zend Search ; on workshops I’ve attended (PostgreSQL) and on tests conducted under my supervision (PostgreSQL, MySQL, Sphinx, Lucene, Xapian). It will cover an exhaustive overview of existing techniques, from the most basic to the more advanced, and will lead to a comparative table of the existing technology.
Introduction to the basics of Information Retrieval (IR) with an emphasis on Apache Solr/Lucene. A lecture I gave during the JOSA Data Science Bootcamp.
Search engines, and Apache Solr in particular, are quickly shifting the focus away from “big data” systems storing massive amounts of raw (but largely unharnessed) content, to “smart data” systems where the most relevant and actionable content is quickly surfaced instead. Apache Solr is the blazing-fast and fault-tolerant distributed search engine leveraged by 90% of Fortune 500 companies. As a community-driven open source project, Solr brings in diverse contributions from many of the top companies in the world, particularly those for whom returning the most relevant results is mission critical.
Out of the box, Solr includes advanced capabilities like learning to rank (machine-learned ranking), graph queries and distributed graph traversals, job scheduling for processing batch and streaming data workloads, the ability to build and deploy machine learning models, and a wide variety of query parsers and functions allowing you to very easily build highly relevant and domain-specific semantic search, recommendations, or personalized search experiences. These days, Solr even enables you to run SQL queries directly against it, mixing and matching the full power of Solr’s free-text, geospatial, and other search capabilities with the a prominent query language already known by most developers (and which many external systems can use to query Solr directly).
Due to the community-oriented nature of Solr, the ecosystem of capabilities also spans well beyond just the core project. In this talk, we’ll also cover several other projects within the larger Apache Lucene/Solr ecosystem that further enhance Solr’s smart data capabilities: bi-directional integration of Apache Spark and Solr’s capabilities, large-scale entity extraction, semantic knowledge graphs for discovering, traversing, and scoring meaningful relationships within your data, auto-generation of domain-specific ontologies, running SPARQL queries against Solr on RDF triples, probabilistic identification of key phrases within a query or document, conceptual search leveraging Word2Vec, and even Lucidworks’ own Fusion project which extends Solr to provide an enterprise-ready smart data platform out of the box.
We’ll dive into how all of these capabilities can fit within your data science toolbox, and you’ll come away with a really good feel for how to build highly relevant “smart data” applications leveraging these key technologies.
Faceted search is a powerful technique to let users easily navigate the search results. It can also be used to develop rich user interfaces, which give an analyst quick insights about the documents space. In this session I will introduce the Facets module, how to use it, under-the-hood details as well as optimizations and best practices. I will also describe advanced faceted search capabilities with Lucene Facets.
Introduction to Solr, presented at Bangkok meetup in April 2014:
http://www.meetup.com/bkk-web/events/172090992/
Covers high-level use-cases for Solr. Demos include support for Thai language (with GitHub link for source).
Has slides showcasing Solr-ecosystem as well as couple of ideas for possible Solr-specific learning projects.
Multi faceted responsive search, autocomplete, feeds engine & logginglucenerevolution
Presented by Remi Mikalsen, Search Engineer, The Norwegian Centre for ICT in Education
Learn how utdanning.no leverages open source technologies to deliver a blazing fast multi-faceted responsive search experience and a flexible and efficient feeds engine on top of Solr 3.6. Among the key open source projects that will be covered are Solr, Ajax-Solr, SolrPHPClient, Bootstrap, jQuery and Drupal. Notable highlights are ajaxified pivot facets, multiple parents hierarchical facets, ajax autocomplete with edge-n-gram and grouping, integrating our search widgets on any external website, custom Solr logging and using Solr to deliver Atom feeds. utdanning.no is a governmental website that collects, normalizes and publishes study information for related to secondary school and higher education in Norway. With 1.2 million visitors each year and 12.000 indexed documents we focus on precise information and a high degree of usability for students, potential students and counselors.
Applied Semantic Search with Microsoft SQL ServerMark Tabladillo
Text mining is projected to dominate data mining, and the reasons are evident: we have more text available than numeric data. Microsoft introduced a new technology to SQL Server 2012 called Semantic Search. This session's detailed description and demos give you important information for the enterprise implementation of Tag Index and Document Similarity Index. The demos include a web-based Silverlight application, and content documents from Wikipedia. We'll also look at strategy tips for how to best leverage the new semantic technology with existing Microsoft data mining.
(ATS6-PLAT02) Accelrys Catalog and Protocol ValidationBIOVIA
Accelrys Catalog is a powerful new technology for creating an index of the protocols and components within your organization. You will learn about strategies for indexing and how search capabilities can be deployed to professional client and Web Port end users. You will also learn how to use this technology to find out about system usage to aid with system upgrades, server consolidations, and general system maintenance. The protocol validation capability in the admin portal allows administrators to created standard reports on server usage characteristics. You will learn how to report on violations of IT policies (e.g. around security), bad protocol authoring practices, or missing or incomplete protocol documentation. Developers will also learn how to extend and customize the rules used to create these reports.
Introduction to libre « fulltext » technologyRobert Viseur
The presentation will be based on my personal experience on SQLite, MySQL and Zend Search ; on workshops I’ve attended (PostgreSQL) and on tests conducted under my supervision (PostgreSQL, MySQL, Sphinx, Lucene, Xapian). It will cover an exhaustive overview of existing techniques, from the most basic to the more advanced, and will lead to a comparative table of the existing technology.
Introduction to the basics of Information Retrieval (IR) with an emphasis on Apache Solr/Lucene. A lecture I gave during the JOSA Data Science Bootcamp.
Search engines, and Apache Solr in particular, are quickly shifting the focus away from “big data” systems storing massive amounts of raw (but largely unharnessed) content, to “smart data” systems where the most relevant and actionable content is quickly surfaced instead. Apache Solr is the blazing-fast and fault-tolerant distributed search engine leveraged by 90% of Fortune 500 companies. As a community-driven open source project, Solr brings in diverse contributions from many of the top companies in the world, particularly those for whom returning the most relevant results is mission critical.
Out of the box, Solr includes advanced capabilities like learning to rank (machine-learned ranking), graph queries and distributed graph traversals, job scheduling for processing batch and streaming data workloads, the ability to build and deploy machine learning models, and a wide variety of query parsers and functions allowing you to very easily build highly relevant and domain-specific semantic search, recommendations, or personalized search experiences. These days, Solr even enables you to run SQL queries directly against it, mixing and matching the full power of Solr’s free-text, geospatial, and other search capabilities with the a prominent query language already known by most developers (and which many external systems can use to query Solr directly).
Due to the community-oriented nature of Solr, the ecosystem of capabilities also spans well beyond just the core project. In this talk, we’ll also cover several other projects within the larger Apache Lucene/Solr ecosystem that further enhance Solr’s smart data capabilities: bi-directional integration of Apache Spark and Solr’s capabilities, large-scale entity extraction, semantic knowledge graphs for discovering, traversing, and scoring meaningful relationships within your data, auto-generation of domain-specific ontologies, running SPARQL queries against Solr on RDF triples, probabilistic identification of key phrases within a query or document, conceptual search leveraging Word2Vec, and even Lucidworks’ own Fusion project which extends Solr to provide an enterprise-ready smart data platform out of the box.
We’ll dive into how all of these capabilities can fit within your data science toolbox, and you’ll come away with a really good feel for how to build highly relevant “smart data” applications leveraging these key technologies.
The presentation describes how to design robust solution for tagging search, how to use tagging for faceted search. Various architecture and data patterns are considered. We discuss relational databases like Oracle, full text search servers like Apache Solr. We will see how Oracle 18c features permit to use embedded faceted search.
Being RDBMS Free -- Alternate Approaches to Data PersistenceDavid Hoerster
The general thinking is that when you create a new application, your data will be persisted into an RDBMS like SQL Server. But with the advent of NoSQL solutions, document databases, key-value stores and other options, do you really need an RDBMS for your application? In this session we’ll look at some alternatives to your persistence solution by looking at utilizing NoSQL solutions like Mongo, search services like Solr, key-value stores and other approaches to data persistence. By the end of this session, you’ll rethink how your applications will store data in the future.
Building Search & Recommendation EnginesTrey Grainger
In this talk, you'll learn how to build your own search and recommendation engine based on the open source Apache Lucene/Solr project. We'll dive into some of the data science behind how search engines work, covering multi-lingual text analysis, natural language processing, relevancy ranking algorithms, knowledge graphs, reflected intelligence, collaborative filtering, and other machine learning techniques used to drive relevant results for free-text queries. We'll also demonstrate how to build a recommendation engine leveraging the same platform and techniques that power search for most of the world's top companies. You'll walk away from this presentation with the toolbox you need to go and implement your very own search-based product using your own data.
Model Attribute Check Company Auto PropertyCeline George
In Odoo, the multi-company feature allows you to manage multiple companies within a single Odoo database instance. Each company can have its own configurations while still sharing common resources such as products, customers, and suppliers.
Macroeconomics- Movie Location
This will be used as part of your Personal Professional Portfolio once graded.
Objective:
Prepare a presentation or a paper using research, basic comparative analysis, data organization and application of economic information. You will make an informed assessment of an economic climate outside of the United States to accomplish an entertainment industry objective.
A Strategic Approach: GenAI in EducationPeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
How to Make a Field invisible in Odoo 17Celine George
It is possible to hide or invisible some fields in odoo. Commonly using “invisible” attribute in the field definition to invisible the fields. This slide will show how to make a field invisible in odoo 17.
Biological screening of herbal drugs: Introduction and Need for
Phyto-Pharmacological Screening, New Strategies for evaluating
Natural Products, In vitro evaluation techniques for Antioxidants, Antimicrobial and Anticancer drugs. In vivo evaluation techniques
for Anti-inflammatory, Antiulcer, Anticancer, Wound healing, Antidiabetic, Hepatoprotective, Cardio protective, Diuretics and
Antifertility, Toxicity studies as per OECD guidelines
Honest Reviews of Tim Han LMA Course Program.pptxtimhan337
Personal development courses are widely available today, with each one promising life-changing outcomes. Tim Han’s Life Mastery Achievers (LMA) Course has drawn a lot of interest. In addition to offering my frank assessment of Success Insider’s LMA Course, this piece examines the course’s effects via a variety of Tim Han LMA course reviews and Success Insider comments.
2. <Insert Picture Here>
Oracle Database 11g New Search Features and Roadmap
Roger Ford
Senior Principal Product Manager
3. Contents
• Oracle’s Search Products
• Oracle Text 11g New Features
• Oracle Text 11.2.0.2 New Features
<Insert Picture Here>
– Entity Extraction
– Name Search
– Result Set Interface
• Search Product Roadmap
– Oracle Text
– Secure Enterprise Search
3
4. Oracle’s Search Products
• Oracle Text
– A SQL and PL/SQL based toolkit for creating full-text search
applications
– Free with all database versions
– Previously known as Context Option, interMedia Text
• Secure Enterprise Search
– A complete search based on Oracle Text capabilities
– Crawlers for datasources such as web, email, document
repositories, databases
– End-user query application and APIs for embedding
4
5. Oracle Text 11g New Features
• Composite Domain Indexes and SDATA sections
– Allows storage of structured info (eg numbers, dates) within
text index
– Makes for much faster “mixed” queries
• Auto Lexer
– Automatic Language Recognition
– Segmentation and Stemming for 32 languages
– Context-sensitive stemming for 23 of these languages
• Off-line and time-limited index creation
– Enables rebuild of indexes offline in quiet periods for true
24x7 operation
5
7. 11.2.0.2 New Features - Summary
1. Entity Extraction
–
–
Find “entities” such as people, countries, cities, states, zip codes,
phone numbers etc from the text
Use default dictionary and rules or define your own dictionary and
rules based on regular expressions
2. Name Search (NDATA sections)
–
–
Inexact searches, copes with mis-spellings, segmentation errors,
contractions and word reversal
Useful for many searches, but particular good for names
3. ResultSet Interface
–
–
Query request in XML and results returned as XML
Avoids SQL layer and requirement to work within “SELECT”
semantics
7
8. Entity Extraction
•
•
•
•
Indentify names, places, dates, times, etc
Tag each occurence with type and subtype
Entities are defined by DICTIONARY and RULES
Implemented by CTX_ENTITY package
– create_extract_policy – create a policy to which you can add extract
rules
• Choose to use/not use built in rules and dictionary
– add_extract_rule – create an XML-based rule to define an entity
– add_stop_entity – prevent defined entities from being used
– compile – build the policy with its rules
– extract – get an XML-based list of entities for a doc
• Also can use ctxload to load user dictionary
8
12. Entity Extraction –
Example 2: User rule
ctx_entity.create_extract_policy('mypolicy');
ctx_entity.add_extract_rule('mypolicy', 5,
'<rule>
<expression>((North|South)? America)</expression>
<type refid="1">xContinent</type>
</rule>');
ctx_entity.compile('mypolicy');
ctx_entity.extract('mypolicy', mydoc, mylang, myresults);
• Note parentheses around expression. refid="1" means take the first expression in
paren – so "North America" or just "America".
• User defined types must be prefixed with a "x" – hence "xContinent"
<entities>
<entity id="0" offset="75" length="13" source="UserRule">
<text>North America</text>
<type>xContinent</type>
</entity>
</entities>
12
13. Ent Ext: Adding a user dictionary
• Create file
ud.xml:
<dictionary> <entities>
<entity> <value>Dow Jones Industrial Average</value> <type>xIndex</type> </entity>
<entity> <value>S&P 500</value> <type>xIndex</type> </entity>
<entities> </dictionary>
• Create the policy with CTXLOAD (can add rules later)
ctxload -user scott/tiger -extract -name pol1 -file ud.xml
• Compile the policy
ctx_entity.compile('pol1');
•
Results
<entity id="69" offset="1010" length="7" source="UserDictionary">
<text>S&P 500</text>
<type>xIndex</type>
</entity>
13
14. Entity Extraction – other stuff
• Extracting only certain entity types:
– ctx_entity.extract('p1', mydoc, null, myresults,
'city,company,xContinent');
14
15. Name Search
• Searching names has many difficulties
–
–
–
–
–
–
Spelling (steven = stephen)
Alternate Names (fred = alfred, chuck = charles)
Transcription (copying from spoken to written form)
Transliteration (copying from one writing system to another)
Segmentation (Mary Jane, Maryjane)
First, Middle, and Last Name Classification
• Name search does intelligent matching across all
these issues
15
17. NDATA section type
• Basic implementation for name search
• Limitations
– 511 characters
– 255 whitespace-delimited terms
– No offset information, therefore no:
• Highlighting / Markup
• NEAR or phrase search with NDATA
• Uses WORDLIST preference attributes:
–
–
–
–
NDATA_ALTERNATE_SPELLING
NDATA_BASE_LETTER
NDATA_THESAURUS (for alternate names – default thesaurus provided)
NDATA_JOIN_PARTICLES (list such as 'de:du:mc:mac')
• Query Syntax
– NDATA(fieldname, search terms [, order [, proximity ] ] )
17
18. Result Set Interface
• Some queries are difficult to express in SQL:
– eg "Give me the top 5 hits in each category"
• Result set interface uses a simple text query and an
XML result set descriptor
• Hitlist is returned in XML according to result set
descriptor
• Uses SDATA sections for
– Grouping
– Counting
18
23. Roadmap – merging Text and SES
Oracle Text
Secure Enterprise
Search
Full Control
Full Featured
• Fine-grained Index Options
• Built in database and mid-tier
• Data Storage Options
• Crawlers for many sources
• Lexer Options
• Simple Query Interface
• Stoplists
• End user GUI / API
• Use existing database
• Embedded security
• RAC, Exadata
23
24. Coming Search Features
• Natural Language Processing enhancements
– Ontology based classification
– Question answering
• Automatic Partitioning
– Query load load balancing
• Full support for facetted navigation (MVDATA sections)
• Functional completeness for Result Set Interface
– Result Iterator – streaming support
– Parallel Query
• Replication Support
– Golden Gate / Logical Standby / Streams
• Operator improvements
– NEAR2 – best query in one operator
– MNOT – mild not, eg YORK mnot NEW YORK
– Nested near
• Substring index and query performance improvements
24
25. Coming Search Features - Continued
• Multiple enhancements to query performance
– BIGIO leverages Secure Files CLOBs
– Automatic optimization of indexes with “stage index”
– Two level index – keep common search terms in memory
• Partition maintenance without reindexing
• Off-load filtering from database server
• Section specific index options
– Choose different options, eg language, stopwords, PRINTJOINS for
each section
• Regular expression based stopwords
• Forward Index
– Hugely improved performance for highlighting, snippets
• PDF “Native” Highlighting
• Unlimited SDATA, MDATA and Field Sections
25
26. The preceding is intended to outline our general
product direction. It is intended for information
purposes only, and may not be incorporated into any
contract. It is not a commitment to deliver any
material, code, or functionality, and should not be
relied upon in making purchasing decisions.
The development, release, and timing of any
features or functionality described for Oracle’s
products remains at the sole discretion of Oracle.
26