Sphinx is a fulltext search engine that provides more advanced indexing and querying capabilities than MySQL fulltext search. It uses an inverted index for fast searching and supports various ranking factors, search operators, and morphology tools. Sphinx can be easily integrated with MySQL for indexing and querying via SphinxQL.
Fulltext engine for non fulltext searchesAdrian Nuta
Or better said when Sphinx can help MySQL on queries that at first look they don’t involve any fulltext searching.
Sphinx was build in mind to help the DB on fulltext queries. But it can also help on where there is no text search. That is everyday used queries with combined filtering,grouping and sorting used for various analytics, reporting of simply general usage.
In Sphinx, the fulltext query is executed first, creating a result set that is passed to the remaining operations ( filters, groups, sorts). By reducing the size of the set that is interogated, the whole query will not be only faster, but it will consume less resources.
Because of design for speed, Sphinx can group and sort a lot faster and can do easy segmentations or getting top-N best group matches in a single query.
The result will be offloading heavy work done by database nodes to even a single Sphinx server.
Slides were presented at PerconaLive London 2013
5_MariaDB_What's New in MariaDB Server 10.2 and Big Data Analytics with Maria...Kangaroot
Anders Karlsson, Principal Sales Engineer at MariaDB Corporation Ab
Join this session to learn more about all the new product features included in MariaDB Server 10.2.
After running over these new features, the presentation will cover MariaDB ColumnStore. MariaDB ColumnStore is a powerful open source columnar storage engine that supports a wide variety of analytical use cases with ANSI SQL in highly scalable distributed environments. It unifies OLTP and analytics workloads with a single ANSI SQL interface.
Fulltext engine for non fulltext searchesAdrian Nuta
Or better said when Sphinx can help MySQL on queries that at first look they don’t involve any fulltext searching.
Sphinx was build in mind to help the DB on fulltext queries. But it can also help on where there is no text search. That is everyday used queries with combined filtering,grouping and sorting used for various analytics, reporting of simply general usage.
In Sphinx, the fulltext query is executed first, creating a result set that is passed to the remaining operations ( filters, groups, sorts). By reducing the size of the set that is interogated, the whole query will not be only faster, but it will consume less resources.
Because of design for speed, Sphinx can group and sort a lot faster and can do easy segmentations or getting top-N best group matches in a single query.
The result will be offloading heavy work done by database nodes to even a single Sphinx server.
Slides were presented at PerconaLive London 2013
5_MariaDB_What's New in MariaDB Server 10.2 and Big Data Analytics with Maria...Kangaroot
Anders Karlsson, Principal Sales Engineer at MariaDB Corporation Ab
Join this session to learn more about all the new product features included in MariaDB Server 10.2.
After running over these new features, the presentation will cover MariaDB ColumnStore. MariaDB ColumnStore is a powerful open source columnar storage engine that supports a wide variety of analytical use cases with ANSI SQL in highly scalable distributed environments. It unifies OLTP and analytics workloads with a single ANSI SQL interface.
Beyond php - it's not (just) about the codeWim Godden
Most PHP developers focus on writing code. But creating Web applications is about much more than just wrting PHP. Take a step outside the PHP cocoon and into the big PHP ecosphere to find out how small code changes can make a world of difference on servers and network. This talk is an eye-opener for developers who spend over 80% of their time coding, debugging and testing.
Accelerating Local Search with PostgreSQL (KNN-Search)Jonathan Katz
KNN-GiST indexes were added in PostgreSQL 9.1 and greatly accelerate some common queries in the geospatial and textual search realms. This presentation will demonstrate the power of KNN-GiST indexes on geospatial and text searching queries, but also their present limitations through some of my experimentations. I will also discuss some of the theory behind KNN (k-nearest neighbor) as well as some of the applications this feature can be applied too.
To see a version of the talk given at PostgresOpen 2011, please visit http://www.youtube.com/watch?v=N-MD08QqGEM
There's an Erlang/Elixir myth that tail-recursive functions are much faster than body-recursive functions. In this talk we will first explore how Erlang terms are represented in bit level, and then take a closer look at this myth to understand what really happens with tail-recursive functions vs body-recursive functions.
imager package in R and example
References:
http://dahtah.github.io/imager/
http://dahtah.github.io/imager/imager.html
https://cran.r-project.org/web/packages/imager/imager.pdf
: A heap is a nearly complete binary tree with the following two properties:
Structural property: all levels are full, except possibly the last one, which is filled from left to right
Order (heap) property: for any node x
Parent(x) ≥ x
This is an intro to Sphinx and PHP. It will take you through the very basics of how Sphinx works, how you can set up an index, and using the mysql client to search your index. Then, it culminates in a quick little PHP script that builds a small search interface around your index. I will be posting the example code into my github account soon.
This presentation was given to the LV PHP meetup on August 5th.
Beyond php - it's not (just) about the codeWim Godden
Most PHP developers focus on writing code. But creating Web applications is about much more than just wrting PHP. Take a step outside the PHP cocoon and into the big PHP ecosphere to find out how small code changes can make a world of difference on servers and network. This talk is an eye-opener for developers who spend over 80% of their time coding, debugging and testing.
Accelerating Local Search with PostgreSQL (KNN-Search)Jonathan Katz
KNN-GiST indexes were added in PostgreSQL 9.1 and greatly accelerate some common queries in the geospatial and textual search realms. This presentation will demonstrate the power of KNN-GiST indexes on geospatial and text searching queries, but also their present limitations through some of my experimentations. I will also discuss some of the theory behind KNN (k-nearest neighbor) as well as some of the applications this feature can be applied too.
To see a version of the talk given at PostgresOpen 2011, please visit http://www.youtube.com/watch?v=N-MD08QqGEM
There's an Erlang/Elixir myth that tail-recursive functions are much faster than body-recursive functions. In this talk we will first explore how Erlang terms are represented in bit level, and then take a closer look at this myth to understand what really happens with tail-recursive functions vs body-recursive functions.
imager package in R and example
References:
http://dahtah.github.io/imager/
http://dahtah.github.io/imager/imager.html
https://cran.r-project.org/web/packages/imager/imager.pdf
: A heap is a nearly complete binary tree with the following two properties:
Structural property: all levels are full, except possibly the last one, which is filled from left to right
Order (heap) property: for any node x
Parent(x) ≥ x
This is an intro to Sphinx and PHP. It will take you through the very basics of how Sphinx works, how you can set up an index, and using the mysql client to search your index. Then, it culminates in a quick little PHP script that builds a small search interface around your index. I will be posting the example code into my github account soon.
This presentation was given to the LV PHP meetup on August 5th.
These are the slides from my talk at the 2012 Sphinx Search Day in Santa Clara, California. It provides a high-level picture of where Sphinx is used at craigslist, a bit of history, issues, and future work.
Búsquedas Full Text con esteroides - Sphinx SearchDiego Sapriza
Charla dada en la @StarTechConf Chile
Octubre 2013
Links:
http://sphinxsearch.com/docs/current.html
http://AV4TAr.com
http://bit.ly/sphinx-autosuggest
http://bit.ly/sphinx-query-builder
http://bit.ly/sphinx-zfconf-011
http://bit.ly/sphinx-high-performance
MYSQL Query Anti-Patterns That Can Be Moved to SphinxPythian
PalominoDB European Team lead, Vladimir Fedorkov will be discussing how to handle query bottlenecks that can result from increases in dataset and traffic
Modern query optimisation features in MySQL 8.Mydbops
MySQL 8 (a huge leap forward), indexing capabilities, execution plan enhancements, optimizer improvements, and many other current query tweak features are covered in the slides.
Efficient MySQL Indexing and what's new in MySQL ExplainMydbops
Efficient MySQL Indexing & What's New in MySQL Explain - Mydbops MyWebinar Edition 32
This session will delve into:
• Strategic indexing techniques: Learn how to optimize your MySQL database by implementing effective indexing strategies, including when to avoid fulltext indexes to prevent wasted resources.
• Demystifying the new MySQL Explain: We'll explore the latest enhancements to the MySQL Explain plan's JSON output format. Discover how to store the output in a variable for further analysis – a valuable addition introduced in MySQL 8.3. You'll also learn about the explain_json_format_version variable, which empowers you to choose between different JSON output versions for greater flexibility.
• Live Chat Engagement: We encourage you to actively participate throughout the webinar! Use the chat functionality to ask questions and share your experiences with indexing and Explain.
This webinar is perfect for:
• Database administrators (DBAs)
• Developers
• Anyone seeking to optimize MySQL performance and streamline database queries
Mydbops Managed Services specializes in taking the pain out of database management while optimizing performance. Since 2015, we have been providing top-notch support and assistance for the top three open-source databases: MySQL, MongoDB, and PostgreSQL.
Our team offers a wide range of services, including assistance, support, consulting, 24/7 operations, and expertise in all relevant technologies. We help organizations improve their database's performance, scalability, efficiency, and availability.
Contact us: info@mydbops.com
Visit: https://www.mydbops.com/
MySQL Cookbook 4th edition was released this summer. We are the book's authors and will show you how to "cook" MySQL. We will show you a few tasks with different priorities, such as JSON in MySQL for those who need flexibility, modern SQL for analytics, and Group Replication for high availability. We will also show how to write programs using JavaScript and Python languages, X DevAPI, and MySQL Shell. We will touch on some of the exciting features of MySQL Spatial Indexes and Geographical Data, Using a Full-Text Search, and more. We're hoping this talk will be interesting for both developers and administrators of MySQL.
These slides are for my talk at Percona Live 2022: https://sched.co/10KEo
MySQL Cookbook 4th edition (https://www.target.com/p/mysql-cookbook-4th-edition-by-sveta-smirnova-alkin-tezuysal-paperback/-/A-85851771) is planned to be released this spring. I am one of the authors of the book and will show you how to "cook" MySQL. I will show you a few tasks with different priorities, such as JSON in MySQL for those who need flexibility; modern SQL for analytics, and Group Replication for high availability. I will also show how to write programs using JavaScript and Python languages, X DevAPI, and MySQL Shell. I expect this talk will be interesting for MySQL application developers.
Performance Schema for MySQL TroubleshootingSveta Smirnova
Percona Live (https://www.percona.com/live/data-performance-conference-2016/sessions/performance-schema-mysql-troubleshooting)
The performance schema in MySQL version 5.6, released in February, 2013, is a very powerful tool that can help DBAs discover why even the trickiest performance issues occur. Version 5.7 introduces even more instruments and tables. And while all these give you great power, you can get stuck choosing which instrument to use.
In this session, I will start with a description of a typical problem, then guide you how to use the performance schema to find out what causes the issue, the reason for unwanted behavior and how the received information can help you solve a particular problem.
Traditionally, performance schema sessions teach what is in contained in tables. I will, in contrast, start from a performance issue, then demonstrate which instruments and tables can help solve it. We will discuss how to setup the performance schema so that it has minimal impact on your server.
Just about anyone can write a basic SQL query for a table. Not everyone can write a good query though - that takes practice and knowing how to understand what the optimizer is doing with the query. Learn the basics of query optimization so you keep your application engaging the user rather then showing the progress bar as they wait on the database.
In this talk you can expect to learn what OCI containers are, how to build them and why you may want them. The first part will be a brief introduction to OCI containers followed by the motivation behind our use-case at the OpenStack/Magnum project and the Container Service at CERN. How we leverage OCI containers and why we chose them to offer container infrastructure to our users, meaning running kubernetes, etcd, flanneld, OpenStack-specific daemons, CERN-specific tools, the docker daemon and cri-o.
The second part will be a shallow dive on how to run and build OCI containers from scratch and most importantly how to populate the famous config.json file, the heart of the OCI configuration. This part will include examples on how to use docker, runc, rkt, atomic and buildah.
Percona Live 2016 (https://www.percona.com/live/data-performance-conference-2016/sessions/why-use-explain-formatjson). Although EXPLAIN FORMAT=JSON was first presented a long time ago, there still aren't many resources that explain how and why to use it. The most advertised feature is visual EXPLAIN in MySQL Workbench, but this format can do more than create nice pictures. It prints additional information that can't be found in good old tabular EXPLAIN, and can help to solve many tricky performance issues. In this session, I will not only describe which additional information we can get with the new syntax, but also provide examples showing how to use it to diagnose production issues.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
2. Fulltext search in MySQL
● available for MyISAM and lately for InnoDB
● limited in indexation options
○ only min length and list of stopwords
● limited in search options
○ boolean
○ natural mode
○ with query expansion
3. Why Sphinx?
● GPLv2
● better performance
● lot of features, both on indexing and
searching
● easy to transit from MySQL:
○ easy to index from MySQL
○ SphinxQL - access and query Sphinx using any
MySQL client
4. MySQL vs Sphinx fulltext index
● B-tree index
● easy to update
frequently, easy to
access by PK
● columnar storage
● OLTP
● inverted index
● hard to update, fast
to read
● keyword based
storage
● OLAP
5. Simple fulltext search
MySQL:
mysql> SELECT * FROM myindex
WHERE MATCH('title,content') AGAINST ('find me fast');
Sphinx:
mysql> SELECT * FROM myindex
WHERE MATCH('find me fast');
6. More complete Sphinx search
mysql> SELECT * FROM index WHERE
MATCH('"a quorum search is made here"/4')
ORDER BY WEIGHT() DESC, id ASC
OPTION ranker = expr(
'sum(
exact_hit+10*(min_hit_pos==1)+lcs*(0.1*my_attr)
)*1000 +
bm25'
);
7. Searching only on some fields
● Not possible in MySQL, need to declare
separate index
● in Sphinx - syntax operator:
mysql> SELECT * FROM myindex
WHERE MATCH(‘@(title,content) find me fast’);
10. Ranking factors formulas
● bm25
● LCS - distance between query and
document
● word and hit counting
● tf_idf and idf
● word positioning
● possible to use attribute values
11. Ranking without field weighting
mysql> SELECT id,title,weight() FROM wikipedia WHERE MATCH('inverted index') OPTION
ranker=expr('sum(hit_count*user_weight)'), field_weights=(title=1,body=1);
+-----------+----------------------------------------------------------+----------+
| id
| title
| weight() |
+-----------+----------------------------------------------------------+----------+
| 221501516 | Index (search engine)
|
125 |
| 221487412 | Inverted index
|
47 |
Doc. 221501516: 1 hit in ‘title’ x 100 + 124 hits in ‘body’ = 125
Doc. 221487412: 2 hits in ‘title’x 100 +
45 hits in ‘body’ = 47
12. Ranking with field weighting
mysql> SELECT id,title,WEIGHT() FROM index WHERE MATCH('inverted index') OPTION
ranker=expr('sum(hit_count*user_weight)'), field_weights=(title=100,body=1);
+-----------+------------------------------------------------------+----------+
| id
| title
| WEIGHT() |
+-----------+------------------------------------------------------+----------+
| 221487412 | Inverted index
|
245 |
| 221501516 | Index (search engine)
|
224 |
Doc. 221501516: 1
hit in ‘title’ x 100 + 124 hits in ‘body’ = 100+124 = 224
Doc. 221487412: 2 hits in ‘title’ x 100 +
45 hits in ‘body’ = 200 +45 = 245
13. Words proximity
mysql> SELECT id,title,WEIGHT() FROM index
WHERE MATCH('@title list of football players') OPTION ranker=expr('sum(lcs)');
+-----------+-----------------------------------------------------+----------+
| id
| title
| weight() |
+-----------+-----------------------------------------------------+----------+
| 207381464 | List of football players from Amsterdam
|
4 |
| 221196229 | List of Football Kingz F.C. players
|
3 |
| 210456301 | List of Florida State University football players
|
2 |
+-----------+-----------------------------------------------------+----------+
14. word and hit count
mysql> SELECT id,title,WEIGHT() AS w FROM index WHERE MATCH('@title php | api') OPTION
ranker=expr('sum(hit_count)');
+---------+----------------------------------------------------------+------+
| id
| title
| w
|
+---------+----------------------------------------------------------+------+
| 1000671 | PHP API gives PHP Warnings - tips?
|
3 |
...
mysql> SELECT id,title,WEIGHT() AS w FROM index WHERE MATCH('@title php | api') OPTION
ranker=expr('sum(word_count)');
+---------+----------------------------------------------------------+------+
| id
| title
| w
|
+---------+----------------------------------------------------------+------+
| 1000671 | PHP API gives PHP Warnings - tips?
|
2 |
15. Position
mysql> select id,title,weight() as w from forum where match('@title sphinx php api')
option ranker=expr('sum(min_hit_pos)');
+---------+--------------------------------------------------------------+------+
| id
| title
| w
|
+---------+--------------------------------------------------------------+------+
| 1004955 | how can i do a sample search use sphinx php api
|
9 |
| 1004900 | How to update fulltext field using sphinx api of PHP?
|
7 |
| 1008783 | Update MVA-Attributes with the PHP-API Sphinx 2.0.2
|
6 |
| 1000498 | Limits in sphinx when using PHP sphinx API
|
3 |
how can i do a sample search use sphinx php api
1
2
3 4
5
6
7
8
9
16. IDF
mysql> select id,title,weight() from wikipedia where match('@title (Polyphonic |
Polysyllabic | Oberheim) ') option ranker=expr('sum(max_idf)*1000');
+-----------+---------------------------+----------+
| id
| title
| weight() |
+-----------+---------------------------+----------+
| 165867281 | The Polysyllabic Spree
|
112 | Polysyllabic - rare
| 208650218 | Oberheim Xpander
|
108 | Oberheim - not so rare
| 209138112 | Oberheim OB-8
|
108 |
| 180503990 | Polyphonic Era
|
85 | Polyphonic - common
| 183135294 | Polyphonic C sharp
|
85 |
| 219939232 | Polyphonic HMI
|
85 |
+-----------+---------------------------+----------+
18. Language morphology
Will the user search ‘shirt’ or ‘shirts’?
● stemming:
○ shirt = shirts
● index_exact_form for exact matching
● lemmatization:
○
men = man
19. EF-S 18-200mm f/3.5-5.6
blend_chars
● act as both separators and valid chars
● 10-200mm with - blended will index 3 terms:
10-200mm, 10 and 200mm
● leading or trailing blend char behaviour can
be configured to be stripped or indexed
20. Sentence delimitation
mysql> INSERT INTO index VALUES(1,
'quick brown fox jumps over the lazy dog');
mysql> INSERT INTO index VALUES(2,
'The quick brown fox made it.
Where was the lazy dog?');
mysql> SELECT * FROM index WHERE
MATCH('brown fox SENTENCE lazy dog');
+------+
| id
|
+------+
|
1 |
+------+
21. Paragraph delimitation
mysql> INSERT
'<p>The quick
mysql> INSERT
'<p>The quick
dog</p>');
INTO index VALUES(1,
brown fox jumps over the lazy dog</p>');
INTO index VALUES(2,
brown fox jumps</p><p>over the lazy
mysql> SELECT * FROM index WHERE
MATCH('brown fox PARAGRAPH lazy dog');
+------+
| id
|
+------+
|
1 |
+------+
22. More fulltext features
●
●
●
●
●
●
bigrams
more ranking factors: lccs, wlccs, atc
phrase boundary chars
HTML index attributes, elements removal
RLP Chinese tokenization
position step tunning