There are number of players that provide full text search feature, starting from embedded search to dedicated search servers [solr, sphinx, elasticsearch etc], but setting up and configuring them is a time consuming process and requires considerable knowledge of the tools.
What if we could get comparable search results using full text search capabilities of Postgres. Developers already have the working knowledge of the database, so this should come natural. In addition to that, it will be one less tool to manage.
Code: https://github.com/Syerram/postgres_search
What is the best full text search engine for Python?Andrii Soldatenko
Nowadays we can see lot’s of benchmarks and performance tests of different web frameworks and Python tools. Regarding to search engines, it’s difficult to find useful information especially benchmarks or comparing between different search engines. It’s difficult to manage what search engine you should select for instance, ElasticSearch, Postgres Full Text Search or may be Sphinx or Whoosh. You face a difficult choice, that’s why I am pleased to share with you my acquired experience and benchmarks and focus on how to compare full text search engines for Python.
A comparison of different solutions for full-text search in web applications using PostgreSQL and other technology. Presented at the PostgreSQL Conference West, in Seattle, October 2009.
Social phenomena is coming. We have lot’s of social applications that we are using every day, let’s say Facebook, twitter, Instagram. Lot’s of such kind apps based on social graph and graph theory. I would like to share my knowledge and expertise about how to work with graphs and build large social graph as engine for Social network using python and Graph databases. We'll compare SQL and NoSQL approaches for friends relationships.
What is the best full text search engine for Python?Andrii Soldatenko
Nowadays we can see lot’s of benchmarks and performance tests of different web frameworks and Python tools. Regarding to search engines, it’s difficult to find useful information especially benchmarks or comparing between different search engines. It’s difficult to manage what search engine you should select for instance, ElasticSearch, Postgres Full Text Search or may be Sphinx or Whoosh. You face a difficult choice, that’s why I am pleased to share with you my acquired experience and benchmarks and focus on how to compare full text search engines for Python.
A comparison of different solutions for full-text search in web applications using PostgreSQL and other technology. Presented at the PostgreSQL Conference West, in Seattle, October 2009.
Social phenomena is coming. We have lot’s of social applications that we are using every day, let’s say Facebook, twitter, Instagram. Lot’s of such kind apps based on social graph and graph theory. I would like to share my knowledge and expertise about how to work with graphs and build large social graph as engine for Social network using python and Graph databases. We'll compare SQL and NoSQL approaches for friends relationships.
Full text search in PostgreSQL is a flexible and powerful facility to search collection of documents using natural language queries. We will discuss several new improvements of FTS in PostgreSQL 9.6 release, such as phrase search, better dictionaries support and tsvector editing functions. Also, we will present new features currently in development - RUM index support, which enables acceleration of some important kinds of full text queries, new and better ranking function for relevance search, loading dictionaries into shared memory and support for search multilingual content.
What's the great thing about a database? Why, it stores data of course! However, one feature that makes a database useful is the different data types that can be stored in it, and the breadth and sophistication of the data types in PostgreSQL is second-to-none, including some novel data types that do not exist in any other database software!
This talk will take an in-depth look at the special data types built right into PostgreSQL version 9.4, including:
* INET types
* UUIDs
* Geometries
* Arrays
* Ranges
* Document-based Data Types:
* Key-value store (hstore)
* JSON (text [JSON] & binary [JSONB])
We will also have some cleverly concocted examples to show how all of these data types can work together harmoniously.
Quite often "new" people are only "new" to Postgres. This is my summary of do's and don'ts when it comes to teaching Postgres, what to take note on, with emphasis on teaching
Developing and Deploying Apps with the Postgres FDWJonathan Katz
I couldn't wait to use the Postgres Foreign Data Wrapper (postgres_fdw) in a project; imagine being able to read and write data to many databases all from a single database! I finally found a project where it made sense to use this amazing technology.
I mapped out my architecture and began to code, and realized there were some things that did not work as expected: I could not call remote functions or insert into a table with a serial primary key and have it autoupdate. I found workarounds (which I will share), so the project went on.
We tested the setup, everything seemed to work well, and then we went to deploy to production. And then the real fun began.
Despite the title, I still love the Postgres FDW but wanted to provide some cautionary tales from a hybrid developer/DBA perspective on how to properly use them in your working environment. This talk will cover:
* Basic Postgres FDW setup in a development environment vs. production environment
* Handling some common FDW uses case that you think are trivial but are not
* Working with advanced Postgres constructs such as schemas and sequences with FDWs
* Putting it all together to make sure your production application is safe with your FDWs
* ...and when you really, really need to make a remote call and it is not supported by a FDW, how to do that too!
JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course PROIDEA
While losing to Oracle in features, it's losing marginally. While not so long on the market, it's still second best. While not so funky and shiny like new NoSQL DBs, it's arguably most shiny of all relational DBs and it has a colourful history. So, let me tell you about Postgresql architecture and internals, walk you through query path and optimization, let me hint about no hinting and how and why, in another thread we'll talk about MVCC and vacuum and if there will be time for more, we'll have a round of questions.
This talk moves beyond the standard introduction into Elasticsearch and focuses on how Elasticsearch tries to fulfill its near-realtime contract. Specifically, I’ll show how Elasticsearch manages to be incredibly fast while handling huge amounts of data. After a quick introduction, we will walk through several search features and how the user can get the most out of the Elasticsearch. This talk will go under the hood exploring features like search, aggregations, highlighting, (non-)use of probabilistic data structures and more.
The latest version of my PostgreSQL introduction for IL-TechTalks, a free service to introduce the Israeli hi-tech community to new and interesting technologies. In this talk, I describe the history and licensing of PostgreSQL, its built-in capabilities, and some of the new things that were added in the 9.1 and 9.2 releases which make it an attractive option for many applications.
How EverTrue is building a donor CRM on top of ElasticSearch. We cover some of the issues around scaling ElasticSearch and which aspects of ElasticSearch we are using to deliver value to our customers.
An overview of how a web search engine is organized is provided. A key component of the AltaVista search engine: its indexing library, is described in more depth. The library manages a set of inverted files, and provides mechanisms to construct and optimize complex queries on those inverted files. The design goals were to enable efficient queries on bodies of text up to a few hundred gigabytes in size (e.g. AltaVista) without sacrificing too much generality, and without giving up on small applications (e.g. mail directories).
Full text search | Speech by Matteo Durighetto | PGDay.IT 2013 Miriade Spa
Slide dell'intervento di Matteo Durighetto al PGDay.IT 2013, Prato, 25 Ottobre 2013
Il Full Text Search nasce dall’esigenza di ricercare parole o loro derivati all’interno di un documento. Infatti non sempre il problema è risolubile con le espressioni regolari, basti pensare ai plurali irregolari (per cui il problema del matching necessità di un dizionario) o al problema di calcolare la similarità di una parola (ad esempio per cercare l’argomento più attinente e farne una classifica).
In questo talk andremo ad esplorare le peculirità di PostgreSQL e le sue potenzialità al riguardo.
Search and information discovery is a huge part of almost any modern site.
Solr is an incredibly powerful search tool that allows us to quickly add advanced search capabilities such as full-text search, faceting, autocomplete and spelling suggestions to our projects without much effort. We will be using 'django-haystack' to communicate between Django and Solr.
Full text search in PostgreSQL is a flexible and powerful facility to search collection of documents using natural language queries. We will discuss several new improvements of FTS in PostgreSQL 9.6 release, such as phrase search, better dictionaries support and tsvector editing functions. Also, we will present new features currently in development - RUM index support, which enables acceleration of some important kinds of full text queries, new and better ranking function for relevance search, loading dictionaries into shared memory and support for search multilingual content.
What's the great thing about a database? Why, it stores data of course! However, one feature that makes a database useful is the different data types that can be stored in it, and the breadth and sophistication of the data types in PostgreSQL is second-to-none, including some novel data types that do not exist in any other database software!
This talk will take an in-depth look at the special data types built right into PostgreSQL version 9.4, including:
* INET types
* UUIDs
* Geometries
* Arrays
* Ranges
* Document-based Data Types:
* Key-value store (hstore)
* JSON (text [JSON] & binary [JSONB])
We will also have some cleverly concocted examples to show how all of these data types can work together harmoniously.
Quite often "new" people are only "new" to Postgres. This is my summary of do's and don'ts when it comes to teaching Postgres, what to take note on, with emphasis on teaching
Developing and Deploying Apps with the Postgres FDWJonathan Katz
I couldn't wait to use the Postgres Foreign Data Wrapper (postgres_fdw) in a project; imagine being able to read and write data to many databases all from a single database! I finally found a project where it made sense to use this amazing technology.
I mapped out my architecture and began to code, and realized there were some things that did not work as expected: I could not call remote functions or insert into a table with a serial primary key and have it autoupdate. I found workarounds (which I will share), so the project went on.
We tested the setup, everything seemed to work well, and then we went to deploy to production. And then the real fun began.
Despite the title, I still love the Postgres FDW but wanted to provide some cautionary tales from a hybrid developer/DBA perspective on how to properly use them in your working environment. This talk will cover:
* Basic Postgres FDW setup in a development environment vs. production environment
* Handling some common FDW uses case that you think are trivial but are not
* Working with advanced Postgres constructs such as schemas and sequences with FDWs
* Putting it all together to make sure your production application is safe with your FDWs
* ...and when you really, really need to make a remote call and it is not supported by a FDW, how to do that too!
JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course PROIDEA
While losing to Oracle in features, it's losing marginally. While not so long on the market, it's still second best. While not so funky and shiny like new NoSQL DBs, it's arguably most shiny of all relational DBs and it has a colourful history. So, let me tell you about Postgresql architecture and internals, walk you through query path and optimization, let me hint about no hinting and how and why, in another thread we'll talk about MVCC and vacuum and if there will be time for more, we'll have a round of questions.
This talk moves beyond the standard introduction into Elasticsearch and focuses on how Elasticsearch tries to fulfill its near-realtime contract. Specifically, I’ll show how Elasticsearch manages to be incredibly fast while handling huge amounts of data. After a quick introduction, we will walk through several search features and how the user can get the most out of the Elasticsearch. This talk will go under the hood exploring features like search, aggregations, highlighting, (non-)use of probabilistic data structures and more.
The latest version of my PostgreSQL introduction for IL-TechTalks, a free service to introduce the Israeli hi-tech community to new and interesting technologies. In this talk, I describe the history and licensing of PostgreSQL, its built-in capabilities, and some of the new things that were added in the 9.1 and 9.2 releases which make it an attractive option for many applications.
How EverTrue is building a donor CRM on top of ElasticSearch. We cover some of the issues around scaling ElasticSearch and which aspects of ElasticSearch we are using to deliver value to our customers.
An overview of how a web search engine is organized is provided. A key component of the AltaVista search engine: its indexing library, is described in more depth. The library manages a set of inverted files, and provides mechanisms to construct and optimize complex queries on those inverted files. The design goals were to enable efficient queries on bodies of text up to a few hundred gigabytes in size (e.g. AltaVista) without sacrificing too much generality, and without giving up on small applications (e.g. mail directories).
Full text search | Speech by Matteo Durighetto | PGDay.IT 2013 Miriade Spa
Slide dell'intervento di Matteo Durighetto al PGDay.IT 2013, Prato, 25 Ottobre 2013
Il Full Text Search nasce dall’esigenza di ricercare parole o loro derivati all’interno di un documento. Infatti non sempre il problema è risolubile con le espressioni regolari, basti pensare ai plurali irregolari (per cui il problema del matching necessità di un dizionario) o al problema di calcolare la similarità di una parola (ad esempio per cercare l’argomento più attinente e farne una classifica).
In questo talk andremo ad esplorare le peculirità di PostgreSQL e le sue potenzialità al riguardo.
Search and information discovery is a huge part of almost any modern site.
Solr is an incredibly powerful search tool that allows us to quickly add advanced search capabilities such as full-text search, faceting, autocomplete and spelling suggestions to our projects without much effort. We will be using 'django-haystack' to communicate between Django and Solr.
Scaling search to a million pages with Solr, Python, and Djangotow21
A talk given to DJUGL on the 26th July 2010, describing and introducing Solr, and discussing how we use it at Timetric to drive navigation across over a million dataseries.
Обзорный доклад про базовое внутреннее устройство любого современного поискового движка. Про сжатые списки документов и позиций, как затем с ними работает поиск совпадающих документов (и разные операторы), как устроено ранжирование найденных документов, как бывают устроены и работают с фильтрацией и агрегацией дополнительные (нетекстовые) атрибуты документов. По возможности, упоминание всех известных вариантов реализаций (как, вообще, можно, как сделано в Sphinx, как в Lucene).
Practical continuous quality gates for development processAndrii Soldatenko
There are a lot of books and publications about the continuous integration in the world. But in my experience it’s difficult to find information about how to open quality gates between automated tests and to continuous integration practice to in your current project. After reading several articles and even a couple of books you will understand how to work with it. But what next? I will share with you practical tips and tricks on how to lift iron curtain to your automated tests before a continuous quality practice today. It is for this reason why I am pleased to share with you my acquired experience in my presentation.
1. Вводная часть: базовые понятия и определения
1.1. Что такое “файл”
1.2. Роль файлов в современном мире, миф о ненужности файлов
1.3. Файловое хранилище АКА файловая система
1.3.1. внутреннее устройство
1.3.1.1. винтажные и журналируемые. зачем нужен журнал
1.3.1.2. плоские и иерархические
1.3.1.3. контроль доступа
1.3.2. POSIX
1.3.2.1. произвольное чтение
1.3.2.2. произвольная запись
1.3.2.3. атомарные операции
1.3.3. bells and whistles
1.3.3.1. сжатие, шифрование, дедупликация
1.3.3.2. snapshots
1.4. кеширование чтения и записи
2. HighLoad - это сеть
2.1. что вообще такое “HighLoad”, или “ведет ли кроилово к попадалову”
2.2. протоколы доступа: stateless и stateful
2.3. отказоустойчивость и ее двуличие
2.3.1. целостность данных
2.3.2. бесперебойные запись и чтение
2.4. Теорема CAP
3. Так в чем проблема?
3.1. Берем большую-пребольшую СХД и…
3.1.1. локальный кеш?!
3.1.2. конкурентная запись?!!
3.1.3. Берем OCFS2 и…
3.1.3.1. Как “падают виртуалки”?!
3.1.3.2. И почему так медленно?
3.1.4. А еще большую-пребольшую СХД довольно трудно получить в свое распоряжение
3.2. Берем CEPH/Lustre/LeoFS и…
3.2.1. Почему так медленно?!
3.2.2. Что значит “ребалансинг”?!
3.3. И немного о резервном копировании
3.3.1. Резервное копирование - это не отказоустойчивость
3.4. И снова про атомарные операции
3.5. Так почему все-таки нельзя просто сложить файлы в базу?
4. Что же делать?
4.1. В первую очередь это зависит от того, какова наша задача
4.1.1. А надо ли экономить?
4.1.2. POSIX - нужен ли он?
4.1.3. Большие файлы - нужны ли они?
4.1.4. Атомарные операции - нужны ли они?
4.1.5. Версионирование - нужно ли версионирование?
4.1.6. Насколько большим должно быть наше хранилище?
4.1.7. И собираемся ли мы удалять файлы?
4.1.8. И каков будет профиль нагрузки?
4.2. I’m feeling lucky - для некоторых сочет�
The final complete poem written Edgar Allan Poe, the famous American Poet. Many believe Poe's wife, Virginia Eliza Clemm Poe, is the most likely inspiration.
cia, electronic warfare, freya, klystron, lawrence livermonre, nro, nsa, radiation lab, robert oppenheimer, secret history, secret history of silicon valley, silicon valley, stanford, steve blank, terman, twt, wasserman, wwii
Search Engine-Building with Lucene and Solr, Part 1 (SoCal Code Camp LA 2013)Kai Chan
These are the slides for the session I presented at SoCal Code Camp Los Angeles on November 10, 2013.
http://www.socalcodecamp.com/socalcodecamp/session.aspx?sid=cc1e6803-b0ec-4832-b8df-e15ea7bd7694
Search Engine-Building with Lucene and SolrKai Chan
These are the slides for the session I presented at SoCal Code Camp San Diego on July 27, 2013.
http://www.socalcodecamp.com/socalcodecamp/session.aspx?sid=6b28337d-6eae-4003-a664-5ed719f43533
Beyond Wordcount with spark datasets (and scalaing) - Nide PDX Jan 2018Holden Karau
Apache Spark is one of the most popular big data systems, but once the shiny finish starts to wear off you can find yourself wondering if you've accidentally deployed a Ford Pinto into production. This talk will look at the challenges that come with scaling Spark jobs. Also, the talk will explore Spark's new(ish) Dataset/DataFrame API, as well as how it’s evolving in Spark 2.3 with improved Python support.
If you're already a Spark user, come to find out why it’s not all your fault. If you aren't already a Spark user, come to find out how to save yourself from some of the pitfalls once you move beyond the example code.
Check out Holden's newest book, High Performance Spark, for more information!
From https://niketechtalksjan2018.splashthat.com/
Big Data Grows Up - A (re)introduction to CassandraRobbie Strickland
For the last several years Cassandra has been the heavyweight in the NoSQL space. But its massive scalability was accompanied by a bare bones feature set, a substantial learning curve, and a Thrift-based RPC mechanism that left newbies bewildered by a sea of potential client libraries–all with their own fragmented semantics. Over the last year that’s all changed, culminating in the recently unveiled Cassandra 2.0. In this talk I’ll bring you up to speed on Cassandra Query Language, cursors, the new native libraries, lightweight transactions, virtual nodes, and loads of other new goodies. Whether you’re completely new to Cassandra or a seasoned veteran who wants the latest scoop, this talk has something for you.
How does a full-text search engine works? How is the index built and searched? Can I use PostgreSQL as a full-text search engine or should I go for a more specialised solution? How does one configure and use PostgreSQL search?
This presentation covers all those aspects, based on the work we did to index teowaki.com. It was presented at PgConf EU 2014 in Madrid
PostgreSQL - It's kind've a nifty databaseBarry Jones
This presentation was given to a company that makes software for churches that is considering a migration from SQL Server to PostgreSQL. It was designed to give a broad overview of features in PostgreSQL with an emphasis on full-text search, various datatypes like hstore, array, xml, json as well as custom datatypes, TOAST compression and a taste of other interesting features worth following up on.
[Session given at Engage 2019, Brussels, 15 May 2019]
In this session, Tim Davis (Technical Director at The Turtle Partnership Ltd) takes you through the new Domino Query Language (DQL), how it works, and how to use it in LotusScript, in Java, and in the new domino-db Node.js module. Introduced in Domino 10, DQL provides a simple, efficient and powerful search facility for accessing Domino documents. Originally only used in the domino-db Node.js module, with 10.0.1 DQL also became available to both LotusScript and Java. This presentation will provide code examples in all three languages, ensuring you will come away with a good understanding of DQL and how to use it in your projects.
Out of the box, Accumulo's strengths are difficult to appreciate without first building an application that showcases its capabilities to handle massive amounts of data. Unfortunately, building such an application is non-trivial for many would-be users, which affects Accumulo's adoption.
In this talk, we introduce Datawave, a complete ingest, query, and analytic framework for Accumulo. Datawave, recently open-sourced by the National Security Agency, capitalizes on Accumulo's capabilities, provides an API for working with structured and unstructured data, and boasts a robust, flexible, and scalable backend.
We'll do a deep dive into Datawave's project layout, table structures, and APIs in addition to demonstrating the Datawave quickstart—a tool that makes it incredibly easy to hit the ground running with Accumulo and Datawave without having to develop a complete application.
Data Exploration with Apache Drill: Day 1Charles Givre
Study after study shows that data scientists and analysts spend between 50% and 90% of their time preparing their data for analysis. Using Drill, you can dramatically reduce the time it takes to go from raw data to insight. This course will show you how.
The course material for this presentation are available at https://github.com/cgivre/data-exploration-with-apache-drill
HelsinkiJS - Clojurescript for Javascript DevelopersJuho Teperi
Web development is nowadays dominated by many compile to JS languages. ClojureScript is one of such languages. This talk will give overview of ClojureScript ecosystem.
MYSQL Query Anti-Patterns That Can Be Moved to SphinxPythian
PalominoDB European Team lead, Vladimir Fedorkov will be discussing how to handle query bottlenecks that can result from increases in dataset and traffic
Similar to Full Text search in Django with Postgres (20)
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
2. Search is everywhere
Search expectations
● FAST
● Full Text search
● Linguistic support (“craziness | crazy”)
● Ranking
● Fuzzy Searching
● More like this
3. Django
● SLOW
● `icontains` is dumbed down version of
search
● Searching across tables is pain
● No relevancy, ranking or similar words
unless done manually
● No easy way for fuzzy searching
4. Other Alternatives
● Solr
● ElasticSearch
● AWS CloudSearch
● Sphinx
● etc*
If you’re using any of the above, use Haystack
5. Postgres Search
● FAST
● Simple to implement
● Supports Search features like Full Text,
Ranking, Boosting, Fuzzy etc..
6. Django
Live Example
● Search Students by name or by course
● Use South migration to create tsvector
column
● Store title in Search table
● Update Search table via Celery on Save of
Student data
https://github.com/Syerram/postgres_search
7. GIN, GIST
● GIST is Hash based, GIN is B-trees
● GINs = GISTs * 3 , s = Speed
● GINu = GISTu * 3 , u = update time
● GINkb = GISTkb * 3, kb = size
A gin index
CREATE INDEX student_index ON students USING gin(to_tsvector('english'
name));
Source http://www.postgresql.org/docs/9.2/static/textsearch-indexes.html
8. Full Text Search
● All text should be preprocessed using
tsvector and queried using tsquery
● Both reduce the text to lexemes
SELECT to_tsvector('How much wood would a woodchuck chuck If a woodchuck could
chuck wood?')
"'chuck':7,12 'could':11 'much':2 'wood':3,13 'woodchuck':6,10 'would':4"
● Both are required for searching to work on
normal text
SELECT to_tsvector('How much wood would a woodchucks chucks If a woodchucks could
chucks woods?') @@ 'chucks' -- False
SELECT to_tsvector('How much wood would a woodchucks chucks If a woodchucks could
chucks woods?') @@ to_tsquery('chucks') -- True
9. Full Text Search (Contd.)
● Technically you don’t need index, but for
large tables it will be slow
SELECT * FROM students where to_tsvector('english', name) @@ to_tsquery('english',
'Kirk')
● GIN or GIST Index
CREATE INDEX <index_name> ON <table_name> USING gin(<col_name>);
● Expression Based
CREATE INDEX <index_name> ON <table_name> USING gin(to_tsvector(COALESCE(col_name,'')
|| COALESCE(col_name,'')));
10. Boosting
● Boost certain results over others
● Still matching
● Use ts_rank to boost results
e.g.
…ORDER BY ts_rank(document,
to_tsquery('python')) DESC
11. Ranking
● Importance of search term within document
e.g.
Search term found in title > description > tag
● Use setweight to assign importance to each field
when preparing Document
e.g.
setweight(to_tsvector(‘english’, post.title), 'A') ||
setweight(to_tsvector(‘english’, post.description), 'B') ||
setweight(to_tsvector('english', post.tags), 'C'))
...
--In search query use ‘ts_rank’ to order by ranking
12. Trigram
● Group of 3 consecutive chars from String
● Similarity between strings is matched by # of
trigrams they share
e.g. "hello": "h", "he", "hel", "ell", "llo", "lo", and "o”
"hallo": "h", "ha", "hal", "all", "llo", "lo", and "o”
Number of matches: 4
● Use similarity to find related terms. Returns value
between 0 to 1 where 0 no match and 1 is exact match
13. Soundex/Metaphone
● Oldest and only good for English names
● Converts to a String of Length 4.
e.g. “Anthony == Anthoney” => “A535 ==
A535”
● Create index itself with Soundex or
Metaphone
e.g. CREATE INDEX idx_name ON tb_name USING
GIN(soundex(col_name));
SELECT ... FROM tb_name WHERE soundex(col_name) = soundex(‘...’)
14. Pro & Con
Pros
● Quick implementation
● Lot easier to change document format and call refresh index
● Speed comparable to other search engines
● Cost effective
Cons
● Not as flexible as pure search engines, like Solr
● Not as fast as Solr though pretty fast for humans
● Tied to Postgres
● Indexes can get pretty large, but so can search engine indexes
15. Django ORM
● Implements Full text Search
class StudentCourse(models.Model):
...
search_index = VectorField()
objects = SearchManager(
fields = ('student__user__name', 'course__name'),
config = 'pg_catalog.english', # this is default
search_field = 'search_index', # this is default
auto_update_search_field = True
)
● StudentCourse.objects.search("David")
https://github.com/djangonauts/djorm-ext-pgfulltext
16. Next Steps
● Add Ranking, Boosting, Fuzzy Search to
djorm pgfulltext
e.g. StudentCourse.objects.search("David & Python").rank("Python")
StudentCourse.objects.fuzzy_search("Jython").rank("Python")
StudentCourse.objects.soundex("Davad").rank("Java") & More
● Continue to add examples to
postgres_search
17. Tips
● Use separate DB if necessary or use
Materialized Views
● Don’t index everything. Limit your
searchable data
● Analyze using `Explain` and ts_stat
● Create indexes on fly using concurrently
● Don’t pull Foreign Key objects in search