This document provides an overview of full text search capabilities including:
- Tokenization, stemming, stop words, and word boundaries to preprocess text for searching
- Language translation examples showing Japanese translations of an English query
- Accuracy considerations for language translation in search
- Performance factors like hosted databases, search indexing, and Rails integration
- Code examples configuring ActiveRecord models to work with a full text search engine
Playing is simple, even a child can do it, but designing something simple is hard. How can we combine prototyping with production software to get our ideas in front of real people? How can we evolve our software over time? How do we measure if something is fun?
I will talk about how Ruby’s flexibility and a strong testing ethos can bring some sanity to this uncertain world. And when I say testing, I’m not just talking about RSpec, Cucumber or Capybara, I’ll share stories from Mightyverse about how we test whether our software actually “works” for the people who use it — sharing failures, I mean, learning, as well as success.
Sarah Allen, Magma Conf 2015
This talk explores power of transparency to create with higher quality at lower cost, looking at open source community process, code and documentation, as well as lean startup open business, customer, and product development processes.
July 2015, Brighton Ruby
Sarah Allen introduces some theories of play and how to apply these and other ideas from games to making other kinds of software fun, and then how our work can be influenced by ideas of play.
Internet security: a landscape of unintended consequencesSarah Allen
Increasingly, software is connected to the internet. How do we design software that will do what it was designed to do without making humans and connected systems vulnerable?
Sarah Allen shares lessons learned from Shockwave and Flash, and the kinds of modern exploits that ought to keep you up at night, along with both modern and time-tested techniques that every developer should know.
Code Mesh LDN 2019
RTMP: how did we get to now? (Demuxed 2019)Sarah Allen
RTMP: web video innovation or Web 1.0 hack… how did we get to now? (Demuxed 2019)
One of the creators of RTMP will take you back to a time before Firefox, Safari, and Chrome, when Internet Explorer was used by the majority of people on the Web, and over 98% of browsers had Flash installed. RTMP was first prototyped in late 2000 and released in July 2002. Sarah Allen shares the untold story of the origins of this protocol — careful design choices and unexpected hacks that led to a de-facto standard that still drives the majority of live web video today.
Playing is simple, even a child can do it, but designing something simple is hard. How can we combine prototyping with production software to get our ideas in front of real people? How can we evolve our software over time? How do we measure if something is fun?
I will talk about how Ruby’s flexibility and a strong testing ethos can bring some sanity to this uncertain world. And when I say testing, I’m not just talking about RSpec, Cucumber or Capybara, I’ll share stories from Mightyverse about how we test whether our software actually “works” for the people who use it — sharing failures, I mean, learning, as well as success.
Sarah Allen, Magma Conf 2015
This talk explores power of transparency to create with higher quality at lower cost, looking at open source community process, code and documentation, as well as lean startup open business, customer, and product development processes.
July 2015, Brighton Ruby
Sarah Allen introduces some theories of play and how to apply these and other ideas from games to making other kinds of software fun, and then how our work can be influenced by ideas of play.
Internet security: a landscape of unintended consequencesSarah Allen
Increasingly, software is connected to the internet. How do we design software that will do what it was designed to do without making humans and connected systems vulnerable?
Sarah Allen shares lessons learned from Shockwave and Flash, and the kinds of modern exploits that ought to keep you up at night, along with both modern and time-tested techniques that every developer should know.
Code Mesh LDN 2019
RTMP: how did we get to now? (Demuxed 2019)Sarah Allen
RTMP: web video innovation or Web 1.0 hack… how did we get to now? (Demuxed 2019)
One of the creators of RTMP will take you back to a time before Firefox, Safari, and Chrome, when Internet Explorer was used by the majority of people on the Web, and over 98% of browsers had Flash installed. RTMP was first prototyped in late 2000 and released in July 2002. Sarah Allen shares the untold story of the origins of this protocol — careful design choices and unexpected hacks that led to a de-facto standard that still drives the majority of live web video today.
Rocky Mountain Ruby 9/30/2016
I share stories and examples from open source, business and community organizing: how communication about what we do is as important as the work itself. I'll also dive into coding as communication with an example of good API design highlighting the expressiveness of the Ruby language.
Feb 2016, Government Transformation conference
Sarah will tell the story about how innovation was inspired at the Federal Government. She will explore what 18F is and how this internal digital agency was formed within government. She will highlight a specific project that has been incredibly successful at encouraging collaboration between federal government employees from different agencies around task sharing. Sarah will also discuss how Open Source software is used by 18F and what impact that has had.
Transparency is a powerful means of making change. Open source increases the speed of software development and leads to higher quality code. These patterns of how we make software are changing how we do business and how our governments work. These aren’t just patterns of how we write code; these are patterns of how we interact with each other, teach and learn new skills, and experiment with new ideas. When we make our work visible, we expand its potential, and increase the chances of dramatic, unexpected impact.
Ruby Conf Taiwan, Sept 12, 2015
Sarah Allen, Mightyverse @mightyverse, AltConf, June 2015
Making your app fun to use requires more than sprinkling a little gamification on top. It requires thoughtful imagination and experimentation. In this talk, I highlight some expert perspectives on theories of play and behavioral psychology, and and how we can apply these ideas in mobile app design. I also share prototyping techniques and how to validate whether a design will actually be fun.
Ruby in the US Government for Ruby World ConferenceSarah Allen
In the United States, Ruby is a common technology choice for startups and is also gaining popularity in large companies. In contrast, Ruby is rarely used for US Government projects. Why do startups favor Ruby while the government makes other choices?
I have been both a startup founder and government employee. After developing a Ruby on Rails web app for my startup Mightyverse from 2009, I worked as a Presidential Innovation Fellow within the Obama administration. I will discuss work in both spheres, and highlight the common themes in the development process.
I love Ruby, but last year I found myself at the Smithsonian Institution coding in, of all things, PHP & Drupal. And I realized that despite my ambivalence towards those technologies, I had no compelling-enough reason to propose Ruby as an alternative. How did we get to this point? I’ll tell 3 reasons we didn't use Ruby, and reflect on whether these are things we want, or problems we should solve.
Sarah Allen talks about her experience as a Presidential Innovation Fellow at the Smithsonian, then poses the question: why was Drupal a good fit for her project, and how did Ruby and Rails fall short?
This is a review of the Transcription projects outside of the Smithsonian. This presentation is not comprehensive. It focuses on looking at the breath of user experience choices for engaging with volunteers.
An overview of video for the mobile web with a "lean startup" case study about how supporting web video on mobile had both expected and unexpected positive effects on Mightyverse metrics.
JRubyConf, May 2012
Test-driven development is mom-and-apple-pie to Rubyists, but knowing that a product will work goes well beyond bug-free code. How do you catch a design flaw early when all your tests are green? We'll look at some techniques for vetting your go-to-market strategy and other things you should be doing *before* you start writing code.
Test First Teaching and the path to TDDSarah Allen
Test First Teaching background and methodology for teaching programming using automated test frameworks, how this relates to (and can lead to learning) test-driven development
Rocky Mountain Ruby 9/30/2016
I share stories and examples from open source, business and community organizing: how communication about what we do is as important as the work itself. I'll also dive into coding as communication with an example of good API design highlighting the expressiveness of the Ruby language.
Feb 2016, Government Transformation conference
Sarah will tell the story about how innovation was inspired at the Federal Government. She will explore what 18F is and how this internal digital agency was formed within government. She will highlight a specific project that has been incredibly successful at encouraging collaboration between federal government employees from different agencies around task sharing. Sarah will also discuss how Open Source software is used by 18F and what impact that has had.
Transparency is a powerful means of making change. Open source increases the speed of software development and leads to higher quality code. These patterns of how we make software are changing how we do business and how our governments work. These aren’t just patterns of how we write code; these are patterns of how we interact with each other, teach and learn new skills, and experiment with new ideas. When we make our work visible, we expand its potential, and increase the chances of dramatic, unexpected impact.
Ruby Conf Taiwan, Sept 12, 2015
Sarah Allen, Mightyverse @mightyverse, AltConf, June 2015
Making your app fun to use requires more than sprinkling a little gamification on top. It requires thoughtful imagination and experimentation. In this talk, I highlight some expert perspectives on theories of play and behavioral psychology, and and how we can apply these ideas in mobile app design. I also share prototyping techniques and how to validate whether a design will actually be fun.
Ruby in the US Government for Ruby World ConferenceSarah Allen
In the United States, Ruby is a common technology choice for startups and is also gaining popularity in large companies. In contrast, Ruby is rarely used for US Government projects. Why do startups favor Ruby while the government makes other choices?
I have been both a startup founder and government employee. After developing a Ruby on Rails web app for my startup Mightyverse from 2009, I worked as a Presidential Innovation Fellow within the Obama administration. I will discuss work in both spheres, and highlight the common themes in the development process.
I love Ruby, but last year I found myself at the Smithsonian Institution coding in, of all things, PHP & Drupal. And I realized that despite my ambivalence towards those technologies, I had no compelling-enough reason to propose Ruby as an alternative. How did we get to this point? I’ll tell 3 reasons we didn't use Ruby, and reflect on whether these are things we want, or problems we should solve.
Sarah Allen talks about her experience as a Presidential Innovation Fellow at the Smithsonian, then poses the question: why was Drupal a good fit for her project, and how did Ruby and Rails fall short?
This is a review of the Transcription projects outside of the Smithsonian. This presentation is not comprehensive. It focuses on looking at the breath of user experience choices for engaging with volunteers.
An overview of video for the mobile web with a "lean startup" case study about how supporting web video on mobile had both expected and unexpected positive effects on Mightyverse metrics.
JRubyConf, May 2012
Test-driven development is mom-and-apple-pie to Rubyists, but knowing that a product will work goes well beyond bug-free code. How do you catch a design flaw early when all your tests are green? We'll look at some techniques for vetting your go-to-market strategy and other things you should be doing *before* you start writing code.
Test First Teaching and the path to TDDSarah Allen
Test First Teaching background and methodology for teaching programming using automated test frameworks, how this relates to (and can lead to learning) test-driven development
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
12. SELECT text FROM phrases WHERE text like '%run%';
Can you run this to the post office for me?
I'm going for a run, want to come along?
Cross country running
I'm too drunk to drive.
I am running out of battery power.
Work is not like wolf - it won't run away.
13. SELECT text FROM phrases WHERE
vectors @@ 'run'::tsquery;
Can you run this to the post office for me?
Sorry I am running really late.
I'm going for a run, want to come along?
Cross country running
I am running out of battery power.
Work is not like wolf - it won't run away.
14.
15. Tokenization and Stemming
Google App Engine /JRuby / Lucene
http://full-text-search.appspot.com
http://
github.com/
ultrasaurus/
full-text-search-appengine
26. a about above after again against all am an and any are
aren't as at be because been before being below between
both but by can't cannot could couldn't did didn't do does
doesn't doing don't down during each few for from further had
hadn't has hasn't have haven't having he he'd he'll he's her
here here's hers herself him himself his how how's i i'd i'll i'm
i've if in into is isn't it it's its itself let's me more most mustn't
my myself no nor not of off on once only or other ought our
ours ourselves out over own same shan't she she'd she'll
she's should shouldn't so some such than that that's the their
theirs them themselves then there there's these they they'd
they'll they're they've this those through to too under until up
very was wasn't we we'd we'll we're we've were weren't what
what's when when's where where's which while who who's
whom why why's with won't would wouldn't you you'd you'll
you're you've your yours yourself yourselves
http://www.ranks.nl/resources/stopwords.html
54. Target Target Source
Text Language Language
We’re running out of daylight en ja
Could you run this? en ja
Cross‐country running en ja
I’m going for a run, want to come along? en ja
60. I’m going for a run, want to come along? en ja
ha shi ri ni iku ke do iAtsho ni ki ma su ka?
Ikuko Kobayashi
2009‐11‐29 20:36:47 UTC
hAp://….16ec695a‐8fce‐4277‐bdd4.flv
61. I’m going for a run, want to come along? en ja
ha shi ri ni iku ke do iAtsho ni ki ma su ka?
Ikuko Kobayashi
2009‐11‐29 20:36:47 UTC
hAp://….16ec695a‐8fce‐4277‐bdd4.flv
hAp://….Japanese_ikuko_kobayashi.jpg
Postgres: In database “tsvector” , partial indexes, acts_as_tsearch\n\nMySql FULLTEXT indices are fully indexed fields which support stopwords, boolean searches, and relevancy ratings: http://onlamp.com/pub/a/onlamp/2003/06/26/fulltext.html\nNote: MySql FULLTEXT requires MyISAM storage engine\nComparison of MySql vs. PostgresQL: http://www.wikivs.com/wiki/MySQL_vs_PostgreSQL\n\nSolr/Lucene: Separate Index, Language Features: Faceted Search, Similar Documents (you may also like…)\nSphinx typically installed on the same machine, directly accessed your database\n
\n
\n
Word boundaries understood by context in: Chinese, Japanese, Korean, Thai\nCJK word boundaries not handled in MySql 5: http://blogs.sun.com/soapbox/entry/fulltext_and_asian_languages_with\n
\n
\n
Rethinking Full-Text Search for Multilingual DatabasesJeffrey Sorensen and Salim Roukos IBM T. J. Watson Research Center Yorktown Heights, New York <sorenj|roukos>@us.ibm.com\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
Stop words can cause problems when using a search engine to search for phrases that include them, particularly in names such as 'The Who', 'The The', or 'Take That'\nhttp://en.wikipedia.org/wiki/Stop_words\n
think of a blank canvas... don&#x2019;t think about Solr or Sphinx, first think about what people are trying to find and what will help them most. \nMaybe browse is more im\n