How to Gain Greater Business Intelligence from Lucene/Solrlucenerevolution
Presented by Patrick Beaucamp | Bpm-Conseil - See conference video - http://www.lucidimagination.com/devzone/events/conferences/lucene-revolution-2012
Vanilla, an Open Source business intelligence application by bpm-conseil.com, offers unique features such as report indexing through an embedded Lucene integration. Using Vanilla and Lucene, developers can manage both report indexing and external document indexing, which ultimately saves end users time when they search for specific keywords such as "product code," or "customer code." Vanilla can build upon an existing Solr/Lucene installation that takes care of all the indexing processes while Vanilla takes care of the Reporting/Dashboard creation. During this presentation, attendees will learn how we moved from embed Lucene Api to a Solr/Lucene platform and all the technical and business benefits from this architecture in terms of clustering, caching and access mode.
Domain-driven design (DDD) is an approach to developing software for complex needs by deeply connecting the implementation to an evolving model of the core business concepts.
How to Gain Greater Business Intelligence from Lucene/Solrlucenerevolution
Presented by Patrick Beaucamp | Bpm-Conseil - See conference video - http://www.lucidimagination.com/devzone/events/conferences/lucene-revolution-2012
Vanilla, an Open Source business intelligence application by bpm-conseil.com, offers unique features such as report indexing through an embedded Lucene integration. Using Vanilla and Lucene, developers can manage both report indexing and external document indexing, which ultimately saves end users time when they search for specific keywords such as "product code," or "customer code." Vanilla can build upon an existing Solr/Lucene installation that takes care of all the indexing processes while Vanilla takes care of the Reporting/Dashboard creation. During this presentation, attendees will learn how we moved from embed Lucene Api to a Solr/Lucene platform and all the technical and business benefits from this architecture in terms of clustering, caching and access mode.
Domain-driven design (DDD) is an approach to developing software for complex needs by deeply connecting the implementation to an evolving model of the core business concepts.
Got data? Let's make it searchable! This interactive presentation will demonstrate getting documents into Solr quickly, will provide some tips in adjusting Solr's schema to match your needs better, and finally will discuss how showcase your data in a flexible search user interface. We'll see how to rapidly leverage faceting, highlighting, spell checking, and debugging. Even after all that, there will be enough time left to outline the next steps in developing your search application and taking it to production.
Cognitum Ontorion: Knowledge Representation and Reasoning SystemCognitum
Presentation from FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS which was held in Lodz, between 13 - 16 of September, 2015.
Pawel Kaplanski - CTO of Cognitum, presented an approach to hybrid agent architecture, which is implemented on top of a scalable Knowledge Representation & Reasoning (KRR) system.
To learn more, visit our website: http://www.cognitum.eu/
Get ready to lock and load through this quick overview of some of the newest most innovative, tools around. Source: http://www.takipiblog.com/7-new-tools-java-developers-should-know/
[CB19] Spyware, Ransomware and Worms. How to prevent the next SAP tragedy by ...CODE BLUE
On this presentation, we will raise awareness on how the SAP Internet facing systems are particularly vulnerable to Spyware, Ransomware and Worms due to their inherent complexity.
We will also introduce (for the first time in Asia ) the “Project ARSAP”. This project is a semi-automatic mechanism which main goal is to detect and register all the SAP systems that are exposed to the Internet, extracting the system’s metadata and cataloging the assets in base of their Geo-location, system type, version, installed components and potential risk of compromise.
We will present a brief introduction to SAP, defining its architecture / entry points and explain with great detail the methodology behind the “ARSAP” project.
Then, three different scenarios were malware could strike SAP will be showcased. We will start by recreating a real SAP cyber-attack, where a company got attacked via malicious emails and we will move forward to some other complex techniques that could allow anyone, directly from the Internet to compromise the whole Interfacing SAP system and jump to the adjacent network.
This presentation will have several live demos where the attendees will be able to observe the entire attack workflow. We will conclude the presentation by presenting some suggested remediations and conclusions.
You’ve taken your first steps into Node.js. You’ve learned how to initialize your projects, you’ve played with some dependencies, and you’re ready to get into some serious Node work. In this session, we’ll dive further into Node as a framework. We’ll learn how to master Node’s inherently asynchronous nature, take advantage of Node’s events and streams capabilities, and learn about sophisticated Node deployments at scale. Participants will leave with a richer understanding of what Node has to offer and higher confidence in dealing with some of Node’s more difficult concepts.
Data Engineer's Lunch 90: Migrating SQL Data with ArcionAnant Corporation
In Data Engineer's Lunch 90, Eric Ramseur teaches our audience how to use Arcion.
From best practices to real-world examples, this talk will provide you with the knowledge and insights you need to ensure a successful migration of your SQL data. So whether you're new to data migration or looking to improve your existing process, join us and discover how Arcion can help you achieve your goals.
Moved to https://slidr.io/azzazzel/web-application-performance-tuning-beyond-xmxMilen Dyankov
This slide deck will be removed from here in the future. It has been moved to : https://slidr.io/azzazzel/web-application-performance-tuning-beyond-xmx
My talk at the @media Ajax conference in London in November 2007 about the non-technical steps you can take to make JavaScript and Ajax work for larger teams.
* Open source search with Solr/Lucene gives you the power to turn a wide range of information into fast, useful, relevant results!
* LucidWorks for Solr gives you a tested, release-stable certified distribution of open source search with enhanced tools and installation for building search apps quickly and reliably.
http://www.lucidimagination.com/How-We-Can-Help/webinar-from-search-to-found
Attendees will learn how eBay Germany has implemented Solr, why Solr was selected, which Solr features are utilized. and how Solr is configured and used in production. Recommended best practices will be profiled alomng with eBay Kleinanzeigen plans for future deployment of Solr.
2019 StartIT - Boosting your performance with BlackfireMarko Mitranić
A workshop held in StartIT as part of Catena Media learning sessions.
We aim to dispel the notion that large PHP applications tend to be sluggish, resource-intensive and slow compared to what the likes of Python, Erlang or even Node can do. The issue is not with optimising PHP internals - it's the lack of proper introspection tools and getting them into our every day workflow that counts! In this workshop we will talk about our struggles with whipping PHP Applications into shape, as well as work together on some of the more interesting examples of CPU or IO drain.
Just a few years back, lack of a standard way to document, govern or describe a contract for the APIs acted as a deterrent to API adoption within the enterprise. WSDL 2.0 and WADL provided early support, but they couldn’t truly capture the essence of RESTful APIs. Recently we have seen the emergence of several description languages. New ways to describe and document APIs have emerged such as Swagger, RAML, API Blueprint and others, each taking a slightly different approach.
A ridiculously long presentation from IBM Connect 2013, formerly Lotusphere, from Rob Novak @IBMRockStar and Jerald Mahurin @SociallyCurious on the tools, language, and methods we used to transition from Domino, Quickr and overall web developers to becoming IBM Connections 4.0 developers. From the abstract:
With IBM Connections 4.0, IBM has released the most important new platform - yes platform - for social business development since the Notes client. As a Domino developer, you have excelled. Now, faced with an entire new glossary of terms, new concepts in customization and development, and a whole new set of tools, it could take some time to get up to speed. This session will help you cut weeks off that ramp-up time by showing you exactly what a Connections development environment looks like. We'll cover how to choose your tools and toolkits as well as configuration for development and testing. From the fundamentals of skill gap identification to real working samples, this session is sure to give you a huge head start.
BYO/DIY Analytics Platform (MeasureCamp Presentation by Clancy Childs)Clancy Childs
The slides accompanying Clancy Childs' talk at Measurecamp V (2014) in London. Might be missing a lot if you weren't at the session, but basically covering some of the design decisions, pitfalls, technology choices and requirements when choosing to build your own analytics / eventing platform and data warehouse.
Benchmarking Web Application Scanners for YOUR OrganizationDenim Group
Web applications pose significant risks for organizations. The selection of an appropriate scanning product or service can be challenging because every organization develops their web applications differently and decisions made by developers can cause wide swings in the value of different scanning technologies. To make a solid, informed decision, organizations need to create development team- and organization-specific benchmarks for the effectiveness of potential scanning technologies. This involves creating a comprehensive model of false positives, false negatives and other factors prior to mandating analysis technologies and making decisions about application risk management. This presentation provides a model for evaluating application analysis technologies, introduces an open source tool for benchmarking and comparing tool effectiveness, and outlines a process for making organization-specific decisions about analysis technology selection.
Text Classification Powered by Apache Mahout and Lucenelucenerevolution
Presented by Isabel Drost-Fromm, Software Developer, Apache Software Foundation/Nokia Gate 5 GmbH at Lucene/Solr Revolution 2013 Dublin
Text classification automates the task of filing documents into pre-defined categories based on a set of example documents. The first step in automating classification is to transform the documents to feature vectors. Though this step is highly domain specific Apache Mahout provides you with a lot of easy to use tooling to help you get started, most of which relies heavily on Apache Lucene for analysis, tokenisation and filtering. This session shows how to use facetting to quickly get an understanding of the fields in your document. It will walk you through the steps necessary to convert your text documents into feature vectors that Mahout classifiers can use including a few anecdotes on drafting domain specific features.
Configure
Presented by Markus Klose, Search + Big Data Consultant SHI Elektronische Medien GmbH at Lucene/Solr Revolution 2013 Dublin
Kibana4Solr is search-driven, scalable, browser based and extremely user friendly (also for non-technical users). Logs are everywhere. Any device, system or human can potentially produce a huge amount of information saved in logs. The amount of available logs and their semi-structured nature make a meaningful processing in real-time quite a difficult task. Thus, valuable business insights stored in logs might be not found. Kibana4Solr is a search-driven approach to handle that challenge. It offers user-friendly and browser-based dashboard which can be easily customized to particular needs. In the session the Kibana4Solr will be introduced. Some light will be shed on the architectural features of Kibana4Solr. Some ideas will be given in terms of possible business uses cases. And finally a live demo of Kibana4Solr will be shown.
Configure
More Related Content
Similar to The Seven Deadly Sins of Solr - By Jay Hill
Got data? Let's make it searchable! This interactive presentation will demonstrate getting documents into Solr quickly, will provide some tips in adjusting Solr's schema to match your needs better, and finally will discuss how showcase your data in a flexible search user interface. We'll see how to rapidly leverage faceting, highlighting, spell checking, and debugging. Even after all that, there will be enough time left to outline the next steps in developing your search application and taking it to production.
Cognitum Ontorion: Knowledge Representation and Reasoning SystemCognitum
Presentation from FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS which was held in Lodz, between 13 - 16 of September, 2015.
Pawel Kaplanski - CTO of Cognitum, presented an approach to hybrid agent architecture, which is implemented on top of a scalable Knowledge Representation & Reasoning (KRR) system.
To learn more, visit our website: http://www.cognitum.eu/
Get ready to lock and load through this quick overview of some of the newest most innovative, tools around. Source: http://www.takipiblog.com/7-new-tools-java-developers-should-know/
[CB19] Spyware, Ransomware and Worms. How to prevent the next SAP tragedy by ...CODE BLUE
On this presentation, we will raise awareness on how the SAP Internet facing systems are particularly vulnerable to Spyware, Ransomware and Worms due to their inherent complexity.
We will also introduce (for the first time in Asia ) the “Project ARSAP”. This project is a semi-automatic mechanism which main goal is to detect and register all the SAP systems that are exposed to the Internet, extracting the system’s metadata and cataloging the assets in base of their Geo-location, system type, version, installed components and potential risk of compromise.
We will present a brief introduction to SAP, defining its architecture / entry points and explain with great detail the methodology behind the “ARSAP” project.
Then, three different scenarios were malware could strike SAP will be showcased. We will start by recreating a real SAP cyber-attack, where a company got attacked via malicious emails and we will move forward to some other complex techniques that could allow anyone, directly from the Internet to compromise the whole Interfacing SAP system and jump to the adjacent network.
This presentation will have several live demos where the attendees will be able to observe the entire attack workflow. We will conclude the presentation by presenting some suggested remediations and conclusions.
You’ve taken your first steps into Node.js. You’ve learned how to initialize your projects, you’ve played with some dependencies, and you’re ready to get into some serious Node work. In this session, we’ll dive further into Node as a framework. We’ll learn how to master Node’s inherently asynchronous nature, take advantage of Node’s events and streams capabilities, and learn about sophisticated Node deployments at scale. Participants will leave with a richer understanding of what Node has to offer and higher confidence in dealing with some of Node’s more difficult concepts.
Data Engineer's Lunch 90: Migrating SQL Data with ArcionAnant Corporation
In Data Engineer's Lunch 90, Eric Ramseur teaches our audience how to use Arcion.
From best practices to real-world examples, this talk will provide you with the knowledge and insights you need to ensure a successful migration of your SQL data. So whether you're new to data migration or looking to improve your existing process, join us and discover how Arcion can help you achieve your goals.
Moved to https://slidr.io/azzazzel/web-application-performance-tuning-beyond-xmxMilen Dyankov
This slide deck will be removed from here in the future. It has been moved to : https://slidr.io/azzazzel/web-application-performance-tuning-beyond-xmx
My talk at the @media Ajax conference in London in November 2007 about the non-technical steps you can take to make JavaScript and Ajax work for larger teams.
* Open source search with Solr/Lucene gives you the power to turn a wide range of information into fast, useful, relevant results!
* LucidWorks for Solr gives you a tested, release-stable certified distribution of open source search with enhanced tools and installation for building search apps quickly and reliably.
http://www.lucidimagination.com/How-We-Can-Help/webinar-from-search-to-found
Attendees will learn how eBay Germany has implemented Solr, why Solr was selected, which Solr features are utilized. and how Solr is configured and used in production. Recommended best practices will be profiled alomng with eBay Kleinanzeigen plans for future deployment of Solr.
2019 StartIT - Boosting your performance with BlackfireMarko Mitranić
A workshop held in StartIT as part of Catena Media learning sessions.
We aim to dispel the notion that large PHP applications tend to be sluggish, resource-intensive and slow compared to what the likes of Python, Erlang or even Node can do. The issue is not with optimising PHP internals - it's the lack of proper introspection tools and getting them into our every day workflow that counts! In this workshop we will talk about our struggles with whipping PHP Applications into shape, as well as work together on some of the more interesting examples of CPU or IO drain.
Just a few years back, lack of a standard way to document, govern or describe a contract for the APIs acted as a deterrent to API adoption within the enterprise. WSDL 2.0 and WADL provided early support, but they couldn’t truly capture the essence of RESTful APIs. Recently we have seen the emergence of several description languages. New ways to describe and document APIs have emerged such as Swagger, RAML, API Blueprint and others, each taking a slightly different approach.
A ridiculously long presentation from IBM Connect 2013, formerly Lotusphere, from Rob Novak @IBMRockStar and Jerald Mahurin @SociallyCurious on the tools, language, and methods we used to transition from Domino, Quickr and overall web developers to becoming IBM Connections 4.0 developers. From the abstract:
With IBM Connections 4.0, IBM has released the most important new platform - yes platform - for social business development since the Notes client. As a Domino developer, you have excelled. Now, faced with an entire new glossary of terms, new concepts in customization and development, and a whole new set of tools, it could take some time to get up to speed. This session will help you cut weeks off that ramp-up time by showing you exactly what a Connections development environment looks like. We'll cover how to choose your tools and toolkits as well as configuration for development and testing. From the fundamentals of skill gap identification to real working samples, this session is sure to give you a huge head start.
BYO/DIY Analytics Platform (MeasureCamp Presentation by Clancy Childs)Clancy Childs
The slides accompanying Clancy Childs' talk at Measurecamp V (2014) in London. Might be missing a lot if you weren't at the session, but basically covering some of the design decisions, pitfalls, technology choices and requirements when choosing to build your own analytics / eventing platform and data warehouse.
Benchmarking Web Application Scanners for YOUR OrganizationDenim Group
Web applications pose significant risks for organizations. The selection of an appropriate scanning product or service can be challenging because every organization develops their web applications differently and decisions made by developers can cause wide swings in the value of different scanning technologies. To make a solid, informed decision, organizations need to create development team- and organization-specific benchmarks for the effectiveness of potential scanning technologies. This involves creating a comprehensive model of false positives, false negatives and other factors prior to mandating analysis technologies and making decisions about application risk management. This presentation provides a model for evaluating application analysis technologies, introduces an open source tool for benchmarking and comparing tool effectiveness, and outlines a process for making organization-specific decisions about analysis technology selection.
Similar to The Seven Deadly Sins of Solr - By Jay Hill (20)
Text Classification Powered by Apache Mahout and Lucenelucenerevolution
Presented by Isabel Drost-Fromm, Software Developer, Apache Software Foundation/Nokia Gate 5 GmbH at Lucene/Solr Revolution 2013 Dublin
Text classification automates the task of filing documents into pre-defined categories based on a set of example documents. The first step in automating classification is to transform the documents to feature vectors. Though this step is highly domain specific Apache Mahout provides you with a lot of easy to use tooling to help you get started, most of which relies heavily on Apache Lucene for analysis, tokenisation and filtering. This session shows how to use facetting to quickly get an understanding of the fields in your document. It will walk you through the steps necessary to convert your text documents into feature vectors that Mahout classifiers can use including a few anecdotes on drafting domain specific features.
Configure
Presented by Markus Klose, Search + Big Data Consultant SHI Elektronische Medien GmbH at Lucene/Solr Revolution 2013 Dublin
Kibana4Solr is search-driven, scalable, browser based and extremely user friendly (also for non-technical users). Logs are everywhere. Any device, system or human can potentially produce a huge amount of information saved in logs. The amount of available logs and their semi-structured nature make a meaningful processing in real-time quite a difficult task. Thus, valuable business insights stored in logs might be not found. Kibana4Solr is a search-driven approach to handle that challenge. It offers user-friendly and browser-based dashboard which can be easily customized to particular needs. In the session the Kibana4Solr will be introduced. Some light will be shed on the architectural features of Kibana4Solr. Some ideas will be given in terms of possible business uses cases. And finally a live demo of Kibana4Solr will be shown.
Configure
Building Client-side Search Applications with Solrlucenerevolution
Presented by Daniel Beach, Search Application Developer, OpenSource Connections
Solr is a powerful search engine, but creating a custom user interface can be daunting. In this fast paced session I will present an overview of how to implement a client-side search application using Solr. Using open-source frameworks like SpyGlass (to be released in September) can be a powerful way to jumpstart your development by giving you out-of-the box results views with support for faceting, autocomplete, and detail views. During this talk I will also demonstrate how we have built and deployed lightweight applications that are able to be performant under large user loads, with minimal server resources.
Integrate Solr with real-time stream processing applicationslucenerevolution
Presented by Timothy Potter, Founder, Text Centrix
Storm is a real-time distributed computation system used to process massive streams of data. Many organizations are turning to technologies like Storm to complement batch-oriented big data technologies, such as Hadoop, to deliver time-sensitive analytics at scale. This talk introduces on an emerging architectural pattern of integrating Solr and Storm to process big data in real time. There are a number of natural integration points between Solr and Storm, such as populating a Solr index or supplying data to Storm using Solr’s real-time get support. In this session, Timothy will cover the basic concepts of Storm, such as spouts and bolts. He’ll then provide examples of how to integrate Solr into Storm to perform large-scale indexing in near real-time. In addition, we'll see how to embed Solr in a Storm bolt to match incoming tuples against pre-configured queries, commonly known as percolator. Attendees will come away from this presentation with a good introduction to stream processing technologies and several real-world use cases of how to integrate Solr with Storm.
Configure your Solr cluster to handle hundreds of millions of documents without even noticing, handle queries in milliseconds, use Near Real Time indexing and searching with document versioning. Scale your cluster both horizontally and vertically by using shards and replicas. In this session you'll learn how to make your indexing process blazing fast and make your queries efficient even with large amounts of data in your collections. You'll also see how to optimize your queries to leverage caches as much as your deployment allows and how to observe your cluster with Solr administration panel, JMX, and third party tools. Finally, learn how to make changes to already deployed collections —split their shards and alter their schema by using Solr API.
Presented by Rafal Kuć, Consultant and Software engineer, , Sematext Group, Inc.
Even though Solr can run without causing any troubles for long periods of time it is very important to monitor and understand what is happening in your cluster. In this session you will learn how to use various tools to monitor how Solr is behaving at a high level, but also on Lucene, JVM, and operating system level. You'll see how to react to what you see and how to make changes to configuration, index structure and shards layout using Solr API. We will also discuss different performance metrics to which you ought to pay extra attention. Finally, you'll learn what to do when things go awry - we will share a few examples of troubleshooting and then dissect what was wrong and what had to be done to make things work again.
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiledlucenerevolution
In a recent project with the United States Patent and Trademark Office, Opensource Connections was asked to prototype the next generation of patent search - using Solr and Lucene. An important aspect of this project was the implementation of BRS, a specialized search syntax used by patent examiners during the examination process. In this fast paced session we will relate our experiences and describe how we used a combination of Parboiled (a Parser Expression Grammar [PEG] parser), Lucene Queries and SpanQueries, and an extension of Solr's QParserPlugin to build BRS search functionality in Solr. First we will characterize the patent search problem and then define the BRS syntax itself. We will then introduce the Parboiled parser and discuss various considerations that one must make when designing a syntax parser. Following this we will describe the methodology used to implement the search functionality in Lucene/Solr. Finally, we will include an overview our syntactic and semantic testing strategies. The audience will leave this session with an understanding of how Solr, Lucene, and Parboiled may be used to implement their own custom search parser.
Many of us tend to hate or simply ignore logs, and rightfully so: they’re typically hard to find, difficult to handle, and are cryptic to the human eye. But can we make logs more valuable and more usable if we index them in Solr, so we can search and run real-time statistics on them? Indeed we can, and in this session you’ll learn how to make that happen. In the first part of the session we’ll explain why centralized logging is important, what valuable information one can extract from logs, and we’ll introduce the leading tools from the logging ecosystems everyone should be aware of - from syslog and log4j to LogStash and Flume. In the second part we’ll teach you how to use these tools in tandem with Solr. We’ll show how to use Solr in a SolrCloud setup to index large volumes of logs continuously and efficiently. Then, we'll look at how to scale the Solr cluster as your data volume grows. Finally, we'll see how you can parse your unstructured logs and convert them to nicely structured Solr documents suitable for analytical queries.
Real-time Inverted Search in the Cloud Using Lucene and Stormlucenerevolution
Building real-time notification systems is often limited to basic filtering and pattern matching against incoming records. Allowing users to query incoming documents using Solr's full range of capabilities is much more powerful. In our environment we needed a way to allow for tens of thousands of such query subscriptions, meaning we needed to find a way to distribute the query processing in the cloud. By creating in-memory Lucene indices from our Solr configuration, we were able to parallelize our queries across our cluster. To achieve this distribution, we wrapped the processing in a Storm topology to provide a flexible way to scale and manage our infrastructure. This presentation will describe our experiences creating this distributed, real-time inverted search notification framework.
Solr's Admin UI - Where does the data come from?lucenerevolution
Like many Web-Applications in the past, the Solr Admin UI up until 4.0 was entirely server based. It used separate code on the server to generate their Dashboards, Overviews and Statistics. All that code had to be maintained and still ... you weren't really able to use that kind of data for the things you needed it for. It was wrapped into HTML, most of the time difficult to extract and changed the structure from time to time w/o announcement. After a short look back, we're going to look into the current state of the Solr Admin UI - a client-side application, running completely in your browser. We'll see how it works, where it gets its data from and how you can get the very same data and wire that into your own custom applications, dashboards and/oder monitoring systems.
Steve will show how and why to use Solr’s new Schemaless Mode, under which document indexing can be performed with no up-front schema configuration. Solr uses content clues to choose among a predefined set of field types and then automatically add previously unseen fields to the schema.
High Performance JSON Search and Relational Faceted Browsing with Lucenelucenerevolution
Presented by Renaud Delbru, Co-Founder, SindiceTech
In this presentation, we will discuss how Lucene and Solr can be used for very efficient search of tree-shaped schemaless document, e.g. JSON or XML, and can be then made to address both graph and relational data search. We will discuss the capabilities of SIREn, a Lucene/Solr plugin we have developed to deal with huge collections of tree-shaped schemaless documents, and how SIREn is built using Lucene extensibility capabilities (Analysis, Codec, Flexible Query Parser). We will compare it with Lucene's BlockJoin Query API in nested schemaless data intensive scenarios. We will then go through use cases that show how relational or graph data can be turned into JSON documents using Hadoop and Pig, and how this can be used in conjunction with SIREn to create relational faceting systems with unprecedented performance. Take-away lessons from this session will be awareness about using Lucene/Solr and Hadoop for relational and graph data search, as well as the awareness that it is now possible to have relational faceted browsers with sub-second response time on commodity hardware.
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMlucenerevolution
In this session we will show how to build a text classifier using the Apache Lucene/Solr with libSVM libraries. We classify our corpus of job offers into a number of predefined categories. Each indexed document (a job offer) then belongs to zero, one or more categories. Known machine learning techniques for text classification include naïve bayes model, logistic regression, neural network, support vector machine (SVM), etc. We use Lucene/Solr to construct the features vector. Then we use the libsvm library known as the reference implementation of the SVM model to classify the document. We construct as many one-vs-all svm classifiers as there are classes in our setting, then using the Hadoop MapReduce Framework we reconcile the result of our classifiers. The end result is a scalable multi-class classifier. Finally we outline how the classifier is used to enrich basic solr keyword search.
Faceted search is a powerful technique to let users easily navigate the search results. It can also be used to develop rich user interfaces, which give an analyst quick insights about the documents space. In this session I will introduce the Facets module, how to use it, under-the-hood details as well as optimizations and best practices. I will also describe advanced faceted search capabilities with Lucene Facets.
Presented by Shai Erera, Researcher, IBM
Lucene's arsenal has recently expanded to include two new modules: Index Sorting and Replication. Index sorting lets you keep an index consistently sorted based on some criteria (e.g. modification date). This allows for efficient search early-termination as well as achieve better index compression. Index replication lets you replicate a search index to achieve high-availability, fault tolerance as well as take hot index backups. In this talk we will introduce these modules, discuss implementation and design details as well as best practices.
As part of their work with large media monitoring companies, Flax has developed a technique for applying tens of thousands of stored Lucene queries to a document in under a second. We'll talk about how we built intelligent filters to reduce the number of actual queries applied and how we extended Lucene to extract the exact hit positions of matches, the challenges of implementation, and how it can be used, including applications that monitor hundreds of thousands of news stories every day.
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...lucenerevolution
Presented by Xavier Sanchez Loro, Ph.D, Trovit Search SL
This session aims to explain the implementation and use case for spellchecking in Trovit search engine. Trovit is a classified ads search engine supporting several different sites, one for each on country and vertical. Our search engine supports multiple indexes in multiple languages, each with several millions of indexed ads. Those indexes are segmented in several different sites depending on the type of ads (homes, cars, rentals, products, jobs and deals). We have developed a multi-language spellchecking system using solr and lucene in order to help our users to better find the desired ads and avoid the dreaded 0 results as much as possible. As such our goal is not pure orthographic correction, but also suggestion of correct searches for a certain site.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.