Solr metrics allow monitoring of Solr system behavior and performance optimization. Key aspects include:
- Solr collects metrics using the Dropwizard Metrics framework, organizing data into registries by component like JVM, core, query handler, etc.
- Metrics include counters, timers, histograms and gauges capturing values like request rates, cache sizes, commit times.
- The /admin/metrics handler provides access to metrics from all registries through flexible selection criteria.
- Configuration in solr.xml specifies metric reporters like JMX, Ganglia, Graphite to export data to external systems.
Hive on spark is blazing fast or is it finalHortonworks
This presentation was given at the Strata + Hadoop World, 2015 in San Jose.
Apache Hive is the most popular and most widely used SQL solution for Hadoop. To keep pace with Hadoop’s increasingly vital role in the Enterprise, Hive has transformed from a batch-only, high-latency system into a modern SQL engine capable of both batch and interactive queries over large datasets. Hive’s momentum is accelerating: With Spark integration and a shift to in-memory processing on the horizon, Hive continues to expand the boundaries of Big Data.
In this talk the speakers examined Hive performance, past, present and future. In particular they looked at Hive’s origins as a petabyte scale SQL engine.
Through some numbers and graphs, they showed how Hive became 100x faster by moving beyond MapReduce, by vectorizing execution and by introducing a cost-based optimizer.
They detailed and discussed the challenges of scalable SQL on Hadoop.
The looked into Hive’s sub-second future, powered by LLAP and Hive on Spark.
And showed just how fast Hive on Spark really is.
This presentation shortly describes key features of Apache Cassandra. It was held at the Apache Cassandra Meetup in Vienna in January 2014. You can access the meetup here: http://www.meetup.com/Vienna-Cassandra-Users/
Understanding of Apache kafka metrics for monitoring SANG WON PARK
2019 kafka conference seould에서 발표한 "Apache Kafka 모니터링을 위한 Metrics 이해" 슬라이드 자료
기존 2018년 자료에서 모니터링 관점에서 중요한 metrcis를 중심으로 정리하였고, 2019년 기준으로 추가/변경된 metrics를 반영하였다.
주용 내용은
- 업무에 최적화된 apache kafka 모니터링을 하려면?
- 어떤 정보를 모니터링 해야 할까?
- 적시성 관점의 모니터링 지표 (TotalTimeMs에 대한 세부 구조 이해)
- 안정성 관점의 모니터링 지표 (데이터 유실이 없이 중단없는 서비스)
- 언제 apache kafka 클러스터를 확장해야 할까? (어떤 지표를 봐야 할까?)
위 모든 지표는 producer/broker/consumer 3가지 측면에서 검토하였다.
컨퍼런스 영상 링크 : https://www.youtube.com/watch?v=p2RGsTOCHAg
Presented by Adrien Grand, Software Engineer, Elasticsearch
Although people usually come to Lucene and related solutions in order to make data searchable, they often realize that it can do much more for them. Indeed, its ability to handle high loads of complex queries make Lucene a perfect fit for analytics applications and, for some use-cases, even a credible replacement for a primary data-store. It is important to understand the design decisions behind Lucene in order to better understand the problems it can solve and the problems it cannot solve. This talk will explain the design decisions behind Lucene, give insights into how Lucene stores data on disk and how it differs from traditional databases. Finally, there will be highlights of recent and future changes in Lucene index file formats.
Hive on spark is blazing fast or is it finalHortonworks
This presentation was given at the Strata + Hadoop World, 2015 in San Jose.
Apache Hive is the most popular and most widely used SQL solution for Hadoop. To keep pace with Hadoop’s increasingly vital role in the Enterprise, Hive has transformed from a batch-only, high-latency system into a modern SQL engine capable of both batch and interactive queries over large datasets. Hive’s momentum is accelerating: With Spark integration and a shift to in-memory processing on the horizon, Hive continues to expand the boundaries of Big Data.
In this talk the speakers examined Hive performance, past, present and future. In particular they looked at Hive’s origins as a petabyte scale SQL engine.
Through some numbers and graphs, they showed how Hive became 100x faster by moving beyond MapReduce, by vectorizing execution and by introducing a cost-based optimizer.
They detailed and discussed the challenges of scalable SQL on Hadoop.
The looked into Hive’s sub-second future, powered by LLAP and Hive on Spark.
And showed just how fast Hive on Spark really is.
This presentation shortly describes key features of Apache Cassandra. It was held at the Apache Cassandra Meetup in Vienna in January 2014. You can access the meetup here: http://www.meetup.com/Vienna-Cassandra-Users/
Understanding of Apache kafka metrics for monitoring SANG WON PARK
2019 kafka conference seould에서 발표한 "Apache Kafka 모니터링을 위한 Metrics 이해" 슬라이드 자료
기존 2018년 자료에서 모니터링 관점에서 중요한 metrcis를 중심으로 정리하였고, 2019년 기준으로 추가/변경된 metrics를 반영하였다.
주용 내용은
- 업무에 최적화된 apache kafka 모니터링을 하려면?
- 어떤 정보를 모니터링 해야 할까?
- 적시성 관점의 모니터링 지표 (TotalTimeMs에 대한 세부 구조 이해)
- 안정성 관점의 모니터링 지표 (데이터 유실이 없이 중단없는 서비스)
- 언제 apache kafka 클러스터를 확장해야 할까? (어떤 지표를 봐야 할까?)
위 모든 지표는 producer/broker/consumer 3가지 측면에서 검토하였다.
컨퍼런스 영상 링크 : https://www.youtube.com/watch?v=p2RGsTOCHAg
Presented by Adrien Grand, Software Engineer, Elasticsearch
Although people usually come to Lucene and related solutions in order to make data searchable, they often realize that it can do much more for them. Indeed, its ability to handle high loads of complex queries make Lucene a perfect fit for analytics applications and, for some use-cases, even a credible replacement for a primary data-store. It is important to understand the design decisions behind Lucene in order to better understand the problems it can solve and the problems it cannot solve. This talk will explain the design decisions behind Lucene, give insights into how Lucene stores data on disk and how it differs from traditional databases. Finally, there will be highlights of recent and future changes in Lucene index file formats.
Presto on Apache Spark: A Tale of Two Computation EnginesDatabricks
The architectural tradeoffs between the map/reduce paradigm and parallel databases has been a long and open discussion since the dawn of MapReduce over more than a decade ago. At Facebook, we have spent the past several years in independently building and scaling both Presto and Spark to Facebook scale batch workloads, and it is now increasingly evident that there is significant value in coupling Presto’s state-of-art low-latency evaluation with Spark’s robust and fault tolerant execution engine.
G1 Garbage Collector: Details and TuningSimone Bordet
The G1 Garbage Collector is the low-pause replacement for CMS, available since JDK 7u4. This session will explore in details how the G1 garbage collector works (from region layout, to remembered sets, to refinement threads, etc.), what are its current weaknesses, what command line options you should enable when you use it, and advanced tuning examples extracted from real world applications.
Presentation at Strata Data Conference 2018, New York
The controller is the brain of Apache Kafka. A big part of what the controller does is to maintain the consistency of the replicas and determine which replica can be used to serve the clients, especially during individual broker failure.
Jun Rao outlines the main data flow in the controller—in particular, when a broker fails, how the controller automatically promotes another replica as the leader to serve the clients, and when a broker is started, how the controller resumes the replication pipeline in the restarted broker.
Jun then describes recent improvements to the controller that allow it to handle certain edge cases correctly and increase its performance, which allows for more partitions in a Kafka cluster.
Garbage First Garbage Collector (G1 GC) - Migration to, Expectations and Adva...Monica Beckwith
Learn what you need to know to experience nirvana in the evaluation of G1 GC even if your are migrating from Parallel GC to G1, or CMS GC to G1 GC
You also get a walk through of some case study data
G1 GC
C* Summit 2013: The World's Next Top Data Model by Patrick McFadinDataStax Academy
You know you need Cassandra for it's uptime and scaling, but what about that data model? Let's bridge that gap and get you building your game changing app. We'll break down topics like storing objects and indexing for fast retrieval. You will see by understanding a few things about Cassandra internals, you can put your data model in the spotlight. The goal of this talk is to get you comfortable working with data in Cassandra throughout the application lifecycle. What are you waiting for? The cameras are waiting!
Staying Ahead of the Curve with Spring and Cassandra 4VMware Tanzu
SpringOne 2020
Staying Ahead of the Curve with Spring and Cassandra 4
Mark Paluch, Spring Data Project Lead at VMware
Alexandre Dutra, Technical Manager at DataStax
This is the presentation I made on JavaDay Kiev 2015 regarding the architecture of Apache Spark. It covers the memory model, the shuffle implementations, data frames and some other high-level staff and can be used as an introduction to Apache Spark
Presto on Apache Spark: A Tale of Two Computation EnginesDatabricks
The architectural tradeoffs between the map/reduce paradigm and parallel databases has been a long and open discussion since the dawn of MapReduce over more than a decade ago. At Facebook, we have spent the past several years in independently building and scaling both Presto and Spark to Facebook scale batch workloads, and it is now increasingly evident that there is significant value in coupling Presto’s state-of-art low-latency evaluation with Spark’s robust and fault tolerant execution engine.
G1 Garbage Collector: Details and TuningSimone Bordet
The G1 Garbage Collector is the low-pause replacement for CMS, available since JDK 7u4. This session will explore in details how the G1 garbage collector works (from region layout, to remembered sets, to refinement threads, etc.), what are its current weaknesses, what command line options you should enable when you use it, and advanced tuning examples extracted from real world applications.
Presentation at Strata Data Conference 2018, New York
The controller is the brain of Apache Kafka. A big part of what the controller does is to maintain the consistency of the replicas and determine which replica can be used to serve the clients, especially during individual broker failure.
Jun Rao outlines the main data flow in the controller—in particular, when a broker fails, how the controller automatically promotes another replica as the leader to serve the clients, and when a broker is started, how the controller resumes the replication pipeline in the restarted broker.
Jun then describes recent improvements to the controller that allow it to handle certain edge cases correctly and increase its performance, which allows for more partitions in a Kafka cluster.
Garbage First Garbage Collector (G1 GC) - Migration to, Expectations and Adva...Monica Beckwith
Learn what you need to know to experience nirvana in the evaluation of G1 GC even if your are migrating from Parallel GC to G1, or CMS GC to G1 GC
You also get a walk through of some case study data
G1 GC
C* Summit 2013: The World's Next Top Data Model by Patrick McFadinDataStax Academy
You know you need Cassandra for it's uptime and scaling, but what about that data model? Let's bridge that gap and get you building your game changing app. We'll break down topics like storing objects and indexing for fast retrieval. You will see by understanding a few things about Cassandra internals, you can put your data model in the spotlight. The goal of this talk is to get you comfortable working with data in Cassandra throughout the application lifecycle. What are you waiting for? The cameras are waiting!
Staying Ahead of the Curve with Spring and Cassandra 4VMware Tanzu
SpringOne 2020
Staying Ahead of the Curve with Spring and Cassandra 4
Mark Paluch, Spring Data Project Lead at VMware
Alexandre Dutra, Technical Manager at DataStax
This is the presentation I made on JavaDay Kiev 2015 regarding the architecture of Apache Spark. It covers the memory model, the shuffle implementations, data frames and some other high-level staff and can be used as an introduction to Apache Spark
Organizations continue to adopt Solr because of its ability to scale to meet even the most demanding workflows. Recently, LucidWorks has been leading the effort to identify, measure, and expand the limits of Solr. As part of this effort, we've learned a few things along the way that should prove useful for any organization wanting to scale Solr. Attendees will come away with a better understanding of how sharding and replication impact performance. Also, no benchmark is useful without being repeatable; Tim will also cover how to perform similar tests using the Solr-Scale-Toolkit in Amazon EC2.
Solr Exchange: Introduction to SolrCloudthelabdude
SolrCloud is a set of features in Apache Solr that enable elastic scaling of search indexes using sharding and replication. In this presentation, Tim Potter will provide an architectural overview of SolrCloud and highlight its most important features. Specifically, Tim covers topics such as: sharding, replication, ZooKeeper fundamentals, leaders/replicas, and failure/recovery scenarios. Any discussion of a complex distributed system would not be complete without a discussion of the CAP theorem. Mr. Potter will describe why Solr is considered a CP system and how that impacts the design of a search application.
Activate '18 talk: Lucene and Solr 8 will be released near the end of 2018. This session will explore new features, performance improvements and breaking changes.
Solr Compute Cloud - An Elastic SolrCloud Infrastructure Nitin S
Scaling search platforms for serving hundreds of millions of documents with low latency and high throughput workloads at an optimized cost is an extremely hard problem. BloomReach has implemented Sc2, which is an elastic Solr infrastructure for Big Data applications, supporting heterogeneous workloads and hosted in the cloud. It dynamically grows/shrinks search servers to provide application and pipeline level isolation, NRT search and indexing, latency guarantees, and application-specific performance tuning. In addition, it provides various high availability features such as differential real-time streaming, disaster recovery, context aware replication, and automatic shard and replica rebalancing, all with a zero downtime guarantee for all consumers. This infrastructure currently serves hundreds of millions of documents in millisecond response times with a load ranging in the order of 200-300K QPS.
This presentation will describe an innovate implementation of scaling Solr in an elastic fashion. It will review the architecture and take a deep dive into how each of these components interact to make the infrastructure truly elastic, real time, and robust while serving latency needs.
MeetUp Monitoring with Prometheus and Grafana (September 2018)Lucas Jellema
This presentation introduces the concept of monitoring - focusing on why and how and finally on the tools to use. It introduces Prometheus (metrics gathering, processing, alerting), application instrumentation and Prometheus exporters and finally it introduces Grafana as a common companion for dashboarding, alerting and notifications. This presentations also introduces the handson workshop - for which materials are available from https://github.com/lucasjellema/monitoring-workshop-prometheus-grafana
Search is the Tip of the Spear for Your B2B eCommerce StrategyLucidworks
With ecommerce experiencing explosive growth, it seems intuitive that the B2B segment of that ecosystem is mirroring the same trajectory. That said, B2B has very different needs when it comes to transacting with the same style of experiences that we see in B2C. For instance, B2B ecommerce is about precision findability, whereas B2C customers can convert at higher rates when they’re just browsing online. In order for the B2B buying experience to be successful, search needs to be tuned to meet the unique needs of the segment.
In this webinar with Forrester senior analyst Joe Cicman, you’ll learn:
-Which verticals in B2B will drive the most growth, and how machine-learning powered personalization tactics can be deployed to support those specific verticals
-Why an omnichannel selling approach must be deployed in order to see success in B2B
-How deploying content search capabilities will support a longer sales cycle at scale
-What the next steps are to support a robust B2B commerce strategy supported by new technology
Speakers
Joe Cicman, Senior Analyst, Forrester
Jenny Gomez, VP of Marketing, Lucidworks
Customer loyalty starts with quickly responding to your customer’s needs. When it comes to resolving open support cases, time is of the essence. Time spent searching for answers adds up and creates inefficiencies in resolving cases at scale. Relevant answers need to be a few clicks away and easily accessible for agents directly from their service console.
We will explore how Lucidworks’ Agent Insights application automatically connects agents with the correct answers and resources. You’ll learn how to:
-Configure a proactive widget in an agent’s case view page to access resources across third-party systems (such as Sharepoint, Confluence, JIRA, Zendesk, and ServiceNow).
-Easily set up query pipelines to autonomously route assets and resources that are relevant to the case-at-hand—directly to the right agent.
-Identify subject matter experts within your support data and access tribal knowledge with lightning-fast speed.
How Crate & Barrel Connects Shoppers with Relevant ProductsLucidworks
Lunch and Learn during Retail TouchPoints #RIC21 virtual event.
***
Crate & Barrel’s previous search solution couldn’t provide its shoppers with an online search and browse experience consistent with the customer-centric Crate & Barrel brand. Meanwhile, Crate & Barrel merchandisers spent the bulk of their time manually creating and maintaining search rules. The search experience impacted customer retention, loyalty, and revenue growth.
Join this lunch & learn for an interactive chat on how Crate & Barrel partnered with Lucidworks to:
-Improve search and browse by modernizing the technology stack with ML-based personalization and merchandising solutions
-Enhance the experience for both shoppers and merchandisers
-Explore signals to transform the omnichannel shopping experience
Questions? Visit https://lucidworks.com/contact/
Learn how to guide customers to relevant products using eCommerce search, hyper-personalisation, and recommendations in our ‘Best-In-Class Retail Product Discovery’ webinar.
Nowadays, shoppers want their online experience to be engaging, inspirational and fulfilling. They want to find what they’re looking for quickly and easily. If the sought after item isn’t available, they want the next best product or content surfaced to them. They want a website to understand their goals as though they were talking to a sales assistant in person, in-store.
In this webinar, we explore IMRG industry data insights and a best-in-class example of retail product discovery. You’ll learn:
- How AI can drive increased revenue through hyper-personalised experiences
- How user intent can be easily understood and results displayed immediately
- How merchandisers can be empowered to curate results and product placement – all without having to rely on IT.
Presented by:
Dave Hawkins, Principal Sales Engineer - Lucidworks
Matthew Walsh, Director of Data & Retail - IMRG
Connected Experiences Are Personalized ExperiencesLucidworks
Many companies claim personalization and omnichannel capabilities are top priorities. Few are able to deliver on those experiences.
For a recent Lucidworks-commissioned study, Forrester Consulting surveyed 350+ global business decision-makers to see what gets in the way of achieving these goals. They discovered that inefficient technology, lack of behavioral insights, and failure to tie initiatives to enterprise-wide goals are some of the most frequent blockers to personalization success.
Join guest speaker, Forrester VP and Principal Analyst, Brendan Witcher, and Lucidworks CEO, Will Hayes, to hear the results of the Forrester Consulting study, how to avoid “digital blindness,” and how to apply VoC data in real-time to delight customers with personalized experiences connected across every touchpoint.
In this webinar, you’ll learn:
- Why companies who utilize real-time customer signals report more effective personalization
- How to connect employees and customers in a shared experience through search and browse
- How Lucidworks clients Lenovo, Morgan Stanley and Red Hat fast-tracked improvements in conversion, engagement and customer satisfaction
Featuring
- Will Hayes, CEO, Lucidworks
- Brendan Witcher, VP, Principal Analyst, Forrester
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Lucidworks
Intelligent Policing. Leveraging Data to more effectively Serve Communities.
Policing in the next decade is anticipated to be very different from historical methods. More data driven, more focused on the intricacies of communities they serve and more open and collaborative to make informed recommendations a reality. Whether its social populations, NIBRS or organization improvement that’s the driver, the IT requirement is largely the same. Provide 360 access to large volumes of siloed data to gain a full 360 understanding of existing connections and patterns for improved insight and recommendation.
Join us for a round table discussion of how the Toronto Police Service is better serving their community through deploying a unified intelligent data platform.
Data innovation improves officers' engagement with existing data and streamlines investigation workflows by enhancing collaboration. This improved visibility into existing police data allows for a more intelligent and responsive police force.
In this webinar, we'll cover:
-The technology needs of an intelligent police force.
-How a Global Search improves an officer's interaction with existing data.
Featuring:
-Simon Taylor, VP, Worldwide Channels & Alliances, Lucidworks
-Michael Cizmar, Managing Director, MC+A
-Ian Williams, Manager of Analytics & Innovation, Toronto Police Service
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...Lucidworks
Policing in the next decade is anticipated to be very different from historical methods. More data driven, more focused on the intricacies of communities they serve and more open and collaborative to make informed recommendations a reality. Whether its social populations, NIBRS or organization improvement that’s the driver, the IT requirement is largely the same. Provide 360 access to large volumes of siloed data to gain a full 360 understanding of existing connections and patterns for improved insight and recommendation.
Join us for a round table discussion of how the Toronto Police Service is better serving their community through deploying a unified intelligent data platform.
Data innovation improves officers' engagement with existing data and streamlines investigation workflows by enhancing collaboration. This improved visibility into existing police data allows for a more intelligent and responsive police force.
In this webinar, we'll cover:
The technology needs of an intelligent police force.
How a Global Search improves an officer's interaction with existing data.
Featuring
-Simon Taylor, VP, Worldwide Channels & Alliances, Lucidworks
-Michael Cizmar, Managing Director, MC+A
-Ian Williams, Manager of Analytics & Innovation, Toronto Police Service
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Lucidworks
Wish your conversion rates were higher? Can’t figure out how to efficiently and effectively serve all the visitors on your site? Embarrassed by the quality of your product discovery experience? The bar is high and the influx of online shopping over recent months has reminded us that the opportunities are real. We’re all deep in holiday prep, but let’s take a few minutes to think about January 2021 and beyond. How can we position ourselves for success with our customers and against our competition?
Grab your lunch and let’s dive into three strategies that need to be part of your 2021 roadmap. You don’t need an army to get there. But you do need to take action and capitalize on the shoppers abandoning the product discovery journey on your site.
In this session, attendees will find out how to:
-Take control of merchandising at scale;
-Implement hands-free search relevancy; and
-Address personalization challenges.
AI-Powered Linguistics and Search with Fusion and RosetteLucidworks
For a personalized search experience, search curation requires robust text interpretation, data enrichment, relevancy tuning and recommendations. In order to achieve this, language and entity identification are crucial.
For teams working on search applications, advanced language packages allow them to achieve greater recall without sacrificing precision.
Join us for a guided tour of our new Advanced Linguistics packages, available in Fusion, thanks to the technology partnership between Lucidworks and Basistech.
We’ll explore the application of language identification and entity extraction in the context of search, along with practical examples of personalizing search and enhancing entity extraction.
In this webinar, we’ll cover:
-How Fusion uses the Rosette Basic Linguistics and Entity Extraction packages
-Tips for improving language identification and treatment as well as data enrichment for personalization
-Speech2 demo modeling Active Recommendation
-Use Rosette’s packages with Fusion Pipelines to build custom entities for specific domain use cases
Featuring:
-Radu Miclaus, Director of Product, AI and Cloud, Lucidworks, Lucidworks
-Robert Lucarini, Senior Software Engineer, Lucidworks
-Nick Belanger, Solutions Engineer, Basis Technology
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentLucidworks
Before COVID-19, almost 80% of the US workforce worked service in jobs that involve in-person interaction with strangers. Now, leaders of service organizations must reshape their offerings during the pandemic and prepare for whatever the new normal turns out to be. Our three panelists will share ideas for adapting their service businesses, now that closer-than-six-feet isn’t an option.
Join Lucidworks as we talk shop with 3 service business leaders, covering:
-Common impacts of the pandemic on service businesses (and what to do about them),
-How service teams can maintain a human touch across virtual channels, and
-Plans for the future, before and after the pandemic subsides.
Featuring
-Sara Nathan, President & CEO, AMIGOS
-Anthony Carruesco, Founder, AC Fly Fishing
-sara bradley, chef and proprietor, freight house
-Justin Sears, VP Product Marketing, Lucidworks
Webinar: Smart answers for employee and customer support after covid 19 - EuropeLucidworks
The COVID-19 pandemic has forced companies to support far more customers and employees through digital channels than ever before. Many are turning to chatbots to help meet increasing demand, but traditional rules-based approaches can’t keep up. Our new Smart Answers add-on to Lucidworks Fusion makes existing chatbots and virtual assistants more intelligent and more valuable to the people you serve.
Smart Answers for Employee and Customer Support After COVID-19Lucidworks
Watch our on-demand webinar showcasing Smart Answers on Lucidworks Fusion. This technology makes existing chatbots and virtual assistants more intelligent and more valuable to the people you serve.
In this webinar, we’ll cover off:
-How search and deep learning extend conversational frameworks for improved experiences
-How Smart Answers improves customer care, call deflection, and employee self-service
-A live demo of Smart Answers for multi-channel self-service support
Applying AI & Search in Europe - featuring 451 ResearchLucidworks
In the current climate, it’s now more important than ever to digitally enable your workforce and customers.
Hear from Simon Taylor, VP Global Partners & Alliances, Lucidworks and Matt Aslett, Research Vice President, 451 Research to get the inside scoop on how industry leaders in Europe are developing and executing their digital transformation strategies.
In this webinar, we’ll discuss:
The top challenges and aspirations European business and technology leaders are solving using AI and search technology
Which search and AI use cases are making the biggest impact in industries such as finance, healthcare, retail and energy in Europe
What technology buyers should look for when evaluating AI and search solutions
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyLucidworks
In this webinar with 451 Research, you'll understand how retailers are using AI to predict customer intent and learn which key performance metrics are used by more than 120 online retailers in Lucidworks’ 2019 Retail Benchmark Survey.
In this webinar, you’ll learn:
● What trends and opportunities are facing the ecommerce industry in 2020
● Why search is the universal path to understanding customer intent
● How large online retailers apply AI to maximize the effectiveness of their personalization efforts
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Lucidworks
Nordstrom Rack | Hautelook curates and serves customers a wide selection of on-trend apparel, accessories, and shoes at an everyday savings of up to 75 percent off regular prices. With over a million visitors shopping across different platforms every day, and a realization that customers have become accustomed to robust and personalized search interactions, Nordstrom Rack | Hautelook launched an initiative over a year ago to provide data science-driven digital experiences to their customers.
In this session, we’ll discuss Nordstrom Rack | Hautelook’s journey of operationalizing a hefty strategy, optimizing a fickle infrastructure, and rallying troops around a single vision of building an expansible machine-learning driven product discovery engine.
The audience will learn about:
-The key technical challenges and outcomes that come with onboarding a solution
-The lessons learned of creating and executing operational design
-The use of Lucidworks Fusion to plug custom data science models into search and browse applications to understand user intent and deliver personalized experiences
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceLucidworks
Knowledge graphs and machine learning are on the rise as enterprises hunt for more effective ways to connect the dots between the data and the business world. With newer technologies, the digital workplace can dramatically improve employee engagement, data-driven decisions, and actions that serve tangible business objectives.
In this webinar, you will learn
-- Introduction to knowledge graphs and where they fit in the ML landscape
-- How breakthroughs in search affect your business
-- The key features to consider when choosing a data discovery platform
-- Best practices for adopting AI-powered search, with real-world examples
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
2. 2
whoami
• Lucene / Solr user, contributor, commi2er
• Author of Luke – The Index Toolbox
• Lucidworks Fusion developer
3. 3
Agenda
• MoAvaAon – why Solr needs metrics
• Design – how Solr collects metrics and what data is being collected
• ImplementaAon – key components
• ConfiguraAon – what to collect and how to report it
• Examples of metrics and integraAons with external systems
• Future development
5. 5
Why metrics?
• DevOps need tools for monitoring the system behavior
• ProducAon troubleshooAng, eg. FD leaks, outlier requests
- Profiling is not an opAon in producAon deployments
• OpAmizaAon of the system on various levels
- OS, collecAon, node, shard, doc rouAng, …
• Especially useful in locked-‐down deployments
6. 6
JIRA
• Long-‐standing request – first created in 2013!
• Many contributors (and watchers!)
• “metrics” JIRA component
- Containing now approx. 60 issues
• Key JIRAs:
- SOLR-‐4735 iniAal framework
- SOLR-‐9812 /admin/metrics handler
• First released in Solr 6.4.0
- With important bug fixes released in 6.4.2
8. 8
Dropwizard Metrics
• High-‐performance lightweight metrics framework
• Metric types
- counter: monotonically increasing counter
- number of processed docs
- meter: counter + moving average (rate), 1-‐, 5-‐ and 15-‐minute
- system load average, rate of requests
- histogram: histogram of values (exponenAally decaying by default)
- result sizes, IO read sizes
- Amer: meter and histogram of event duraAons
- commit Ames, query Ames, request Ames
- gauge: instantaneous reading of a value
- current heap size, number of cores, TLOG buffer size
9. 9
Where the data is collected from?
• JVM metrics
- GC, heap, threads, class loading, OS load / mem / FDs, etc
• Je2y / HTTP metrics
- connecAons, thread pools, …
• Container metrics
- number of cores, data paths, admin handler metrics
• Per-‐SolrCore metrics
- All RequestHandler-‐s: request counters and Amers
- Searcher and cache stats
- ReplicaAon
- Index-‐level Amers and histograms
- Other components
• SolrCloud metrics (opAonal)
- Aggregated from SolrCloud nodes
JVM
Je&y
/
HTTP
CoreContainer
SolrCore
…
Components
Solr instance
10. 10
Registries
• Metric groups for each major aspect of a Solr instance:
- jvm, jetty, node (CoreContainer), core (SolrCore)
- see SolrInfoBean.Group
• One registry per group, and one for each SolrCore
- Easier to manage core metrics throughout core life-‐cycle
• No persistence across node restarts
- SolrCore metrics persist across core reloads
solr.jvm
solr.node
solr.core.collec=on1
11. 11
Registry names
• Hierarchical, dot-‐separated
• Always prefixed with solr.
• Overridable using System properAes:
-Dsolr.core.collection1=solr.myCollection
- This is useful eg. to collapse per-‐replica registries into one registry with aggregated
metrics
• SolrCloud “core” registry name example:
SolrCore name: collection1_shard1_replica_n3
Registry name: solr.core.collection1.shard1.replica_n3
solr.jvm
solr.node
solr.core.collec=on1
12. 12
Metric names
• Hierarchical dot-‐separated
• By convenAon names start with component category
- eg. CONTAINER, CORE, QUERY …
- see SolrInfoBean.Category
• Request handler metrics follow this naming:
<category> . <handler name or scope> . <metric name>
• Examples:
QUERY./select.requestTimes
UPDATE.updateShardHandler.threadPool.recoveryExecutor.completed
solr.jvm
solr.node
solr.core.collec=on1
CORE.fs.totalSpace
SEARCHER.new
QUERY./get.requests
CACHE.core.fieldCache
15. 15
Components
• SolrMetricManager
- One central component to manage registries and reporters
• MetricRegistry
(Dropwizard API)
- Type-‐safe Map keeping related metric instances and their names
• SolrMetricProducer (interface)
- Creates and registers metric instances
- Many exisAng Solr components now implement this interface
• SolrMetricReporter (abstract class)
- Reports collected metrics to external agents and/or files
- Several implementaAons available out of the box
• MetricsHandler (at /admin/metrics)
- Provides access to all local metric registries
17. 17
/admin/metrics handler
• Shows metrics from all or selected registries
• Flexible selecAon criteria:
- registry by: group (e.g. jetty, node), or registry name (e.g.
solr.core.collection1)
- filter metrics by a list of prefixes (or regexes)
- retrieve only some properAes using property parameter
- Retrieve single metrics / properAes using fully-‐qualified key (7.1)
21. 21
Metrics vs. Solr 6.x MBeans
• Naming of categories and groups has slightly changed
- More fiqng categories for some components
• Solr 6.x sAll uses independent implementaAons for Metrics and MBeans
- Some staAsAcs are either unavailable in each API or reported differently
• Solr 7.x uses only Metrics API to report MBean stats
- This includes also nested and non-‐numeric values
- <jmx> element in solrconfig.xml is no longer supported – instead use
SolrJMXReporter in solr.xml
- AutomaAcally added if missing and when an MBeanServer is detected
24. 24
Metrics collection
• Already happening J
- Minimal overhead, in the order of μs/req and < 0.5 MB / core
• New secAon in solr.xml: <solr><metrics>
- Reporter configuraAon
- Custom metric implementaAons
- Some debug configuraAon
- Detailed histograms of index and TLOG processing Ames, per core
25. 25
Reporters
• Extend SolrMetricReporter
• Configured in solr.xml <solr><metrics><reporter>
• Several implementaAons provided:
- JMX: fully hierarchical view in e.g. JConsole
- Ganglia, Graphite, SLF4J: send periodic reports of selected metrics
- Easy API – create new ones!
- h2ps://github.com/vthacker/solr-‐metrics-‐influxdb
• Created for each selected registry, using group and/or registry list a2ributes
- If neither is present the reporter is created for all registries
26. 26
Reporter configuration details
• Required a2ributes: name (unique per registry), class (FQCN)
• OpAonal a2ributes:
- group – comma-‐separated list of registry groups, eg. core,jvm
- registry – comma-‐separated list of registry prefixes, eg.
solr.node,solr.core.coll
• OpAonal initArgs:
- filter -‐ report only metrics with that prefix, e.g. QUERY./select
- period -‐ how oven metrics will be reported, in seconds
- ... other, depending on implementaAon, e.g. logger name for SLF4j
* NOTE: for a given configuraAon, separate reporter instances are created for each matching registry
28. 28
Advanced configuration
• LimitaAons of default histogram / Amer in Dropwizard Metrics
- Uses ExponenAallyDecayingReservoir (EDR) sampling BUT assumes
normal distribuAon
- If distribuAon is skewed then rare outliers may never be captured or retained long enough
- EDR is tuned to prefer last 5 minutes of data – but keeps only 1028
random samples
- May LOSE criAcal min / max / percenAle data under higher rate of
updates
- May report obsolete values to snapshot because retained data is
replaced randomly
- Internal values are “decayed” only during updates – no updates means
values are stuck!
≠
29. 29
Advanced configuration
• Custom parameters and implementaAons for metrics
- <solr><metrics><suppliers> secAon in solr.xml
- Users can provide their own implementaAons of counters, meters, Amers and
histograms
• Solving the issue with EDR
- Use different reservoir size, or different reservoir implementaAon
- Several other implementaAons available, with tradeoffs, eg. SlidingTimeWindowReservoir
- Use your own histogram implementaAons
- h2p://github.com/vladimir-‐bukhtoyarov/rolling-‐metrics
* NOTE: metric reporters retrieve metric snapshots concurrently and at arbitrary :mes,
DO NOT use implementa:ons that reset to 0 a>er each snapshot!
31. 31
SolrCloud metrics (7.x)
• Shard metrics
- Reported from replicas to shard leaders
• Node metrics
- Reported from mulAple registries on each node to Overseer
• ParAally aggregated (simple sum, avg, mean, stddev, string lists)
- Some aggregaAons wouldn’t make sense, eg. Histograms
• AutomaAcally collected by /metrics/collector handler
• Configured in solr.xml using special shard and cluster groups
39. 39
Metrics in 7.x
• Adding more configurability
• Be2er defaults for reservoirs
• Autoscaling framework
- Autoscaling acAons are largely based on metrics, eg.
- freedisk, sysLoadAvg, cores, heapUsage, system properAes
- May use any metric value eg. metrics:solr.node:CONTAINER.fs.usableSpace
• Using metrics for feedback control in Solr clusters
- Support for modeling and simulaAon of dynamic behavior (SOLR-‐11285)
40. 40
Summary
• Metrics are a lightweight mechanism for collecAng detailed insights into Solr
operaAon
- Provided now by most Solr components
- Easy to add new metrics
• Metrics can be reported to external systems in mulAple formats and protocols
- Several popular systems already supported
- Easy to add new reporters
• Metrics provide key data for SolrCloud autoscaling
• How do you want to use metrics?