This talk was given during DevOps Con 2017.
Have you ever spent time digging through various terminals, greping, lessing, awking and trying to find that few log lines that may be important? Have you every done that under time pressure, because mission critical services were not working? Have you every heard from your developers that they can’t tell you anything, because they don’t have access to application logs? Have you ever considered a centralized storage for logs, but time and resources are not on your side?
If you said yes, to any of the above questions, than this talk is for you. During the talk we’ll introduce you to the world of log centralization and analysis, both when it comes to open source, but also commercial tools. We will go from top to bottom and learn how to setup log centralization and analysis for servers, virtualized environments and containers. We will get from log shipping, through centralized buffering to storage and analysis to show you, that having a centralized log analysis tool is not a rocket science.
Finally, you will see how useful is to combine the logs from all your servers in a single place for blazingly fast correlation.
Microservices, Continuous Delivery, and Elasticsearch at Capital OneNoriaki Tatsumi
This presentation focuses on the implementation of Continuous Delivery and Microservices principles in Capital One’s
cybersecurity data platform – which ingests ~6 TB of data every day, and where Elasticsearch is a core component.
For the Docker users out there, Sematext's DevOps Evangelist, Stefan Thies, goes through a number of different Docker monitoring options, points out their pros and cons, and offers solutions for Docker monitoring. Webinar contains actionable content, diagrams and how-to steps.
Building a Versatile Analytics Pipeline on Top of Apache Spark with Mikhail C...Databricks
It is common for consumer Internet companies to start off with popular third-party tools for analytics needs. Then, when the user base and the company grows, they end up building their own analytics data pipeline and query engine to cope with their data scale, satisfy custom data enrichment and reporting needs and achieve high quality of their data. That’s exactly the path that was taken at Grammarly, the popular online proofreading service.
In this session, Grammarly will share how they improved business and marketing analytics, previously done with Mixpanel, by building their own in-house analytics engine and application on top of Apache Spark. Chernetsov wil touch upon several Spark tweaks and gotchas that they experienced along the way:
– Outputting data to several storages in a single Spark job
– Dealing with Spark memory model, building a custom spillable data-structure for your data traversal
– Implementing a custom query language with parser combinators on top of Spark sql parser
– Custom query optimizer and analyzer when you want not exactly sql
– Flexible-schema storage and query against multi-schema data with schema conflicts
– Custom aggregation functions in Spark SQL
This talk was given during DevOps Con 2017.
Have you ever spent time digging through various terminals, greping, lessing, awking and trying to find that few log lines that may be important? Have you every done that under time pressure, because mission critical services were not working? Have you every heard from your developers that they can’t tell you anything, because they don’t have access to application logs? Have you ever considered a centralized storage for logs, but time and resources are not on your side?
If you said yes, to any of the above questions, than this talk is for you. During the talk we’ll introduce you to the world of log centralization and analysis, both when it comes to open source, but also commercial tools. We will go from top to bottom and learn how to setup log centralization and analysis for servers, virtualized environments and containers. We will get from log shipping, through centralized buffering to storage and analysis to show you, that having a centralized log analysis tool is not a rocket science.
Finally, you will see how useful is to combine the logs from all your servers in a single place for blazingly fast correlation.
Microservices, Continuous Delivery, and Elasticsearch at Capital OneNoriaki Tatsumi
This presentation focuses on the implementation of Continuous Delivery and Microservices principles in Capital One’s
cybersecurity data platform – which ingests ~6 TB of data every day, and where Elasticsearch is a core component.
For the Docker users out there, Sematext's DevOps Evangelist, Stefan Thies, goes through a number of different Docker monitoring options, points out their pros and cons, and offers solutions for Docker monitoring. Webinar contains actionable content, diagrams and how-to steps.
Building a Versatile Analytics Pipeline on Top of Apache Spark with Mikhail C...Databricks
It is common for consumer Internet companies to start off with popular third-party tools for analytics needs. Then, when the user base and the company grows, they end up building their own analytics data pipeline and query engine to cope with their data scale, satisfy custom data enrichment and reporting needs and achieve high quality of their data. That’s exactly the path that was taken at Grammarly, the popular online proofreading service.
In this session, Grammarly will share how they improved business and marketing analytics, previously done with Mixpanel, by building their own in-house analytics engine and application on top of Apache Spark. Chernetsov wil touch upon several Spark tweaks and gotchas that they experienced along the way:
– Outputting data to several storages in a single Spark job
– Dealing with Spark memory model, building a custom spillable data-structure for your data traversal
– Implementing a custom query language with parser combinators on top of Spark sql parser
– Custom query optimizer and analyzer when you want not exactly sql
– Flexible-schema storage and query against multi-schema data with schema conflicts
– Custom aggregation functions in Spark SQL
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...Sematext Group, Inc.
This talk covers the basics of centralizing logs in Elasticsearch and all the strategies that make it scale with billions of documents in production. Topics include:
- Time-based indices and index templates to efficiently slice your data
- Different node tiers to de-couple reading from writing, heavy traffic from low traffic
- Tuning various Elasticsearch and OS settings to maximize throughput and search performance
- Configuring tools such as logstash and rsyslog to maximize throughput and minimize overhead
During this brief walkthrough of the setup, configuration and use of the toolset we will show you how to find the trees from the forest in today's modern cloud environments and beyond.
DOD 2016 - Rafał Kuć - Building a Resilient Log Aggregation Pipeline Using El...PROIDEA
YouTube: https://www.youtube.com/watch?v=1cCD5axQf9U&list=PLnKL6-WWWE_VtIMfNLW3N3RGuCUcQkDMl&index=7
Time-based data, especially logs are all around us. Every application, system or hardware piece logs something - from simple messages, to large stack traces. In this talk we will learn how to build and tune resilient log aggregation pipeline using Elasticsearch and Kafka as its heart. We will start by looking at the overall architecture and how we can connect Elasticsearch and Kafka together. We will look at how to scale our system through a hybrid approach using a combination of time- and size-based indices, and also how to divide the cluster in tiers in order to handle the potentially spiky load in real-time. Then, we'll look at tuning individual nodes. We'll cover everything from commits, buffers, merge policies and doc values to OS settings like disk scheduler, SSD caching, and huge pages. Finally, we'll take a look at the pipeline of getting the logs to Elasticsearch and how to make it fast and reliable: where should buffers live, which protocols to use, where should the heavy processing be done (like parsing unstructured data), and which tools from the ecosystem can help.
Using Elastic to Monitor Everything - Christoph Wurm, Elastic - DevOpsDays Te...DevOpsDays Tel Aviv
"Elasticsearch has come a long way: Started as a distributed search engine in 2009, it's now the tool of choice for even the largest websites (e.g. Facebook, Github, Ebay). Half-way to 2016 the ELK stack helped it become firmly embedded in many centralised log management systems (e.g. Netflix, Uber).
We're now midair in the next step, with the first folks using it for metrics. NASA is using it to monitor the Curiosity rover, Blizzard and Riot to monitor vast online gaming worlds.
This talk will focus on what makes this transition from more unstructured to structured data possible.
"
DOD 2016 - Stefan Thies - Monitoring and Log Management for Docker Swarm and...PROIDEA
YouTube: https://www.youtube.com/watch?v=1HBP6LkKwLc&list=PLnKL6-WWWE_VtIMfNLW3N3RGuCUcQkDMl&index=13
The high level of automation for the container and microservice lifecycle makes the monitoring of Kubernetes or Swarm more challenging than in more traditional, more static deployments. Any static setup to monitor specific application containers does not work because orchestration tools like Kubernetes or Swarm make their own decisions according to the defined deployment rules. In this talk you will learn how DevOps can cope with challenges in Monitoring and Log Management on Docker Swarm and Kubernetes. We will start with the basics of container monitoring and logging, including APIs and tools, followed by an overview of the key metrics of both platforms. We will speak about cluster-wide deployments for monitoring and log management solutions and how to discover services for log collection and monitoring, tagging of logs and metrics. Finally, we will share insights derived from monitoring a 4700 node Swarm cluster, as part of the Swarm3k project.
Docker is all the rage these days. While one doesn't hear much about Solr on Docker, we're here to tell you not only that it can be done, but also share how it's done.
We'll quickly go over the basic Docker ideas - containers are lighter than VMs, they solve "but it worked on my laptop" issues - so we can dive into the specifics of running Solr on Docker.
We'll do a live demo showing you how to run Solr master - slave as well as SolrCloud using containers, how to manage CPU assignments, constraint memory and use Docker data volumes when running Solr in containers. We will also show you how to create your own containers with custom configurations.
Finally, we'll address one of the core Solr questions - which deployment type should I use? We will demonstrate performance differences between the following deployment types:
- Single Solr instance running on a bare metal machine
- Multiple Solr instances running on a single bare metal machine
- Solr running in containers
- Solr running on virtual machine
- Solr running on virtual machine using unikernel
For each deployment type we'll address how it impacts performance, operational flexibility and all other key pros and cons you ought to keep in mind.
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...Sematext Group, Inc.
This talk covers the basics of centralizing logs in Elasticsearch and all the strategies that make it scale with billions of documents in production. Topics include:
- Time-based indices and index templates to efficiently slice your data
- Different node tiers to de-couple reading from writing, heavy traffic from low traffic
- Tuning various Elasticsearch and OS settings to maximize throughput and search performance
- Configuring tools such as logstash and rsyslog to maximize throughput and minimize overhead
During this brief walkthrough of the setup, configuration and use of the toolset we will show you how to find the trees from the forest in today's modern cloud environments and beyond.
DOD 2016 - Rafał Kuć - Building a Resilient Log Aggregation Pipeline Using El...PROIDEA
YouTube: https://www.youtube.com/watch?v=1cCD5axQf9U&list=PLnKL6-WWWE_VtIMfNLW3N3RGuCUcQkDMl&index=7
Time-based data, especially logs are all around us. Every application, system or hardware piece logs something - from simple messages, to large stack traces. In this talk we will learn how to build and tune resilient log aggregation pipeline using Elasticsearch and Kafka as its heart. We will start by looking at the overall architecture and how we can connect Elasticsearch and Kafka together. We will look at how to scale our system through a hybrid approach using a combination of time- and size-based indices, and also how to divide the cluster in tiers in order to handle the potentially spiky load in real-time. Then, we'll look at tuning individual nodes. We'll cover everything from commits, buffers, merge policies and doc values to OS settings like disk scheduler, SSD caching, and huge pages. Finally, we'll take a look at the pipeline of getting the logs to Elasticsearch and how to make it fast and reliable: where should buffers live, which protocols to use, where should the heavy processing be done (like parsing unstructured data), and which tools from the ecosystem can help.
Using Elastic to Monitor Everything - Christoph Wurm, Elastic - DevOpsDays Te...DevOpsDays Tel Aviv
"Elasticsearch has come a long way: Started as a distributed search engine in 2009, it's now the tool of choice for even the largest websites (e.g. Facebook, Github, Ebay). Half-way to 2016 the ELK stack helped it become firmly embedded in many centralised log management systems (e.g. Netflix, Uber).
We're now midair in the next step, with the first folks using it for metrics. NASA is using it to monitor the Curiosity rover, Blizzard and Riot to monitor vast online gaming worlds.
This talk will focus on what makes this transition from more unstructured to structured data possible.
"
DOD 2016 - Stefan Thies - Monitoring and Log Management for Docker Swarm and...PROIDEA
YouTube: https://www.youtube.com/watch?v=1HBP6LkKwLc&list=PLnKL6-WWWE_VtIMfNLW3N3RGuCUcQkDMl&index=13
The high level of automation for the container and microservice lifecycle makes the monitoring of Kubernetes or Swarm more challenging than in more traditional, more static deployments. Any static setup to monitor specific application containers does not work because orchestration tools like Kubernetes or Swarm make their own decisions according to the defined deployment rules. In this talk you will learn how DevOps can cope with challenges in Monitoring and Log Management on Docker Swarm and Kubernetes. We will start with the basics of container monitoring and logging, including APIs and tools, followed by an overview of the key metrics of both platforms. We will speak about cluster-wide deployments for monitoring and log management solutions and how to discover services for log collection and monitoring, tagging of logs and metrics. Finally, we will share insights derived from monitoring a 4700 node Swarm cluster, as part of the Swarm3k project.
Docker is all the rage these days. While one doesn't hear much about Solr on Docker, we're here to tell you not only that it can be done, but also share how it's done.
We'll quickly go over the basic Docker ideas - containers are lighter than VMs, they solve "but it worked on my laptop" issues - so we can dive into the specifics of running Solr on Docker.
We'll do a live demo showing you how to run Solr master - slave as well as SolrCloud using containers, how to manage CPU assignments, constraint memory and use Docker data volumes when running Solr in containers. We will also show you how to create your own containers with custom configurations.
Finally, we'll address one of the core Solr questions - which deployment type should I use? We will demonstrate performance differences between the following deployment types:
- Single Solr instance running on a bare metal machine
- Multiple Solr instances running on a single bare metal machine
- Solr running in containers
- Solr running on virtual machine
- Solr running on virtual machine using unikernel
For each deployment type we'll address how it impacts performance, operational flexibility and all other key pros and cons you ought to keep in mind.
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)Sematext Group, Inc.
In this talk from Lucene/Solr Revolution 2015, Solr and centralized logging experts Radu Gheorghe and Rafal Kuć cover topics like: flow in Logstash, flow in rsyslog, parsing JSON, log shipping, Solr tuning, time-based collections and tiered clusters.
Running High Performance and Fault Tolerant Elasticsearch Clusters on DockerSematext Group, Inc.
Sematext engineer Rafal Kuc (@kucrafal) walks through the details of running high-performance, fault tolerant Elasticsearch clusters on Docker. Topics include: Containers vs. Virtual Machines, running the official Elasticsearch container, container constraints, good network practices, dealing with storage, data-only Docker volumes, scaling, time-based data, multiple tiers and tenants, indexing with and without routing, querying with and without routing, routing vs. no routing, and monitoring. Talk was delivered at DevOps Days Warsaw 2015.
An updated talk about how to use Solr for logs and other time-series data, like metrics and social media. In 2016, Solr, its ecosystem, and the operating systems it runs on have evolved quite a lot, so we can now show new techniques to scale and new knobs to tune.
We'll start by looking at how to scale SolrCloud through a hybrid approach using a combination of time- and size-based indices, and also how to divide the cluster in tiers in order to handle the potentially spiky load in real-time. Then, we'll look at tuning individual nodes. We'll cover everything from commits, buffers, merge policies and doc values to OS settings like disk scheduler, SSD caching, and huge pages.
Finally, we'll take a look at the pipeline of getting the logs to Solr and how to make it fast and reliable: where should buffers live, which protocols to use, where should the heavy processing be done (like parsing unstructured data), and which tools from the ecosystem can help.
From Zero to Hero - Centralized Logging with Logstash & ElasticsearchSematext Group, Inc.
Originally presented at DevOpsDays Warsaw 2014. How to set up centralized logging either using ELK stack - Logstash, Elasticsearch, and Kibana or using Logsene.
Sematext's DevOps Evangelist, Stefan Thies (@seti321), takes a Docker Logging tour through the different log collection options Docker users have, the pros and cons of each, specific and existing Docker logging solutions, tooling, the role of syslog, log shipping to ELK Stack, and more. Q&A session at end.
Ingestion and Dimensions Compute and Enrich using Apache ApexApache Apex
Presenter: Devendra Tagare - DataTorrent Engineer, Contributor to Apex, Data Architect experienced in building high scalability big data platforms.
This talk will be a deep dive into ingesting unbounded file data and streaming data from Kafka into Hadoop. We will also cover data enrichment and dimensional compute. Customer use-case and reference architecture.
Presentation on Secondary Indexes from the 9/11/12 HBase Contributor's Meetup. It discusses the current state of the discussion and some possible future directions.
Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...HostedbyConfluent
Real-time connectivity of databases and systems is critical in enterprises adopting digital transformation to support super-fast decisioning to drive applications like fraud detection, digital payments, recommendation engines. This talk will focus on the many functions that database streaming serves with Kafka, Spark and Aerospike. We will explore how to eliminate the wall between transaction processing and analytics by synthesizing streaming data with system of record data, to gain key insights in real-time.
Powering Interactive Data Analysis at Pinterest by Amazon RedshiftJie Li
In the last six month, we have set up Amazon Redshift to power our interactive data analysis at Pinterest. It has tremendously improved the speed of analyzing our data.
AWS Analytics Immersion Day - Build BI System from Scratch (Day1, Day2 Full V...Sungmin Kim
How to build Business Intelligence System from scratch on AWS (Day1, Day2)
------------------------------------------------------------------------------------------
2020-03-18(수)~19(목) 2일 동안 온라인으로 진행한 Online AWS Analytics Immersion Day 전체 발표 자료 입니다.
BI(Business Intelligence) 시스템을 설계하는 과정에서 AWS Analytics 서비스들을 어떻게 활용할 수 있는지 설명 드리고자 만든 자료 입니다.
Target Audience
-------------------
Online Analytics Immersion Day는 다음과 같은 고객을 대상으로 진행됩니다.
- AWS Analytics Services (ex. Kinesis, Athena, Redshift, EMR, etc)의 기본 개념을 알고 있지만, 이러한 서비스 활용 방법 및 데이터 분석 시스템 구축 과정이 궁금하신 분
- 데이터 분석 시스템을 구축한 경험은 있지만, 자신이 만든 시스템을 아키텍처 관점에서
어떻게 평가하고 확인할 수 있는지 궁금하신 분
The database market is large and filled with many solutions. In this talk, Seth Luersen from MemSQL we will take a look at what is happening within AWS, the overall data landscape, and how customers can benefit from using MemSQL within the AWS ecosystem.
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceAmazon Web Services
Everything generates logs. Applications, infrastructure, security ... everything. Keeping track of the flood of log data is a big challenge, yet critical to your ability to understand your systems and troubleshoot (or prevent) issues. In this session, we will use both Amazon CloudWatch and application logs to show you how to build an end-to-end log analytics solution. First, we cover how to configure an Amazon Elaticsearch Service domain and ingest data into it using Amazon Kinesis Firehose, demonstrating how easy it is to transform data with Firehose. We look at best practices for choosing instance types, storage options, shard counts, and index rotations based on the throughput of incoming data and configure a secure analytics environment. We demonstrate how to set up a Kibana dashboard and build custom dashboard widgets. Finally, we dive deep into the Elasticsearch query DSL and review approaches for generating custom, ad-hoc reports.
Descubre las características disponibles con demostraciones: la replicación entre clústeres, los índices bloqueados de Elasticsearch, los espacios de Kibana y los datos de integraciones en Beats y Logstash.
A data lake can be used as a source for both structured and unstructured data - but how? We'll look at using open standards including Spark and Presto with Amazon EMR, Amazon Redshift Spectrum and Amazon Athena to process and understand data.
Level: Intermediate
Speakers:
Tony Nguyen - Senior Consultant, ProServe, AWS
Hannah Marlowe - Consultant - Federal, AWS
Sudhir Menon, Founder and COO of SnappyData explains how you can tackle Data Gravity, Kubernetes, and strategies/best practices to run, scale, and leverage stateful containers in production.
Similar to Building Resilient Log Aggregation Pipeline with Elasticsearch & Kafka (20)
This talk was given during Activate Conference 2019. Lucene has a lot of options for configuring similarity, and Solr inherits them. Similarity makes the base of your relevancy score: how similar is this document to the query? The default similarity (BM25) is a good start, but you may need to tweak it for your use-case. In this session, you will learn how BM25 works and how you may want to change its parameters. Then, we'll move to other similarity classes: DFR, DFI, IB and LM. You will learn the thinking behind them, how that thinking translates to the similarity score, and which parameters allow you to tweak how score evolves based on things like term frequency or document length. By the end, you’ll have a good understanding of which similarity options are likely to work well for your use-case. You'll know which tunables are available and whether you need to implement a custom similarity class. As an example, we’ll focus on E-commerce, where you often end up ignoring term frequency altogether.
Key Takeaway
1) What are the built-in Lucene/Solr similarities and what they do
2) Which similarity to use for which use-case
3) How to use a custom similarity class in Solr
Learn more about search relevance and similarity: sematext.com/blog/search-relevance-solr-elasticsearch-similarity
This talk was given during DockerCon EU 2018.
It ain't just a whim - to be able to continue innovating, we’ve moved our good old static production to containers. We needed to be elastic, fast, reliable and production ready at any time - that's why we chose Docker. But like in most enterprises, lots of our apps run on the JVM and most JVMs’ ergonomics assume they “own” the server they are running on. So how do you containerize JVM apps? Should you really increase JVM heap if you have spare memory? What about OS caches? What are the differences between JDK 8, 9 and 10 when it comes to container awareness? Outages because of out of memory errors? Slowness because of long garbage collection and poor environment visibility? Long story short, in this session, we’ll look at the gotchas of running JVM apps in containers and teach you how to avoid costly mistakes.
Top 3 things attendees will learn:
1. Key differences between various JVM versions relevant for containerized Java apps.
2. Best practices for running JVM in containers.
3. Avoiding common pitfalls when running containerized JVM applications.
This talk was given during Monitorama EU 2018.
Observability, like other ops practices, has hard and soft benefits. No logs - no root cause, that’s a hard benefit. A soft benefit is when we have more confidence in an observable system. Then we can be more productive in developing it. The trouble with soft benefits like confidence, is how to measure them. Does observability actually make us more productive? How about other activities, such as post-mortems? Why is alert fatigue so bad? Turns out, there are plenty of studies about the impact of such activities on our brain, our behavior, our productivity. In this session, we’ll explore what [neuro]science says about such practices so that:
We turn soft benefits into hard benefits
We can encourage a culture where we get the benefits and avoid the traps
Be prepared for surprises, as some “best practices” aren’t “best” at all.
This talk was given during Lucene Revolution 2017.
They say optimize is bad for you, they say you shouldn't do it, they say it will invalidate operating system caches and make your system suffer. This is all true, but is it true in all cases?
In this presentation we will look closer on what optimize or better called force merge does to your Solr search engine. You will learn what segments are, how they are built and how they are used by Lucene and Solr for searching. We will discuss real-life performance implications regarding Solr collections that have many segments on a single node and compare that to the Solr where the number of segments is moderate and low. We will see what we can do to tune the merging process to trade off indexing performance for better query performance and what pitfalls are there waiting for us. Finally, at the end of the talk we will discuss possibilities of running force merge to avoid system disruption and still benefit from query performance boost that single segment index provides.
This talk was given during Lucene Revolution 2017 and has two goals: first, to discuss the tradeoffs for running Solr on Docker. For example, you get dynamic allocation of operating system caches, but you also get some CPU overhead. We'll keep in mind that Solr nodes tend to be different than your average container: Solr is usually long running, takes quite some RSS and a lot of virtual memory. This will imply, for example, that it makes more sense to use Docker on big physical boxes than on configurable-size VMs (like Amazon EC2).
The second goal is to discuss issues with deploying Solr on Docker and how to work around them. For example, many older (and some of the newer) combinations of Docker, Linux Kernel and JVM have memory leaks. We'll go over Docker operations best practices, such as using container limits to cap memory usage and prevent the host OOM killer from terminating a memory-consuming process - usually a Solr node. Or running Docker in Swarm mode over multiple smaller boxes to limit the spread of a single issue.
Elasticsearch and Solr for Logs + info on Rsyslog, Kibana, Logstash, and Apache Flume for log shipping logs. VIDEO at: http://blog.sematext.com/2014/02/26/video-and-presentation-indexing-and-searching-logs-with-elasticsearch-or-solr/
37 slides about taking care of your SolrCluster - Collections API, Core API, dynamic schema modification, segment merging, hard vs. soft commit, caches, monitoring, performance, JMX, it's all in here.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
14. Daily indices are a good start
2016.11.18 2016.11.19 2016.11.22 2016.11.23. . .
Indexing is faster for smaller indices
Deletes are cheap
Search can be performed on indices that are needed
Static indices are cache friendly
indexing
most searches
We delete whole indices
54. Buffer types
Disk || memory || combined hybrid approach
On source || centralized
App
Buffer
App
Buffer
file or local log shipper
easy scaling – fewer moving parts
often with the use of lightweight shipper
App
App
Kafka / Redis / Logstash / etc…
one place for all changes
extra features made easy (like TTL)
ES
ES