ELK Wrestling (Leeds DevOps)

•

3 likes•1,764 views

Talk I did on log aggregation with the ELK stack at Leeds DevOps. Covers how we process over 800,000 logs per hour at laterooms, and the cultural changes this has helped drive.

Technology

{
title: ‘ELK Wrestling’,
author: ‘Steve Elliott’,
company: ‘LateRooms.com’,
type: ‘DevOpsLeeds,
@timestamp: ‘2014-10-13T18:30Z’
}

Featuring Live Demo!
Please tweet!
Include: “leedsdevops”

Home growing a metrics culture
Needed visibility of live issues
Had trialled off the shelf before (Splunk)
Hadn’t gained traction
Wanted the data still

Options...
Tried Splunk
...Bit pricey, pay for HW and volume of data
indexed
Looked at cloud based options, were also
expensive

Logging and Monitoring Project
Locate and implement the tools we needed
Started with Cube for metrics (wouldn’t
recommend)
Moved onto Logging

What can we log?
Pretty much anything with a timestamp
Error log
Web logs
Proxy logs
Releases?
Tweets?

High level architectural design
Web servers
Queue
Dashboards
Elasticsearch
Rest of
Badger

Who’s using it?
Certain other hotel website...
...Clever people

Working with Elasticsearch
● RESTful API
● JSON
● Many libraries to deal with it (new on
ElasticLinq for C#)

Clustering
Excellent distributed features
Easy to use
Node Self discovery
Different Node Types
(Data, Master, Search, Client)
“Live”
SSD
“Archive”
HDD

More in depth architecture
IIS
Logs
Errors
WMI
Collector
(e.g. Live Server)
Queue
Forwarder
Cube (/TSDB)
Search
Analytics
Rabbit MQ
Filter &
Forward

Logstash
Inputs
Filters
Outputs
e.g.HTTP logs,
UDP, error logs,
tweets.
e.g. UDP,
elasticsearch,
graphite, IRC
(e.g. Filter, grok, lookup
IP, magic…)

Why the Queue?
● Resiliancy
● Single source of data for everyone
● Logstash used to recommend RabbitMQ,
now they recommend Redis
● We still use RabbitMQ, works for us

Kibana
● Easy to build dashboards
● Gateway drug to ElasticSearch queries
● Examples!

Mistake: Dashboard Fatigue
Too many dashboards to watch!
Need to do more on alerting

Mistake: Using elasticsearch as a TSDB
Lots of graphs just cared about
top level values, should
use a TSDB (such as graphite)
instead
Elasticsearch use case for more in-depth data
analysis

Mistake: Trying to keep too much data
● Nodes going out of memory or disk space is
bad
● Long GC can cause nodes to drop
● Can lead to split brain
● More shards = more memory
● usage, watch your scaling

Scaling
Hit two bottlenecks
- Ingestion (solved with SSDs)
- Search (solved by scaling horizontally)
1.4.0 brings stability improvements, should
handle oom better

Other Mistakes
Should have automated
sooner
(Good chef/puppet support)
Should have used
“normal” logstash more
More
node
More
awesome??

What went right?
● Free and easy access to Data
● Doesn’t need to be on elasticsearch, but the
tooling makes it easy
● Give people access and they’ll seek out the
data to drive decisions - start the feedback
loop
● Dev/Test instance

ELK in the wild
Data Driven QA
Data Driven...Managering

But wait, theres more!
Curator, Kibana 4 (Woo - aggregations),
alerting, linking
logs together…
Too much to
cover here!

Thanks for Listening!
More: elasticsearch.org, logstash.net
Blog: www.tegud.net
Twitter: @tegud
Github: www.github.com/tegud
Come say hi!

This document summarizes an ELK meetup that took place on March 2nd 2015. It discusses using ELK for log processing, in public clouds like AWS, and activities like kite surfing. The document also provides information on Wind Analytics and their next steps, monitoring large AWS environments, implementing ELK with the right architecture, and Logz.io which provides an ELK as a service solution and insights. It includes demos of Logz.io's architecture and log processing. The meetup concluded with information on job opportunities at Logz.io.

Interactive learning analytics dashboards with ELK (Elasticsearch Logstash Ki...

Andrii Vozniuk

My workshop at the Learning Analytics Summer Institute (LASI) 2016: http://lasi16.snola.es/#!/schedule/113 Educational data continues to grow in volume, velocity and variety. Making sense of the educational data in such conditions requires deployment and usage of appropriate scalable, real-time processing tools supporting a flexible data schema. Elasticsearch is one of the popular open-source tools meeting the enlisted requirements. Initially envisioned as a search engine capable of operating at scale and in real time, Elasticsearch is used by organisations such as Wikimedia and Github, which deal with big data on daily basis. In addition, Elasticsearch is used increasingly often as analytics platform thanks to its scalable architecture and expressive query language. Until recently, the exploitation of Elasticsearch for (learning) analytical purposes by practitioners was hindered by a high entrance barrier due to the complexity of the query language and the query specificities. This is currently changing with the ongoing development of Kibana, an open-source tool that allows to conduct analysis and build visualisations of Elasticsearch data through a graphical user interface. Kibana does not require the user to dive into technical details of the queries (although it is still possible) and hence makes big educational data visualisations accessible to regular users. The additional value of Kibana comes in play whenever several visualisations are combined on a single dashboard, enabling to use multiple coordinated views for an interactive explorative analysis. Both Elasticsearch and Kibana, together with Logstash are part of an analytics stack often referred to as ELK. Logstash supports data acquisition from multiple sources (including twitter, RSS, event logs) thanks to its rich set of available connectors. Custom connectors can be developed for case-specific sources. In addition to the mentioned values, ELK enables building analytics infrastructures decoupled from the learning platform, i.e., it allows to host separately the learning environment (with the analytics functionalities) and the data storage without affecting the end-user experience.

Log analysis using Logstash,ElasticSearch and Kibana

Avinash Ramineni

This document provides an overview of Logstash, Elasticsearch, and Kibana for log analysis. It discusses how logging is used for troubleshooting, security, and monitoring. It then introduces Logstash as an open-source log collection and parsing tool. Elasticsearch is described as a search and analytics engine that indexes log data from Logstash. Kibana provides a web interface for visualizing and searching logs stored in Elasticsearch. The document concludes with discussing demo, installation, scaling, and deployment considerations for these log analysis tools.

Log aggregation and analysis

Dhaval Mehta

The document discusses log aggregation and analysis using the Elastic Stack. It describes how the Elastic Stack collects logs from various sources using lightweight data shippers called Beats. The logs are then processed and structured by Logstash before being stored in Elasticsearch for exploration and visualization using Kibana. Demos are provided showing how the Elastic Stack can parse nginx logs, capture logs from a Django application, and monitor node metrics.

Toronto High Scalability meetup - Scaling ELK

Andrew Trossman

The document discusses scaling logging and monitoring infrastructure at IBM. It describes: 1) User scenarios that generate varying amounts of log data, from small internal groups generating 3-5 TB/day to many external users generating kilobytes to gigabytes per day. 2) The architecture uses technologies like OpenStack, Docker, Kafka, Logstash, Elasticsearch, Grafana to process and analyze logs and metrics. 3) Key aspects of scaling include automating deployments with Heat and Ansible, optimizing components like Logstash and Elasticsearch, and techniques like sharding indexes across multiple nodes.

Log management with ELK

Geert Pante

ELK (Elasticsearch, Logstash, Kibana) is an open source toolset for centralized logging, where Logstash collects, parses, and filters logs, Elasticsearch stores and indexes logs for search, and Kibana visualizes logs. Logstash processes logs through an input, filter, output pipeline using plugins. It can interpret various log formats and event types. Elasticsearch allows real-time search and scaling through replication/sharding. Kibana provides browser-based dashboards and visualization of Elasticsearch query results.

Scaling an ELK stack at bol.com

Renzo Tomà

A presentation about the deployment of an ELK stack at bol.com At bol.com we use Elasticsearch, Logstash and Kibana in a logsearch system that allows our developers and operations people to easilly access and search thru logevents coming from all layers of its infrastructure. The presentations explains the initial design and its failures. It continues with explaining the latest design (mid 2014). Its improvements. And finally a set of tips are giving regarding Logstash and Elasticsearch scaling. These slides were first presented at the Elasticsearch NL meetup on September 22nd 2014 at the Utrecht bol.com HQ.

Centralised logging with ELK stack

Simon Hanmer

Bol.com uses the Elastic (ELK) stack to make sense of logs from over 1,600 servers and 500-600 million events per day. Key aspects of their system include: 1. Shipping JSON-formatted log events from sources like Apache, databases, and applications to Redis queues to allow multiple Logstash instances to process events in real-time without data loss. 2. Enriching log events with information like request IDs to correlate requests across services, and IP-to-role mappings to identify client roles. 3. Using Elasticsearch aggregations and transformations to generate a directed graph of service dependencies based on logs, to help understand their distributed architecture.

ELK at LinkedIn - Kafka, scaling, lessons learned

Tin Le

Centralized Logging System Using ELK Stack

Rohit Sharma

Centralized Logging System using ELK Stack The document discusses setting up a centralized logging system (CLS) using the ELK stack. The ELK stack consists of Logstash to capture and filter logs, Elasticsearch to index and store logs, and Kibana to visualize logs. Logstash agents on each server ship logs to Logstash, which filters and sends logs to Elasticsearch for indexing. Kibana queries Elasticsearch and presents logs through interactive dashboards. A CLS provides benefits like log analysis, auditing, compliance, and a single point of control. The ELK stack is an open-source solution that is scalable, customizable, and integrates with other tools.

Introducing ELK

AllBits BVBA (freelancer)

This document introduces the (B)ELK stack, which consists of Beats, Elasticsearch, Logstash, and Kibana. It describes each component and how they work together. Beats are lightweight data shippers that collect data from logs and systems. Logstash processes and transforms data from inputs like Beats. Elasticsearch stores and indexes the data. Kibana provides visualization and analytics capabilities. The document provides examples of using each tool and tips for working with the ELK stack.

The ELK Stack - Get to Know Logs

GlobalLogic Ukraine

Elk

Caleb Wang

Log analysis with the elk stack

Vikrant Chauhan

This document discusses using the ELK stack (Elasticsearch, Logstash, Kibana) for log analysis. It describes the author's experience using Splunk and alternatives like Graylog and Elasticsearch before settling on the ELK stack. The key components - Logstash for input, Elasticsearch for storage and searching, and Kibana for the user interface - are explained. Troubleshooting tips are provided around checking that the components are running and communicating properly.

Elk devops

Ideato

La gestione dei log è da sempre un argomento complesso e nel tempo si sono cercate varie soluzioni più o meno complesse, spesso difficili da integrare nel proprio stack applicativo. Daremo un’ overview generale dei principali sistemi di aggregazione evoluta dei log in realtime (Fluentd, Greylog, eccetera) e illustreremo del motivo ci ha spinto a scegliere ELK per risolvere un’esigenza del nostro cliente; ovvero di consultare i log in modo piu comprensibile da persone non tecniche. Lo stack ELK (Elasticsearch Logstash Kibana) permette agli sviluppatori di consultare i log in fase di debug / produzione senza avvalersi dello staff sistemistico. Dimostreremo come abbiamo eseguito il deployment dello stack ELK e lo abbiamo implementato per interpretare e strutturare i log applicativi di Magento.

Presto meetup 2015-03-19 @Facebook

Treasure Data, Inc.

This document provides an overview of Presto as a Service in Treasure Data, including how Treasure Data deploys and monitors Presto. Key points include: - Treasure Data offers Presto as an interactive query engine accessible through its API and web console. - Treasure Data uses blue-green deployments and a private Maven repository to deploy new Presto versions with no downtime. - Treasure Data monitors Presto using its REST API and collects query logs to analyze performance and detect anomalies. - Treasure Data implements multi-tenancy in Presto by allocating resources like worker nodes based on customers' price plans and resource usage.

Kibana + timelion: time series with the elastic stack

Sylvain Wallez

The document discusses Kibana and Timelion, which are tools for visualizing and analyzing time series data in the Elastic Stack. It provides an overview of Kibana's evolution and capabilities for creating dashboards. Timelion is introduced as a scripting language that allows users to transform, aggregate, and calculate on time series data from multiple sources to create visualizations. The document demonstrates Timelion's expression language, which includes functions, combinations, filtering, and attributes to process and render time series graphs.

Elastic Stack Introduction

Vikram Shinde

Monitoring, Hold the Infrastructure - Getting the Most out of AWS Lambda – Da...

Amazon Web Services

This document discusses monitoring AWS Lambda functions. It begins with an introduction to AWS Lambda and important concepts like triggers, statelessness, and serverlessness. It then covers how to create and add Lambda functions to infrastructure, and provides examples of common uses. The document emphasizes that collecting data is cheap but not having it when needed can be expensive. It outlines three options for monitoring Lambda functions and how Datadog specifically handles it by adding lines to CloudWatch logs. The presentation concludes with a thank you and opportunities to follow up.

Norikra Recent Updates

SATOSHI TAGOMORI

This document summarizes recent updates to Norikra, an open source stream processing server. Key updates include: 1) The addition of suspended queries, which allow queries to be temporarily stopped and resumed later, and NULLABLE fields, which handle missing fields as null values. 2) New listener plugins that allow processing query outputs in customizable ways, such as pushing to users, enqueueing to Kafka, or filtering records. 3) Dynamic plugin reloading that loads newly installed plugins without requiring a restart, improving uptime.

Security monitoring log management-describe logstash,kibana,elastic slidshare

ReZa AdineH

Fluentd and Docker - running fluentd within a docker container

Treasure Data, Inc.

Monitoring with Graylog - a modern approach to monitoring?

inovex GmbH

Centralized logging system using mongoDB

Vivek Parihar

This talk will cover the need of a centralized logging system, showcasing the architecture of the system. Also, I talk about how we ended up building this centralized logging system, What was the need for such a system, what problems we faced, how MongoDB fits into this and what others can learn from this. I also covered some how can we use mongoDB to make our logging system for realtime analytics and alerting system . The major use case of this system it to keeping track of meaningful events. This could be -: 1. How many users registered ? 2. How many registrations fails ? 3. Most occurred errors while doing something. 4. Realtime Analytics and Alerts 5. Identifying the possible threats.

tdtechtalk20160330johan

Johan Gustavsson

Spark Workflow Management

Romi Kuntsman

Technologies, Data Analytics Service and Enterprise Business

SATOSHI TAGOMORI

This document discusses technologies for data analytics services for enterprise businesses. It begins by defining enterprise businesses as those "not about IT" and data analytics services as providing insights into business metrics like customer reach, ad views, purchases, and more using data. It then outlines some key technologies needed for such services, including data management systems, distributed processing systems, queues and schedulers, tools for connecting systems, and methods for controlling jobs and workflows with retries to handle failures. Specific challenges around deadlines, idempotent operations, and replay-able workflows are also addressed.

Php johannesburg meetup - talk 2014 - scaling php in the enterprise

Sarel van der Walt

This document discusses scaling PHP applications for enterprise environments. It provides tips on optimizing various aspects of PHP applications and infrastructure to improve scalability. These include optimizing databases, caching, background tasks, frameworks, monitoring, and more. Specific technologies and strategies mentioned include Redis, memcached, haproxy, MySQL optimization techniques like archiving, and moving work to the client side where possible using techniques like AngularJS.

Dapper: the microORM that will change your life

Davide Mauri

ORM or Stored Procedures? Code First or Database First? Ad-Hoc Queries? Impedance Mismatch? If you're a developer or you are a DBA working with developers you have heard all this terms at least once in your life…and usually in the middle of a strong discussion, debating about one or the other. Well, thanks to StackOverflow's Dapper, all these fights are finished. Dapper is a blazing fast microORM that allows developers to map SQL queries to classes automatically, leaving (and encouraging) the usage of stored procedures, parameterized statements and all the good stuff that SQL Server offers (JSON and TVP are supported too!) In this session I'll show how to use Dapper in your projects from the very basis to some more complex usages that will help you to create *really fast* applications without the burden of huge and complex ORMs. The days of Impedance Mismatch are finally over!

What's hot

How bol.com makes sense of its logs, using the Elastic technology stack.

Renzo Tomà

ELK at LinkedIn - Kafka, scaling, lessons learned

Tin Le

Centralized Logging System Using ELK Stack

Rohit Sharma

Introducing ELK

AllBits BVBA (freelancer)

The ELK Stack - Get to Know Logs

GlobalLogic Ukraine

Elk

Caleb Wang

Log analysis with the elk stack

Vikrant Chauhan

Elk devops

Ideato

Presto meetup 2015-03-19 @Facebook

Treasure Data, Inc.

Kibana + timelion: time series with the elastic stack

Sylvain Wallez

Elastic Stack Introduction

Vikram Shinde

Monitoring, Hold the Infrastructure - Getting the Most out of AWS Lambda – Da...

Amazon Web Services

Norikra Recent Updates

SATOSHI TAGOMORI

Security monitoring log management-describe logstash,kibana,elastic slidshare

ReZa AdineH

Fluentd and Docker - running fluentd within a docker container

Treasure Data, Inc.

Monitoring with Graylog - a modern approach to monitoring?

inovex GmbH

Centralized logging system using mongoDB

Vivek Parihar

tdtechtalk20160330johan

Johan Gustavsson

Spark Workflow Management

Romi Kuntsman

Technologies, Data Analytics Service and Enterprise Business

SATOSHI TAGOMORI

What's hot (20)

How bol.com makes sense of its logs, using the Elastic technology stack.

ELK at LinkedIn - Kafka, scaling, lessons learned

Centralized Logging System Using ELK Stack

Introducing ELK

The ELK Stack - Get to Know Logs

Elk

Log analysis with the elk stack

Elk devops

Presto meetup 2015-03-19 @Facebook

Kibana + timelion: time series with the elastic stack

Elastic Stack Introduction

Monitoring, Hold the Infrastructure - Getting the Most out of AWS Lambda – Da...

Norikra Recent Updates

Security monitoring log management-describe logstash,kibana,elastic slidshare

Fluentd and Docker - running fluentd within a docker container

Monitoring with Graylog - a modern approach to monitoring?

Centralized logging system using mongoDB

tdtechtalk20160330johan

Spark Workflow Management

Technologies, Data Analytics Service and Enterprise Business

Similar to ELK Wrestling (Leeds DevOps)

Php johannesburg meetup - talk 2014 - scaling php in the enterprise

Sarel van der Walt

Dapper: the microORM that will change your life

Davide Mauri

Leveraging Databricks for Spark Pipelines

Rose Toomey

Leveraging Databricks for Spark pipelines

Rose Toomey

Databases benoitg 2009-03-10

benoitg

The document discusses the hype around NoSQL databases and provides guidance on selecting the right database solution. It summarizes different database types and evaluates databases based on characteristics like concurrency control, data storage, replication, and transaction support. The document advises profiling applications carefully before selecting a database and avoiding premature decoupling of data.

PostgreSQL is the new NoSQL - at Devoxx 2018

Quentin Adam

Have you seen the latest updates for traditional RDBNS lately? It's insane. They are all catching up and won't be left out. While all NoSQL stores are proposing SQL, all RDMS are proposing top notch JSON support. And it does not stop there. Latest PostgreSQL version have added new scalability features like table partitioning, query parallelism, pub/sub framework, a new quorum system for data sync. They have also improved their window functions for better time series queryability. And as it happens, we are using some of these new functionalities at Clever Cloud. In this talk I will showcase some of them to try to convince you that PostgreSQL is the new NoSQL. talk is recorded here: https://www.youtube.com/watch?v=t8-BQjWJFKw https://dvbe18.confinabox.com/talk/BLA-3308/PostgreSQL_is_the_new_NoSQL

Hosting Drupal on Amazon EC2

Kornel Lugosi

The document discusses various hosting solutions for Drupal including web hosting, virtual private servers, dedicated servers, and Amazon EC2. It provides details on the costs, reliability, customization options, and maintenance requirements for each solution. Additionally, it covers some key terms and tools related to using Amazon EC2, such as instances, AMIs, EBS, S3 storage, the command line interface, and the ElasticFox browser plugin.

Deferred Processing in Ruby - Philly rb - August 2011

rob_dimarco

Amazing Speed: Elasticsearch for the .NET Developer- Adrian Carr, Codestock 2015

Adrian Carr

The document summarizes a presentation about using Elasticsearch to improve search performance for applications with large amounts of data. It describes how the presenter previously used Elasticsearch at a previous job to speed up searches of a growing product catalog. The presenter then demonstrates how to install and use Elasticsearch with .NET applications using the Nest client library. Issues that may arise with integrating Elasticsearch into existing applications are also discussed, such as differences from relational databases and potential rework of user interfaces.

PostgreSQL, your NoSQL database

Reuven Lerner

Why databases cry at night

Michael Yarichuk

Slides for a talk. Talk abstract: In the dark of the night, if you listen carefully enough, you can hear databases cry. But why? As developers, we rarely consider what happens under the hood of widely used abstractions such as databases. As a consequence, we rarely think about the performance of databases. This is especially true to less widespread, but often very useful NoSQL databases. In this talk we will take a close look at NoSQL database performance, peek under the hood of the most frequently used features to see how they affect performance and discuss performance issues and bottlenecks inherent to all databases.

BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...

BigDataCloud

This document discusses using Amazon Elastic MapReduce (EMR) for cost-effective big data processing. It describes the author's experience using EMR to process 1TB of log data per week for a startup. Key advantages of EMR include only paying for usage, no hardware to maintain, and ability to customize cluster resources for different jobs. The author outlines best practices learned, such as splitting logs by type and processing in smaller windows, as well as next steps like using spot instances and NoSQL for improved performance and cost savings.

rsyslog meets docker

Rainer Gerhards

From a student to an apache committer practice of apache io tdb

jixuan1989

This talk is introduce by Xiangdong Huang, who is a PPMC of Apache IoTDB (incubating) project, at Apache Event at Tsinghua University in China. About the Event: The open source ecosystem plays more and more important role in the world. Open source software is widely used in operating systems, cloud computing, big data, artificial intelligence, and industrial Internet. Many companies have gradually increased their participation in the open source community. Developers with open source experience are increasingly valued and favored by large enterprises. The Apache Software Foundation is one of the most important open source communities, contributing a large number of valuable open source software and communities to the world. The invited guests of this lecture are all from ASF community, including the chairman of the Apache Software Foundation, three Apache members, Top 5 Apache code committers (according to Apache annual report), the first Committer in the Hadoop project in China, several Apache project mentors or VPs, and many Apache Committers. They will tell you what the open source culture is, how to join the Apache open source community, and the Apache Way.

Cost effective BigData Processing on Amazon EC2

Sujee Maniyam

Public Terabyte Dataset Project: Web crawling with Amazon Elastic MapReduce

Hadoop User Group

The document discusses the Public Terabyte Dataset Project which aims to create a large crawl of top US domains for public use on Amazon's cloud. It describes how the project uses various Amazon Web Services like Elastic MapReduce and SimpleDB along with technologies like Hadoop, Cascading, and Tika for web crawling and data processing. Common issues encountered include configuration problems, slow performance from fetching all web pages or using Tika language detection, and generating log files instead of results.

Why Wordnik went non-relational

Tony Tam

Mathias test

Mathias Stjernström

The document discusses a presentation on using PostgreSQL as a schemaless database. It provides an overview of different document storage options in PostgreSQL, including XML, hstore, and JSON. It then describes some performance tests conducted to compare loading and querying data stored in these PostgreSQL document formats versus a traditional relational schema and MongoDB. The test results showed PostgreSQL with a relational schema performed best for bulk loading, while PostgreSQL with B-tree indexes outperformed hstore, XML, JSON and MongoDB for primary key lookups. Hstore indexes were much slower than B-tree indexes for simple queries.

Get more than a cache back! The Microsoft Azure Redis Cache (NDC Oslo)

Maarten Balliauw

The document discusses Azure Cache and Redis. It provides an overview of Redis, including its data types, transactions, pub/sub capabilities, scripting, and sharding/partitioning. It then discusses common patterns for using Redis, such as caching, counting likes on Facebook, getting the latest reviews, rate limiting, and autocompletion. The document emphasizes that Redis is very flexible and can be used for more than just caching, acting as a general datastore. It concludes by recommending a Redis reference book for further learning.

First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNA

Tomas Cervenka

Tomáš Červenka will discuss Hive, an open-source data warehousing system built on Hadoop that provides SQL-like queries over large datasets. He will explain what Hive is useful for (big data analytics and processing), and not useful for (real-time queries and algorithms difficult to parallelize). He will demonstrate how to get started with Hive using Amazon EMR and provide a sample query, and discuss how VisualDNA uses Hive for analytics, reporting pipelines, and machine learning inference. Tips provided include using fast instance types, compression, and partitioning data.

Similar to ELK Wrestling (Leeds DevOps) (20)

Php johannesburg meetup - talk 2014 - scaling php in the enterprise

Dapper: the microORM that will change your life

Leveraging Databricks for Spark Pipelines

Leveraging Databricks for Spark pipelines

Databases benoitg 2009-03-10

PostgreSQL is the new NoSQL - at Devoxx 2018

Hosting Drupal on Amazon EC2

Deferred Processing in Ruby - Philly rb - August 2011

Amazing Speed: Elasticsearch for the .NET Developer- Adrian Carr, Codestock 2015

PostgreSQL, your NoSQL database

Why databases cry at night

BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...

rsyslog meets docker

From a student to an apache committer practice of apache io tdb

Cost effective BigData Processing on Amazon EC2

Public Terabyte Dataset Project: Web crawling with Amazon Elastic MapReduce

Why Wordnik went non-relational

Mathias test

Get more than a cache back! The Microsoft Azure Redis Cache (NDC Oslo)

First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNA

Recently uploaded

Generating privacy-protected synthetic data using Secludy and Milvus

Zilliz

During this demo, the founders of Secludy will demonstrate how their system utilizes Milvus to store and manipulate embeddings for generating privacy-protected synthetic data. Their approach not only maintains the confidentiality of the original data but also enhances the utility and scalability of LLMs under privacy constraints. Attendees, including machine learning engineers, data scientists, and data managers, will witness first-hand how Secludy's integration with Milvus empowers organizations to harness the power of LLMs securely and efficiently.

zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...

Alex Pruden

Folding is a recent technique for building efficient recursive SNARKs. Several elegant folding protocols have been proposed, such as Nova, Supernova, Hypernova, Protostar, and others. However, all of them rely on an additively homomorphic commitment scheme based on discrete log, and are therefore not post-quantum secure. In this work we present LatticeFold, the first lattice-based folding protocol based on the Module SIS problem. This folding protocol naturally leads to an efficient recursive lattice-based SNARK and an efficient PCD scheme. LatticeFold supports folding low-degree relations, such as R1CS, as well as high-degree relations, such as CCS. The key challenge is to construct a secure folding protocol that works with the Ajtai commitment scheme. The difficulty, is ensuring that extracted witnesses are low norm through many rounds of folding. We present a novel technique using the sumcheck protocol to ensure that extracted witnesses are always low norm no matter how many rounds of folding are used. Our evaluation of the final proof system suggests that it is as performant as Hypernova, while providing post-quantum security. Paper Link: https://eprint.iacr.org/2024/257

9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...

saastr

Astute Business Solutions | Oracle Cloud Partner |

AstuteBusiness

Your One-Stop Shop for Python Success: Top 10 US Python Development Providers

akankshawande

Columbus Data & Analytics Wednesdays - June 2024

Jason Packer

Choosing The Best AWS Service For Your Website + API.pptx

Brandon Minnick, MBA

Have you ever been confused by the myriad of choices offered by AWS for hosting a website or an API? Lambda, Elastic Beanstalk, Lightsail, Amplify, S3 (and more!) can each host websites + APIs. But which one should we choose? Which one is cheapest? Which one is fastest? Which one will scale to meet our needs? Join me in this session as we dive into each AWS hosting service to determine which one is best for your scenario and explain why!

Dandelion Hashtable: beyond billion requests per second on a commodity server

Antonios Katsarakis

This slide deck presents DLHT, a concurrent in-memory hashtable. Despite efforts to optimize hashtables, that go as far as sacrificing core functionality, state-of-the-art designs still incur multiple memory accesses per request and block request processing in three cases. First, most hashtables block while waiting for data to be retrieved from memory. Second, open-addressing designs, which represent the current state-of-the-art, either cannot free index slots on deletes or must block all requests to do so. Third, index resizes block every request until all objects are copied to the new index. Defying folklore wisdom, DLHT forgoes open-addressing and adopts a fully-featured and memory-aware closed-addressing design based on bounded cache-line-chaining. This design offers lock-free index operations and deletes that free slots instantly, (2) completes most requests with a single memory access, (3) utilizes software prefetching to hide memory latencies, and (4) employs a novel non-blocking and parallel resizing. In a commodity server and a memory-resident workload, DLHT surpasses 1.6B requests per second and provides 3.5x (12x) the throughput of the state-of-the-art closed-addressing (open-addressing) resizable hashtable on Gets (Deletes).

Leveraging the Graph for Clinical Trials and Standards

Neo4j

June Patch Tuesday

Ivanti

Ivanti’s Patch Tuesday breakdown goes beyond patching your applications and brings you the intelligence and guidance needed to prioritize where to focus your attention first. Catch early analysis on our Ivanti blog, then join industry expert Chris Goettl for the Patch Tuesday Webinar Event. There we’ll do a deep dive into each of the bulletins and give guidance on the risks associated with the newly-identified vulnerabilities.

Monitoring and Managing Anomaly Detection on OpenShift.pdf

Tosin Akinosho

Monitoring and Managing Anomaly Detection on OpenShift Overview Dive into the world of anomaly detection on edge devices with our comprehensive hands-on tutorial. This SlideShare presentation will guide you through the entire process, from data collection and model training to edge deployment and real-time monitoring. Perfect for those looking to implement robust anomaly detection systems on resource-constrained IoT/edge devices. Key Topics Covered 1. Introduction to Anomaly Detection - Understand the fundamentals of anomaly detection and its importance in identifying unusual behavior or failures in systems. 2. Understanding Edge (IoT) - Learn about edge computing and IoT, and how they enable real-time data processing and decision-making at the source. 3. What is ArgoCD? - Discover ArgoCD, a declarative, GitOps continuous delivery tool for Kubernetes, and its role in deploying applications on edge devices. 4. Deployment Using ArgoCD for Edge Devices - Step-by-step guide on deploying anomaly detection models on edge devices using ArgoCD. 5. Introduction to Apache Kafka and S3 - Explore Apache Kafka for real-time data streaming and Amazon S3 for scalable storage solutions. 6. Viewing Kafka Messages in the Data Lake - Learn how to view and analyze Kafka messages stored in a data lake for better insights. 7. What is Prometheus? - Get to know Prometheus, an open-source monitoring and alerting toolkit, and its application in monitoring edge devices. 8. Monitoring Application Metrics with Prometheus - Detailed instructions on setting up Prometheus to monitor the performance and health of your anomaly detection system. 9. What is Camel K? - Introduction to Camel K, a lightweight integration framework built on Apache Camel, designed for Kubernetes. 10. Configuring Camel K Integrations for Data Pipelines - Learn how to configure Camel K for seamless data pipeline integrations in your anomaly detection workflow. 11. What is a Jupyter Notebook? - Overview of Jupyter Notebooks, an open-source web application for creating and sharing documents with live code, equations, visualizations, and narrative text. 12. Jupyter Notebooks with Code Examples - Hands-on examples and code snippets in Jupyter Notebooks to help you implement and test anomaly detection models.

Driving Business Innovation: Latest Generative AI Advancements & Success Story

Safe Software

Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency. During the hour, we’ll take you through: Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board. Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes. Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI. We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI. This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!

Skybuffer SAM4U tool for SAP license adoption

Tatiana Kojar

Manage and optimize your license adoption and consumption with SAM4U, an SAP free customer software asset management tool. SAM4U, an SAP complimentary software asset management tool for customers, delivers a detailed and well-structured overview of license inventory and usage with a user-friendly interface. We offer a hosted, cost-effective, and performance-optimized SAM4U setup in the Skybuffer Cloud environment. You retain ownership of the system and data, while we manage the ABAP 7.58 infrastructure, ensuring fixed Total Cost of Ownership (TCO) and exceptional services through the SAP Fiori interface.

GraphRAG for Life Science to increase LLM accuracy

Tomaz Bratanic

Digital Marketing Trends in 2024 | Guide for Staying Ahead

Wask

https://www.wask.co/ebooks/digital-marketing-trends-in-2024 Feeling lost in the digital marketing whirlwind of 2024? Technology is changing, consumer habits are evolving, and staying ahead of the curve feels like a never-ending pursuit. This e-book is your compass. Dive into actionable insights to handle the complexities of modern marketing. From hyper-personalization to the power of user-generated content, learn how to build long-term relationships with your audience and unlock the secrets to success in the ever-shifting digital landscape.

JavaLand 2024: Application Development Green Masterplan

Miro Wengner

Apps Break Data

Ivo Velitchkov

How information systems are built or acquired puts information, which is what they should be about, in a secondary place. Our language adapted accordingly, and we no longer talk about information systems but applications. Applications evolved in a way to break data into diverse fragments, tightly coupled with applications and expensive to integrate. The result is technical debt, which is re-paid by taking even bigger "loans", resulting in an ever-increasing technical debt. Software engineering and procurement practices work in sync with market forces to maintain this trend. This talk demonstrates how natural this situation is. The question is: can something be done to reverse the trend?

TrustArc Webinar - 2024 Global Privacy Survey

TrustArc

How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024? In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores. See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe. This webinar will review: - The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey - The top challenges for privacy leaders, practitioners, and organizations in 2024 - Key themes to consider in developing and maintaining your privacy program

Serial Arm Control in Real Time Presentation

tolgahangng

Presentation of the OECD Artificial Intelligence Review of Germany

innovationoecd

Recently uploaded (20)

Generating privacy-protected synthetic data using Secludy and Milvus

zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...

9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...

Astute Business Solutions | Oracle Cloud Partner |

Your One-Stop Shop for Python Success: Top 10 US Python Development Providers

Columbus Data & Analytics Wednesdays - June 2024

Choosing The Best AWS Service For Your Website + API.pptx

Dandelion Hashtable: beyond billion requests per second on a commodity server

Leveraging the Graph for Clinical Trials and Standards

June Patch Tuesday

Monitoring and Managing Anomaly Detection on OpenShift.pdf

Driving Business Innovation: Latest Generative AI Advancements & Success Story

Skybuffer SAM4U tool for SAP license adoption

GraphRAG for Life Science to increase LLM accuracy

Digital Marketing Trends in 2024 | Guide for Staying Ahead

JavaLand 2024: Application Development Green Masterplan

Apps Break Data

TrustArc Webinar - 2024 Global Privacy Survey

Serial Arm Control in Real Time Presentation

Presentation of the OECD Artificial Intelligence Review of Germany

ELK Wrestling (Leeds DevOps)

1. { title: ‘ELK Wrestling’, author: ‘Steve Elliott’, company: ‘LateRooms.com’, type: ‘DevOpsLeeds, @timestamp: ‘2014-10-13T18:30Z’ }

2. Featuring Live Demo! Please tweet! Include: “leedsdevops”

3. Home growing a metrics culture Needed visibility of live issues Had trialled off the shelf before (Splunk) Hadn’t gained traction Wanted the data still

4. Options... Tried Splunk ...Bit pricey, pay for HW and volume of data indexed Looked at cloud based options, were also expensive

5. It started with Badger...

6. Logging and Monitoring Project Locate and implement the tools we needed Started with Cube for metrics (wouldn’t recommend) Moved onto Logging

7. Current tooling... ...Lacking

8. “But it works”

9. What can we log? Pretty much anything with a timestamp Error log Web logs Proxy logs Releases? Tweets?

10. Logstash ELK

11. High level architectural design Web servers Queue Dashboards Elasticsearch Rest of Badger

12. Real time search and analytics database

13. Who’s using it? Certain other hotel website... ...Clever people

14. Working with Elasticsearch ● RESTful API ● JSON ● Many libraries to deal with it (new on ElasticLinq for C#)

15. Sense Chrome Extension

16. Clustering Excellent distributed features Easy to use Node Self discovery Different Node Types (Data, Master, Search, Client) “Live” SSD “Archive” HDD

17. More in depth architecture IIS Logs Errors WMI Collector (e.g. Live Server) Queue Forwarder Cube (/TSDB) Search Analytics Rabbit MQ Filter & Forward

18. Logstash Inputs Filters Outputs e.g.HTTP logs, UDP, error logs, tweets. e.g. UDP, elasticsearch, graphite, IRC (e.g. Filter, grok, lookup IP, magic…)

19. Why the Queue? ● Resiliancy ● Single source of data for everyone ● Logstash used to recommend RabbitMQ, now they recommend Redis ● We still use RabbitMQ, works for us

20. Kibana ● Easy to build dashboards ● Gateway drug to ElasticSearch queries ● Examples!

21.

22.

23.

24. But...

25. Demo

26. Mistake: Dashboard Fatigue Too many dashboards to watch! Need to do more on alerting

27. Mistake: Using elasticsearch as a TSDB Lots of graphs just cared about top level values, should use a TSDB (such as graphite) instead Elasticsearch use case for more in-depth data analysis

28. Mistake: Trying to keep too much data ● Nodes going out of memory or disk space is bad ● Long GC can cause nodes to drop ● Can lead to split brain ● More shards = more memory ● usage, watch your scaling

29. Scaling Hit two bottlenecks - Ingestion (solved with SSDs) - Search (solved by scaling horizontally) 1.4.0 brings stability improvements, should handle oom better

30. Other Mistakes Should have automated sooner (Good chef/puppet support) Should have used “normal” logstash more More node More awesome??

31. What went right? ● Free and easy access to Data ● Doesn’t need to be on elasticsearch, but the tooling makes it easy ● Give people access and they’ll seek out the data to drive decisions - start the feedback loop ● Dev/Test instance

32. ELK in the wild Data Driven QA Data Driven...Managering

33. But wait, theres more! Curator, Kibana 4 (Woo - aggregations), alerting, linking logs together… Too much to cover here!

34. Thanks for Listening! More: elasticsearch.org, logstash.net Blog: www.tegud.net Twitter: @tegud Github: www.github.com/tegud Come say hi!

ELK Wrestling (Leeds DevOps)

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to ELK Wrestling (Leeds DevOps)

Similar to ELK Wrestling (Leeds DevOps) (20)

Recently uploaded

Recently uploaded (20)

ELK Wrestling (Leeds DevOps)