Logstash is a tool for managing logs that allows for input, filter, and output plugins to collect, parse, and deliver logs and log data. It works by treating logs as events that are passed through the input, filter, and output phases, with popular plugins including file, redis, grok, elasticsearch and more. The document also provides guidance on using Logstash in a clustered configuration with an agent and server model to optimize log collection, processing, and storage.
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...Edureka!
( ELK Stack Training - https://www.edureka.co/elk-stack-trai... )
This Edureka tutorial on What Is ELK Stack will help you in understanding the fundamentals of Elasticsearch, Logstash, and Kibana together and help you in building a strong foundation in ELK Stack. Below are the topics covered in this ELK tutorial for beginners:
1. Need for Log Analysis
2. Problems with Log Analysis
3. What is ELK Stack?
4. Features of ELK Stack
5. Companies Using ELK Stack
So, what is the ELK Stack? "ELK" is the acronym for three open source projects: Elasticsearch, Logstash, and Kibana. Elasticsearch is a search and analytics engine. Logstash is a server‑side data processing pipeline that ingests data from multiple sources simultaneously, transforms it, and then sends it to a "stash" like Elasticsearch. Kibana lets users visualize data with charts and graphs in Elasticsearch.
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...Edureka!
( ELK Stack Training - https://www.edureka.co/elk-stack-trai... )
This Edureka tutorial on What Is ELK Stack will help you in understanding the fundamentals of Elasticsearch, Logstash, and Kibana together and help you in building a strong foundation in ELK Stack. Below are the topics covered in this ELK tutorial for beginners:
1. Need for Log Analysis
2. Problems with Log Analysis
3. What is ELK Stack?
4. Features of ELK Stack
5. Companies Using ELK Stack
So, what is the ELK Stack? "ELK" is the acronym for three open source projects: Elasticsearch, Logstash, and Kibana. Elasticsearch is a search and analytics engine. Logstash is a server‑side data processing pipeline that ingests data from multiple sources simultaneously, transforms it, and then sends it to a "stash" like Elasticsearch. Kibana lets users visualize data with charts and graphs in Elasticsearch.
Keeping Up with the ELK Stack: Elasticsearch, Kibana, Beats, and LogstashAmazon Web Services
Version 7 of the Elastic Stack adds powerful new features to the popular open source platform for search, logging, and analytics. Come hear directly from Elastic engineers and architecture team members on powerful new additions like GIS functionality and frozen-tier search. Plus, hear about the full range of orchestration options for getting the most out of your deployments, however and wherever you choose to run them. This session is sponsored by Elastic.
ELK Stack workshop covers real-world use cases and works with the participants to - implement them. This includes Elastic overview, Logstash configuration, creation of dashboards in Kibana, guidelines and tips on processing custom log formats, designing a system to scale, choosing hardware, and managing the lifecycle of your logs.
ELK Elasticsearch Logstash and Kibana Stack for Log ManagementEl Mahdi Benzekri
Initiation to the powerful Elasticsearch Logstash and Kibana stack, it has many use cases, the popular one is the server and application log management.
Common issues with Apache Kafka® Producerconfluent
Badai Aqrandista, Confluent, Senior Technical Support Engineer
This session will be about a common issue in the Kafka Producer: producer batch expiry. We will be discussing the Kafka Producer internals, its common causes, such as a slow network or small batching, and how to overcome them. We will also be sharing some examples along the way!
https://www.meetup.com/apache-kafka-sydney/events/279651982/
Talk given for the #phpbenelux user group, March 27th in Gent (BE), with the goal of convincing developers that are used to build php/mysql apps to broaden their horizon when adding search to their site. Be sure to also have a look at the notes for the slides; they explain some of the screenshots, etc.
An accompanying blog post about this subject can be found at http://www.jurriaanpersyn.com/archives/2013/11/18/introduction-to-elasticsearch/
Real-time Analytics with Trino and Apache PinotXiang Fu
Trino summit 2021:
Overview of Trino Pinot Connector, which bridges the flexibility of Trino's full SQL support to the power of Apache Pinot's realtime analytics, giving you the best of both worlds.
Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...Databricks
Uber has real needs to provide faster, fresher data to data consumers & products, running hundreds of thousands of analytical queries everyday. Uber engineers will share the design, architecture & use-cases of the second generation of ‘Hudi’, a self contained Apache Spark library to build large scale analytical datasets designed to serve such needs and beyond. Hudi (formerly Hoodie) is created to effectively manage petabytes of analytical data on distributed storage, while supporting fast ingestion & queries. In this talk, we will discuss how we leveraged Spark as a general purpose distributed execution engine to build Hudi, detailing tradeoffs & operational experience. We will also show to ingest data into Hudi using Spark Datasource/Streaming APIs and build Notebooks/Dashboards on top using Spark SQL.
Easy Cloud Native Transformation with NomadBram Vogelaar
HashiCorp Nomad is a flexible and straightforward scheduler and orchestrator to deploy and manage containers and non-containerized applications across on-prem and clouds at scale.
Nomad can be seen as:
- an alternative to Kubernetes to deploy and scale containers without complexity
- a supplement to Kubernetes to implement a multi-orchestrator platform
On the other hand, this session will present how to ease the Cloud Native Transformation using Nomad.
Video: https://www.youtube.com/watch?v=v69kyU5XMFI
A talk I gave at the Philly Security Shell meetup 2019-02-21 on how the Elastic Stack works and how you can use it for indexing and searching security logs. Tools I mentioned: Github repo with script and demo data - https://github.com/SecHubb/SecShell_Demo Cerebro - https://github.com/lmenezes/cerebro Elastalert - https://github.com/Yelp/elastalert For info on my SANS teaching schedule visit: https://www.sans.org/instructors/john... Twitter: https://twitter.com/SecHubb
An introduction to elasticsearch with a short demonstration on Kibana to present the search API. The slide covers:
- Quick overview of the Elastic stack
- indexation
- Analysers
- Relevance score
- One use case of elasticsearch
The query used for the Kibana demonstration can be found here:
https://github.com/melvynator/elasticsearch_presentation
Keeping Up with the ELK Stack: Elasticsearch, Kibana, Beats, and LogstashAmazon Web Services
Version 7 of the Elastic Stack adds powerful new features to the popular open source platform for search, logging, and analytics. Come hear directly from Elastic engineers and architecture team members on powerful new additions like GIS functionality and frozen-tier search. Plus, hear about the full range of orchestration options for getting the most out of your deployments, however and wherever you choose to run them. This session is sponsored by Elastic.
ELK Stack workshop covers real-world use cases and works with the participants to - implement them. This includes Elastic overview, Logstash configuration, creation of dashboards in Kibana, guidelines and tips on processing custom log formats, designing a system to scale, choosing hardware, and managing the lifecycle of your logs.
ELK Elasticsearch Logstash and Kibana Stack for Log ManagementEl Mahdi Benzekri
Initiation to the powerful Elasticsearch Logstash and Kibana stack, it has many use cases, the popular one is the server and application log management.
Common issues with Apache Kafka® Producerconfluent
Badai Aqrandista, Confluent, Senior Technical Support Engineer
This session will be about a common issue in the Kafka Producer: producer batch expiry. We will be discussing the Kafka Producer internals, its common causes, such as a slow network or small batching, and how to overcome them. We will also be sharing some examples along the way!
https://www.meetup.com/apache-kafka-sydney/events/279651982/
Talk given for the #phpbenelux user group, March 27th in Gent (BE), with the goal of convincing developers that are used to build php/mysql apps to broaden their horizon when adding search to their site. Be sure to also have a look at the notes for the slides; they explain some of the screenshots, etc.
An accompanying blog post about this subject can be found at http://www.jurriaanpersyn.com/archives/2013/11/18/introduction-to-elasticsearch/
Real-time Analytics with Trino and Apache PinotXiang Fu
Trino summit 2021:
Overview of Trino Pinot Connector, which bridges the flexibility of Trino's full SQL support to the power of Apache Pinot's realtime analytics, giving you the best of both worlds.
Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...Databricks
Uber has real needs to provide faster, fresher data to data consumers & products, running hundreds of thousands of analytical queries everyday. Uber engineers will share the design, architecture & use-cases of the second generation of ‘Hudi’, a self contained Apache Spark library to build large scale analytical datasets designed to serve such needs and beyond. Hudi (formerly Hoodie) is created to effectively manage petabytes of analytical data on distributed storage, while supporting fast ingestion & queries. In this talk, we will discuss how we leveraged Spark as a general purpose distributed execution engine to build Hudi, detailing tradeoffs & operational experience. We will also show to ingest data into Hudi using Spark Datasource/Streaming APIs and build Notebooks/Dashboards on top using Spark SQL.
Easy Cloud Native Transformation with NomadBram Vogelaar
HashiCorp Nomad is a flexible and straightforward scheduler and orchestrator to deploy and manage containers and non-containerized applications across on-prem and clouds at scale.
Nomad can be seen as:
- an alternative to Kubernetes to deploy and scale containers without complexity
- a supplement to Kubernetes to implement a multi-orchestrator platform
On the other hand, this session will present how to ease the Cloud Native Transformation using Nomad.
Video: https://www.youtube.com/watch?v=v69kyU5XMFI
A talk I gave at the Philly Security Shell meetup 2019-02-21 on how the Elastic Stack works and how you can use it for indexing and searching security logs. Tools I mentioned: Github repo with script and demo data - https://github.com/SecHubb/SecShell_Demo Cerebro - https://github.com/lmenezes/cerebro Elastalert - https://github.com/Yelp/elastalert For info on my SANS teaching schedule visit: https://www.sans.org/instructors/john... Twitter: https://twitter.com/SecHubb
An introduction to elasticsearch with a short demonstration on Kibana to present the search API. The slide covers:
- Quick overview of the Elastic stack
- indexation
- Analysers
- Relevance score
- One use case of elasticsearch
The query used for the Kibana demonstration can be found here:
https://github.com/melvynator/elasticsearch_presentation
Webinar usando graylog para la gestión centralizada de logsatSistemas
De la mano de atSistemas, descubrirás cómo implantar esta solución en entornos complejos: desde la definición de la arquitectura y dimensionamiento de los sistemas que más se ajusta a las necesidades del cliente, hasta la configuración de los recolectores de mensajes y posterior trasformación para la localización de problemas.
Graylog proporciona un sistema unificado y centralizado de mensajes procedentes de diferentes fuentes: sistema operativo, servidores de aplicación, sistemas de información, etc. Dispone de un sistema de alertas y de búsqueda de histórico de logs usando ElasticSearch como base de datos de índices.
Attack monitoring using ElasticSearch Logstash and KibanaPrajal Kulkarni
With growing trend of Big data, companies are tend to rely on high cost SIEM solutions. However, with introduction of open source and lightweight cluster management solution like ElasticSearch this has been the highlight of the year. Similarly, the log aggregation has been simplified by logstash and kibana providing a visual look to the complex data structure. This presentation will exactly cater to this need of having a appropriate log analysis+Detecting Intrusion+Visualizing data in a powerful interface.
Managing Your Security Logs with ElasticsearchVic Hargrave
The ELK stack (Elasticsearch-Logstash-Kibana) provides a cost effective alternative to commercial SIEMs for ingesting and managing OSSEC alert logs. This presentation will show you how to construct a low cost SIEM based on ELK that rivals the capabilties of commercials SIEMs.
Docker Logging and analysing with Elastic StackJakub Hajek
Collecting logs from the entire stateless environment is challenging parts of the application lifecycle. Correlating business logs with operating system metrics to provide insights is a crucial part of the entire organization. What aspects should be considered while you design your logging solutions?
Docker Logging and analysing with Elastic Stack - Jakub Hajek PROIDEA
Collecting logs from the entire stateless environment is challenging parts of the application lifecycle. Correlating business logs with operating system metrics to provide insights is a crucial part of the entire organization. We will see the technical presentation on how to manage a large amount of the data in a typical environment with microservices.
A talk about Open Source logging and monitoring tools, using the ELK stack (ElasticSearch, Logstash, Kibana) to aggregate logs, how to track metrics from systems and logs, and how Drupal.org uses the ELK stack to aggregate and process billions of logs a month.
DSLing your System For Scalability Testing Using Gatling - Dublin Scala User ...Aman Kohli
The power of Gatling is the DSL it provides to allow writing meaningful and expressive tests. We provide an overview of the framework, a description of their development environment and goals, and present their test results.
Source code available https://github.com/lawlessc/random-response-time
Pradeep Sharma from OSSCube presents on Securing your web server at OSSCamp, organized by OSSCube - A Global open Source enterprise for Open Source Solutions
To know how we can help your business grow, leveraging Open Source, contact us:
India: +91 995 809 0987
USA: +1 919 791 5427
WEB: www.osscube.com
Mail: sales@osscube.com
I will be giving a brief overview of the history of NGINX along with an overview of the features and functionality in the project as it stands today. I will give some real use case of example of how NGINX can be used to solve problems and eliminate complexity within infrastructure. I will then dive into the future of the modern web and how NGINX is monitoring and leveraging industry changes to enhance the product for individuals and companies in the industry.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
2. Why use Logstash?
• We already have splunk, syslog-ng, chukwa,
graylog2, scribe, flume and so on.
• But we want a free, light-weight and high-
integrality frame for our log:
• non free --> splunk
• heavy java --> scribe,flume
• lose data --> syslog
• non flex --> nxlog
3. How logstash works?
• Ah, just like others, logstash has
input/filter/output plugins.
• Attention: logstash process events, not (only)
loglines!
• "Inputs generate events, filters modify them,
outputs ship them elsewhere." -- [the life of an
event in logstash]
• "events are passed from each phase using
internal queues......Logstash sets each queue
size to 20." -- [the life of an event in logstash]
8. Usage in cluster - agent install
• Only an 'all in one' jar download in
http://logstash.net/
• All source include ruby and JRuby in
http://github.com/logstash/
• But we want a lightweight agent in cluster.
11. Usage in cluster - server install
• Server is another agent run some filter and
storages.
• Message queue(RabbitMQ is too heavy, Redis
just enough):
– yum install redis-server
– service redis-server start
• Storage: mongo/elasticsearch/Riak
• Visualization: kibana/statsd/riemann/opentsdb
• Run:
– java -jar logstash-1.1.0-monolithic.jar agent -f logstash/etc/server.conf
13. Usage in cluster - grok
• jls-grok is a pattern tool wrote by JRuby
• Lots of examples can be found at:
https://github.com/logstash/logstash/tree/master/patterns
• Here is my "nginx" patterns:
– NGINXURI %{URIPATH}(?:%{URIPARAM})*
– NGINXACCESS [%{HTTPDATE}] %{NUMBER:code:int} %{IP:client} %
{HOSTNAME} %{WORD:method} %{NGINXURI:req} %{URIPROTO}/%
{NUMBER:version} %{IP:upstream}(:%{POSINT:port})? %
{NUMBER:upstime:float} %{NUMBER:reqtime:float} %{NUMBER:size:int}
"(%{URIPROTO}://%{HOST:referer}%{NGINXURI:referer}|-)" %
{QS:useragent} "(%{IP:x_forwarder_for}|-)"
14. Usage in cluster - elasticsearch
• ElasticSearch is a production build-on Luence
for the cloud compute.
• more information at:
– http://www.elasticsearch.cn/
• Logstash has an embedded ElasticSearch
already!
• Attention: If you want to build your own
distributed elasticsearch cluster, make sure the
server version is equal to the client used by
logstash!
16. Usage in cluster - elasticsearch
• The embedded web front for ES is too simple,
sometimes naïve~Try Kibana and EShead.
• https://github.com/rashidkpc/Kibana
• https://github.com/mobz/elasticsearch-head.git
• Attention:there is a bug about ES ---- ifdown
your external network before ES starting and
ifup later.Otherwase your ruby client cannot
connect ES server!
17. Try it please!
• Ah, do not want install,install,install and install?
• Here is a killer application:
– sudo zypper install virtualbox rubygems
– gem install vagrant
– git clone https://github.com/mediatemple/log_wrangler.git
– cd log_wrangler
– PROVISION=1 vagrant up
18. Other output example
• For monitor(example):
– filter {
– grep {
– type => "linux-syslog"
– match => [ "@message","(error|ERROR|CRITICAL)" ]
– add_tag => [ "nagios-update" ]
– add_field => [ "nagios_host", "%{@source_host}", "nagios_service", "the name of your
nagios service check" ]
– }
– }
– output{
– nagios {
– commandfile => “/usr/local/nagios/var/rw/nagios.cmd"
– tags => "nagios-update"
– type => "linux-syslog"
– }
– }
19. Other output example
• For metric
– output {
– statsd {
– increment => "apache.response.%{response}"
– count => [ "apache.bytes", "%{bytes}" ]
– }
– }
20. Advanced Questions
• Is ruby1.8.7 stability enough?
• Try Message::Passing module in CPAN, I love perl~
• Is ElasticSearch high-speedy enough?
• Try Sphinx, see report in ELSA project:
– In designing ELSA, I tried the following components but found them too slow. Here they are ordered from fastest to
slowest for indexing speeds (non-scientifically tested):
1. Tokyo Cabinet
2. MongoDB
3. TokuDB MySQL plugin
4. Elastic Search (Lucene)
5. Splunk
6. HBase
7. CouchDB
8. MySQL Fulltext
• http://code.google.com/p/enterprise-log-search-and-archive/wiki/Documentation#Why_ELSA?
21. Advanced Testing
• How much event/sec can ElasticSearch hold?
• - Logstash::Output::Elasticsearch(HTTP) can only indexes 200+ msg/sec for
one thread.
• - Try _bulk API by myself using perl ElasticSearch::Transport::HTTPLite
module.
• -- speed testing result is 2500+ msg/sec
• -- tesing record see:
http://chenlinux.com/2012/09/16/elasticsearch-bulk-index-speed-testing/
WHY?!
22. Maybe…
• Logstash use an experimental module, we can
see the Logstash::Output::ElasticsearchHTTP
use ftw as http client but it cannot hold bulk size
larger than 200!!
• So we all suggest to use multi-output block in
agent.conf.
23. Advanced ES Settings(1)--problems
• Kibana can search data by using facets APIs.
But when you indexes URLs, they would be
auto-splitted by ‘/’~~
• And search facets at ip from 1000w msgs use
0.1s,but at urls use…ah, timeout!
• When you check your indices size, you will find
that (indices size/indices count) : message
length ~~ 10:1 !!
24. Advanced ES Settings(2)--solution
• Setting ElasticSearch default _mapping
template!
• In fact, ES “store” index data, and then “store”
store data… Yes! If you don’t set “store” : “no”,
all the data reduplicate stored.
• And ES has many analyze plugins.They
automate split words by whitespaces, path
hierachy, keword etc.
• So, set “index”:”not_analyzed” and facets 100k+
URLs can be finished in 1s.
25. Advanced ES Settings(2)--solution
• Optimze:
• Call _optimze API everyday may decrease some
indexed size~
• You can found those solutions in:
• https://github.com/logstash/logstash/wiki/Elasticsearch-Storage-Optimization
• https://github.com/logstash/logstash/wiki/Elasticsearch----Using-index-templates-&-dynamic-
26. Advanced Input -- question
• Now we know how to disable _all field, but there
are still duplicated fields: @fields and
@message!
• Logstash search ES default in @message field
but logstash::Filter::Grok default capture
variables into @fields just from @message!
• How to solve?
27. Advanced Input -- solution
• We know some other systems like
Message::Passing have encode/decode in
addition to input/filter/output.
• In fact logstash has them too~but rename them
as ‘format’.
• So we can define the message format ourself,
just using logformat in nginx.conf.
• (example as follow)
29. Advanced Input -- json_event
• Now define input block with format:
– input {
– stdin {
– type => "nginx“
– format => "json_event“
– }
– }
• And start in command line:
– tail -F /data/nginx/logs/access.json
– | sed 's/upstreamtime":-/upstreamtime":0/'
– | /usr/local/logstash/bin/logstash -f /usr/local/logstash/etc/agent.conf &
• Attention: Upstreamtime may be “-” if status is 400.
30. Advanced Web GUI
• Write your own website using ElasticSearch
RESTful API to search as follows:
– curl -XPOST http://es.domain.com:9200/logstash-2012.09.18/nginx/_search?pretty=1 –d ‘
{
“query”: {
“range”: {
“from”: “now-1h”,
“to”: “now”
}
},
“facets”: {
“curl_test”: {
“date_histogram”: {
“key_field”: “@timestamp”,
“value_field”: “url”,
“interval “: “5m”
}
}
},
“size”: 0
}
’
31. Additional Message::Passing demo
• I do write a demo using Message::Passing,
Regexp::Log, ElasticSearch and so on perl
modules working similar to logstash usage
showed here.
• See:
– http://chenlinux.com/2012/09/16/message-passing-agent/
– http://chenlinux.com/2012/09/16/regexp-log-demo-for-nginx/
– http://chenlinux.com/2012/09/16/message-passing-filter-demo/