Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Visualizing HPCC Systems Log Data Using ELK

17 views

Published on

As part of the 2018 HPCC Systems Summit Community Day event:

Up first, Aramis Tanelus, American Heritage School, briefly discusses his poster, HPCC Systems Robotics Sensor Interface.

Following, Rodrigo Pastrana and Miguel Vazquez present their breakout session in the Systems Tools track.

Find out how to utilize ELK (a powerful log management stack comprised of Elastic Search, Logstash and Kibana) to visualize your HPCC Systems log data to help deliver actionable insights in real time, trend analytics, system monitoring and much more. In this session, we will walk through the log data extraction, visualization dashboard creation, and discuss related HPCC Systems and ELK insights.

Rodrigo Pastrana is an Architect with the HPCC systems supercomputer focusing in platform integration and plug-in development. He has been a member of the HPCC core technology team for over five years and a member of the LexisNexis team for seven. Rodrigo is the principle developer of WsSQL, the HPCC JDBC connector, the HPCC Java APIs library and tools, and the Dynamic ESDL component. He has more than fifteen years of experience in design, research and development of state of the art technology including IBM’s embedded text-to-speech and voice recognition products, Eclipse’s device development environment. Rodrigo holds an MS and BS in Computer Engineering from the University of Florida and during his professional career has filed more than ten patent disclosures through the USPTO.

Miguel Vazquez is a Consulting Software Engineer and has been with LexisNexis for 6 years. He brings 10+ years of application development to his role within the company. He oversees the front-end development for a number of items which include: ECL Watch, Dynamic ESDL, and Configuration Manager. He specializes in JavaScript and is a fan of the latest JavaScript trends.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Visualizing HPCC Systems Log Data Using ELK

  1. 1. Innovation and Reinvention Driving Transformation OCTOBER 9, 2018 2018 HPCC Systems® Community Day Rodrigo Pastrana & Miguel Vazquez Visualizing HPCC Systems Log Data Using ELK
  2. 2. Who are we? Visualizing HPCC Systems Log Data Using ELK 3 Rodrigo Pastrana Architect HPCC Systems Miguel Vazquez Consulting SWE HPCC Systems
  3. 3. Why visualize HPCC Systems log data? • Component logs contain a wealth of raw information • This data can be used for debugging, profiling, billing, accounting, analyzing, etc. • These actions are difficult to do with raw log data • Visualizing this data can help you find the needle in the haystack Visualizing HPCC Systems Log Data Using ELK 4
  4. 4. A visualization is worth a thousand (or more) log entries… Visualizing HPCC Systems Log Data Using ELK 5
  5. 5. Discussion Topics • What is ELK (and why) • HPCC Systems Log Details • ELK Topology and other Considerations • Sample ELK Topology • Sample ESP Transaction Info Processing • ELK Component Configuration • Demo Visualization of ESP Transaction data
  6. 6. Elasticsearch Logstash Kibana (and Beats) – Elastic Stack • Powerful, flexible Open-source stack for log analytics from Elastic • Arguably the de-facto standard for log processing and analytics Components • Beats: Light-weight, single purpose log data shippers • Logstash: Ingests data from a number of different sources, parse, filter, mutate, and stashes it. • Elasticsearch: Search and analytics engine - acts as a data store. • Kibana: Visualization front end Visualizing HPCC Systems Log Data Using ELK 7
  7. 7. HPCC Systems Component Logs • Important to understand HPCC Systems log basics! • Logs generated for major HPCC Systems components • ESP, ROXIE, THOR, DALI, Sasha, DafileServ, DFUServer, ECLAgent, etc. • Log files are time-stamped and rolled daily; • Running log link provided • Default location: /var/log/HPCCSystems/<componentname> • Default log message format (can be edited in configuration): • SequenceNumber, Date-TimeStamp, ProcessId, ThreadId, QuotedMessage • QuotedMessage – Contains actual log message • Aggregation of other fields – Uniquely identify log message instance Visualizing HPCC Systems Log Data Using ELK 8
  8. 8. ELK Implementation - Design Considerations • Several considerations when setting up ELK system to process log data • What are we doing with the data • Troubleshooting? Monitoring? Analytics? Reporting? • Type and volume of data • Data sensitivity • Resources required • Security • Compliance • Disaster recovery • Data backup and others Visualizing HPCC Systems Log Data Using ELK 9
  9. 9. Our approach (to illustrate the point) Visualizing HPCC Systems Log Data Using ELK 10 • We will target ESP transactions and ROXIE completed queries • ESP logs conveniently provide transaction summary log entries • Roxie provides similar query complete summary log entries • Focuses on most important info, minimize amount of data processed through ELK • This also prevents us from exposing sensitive data (PII, SPII nor business data) • We will target a remote ELK cluster • Processing small subset of logs • No sensitive data! • Minimal resource contention with HPCC Systems components
  10. 10. ROXI E ROXI E ESP … Overall Topology Visualizing HPCC Systems Log Data Using ELK 11
  11. 11. Metricbeats - Monitor cluster system health (optional) Visualizing HPCC Systems Log Data Using ELK 12
  12. 12. Metricbeats - Monitor node system health (optional) Visualizing HPCC Systems Log Data Using ELK 13
  13. 13. Setup Filebeat to forward HPCC Systems log entries Visualizing HPCC Systems Log Data Using ELK 14 • Filebeat primarily responsible for tailing files, filtering, and multi-line stitching • Filebeat configuration in filebeat.yml • Declare filebeat “prospector” for each log message type to be forwarded • Set prospector to target running component log file • In case of HPCC component ESP labeled “myesp” • /var/log/HPCCSystems/myesp/esp.log • Define custom field “component” set to component type (ESP, ROXIE,etc.). • “component” field subsequently used by Logstash • Declare regex patterns to include or exclude message types • Include_lines: standard component log columns + “TxSummary …” • Declare regex patterns to handle multi-line TxSummary messages as single entity
  14. 14. FileBeat ESP prospector Visualizing HPCC Systems Log Data Using ELK 15 filebeat.prospectors: type: log enabled: true paths: - /var/log/HPCCSystems/myesp/esp.log fields: component: esp fields_under_root: true encoding: plain include_lines: ['([A-Z-0-9]{8})s+(d{4}-d{2}-d{2}sd{2}:d{2}:d{2}.d{3})s+(d+)s+(d+)s+("TxSummary)'] multiline.pattern: ([A-Z-0-9]{8})s+(d{4}-d{2}-d{2}sd{2}:d{2}:d{2}.d{3})s+(d+)s+(d+)s+(") multiline.negate: true multiline.match: after Targeting current log file Custom field declaration Rule treats multi-line messages as single entity RegEx describes default log entry with Quoted message with a leading “TxSummary” Default ‘myesp’ ESP log location FileBeat.yml
  15. 15. FileBeat.yml details continued… • Prospector definitions for other components very similar • Target appropriate component log file • Create custom “component” field, populate with appropriate label • Regex to include or exclude entry types of interest • Configure Filebeat to forward the filtered log entries to remote Logstash instance • Logstash assumed to listen for these messages on particular address(es) and ports, (X.Y.Z.W:5044 for this example): #----------------------------- Logstash output ----------------- ---- output.logstash: # The Logstash hosts hosts: ["X.Y.Z.W:5044"] FileBeat.yml Visualizing HPCC Systems Log Data Using ELK 16
  16. 16. Process HPCC Systems log messages through Logstash Capture HPCC Systems log messages Parse Structure Filter Mutate Create and populate ElasticSearch indexes Visualizing HPCC Systems Log Data Using ELK 17 ROXI E ROXI E ESP …
  17. 17. Logstash Input setup • Set up the Logstash input mechanism • Many options (stdin, s3, redis, pipe, file, log4j, kafka, etc.) • Let’s enable input for filebeats messages via port 5044 input { beats { port => 5044 } } Logstash.con f Visualizing HPCC Systems Log Data Using ELK 18
  18. 18. Setup Logstash to process ESP transactions Filter{ if [component] == "esp" { grok {match => { "message" => "%{BASE16NUM:sequence} %{TIMESTAMP_ISO8601:logtimestamp} +%{INT:processID} +%{INT:threadID} %{QUOTEDSTRING:logmessage}" }} kv { source => "logmessage" field_split => "[;" value_split => "=" } Logstash.con f Rule for ESP based messages Capture messages with known log format Expected sequence of space delimited fields (fieldtype:fieldname) Parses string field “logmessage” into key value pairs "TxSummary[activeReqs=4;rcv=1ms;user=ausr@127.0.0.1;req=POST wssmc.ACTIVITY v1.2;total=45ms;]" Visualizing HPCC Systems Log Data Using ELK 19
  19. 19. Setup Logstash to process ESP transactions (continued…) mutate { remove_field => [ "message", "@version" ] rename => { "total" => "TotalTrxmS” "rcv" => "TimeReceived” } convert => { "threadID" => "integer” "processID" => "integer” }}} else if [component] == “roxie“ Logstash.con f Important to filter out noise Assign meaningful field names Assign field type for aggregation purpose Create similar rules for other log message types Visualizing HPCC Systems Log Data Using ELK 20
  20. 20. Setup Logstash to process ESP transactions (output to ES) output { if [component] == "esp" { elasticsearch { hosts => ["yourelasticsearchaddress:9200"] index => “esp-log-%{+YYYY.MM.dd}" } } else if [component] == “roxie" { …} } Logstash.con f Define output Logstash output rules Forward processed messages to ES Important to establish appropriate indexing mechanism Similar rules for ROXIE based messages Visualizing HPCC Systems Log Data Using ELK 21
  21. 21. Confirm EL indexes are created Visualizing HPCC Systems Log Data Using ELK 22 http://yourkibanaip:5601/app/kibana#/dev_tools/conso le
  22. 22. Let’s create some visualizations
  23. 23. Discover your newly created log events Visualizing HPCC Systems Log Data Using ELK 24
  24. 24. Some of Kibana’s visualization toolset Visualizing HPCC Systems Log Data Using ELK 25
  25. 25. Visualization creation • Visualize > > Select a visualization type > Enter some metrics > Save Visualizing HPCC Systems Log Data Using ELK 26
  26. 26. Deeper dive into what our users are doing Visualizing HPCC Systems Log Data Using ELK 27
  27. 27. Visualizations using a search query Visualizing HPCC Systems Log Data Using ELK 28
  28. 28. Lets tie it up all together with a Dashboard • Dashboard > > > Save Visualizing HPCC Systems Log Data Using ELK 29
  29. 29. Available in 7.0 Visualizing HPCC Systems Log Data Using ELK 30 • Ability to embed into the ECL Watch U/I
  30. 30. Questions? Visualizing HPCC Systems Log Data Using ELK 31 Useful Links • https://hpccsystems.com/blog/ELK_visualizations • https://www.elastic.co/guide/index.html • https://github.com/rpastrana/hpcc-elk • http://cdn.hpccsystems.com/releases/CE-Candidate-7.0.0/docs/EN_US/HPCCSystemAdministratorsGuide_EN_US-7.0.0- rc2.pdf#page=26 Contact Us • Rodrigo.Pastrana@lexisnexisrisk.com • Miguel.Vazquez@lexisnexisrisk.com
  31. 31. Visualizing HPCC Systems Log Data Using ELK 32 Thank you

×