Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit Data


Published on

Warren Strange, Principal Systems Engineer, ForgeRock, presents a Breakout Session on the ELK Stack at the 2014 IRM Summit in Phoenix, Arizona.

Published in: Software, Technology

Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit Data

  1. 1. IRM Summit 2014 Customer Intelligence: Using the ELK stack (Elasticsearch, Logstash and Kibana) to analyse ForgeRock OpenAM audit data
  2. 2. IRM Summit 2014 Make pretty pictures for the boss
  3. 3. 3IRM Summit 2014 Coincidence?
  4. 4. OpenDJ, OpenIDM, OpenAM produce copious amounts of audit data Analysis of that data is left as an exercise for the reader Many great SIEM tools Desire for an Open Source solution for data analysis
  5. 5. What is the ELK stack? Elasticsearch: “No SQL” database Logstash: Log collection and transformation Kibana: Data visualizer for Elasticsearch
  6. 6. Yes, but what does ELK do? Collect, analyse and visualize data Any kind of data Github (8 Million repos), Soundcloud (30M users), The Guardian (40M documents) Answer questions: ● Where are my users coming from? ● What is the traffic in North America vs. Europe? ● Why do I see an employee logging in from Canada?
  7. 7. Elasticsearch ● NoSQL, REST/json, document oriented, schemaless, “Big data” full text search engine ● Apache 2.0 license ● Sweet spot is rapid full text search / ad hoc queries ● Not a replacement for an RDBMS ● Not transactional, not ACID, etc. ● Built on Apache Lucene project
  8. 8. Logstash ● Swiss army knife of log collection, transformation and forwarding ● JRuby based ● Large footprint :-( ● lumberjack ● go based collector that feeds into logstash ● Very lightweight, small footprint
  9. 9. Kibana
  10. 10. Logstash flow Input source files, database, syslog, etc. Filters grep, regex, geoIP, ... Output elasticsearch, file, db, syslog “Plugin” based architecture. Add new plugins for input, output and filters
  11. 11. Logstash example Input source file: amAccess.* type: amAccess Filters Map IP address to GEO location Output elasticsearch:9100 Read from OpenAM access logs Add Geo Location data Write the result to Elasticsearch
  12. 12. Geek Alert!!
  13. 13. input { file { type => amAccess path => "/logs/am/log/amAuthentication.*" } } Input section Wildcards can be used Data is tagged with a type. Use this to classify & search by type
  14. 14. filter { if [type] == "amAccess" { csv { columns => [time,Data,LoginID,ContextID, IPAddr, LogLevel, Domain, LoggedBy, MessageID, ModuleName, NameID, HostName] separator => " " } date { match => ["time", "yyyy-MM-dd HH:mm:ss"] } geoip { database => "/usr/share/GeoIP/GeoIP.dat" source => ["IPAddr"] } } } Filter apply to type Parse the data as csv Normalize the date to a common format Enrich the record with GEO location
  15. 15. output { stdout { codec => rubydebug } elasticsearch { host => localhost } } Output Send the data to Elasticsearch and the stdout
  16. 16. Demo Time As seen on youtube! 27 49 views!
  17. 17. ELK Demo Environment OpenAM OpenDJ OpenIDM logstash elasticsearch:9100 Apache:80/443 + Policy Agent Log Files Kibana
  18. 18. Marketing Genius? Where to hold the next ForgeRock Summit: Europe, USA, or Canada? Asks you to find out pronto: ● What country are customers visiting the ForgeRock website from? ● How are they authenticating (forgerock account, or federated?)
  19. 19. The next IRM summit location: We have beer! Bring your toque!
  20. 20. Next Steps Delivery Models Cloud or Appliance? Interested in collaborating? Share logstash config, kibana reports, etc. Puppet/Chef/Ansible/Docker installers?