SlideShare a Scribd company logo
Using AWS, Terraform, and Ansible for
DreamPort Projects - the Splunk Cluster
How we used (and are still using) tools such as AWS, Terraform, and Ansible to
automate everything about a Splunk cluster.
Intro
The Who, the What, the Why, and the How
Hands on Keys – Live Demo
Summary, Questions, Extra Deep Dives
On the Agenda Today...
Prerequisites – Terms and Tools
• Basic understanding of AWS and cloud computing platforms
• Aware of configuration management/orchestration tools such as
Terraform and Ansible
• Aware of the concepts of Docker
• Need to have a basic understanding of Splunk and a Splunk cluster
• PLEASE ASK QUESTIONS.
The Who – Me, MISI, and DreamPort
• Bill Cawthra - Cloud Infrastructure Architect
• I play with little fluffy clouds all day (AWS, Google Cloud, Azure)
• MISI/DreamPort - Support and help develop various cyber security projects
through collaboration with .gov, private industry, community, and .edu
• DreamPort projects – over 20 projects/AWS environments, usually 30-90
days long (some are notably longer)
• https://misi.tech/#about
• https://dreamport.tech/about-us.php
The What and the Why - The Splunk Evaluation
• We wanted to build a Splunk cluster to analyze it's machine learning capabilities.
• The data set was 9 TB of Zeek data
• 20 users accessing this data at a time (so fairly light on the frontend)
• But very intense work done on the backend (indexers)
• Big beefy i3.8xlarge instances… Use the instance-store for fast IO (but ephemeral!
Therefore we used Splunk SmartStore)
• With the help of many people at Splunk (Bryan Pluta, Tyler Muth, Matt Toth, and
others), we came up with a design to fit these requirements
• We are going to use AWS, Terraform, and Ansible as our tools of choice
The How - AWS
• Amazon Web Services; provides an on-demand
computing platform
• "Elastic" resources
• Allows us to rapidly scale out and scale down
• Very easy to manage many disparate projects
• Best datacenter money can buy
The How - Terraform
• Our infrastructure configuration tool of choice
• This "frames the house"; creating the AWS resources (VPC, security
groups, instances, IAM policies, IAM roles, S3 buckets, etc)
• Enforces configuration from the very start (no GUI. No artisinally
crafted architecture)
The How - Ansible - Drywall, Paint,
and Fixtures
• Our automation and configuration management tool of choice
• Handles configuration of systems
• Handles automation tasks (upgrade and reboot of systems… and ingest orchestration!)
• Does everything after the "house is framed"
The How - Docker
• Easy binary management (example: to upgrade, just docker pull
splunk:<VERSION>)
• The splunk-docker project makes it very easy to assign roles, access
variables
The How - Infrastructure Diagram
Before We Go Live
• I will be covering things at a high level
• I will be skipping many things
• Ask questions if you want to see XYZ
• Look at the code on your own too!
• It’s tricky to balance being concise in a talk and detail of the code
• Need to avoid turning this into a code review session…
• If something looks confusing or wrong, I probably made a mistake.
Before We Go Live - Resources
• https://github.com/TheDreamPort/splunk-infrastructure (santiized
version of this project)
• Also great references:
• https://splunk.github.io/splunk-ansible/ - Splunk Ansible reference
• https://splunk.github.io/docker-splunk/ - Splunk Docker
TO THE TERMINAL AND BROWSER
Conclusion
• We automate automate automate
• Which means, we configure/deploy everything programmatically
• Ingest is automated
• Makes it so easy to redo
• Break up the automation into logical pieces
• It is not fun having a single mega-script
Extra Notes - Splunk Ingest
• Ingest the 9TB of data in batches (basically did it a month at a time) and
wait for completion
• Limited disk space on the ingesters
• Minimize impact of mistakes
• Had to be very specific on what was ingested; did not want to duplicate
data
• Ingest process would attempt to detect if a file had been ingested
• Had to verify data was properly ingested (document count of files vs
document count in Splunk)
Extra Notes - Monitoring and Logging
• Delicious dashboards using Grafana
• Graphs the Prometheus metric data
• Can graph Loki events too (logs)
Questions? Comments?

More Related Content

What's hot

AWS ESC + Ansibleで お手軽 Blue-Green Deployment
AWS ESC + Ansibleで お手軽 Blue-Green DeploymentAWS ESC + Ansibleで お手軽 Blue-Green Deployment
AWS ESC + Ansibleで お手軽 Blue-Green Deployment
Kentaro NOMURA
 
High Availability and Disaster Recovery in PostgreSQL - EQUNIX
High Availability and Disaster Recovery in PostgreSQL - EQUNIXHigh Availability and Disaster Recovery in PostgreSQL - EQUNIX
High Availability and Disaster Recovery in PostgreSQL - EQUNIX
Julyanto SUTANDANG
 
Indexing in Cassandra
Indexing in CassandraIndexing in Cassandra
Indexing in Cassandra
Ed Anuff
 
リペア時間短縮にむけた取り組み@Yahoo! JAPAN #casstudy
リペア時間短縮にむけた取り組み@Yahoo! JAPAN #casstudyリペア時間短縮にむけた取り組み@Yahoo! JAPAN #casstudy
リペア時間短縮にむけた取り組み@Yahoo! JAPAN #casstudy
Yahoo!デベロッパーネットワーク
 
Comparing Next-Generation Container Image Building Tools
 Comparing Next-Generation Container Image Building Tools Comparing Next-Generation Container Image Building Tools
Comparing Next-Generation Container Image Building Tools
Akihiro Suda
 
The overview of lazypull with containerd Remote Snapshotter & Stargz Snapshotter
The overview of lazypull with containerd Remote Snapshotter & Stargz SnapshotterThe overview of lazypull with containerd Remote Snapshotter & Stargz Snapshotter
The overview of lazypull with containerd Remote Snapshotter & Stargz Snapshotter
Kohei Tokunaga
 
Disaster Recovery with MySQL InnoDB ClusterSet - What is it and how do I use it?
Disaster Recovery with MySQL InnoDB ClusterSet - What is it and how do I use it?Disaster Recovery with MySQL InnoDB ClusterSet - What is it and how do I use it?
Disaster Recovery with MySQL InnoDB ClusterSet - What is it and how do I use it?
Miguel Araújo
 
[234]멀티테넌트 하둡 클러스터 운영 경험기
[234]멀티테넌트 하둡 클러스터 운영 경험기[234]멀티테넌트 하둡 클러스터 운영 경험기
[234]멀티테넌트 하둡 클러스터 운영 경험기
NAVER D2
 
Network Automation: Ansible 102
Network Automation: Ansible 102Network Automation: Ansible 102
Network Automation: Ansible 102
APNIC
 
Running Scylla on Kubernetes with Scylla Operator
Running Scylla on Kubernetes with Scylla OperatorRunning Scylla on Kubernetes with Scylla Operator
Running Scylla on Kubernetes with Scylla Operator
ScyllaDB
 
Rancher 2.0 Technical Deep Dive
Rancher 2.0 Technical Deep DiveRancher 2.0 Technical Deep Dive
Rancher 2.0 Technical Deep Dive
LINE Corporation
 
Python for Linux System Administration
Python for Linux System AdministrationPython for Linux System Administration
Python for Linux System Administration
vceder
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3
Eric Evans
 
Storm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationStorm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationnathanmarz
 
Percona XtraDB Cluster vs Galera Cluster vs MySQL Group Replication
Percona XtraDB Cluster vs Galera Cluster vs MySQL Group ReplicationPercona XtraDB Cluster vs Galera Cluster vs MySQL Group Replication
Percona XtraDB Cluster vs Galera Cluster vs MySQL Group Replication
Kenny Gryp
 
MySQL Performance Tuning: Top 10 Tips
MySQL Performance Tuning: Top 10 TipsMySQL Performance Tuning: Top 10 Tips
MySQL Performance Tuning: Top 10 Tips
OSSCube
 
[KubeCon EU 2022] Running containerd and k3s on macOS
[KubeCon EU 2022] Running containerd and k3s on macOS[KubeCon EU 2022] Running containerd and k3s on macOS
[KubeCon EU 2022] Running containerd and k3s on macOS
Akihiro Suda
 
Wait queue
Wait queueWait queue
Wait queue
Roy Lee
 
CNCF and Cloud Native Intro
CNCF and Cloud Native IntroCNCF and Cloud Native Intro
CNCF and Cloud Native Intro
Cloud Native Bangalore
 
AIDEVDAY_ Data-in-Motion to Supercharge AI
AIDEVDAY_ Data-in-Motion to Supercharge AIAIDEVDAY_ Data-in-Motion to Supercharge AI
AIDEVDAY_ Data-in-Motion to Supercharge AI
Timothy Spann
 

What's hot (20)

AWS ESC + Ansibleで お手軽 Blue-Green Deployment
AWS ESC + Ansibleで お手軽 Blue-Green DeploymentAWS ESC + Ansibleで お手軽 Blue-Green Deployment
AWS ESC + Ansibleで お手軽 Blue-Green Deployment
 
High Availability and Disaster Recovery in PostgreSQL - EQUNIX
High Availability and Disaster Recovery in PostgreSQL - EQUNIXHigh Availability and Disaster Recovery in PostgreSQL - EQUNIX
High Availability and Disaster Recovery in PostgreSQL - EQUNIX
 
Indexing in Cassandra
Indexing in CassandraIndexing in Cassandra
Indexing in Cassandra
 
リペア時間短縮にむけた取り組み@Yahoo! JAPAN #casstudy
リペア時間短縮にむけた取り組み@Yahoo! JAPAN #casstudyリペア時間短縮にむけた取り組み@Yahoo! JAPAN #casstudy
リペア時間短縮にむけた取り組み@Yahoo! JAPAN #casstudy
 
Comparing Next-Generation Container Image Building Tools
 Comparing Next-Generation Container Image Building Tools Comparing Next-Generation Container Image Building Tools
Comparing Next-Generation Container Image Building Tools
 
The overview of lazypull with containerd Remote Snapshotter & Stargz Snapshotter
The overview of lazypull with containerd Remote Snapshotter & Stargz SnapshotterThe overview of lazypull with containerd Remote Snapshotter & Stargz Snapshotter
The overview of lazypull with containerd Remote Snapshotter & Stargz Snapshotter
 
Disaster Recovery with MySQL InnoDB ClusterSet - What is it and how do I use it?
Disaster Recovery with MySQL InnoDB ClusterSet - What is it and how do I use it?Disaster Recovery with MySQL InnoDB ClusterSet - What is it and how do I use it?
Disaster Recovery with MySQL InnoDB ClusterSet - What is it and how do I use it?
 
[234]멀티테넌트 하둡 클러스터 운영 경험기
[234]멀티테넌트 하둡 클러스터 운영 경험기[234]멀티테넌트 하둡 클러스터 운영 경험기
[234]멀티테넌트 하둡 클러스터 운영 경험기
 
Network Automation: Ansible 102
Network Automation: Ansible 102Network Automation: Ansible 102
Network Automation: Ansible 102
 
Running Scylla on Kubernetes with Scylla Operator
Running Scylla on Kubernetes with Scylla OperatorRunning Scylla on Kubernetes with Scylla Operator
Running Scylla on Kubernetes with Scylla Operator
 
Rancher 2.0 Technical Deep Dive
Rancher 2.0 Technical Deep DiveRancher 2.0 Technical Deep Dive
Rancher 2.0 Technical Deep Dive
 
Python for Linux System Administration
Python for Linux System AdministrationPython for Linux System Administration
Python for Linux System Administration
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3
 
Storm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationStorm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computation
 
Percona XtraDB Cluster vs Galera Cluster vs MySQL Group Replication
Percona XtraDB Cluster vs Galera Cluster vs MySQL Group ReplicationPercona XtraDB Cluster vs Galera Cluster vs MySQL Group Replication
Percona XtraDB Cluster vs Galera Cluster vs MySQL Group Replication
 
MySQL Performance Tuning: Top 10 Tips
MySQL Performance Tuning: Top 10 TipsMySQL Performance Tuning: Top 10 Tips
MySQL Performance Tuning: Top 10 Tips
 
[KubeCon EU 2022] Running containerd and k3s on macOS
[KubeCon EU 2022] Running containerd and k3s on macOS[KubeCon EU 2022] Running containerd and k3s on macOS
[KubeCon EU 2022] Running containerd and k3s on macOS
 
Wait queue
Wait queueWait queue
Wait queue
 
CNCF and Cloud Native Intro
CNCF and Cloud Native IntroCNCF and Cloud Native Intro
CNCF and Cloud Native Intro
 
AIDEVDAY_ Data-in-Motion to Supercharge AI
AIDEVDAY_ Data-in-Motion to Supercharge AIAIDEVDAY_ Data-in-Motion to Supercharge AI
AIDEVDAY_ Data-in-Motion to Supercharge AI
 

Similar to Using AWS, Terraform, and Ansible to Automate Splunk at Scale

Netflix oss season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talksNetflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss season 2 episode 1 - meetup Lightning talksRuslan Meshenberg
 
Smart Platform Infrastructure with AWS
Smart Platform Infrastructure with AWSSmart Platform Infrastructure with AWS
Smart Platform Infrastructure with AWS
James Huston
 
What we talk about when we talk about DevOps
What we talk about when we talk about DevOpsWhat we talk about when we talk about DevOps
What we talk about when we talk about DevOps
Ricard Clau
 
Greenfields tech decisions
Greenfields tech decisionsGreenfields tech decisions
Greenfields tech decisions
Trent Hornibrook
 
Stackato v3
Stackato v3Stackato v3
Stackato v3
Jonas Brømsø
 
Stackato v5
Stackato v5Stackato v5
Stackato v5
Jonas Brømsø
 
OpenStack 101
OpenStack 101OpenStack 101
OpenStack 101
All Things Open
 
OpenStack 101 - All Things Open 2015
OpenStack 101 - All Things Open 2015OpenStack 101 - All Things Open 2015
OpenStack 101 - All Things Open 2015
Mark Voelker
 
Comment choisir entre Parse, Heroku et AWS ?
Comment choisir entre Parse, Heroku et AWS ?Comment choisir entre Parse, Heroku et AWS ?
Comment choisir entre Parse, Heroku et AWS ?
TheFamily
 
Server’s variations bsw2015
Server’s variations bsw2015Server’s variations bsw2015
Server’s variations bsw2015
Laurent Cerveau
 
Stackato v2
Stackato v2Stackato v2
Stackato v2
Jonas Brømsø
 
Stackato v4
Stackato v4Stackato v4
Stackato v4
Jonas Brømsø
 
PowerPoint Presentation
PowerPoint PresentationPowerPoint Presentation
PowerPoint Presentation
lalitjangra9
 
Yow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with NotesYow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with Notes
Adrian Cockcroft
 
Logmatic at ElasticSearch November Paris meetup
Logmatic at ElasticSearch November Paris meetupLogmatic at ElasticSearch November Paris meetup
Logmatic at ElasticSearch November Paris meetup
logmatic.io
 
Stackato v6
Stackato v6Stackato v6
Stackato v6
Jonas Brømsø
 
PyData Boston 2013
PyData Boston 2013PyData Boston 2013
PyData Boston 2013
Travis Oliphant
 
Immutable infrastructure with Boxfuse
Immutable infrastructure with BoxfuseImmutable infrastructure with Boxfuse
Immutable infrastructure with Boxfuse
Lars Östling
 
Ruby and Distributed Storage Systems
Ruby and Distributed Storage SystemsRuby and Distributed Storage Systems
Ruby and Distributed Storage Systems
SATOSHI TAGOMORI
 
Inrastructure as Code
Inrastructure as CodeInrastructure as Code
Inrastructure as Code
Charles Anderson
 

Similar to Using AWS, Terraform, and Ansible to Automate Splunk at Scale (20)

Netflix oss season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talksNetflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss season 2 episode 1 - meetup Lightning talks
 
Smart Platform Infrastructure with AWS
Smart Platform Infrastructure with AWSSmart Platform Infrastructure with AWS
Smart Platform Infrastructure with AWS
 
What we talk about when we talk about DevOps
What we talk about when we talk about DevOpsWhat we talk about when we talk about DevOps
What we talk about when we talk about DevOps
 
Greenfields tech decisions
Greenfields tech decisionsGreenfields tech decisions
Greenfields tech decisions
 
Stackato v3
Stackato v3Stackato v3
Stackato v3
 
Stackato v5
Stackato v5Stackato v5
Stackato v5
 
OpenStack 101
OpenStack 101OpenStack 101
OpenStack 101
 
OpenStack 101 - All Things Open 2015
OpenStack 101 - All Things Open 2015OpenStack 101 - All Things Open 2015
OpenStack 101 - All Things Open 2015
 
Comment choisir entre Parse, Heroku et AWS ?
Comment choisir entre Parse, Heroku et AWS ?Comment choisir entre Parse, Heroku et AWS ?
Comment choisir entre Parse, Heroku et AWS ?
 
Server’s variations bsw2015
Server’s variations bsw2015Server’s variations bsw2015
Server’s variations bsw2015
 
Stackato v2
Stackato v2Stackato v2
Stackato v2
 
Stackato v4
Stackato v4Stackato v4
Stackato v4
 
PowerPoint Presentation
PowerPoint PresentationPowerPoint Presentation
PowerPoint Presentation
 
Yow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with NotesYow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with Notes
 
Logmatic at ElasticSearch November Paris meetup
Logmatic at ElasticSearch November Paris meetupLogmatic at ElasticSearch November Paris meetup
Logmatic at ElasticSearch November Paris meetup
 
Stackato v6
Stackato v6Stackato v6
Stackato v6
 
PyData Boston 2013
PyData Boston 2013PyData Boston 2013
PyData Boston 2013
 
Immutable infrastructure with Boxfuse
Immutable infrastructure with BoxfuseImmutable infrastructure with Boxfuse
Immutable infrastructure with Boxfuse
 
Ruby and Distributed Storage Systems
Ruby and Distributed Storage SystemsRuby and Distributed Storage Systems
Ruby and Distributed Storage Systems
 
Inrastructure as Code
Inrastructure as CodeInrastructure as Code
Inrastructure as Code
 

More from Data Works MD

Data Journalism at The Baltimore Banner
Data Journalism at The Baltimore BannerData Journalism at The Baltimore Banner
Data Journalism at The Baltimore Banner
Data Works MD
 
Jolt’s Picks - Machine Learning and Major League Baseball Hit Streaks
Jolt’s Picks - Machine Learning and Major League Baseball Hit StreaksJolt’s Picks - Machine Learning and Major League Baseball Hit Streaks
Jolt’s Picks - Machine Learning and Major League Baseball Hit Streaks
Data Works MD
 
Introducing DataWave
Introducing DataWaveIntroducing DataWave
Introducing DataWave
Data Works MD
 
Malware Detection, Enabled by Machine Learning
Malware Detection, Enabled by Machine LearningMalware Detection, Enabled by Machine Learning
Malware Detection, Enabled by Machine Learning
Data Works MD
 
A Day in the Life of a Data Journalist
A Day in the Life of a Data JournalistA Day in the Life of a Data Journalist
A Day in the Life of a Data Journalist
Data Works MD
 
Robotics and Machine Learning: Working with NVIDIA Jetson Kits
Robotics and Machine Learning: Working with NVIDIA Jetson KitsRobotics and Machine Learning: Working with NVIDIA Jetson Kits
Robotics and Machine Learning: Working with NVIDIA Jetson Kits
Data Works MD
 
Connect Data and Devices with Apache NiFi
Connect Data and Devices with Apache NiFiConnect Data and Devices with Apache NiFi
Connect Data and Devices with Apache NiFi
Data Works MD
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
Data Works MD
 
Data in the City: Analytics and Civic Data in Baltimore
Data in the City: Analytics and Civic Data in BaltimoreData in the City: Analytics and Civic Data in Baltimore
Data in the City: Analytics and Civic Data in Baltimore
Data Works MD
 
Exploring Correlation Between Sentiment of Environmental Tweets and the Stock...
Exploring Correlation Between Sentiment of Environmental Tweets and the Stock...Exploring Correlation Between Sentiment of Environmental Tweets and the Stock...
Exploring Correlation Between Sentiment of Environmental Tweets and the Stock...
Data Works MD
 
Automated Software Requirements Labeling
Automated Software Requirements LabelingAutomated Software Requirements Labeling
Automated Software Requirements Labeling
Data Works MD
 
Introduction to Elasticsearch for Business Intelligence and Application Insights
Introduction to Elasticsearch for Business Intelligence and Application InsightsIntroduction to Elasticsearch for Business Intelligence and Application Insights
Introduction to Elasticsearch for Business Intelligence and Application Insights
Data Works MD
 
An Asynchronous Distributed Deep Learning Based Intrusion Detection System fo...
An Asynchronous Distributed Deep Learning Based Intrusion Detection System fo...An Asynchronous Distributed Deep Learning Based Intrusion Detection System fo...
An Asynchronous Distributed Deep Learning Based Intrusion Detection System fo...
Data Works MD
 
RAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data ScienceRAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data Science
Data Works MD
 
Two Algorithms for Weakly Supervised Denoising of EEG Data
Two Algorithms for Weakly Supervised Denoising of EEG DataTwo Algorithms for Weakly Supervised Denoising of EEG Data
Two Algorithms for Weakly Supervised Denoising of EEG Data
Data Works MD
 
Detecting Lateral Movement with a Compute-Intense Graph Kernel
Detecting Lateral Movement with a Compute-Intense Graph KernelDetecting Lateral Movement with a Compute-Intense Graph Kernel
Detecting Lateral Movement with a Compute-Intense Graph Kernel
Data Works MD
 
Predictive Analytics and Neighborhood Health
Predictive Analytics and Neighborhood HealthPredictive Analytics and Neighborhood Health
Predictive Analytics and Neighborhood Health
Data Works MD
 
Social Network Analysis Workshop
Social Network Analysis WorkshopSocial Network Analysis Workshop
Social Network Analysis Workshop
Data Works MD
 

More from Data Works MD (18)

Data Journalism at The Baltimore Banner
Data Journalism at The Baltimore BannerData Journalism at The Baltimore Banner
Data Journalism at The Baltimore Banner
 
Jolt’s Picks - Machine Learning and Major League Baseball Hit Streaks
Jolt’s Picks - Machine Learning and Major League Baseball Hit StreaksJolt’s Picks - Machine Learning and Major League Baseball Hit Streaks
Jolt’s Picks - Machine Learning and Major League Baseball Hit Streaks
 
Introducing DataWave
Introducing DataWaveIntroducing DataWave
Introducing DataWave
 
Malware Detection, Enabled by Machine Learning
Malware Detection, Enabled by Machine LearningMalware Detection, Enabled by Machine Learning
Malware Detection, Enabled by Machine Learning
 
A Day in the Life of a Data Journalist
A Day in the Life of a Data JournalistA Day in the Life of a Data Journalist
A Day in the Life of a Data Journalist
 
Robotics and Machine Learning: Working with NVIDIA Jetson Kits
Robotics and Machine Learning: Working with NVIDIA Jetson KitsRobotics and Machine Learning: Working with NVIDIA Jetson Kits
Robotics and Machine Learning: Working with NVIDIA Jetson Kits
 
Connect Data and Devices with Apache NiFi
Connect Data and Devices with Apache NiFiConnect Data and Devices with Apache NiFi
Connect Data and Devices with Apache NiFi
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Data in the City: Analytics and Civic Data in Baltimore
Data in the City: Analytics and Civic Data in BaltimoreData in the City: Analytics and Civic Data in Baltimore
Data in the City: Analytics and Civic Data in Baltimore
 
Exploring Correlation Between Sentiment of Environmental Tweets and the Stock...
Exploring Correlation Between Sentiment of Environmental Tweets and the Stock...Exploring Correlation Between Sentiment of Environmental Tweets and the Stock...
Exploring Correlation Between Sentiment of Environmental Tweets and the Stock...
 
Automated Software Requirements Labeling
Automated Software Requirements LabelingAutomated Software Requirements Labeling
Automated Software Requirements Labeling
 
Introduction to Elasticsearch for Business Intelligence and Application Insights
Introduction to Elasticsearch for Business Intelligence and Application InsightsIntroduction to Elasticsearch for Business Intelligence and Application Insights
Introduction to Elasticsearch for Business Intelligence and Application Insights
 
An Asynchronous Distributed Deep Learning Based Intrusion Detection System fo...
An Asynchronous Distributed Deep Learning Based Intrusion Detection System fo...An Asynchronous Distributed Deep Learning Based Intrusion Detection System fo...
An Asynchronous Distributed Deep Learning Based Intrusion Detection System fo...
 
RAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data ScienceRAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data Science
 
Two Algorithms for Weakly Supervised Denoising of EEG Data
Two Algorithms for Weakly Supervised Denoising of EEG DataTwo Algorithms for Weakly Supervised Denoising of EEG Data
Two Algorithms for Weakly Supervised Denoising of EEG Data
 
Detecting Lateral Movement with a Compute-Intense Graph Kernel
Detecting Lateral Movement with a Compute-Intense Graph KernelDetecting Lateral Movement with a Compute-Intense Graph Kernel
Detecting Lateral Movement with a Compute-Intense Graph Kernel
 
Predictive Analytics and Neighborhood Health
Predictive Analytics and Neighborhood HealthPredictive Analytics and Neighborhood Health
Predictive Analytics and Neighborhood Health
 
Social Network Analysis Workshop
Social Network Analysis WorkshopSocial Network Analysis Workshop
Social Network Analysis Workshop
 

Recently uploaded

20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
ThomasParaiso2
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 

Recently uploaded (20)

20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 

Using AWS, Terraform, and Ansible to Automate Splunk at Scale

  • 1. Using AWS, Terraform, and Ansible for DreamPort Projects - the Splunk Cluster How we used (and are still using) tools such as AWS, Terraform, and Ansible to automate everything about a Splunk cluster.
  • 2. Intro The Who, the What, the Why, and the How Hands on Keys – Live Demo Summary, Questions, Extra Deep Dives On the Agenda Today...
  • 3. Prerequisites – Terms and Tools • Basic understanding of AWS and cloud computing platforms • Aware of configuration management/orchestration tools such as Terraform and Ansible • Aware of the concepts of Docker • Need to have a basic understanding of Splunk and a Splunk cluster • PLEASE ASK QUESTIONS.
  • 4. The Who – Me, MISI, and DreamPort • Bill Cawthra - Cloud Infrastructure Architect • I play with little fluffy clouds all day (AWS, Google Cloud, Azure) • MISI/DreamPort - Support and help develop various cyber security projects through collaboration with .gov, private industry, community, and .edu • DreamPort projects – over 20 projects/AWS environments, usually 30-90 days long (some are notably longer) • https://misi.tech/#about • https://dreamport.tech/about-us.php
  • 5. The What and the Why - The Splunk Evaluation • We wanted to build a Splunk cluster to analyze it's machine learning capabilities. • The data set was 9 TB of Zeek data • 20 users accessing this data at a time (so fairly light on the frontend) • But very intense work done on the backend (indexers) • Big beefy i3.8xlarge instances… Use the instance-store for fast IO (but ephemeral! Therefore we used Splunk SmartStore) • With the help of many people at Splunk (Bryan Pluta, Tyler Muth, Matt Toth, and others), we came up with a design to fit these requirements • We are going to use AWS, Terraform, and Ansible as our tools of choice
  • 6. The How - AWS • Amazon Web Services; provides an on-demand computing platform • "Elastic" resources • Allows us to rapidly scale out and scale down • Very easy to manage many disparate projects • Best datacenter money can buy
  • 7. The How - Terraform • Our infrastructure configuration tool of choice • This "frames the house"; creating the AWS resources (VPC, security groups, instances, IAM policies, IAM roles, S3 buckets, etc) • Enforces configuration from the very start (no GUI. No artisinally crafted architecture)
  • 8. The How - Ansible - Drywall, Paint, and Fixtures • Our automation and configuration management tool of choice • Handles configuration of systems • Handles automation tasks (upgrade and reboot of systems… and ingest orchestration!) • Does everything after the "house is framed"
  • 9. The How - Docker • Easy binary management (example: to upgrade, just docker pull splunk:<VERSION>) • The splunk-docker project makes it very easy to assign roles, access variables
  • 10. The How - Infrastructure Diagram
  • 11. Before We Go Live • I will be covering things at a high level • I will be skipping many things • Ask questions if you want to see XYZ • Look at the code on your own too! • It’s tricky to balance being concise in a talk and detail of the code • Need to avoid turning this into a code review session… • If something looks confusing or wrong, I probably made a mistake.
  • 12. Before We Go Live - Resources • https://github.com/TheDreamPort/splunk-infrastructure (santiized version of this project) • Also great references: • https://splunk.github.io/splunk-ansible/ - Splunk Ansible reference • https://splunk.github.io/docker-splunk/ - Splunk Docker
  • 13. TO THE TERMINAL AND BROWSER
  • 14. Conclusion • We automate automate automate • Which means, we configure/deploy everything programmatically • Ingest is automated • Makes it so easy to redo • Break up the automation into logical pieces • It is not fun having a single mega-script
  • 15. Extra Notes - Splunk Ingest • Ingest the 9TB of data in batches (basically did it a month at a time) and wait for completion • Limited disk space on the ingesters • Minimize impact of mistakes • Had to be very specific on what was ingested; did not want to duplicate data • Ingest process would attempt to detect if a file had been ingested • Had to verify data was properly ingested (document count of files vs document count in Splunk)
  • 16. Extra Notes - Monitoring and Logging • Delicious dashboards using Grafana • Graphs the Prometheus metric data • Can graph Loki events too (logs)

Editor's Notes

  1. Splunk search-head (1) c5d.12xlarge (48 vCPU 96GB) Splunk indexer (9) i3.8xlarge (32 vCPU 244 GB each) 7600 GB of instance storage Splunk universal-forwarders (4) i3.2xlarge (8 vCPU 61 GB each) 1900 GB of instance storage Splunk master-node (1) i3.large (2 vCPU 15 GB) Splunk monitor (1) i3.large (2 vCPU 15 GB)
  2. If you want to follow along or poke around the code and find the flaws, go here.
  3. If you want to follow along or poke around the code and find the flaws, go here.