SlideShare a Scribd company logo
1 of 22
Download to read offline
Logging @ Scale on AWS
Visualization > Plaintext
Who am I?
• Chris	Riddell
• Tech	Co-Founder	of	some	start	ups,	Senior	Software	Engineer
• Big	Data	guy	– Redshift,	S3,	EC2,	EMR,	Hive,	Spark,	Dynamo	and	
many	others….
• I	manage	big	AWS	infrastructure	from	software	to	architecture	to	
DevOps (start	up	life...)
• Java	(+	others)	/	AWS	/	geek
• AWS	Professionally	Certified	Solution	Architect	(ask	me	how!)
• Not	a	logging	expert	- like	to	think	I’m	getting	close	J
Logging @ Scale on AWS
• Use	cases
• What	to	log
• Commercial	options
• Common	tools
• Plausible	architectures
• Demo!	EC2	w/Fluentd ->	Kinesis	Firehose ->	Elasticsearch
Why log?
• Get	visibility	into	app	health,	centrally	accessible	and	
searchable
• Alerts	&	fast	error	event	response	– Agility!
• Not	having	to	login	to	individual	instances	and	tail
• Contextual	tracking e.g.	user	behaviour
• Finding	code	optimisations
• …and	many	many	other	reasons!
First have common language
• What	is	DEBUG,	INFO,	WARN	and	ERROR	used	for	in	your	
organisation?
• Have	common	language	for	what	should	be	logged	where
• Bad	leveling	messes	up	your	storage
• E.g.	DEBUG	logs	going	to	your	expensive	Elasticsearch store,	when	
they	never	need	to	be	searched
• One	guy’s	opinion:	
http://stackoverflow.com/a/8021604/3843660
Centralising the logs
• Let’s	get	them	off	the	host
• Basic	DIY:
• Syslog-ng,	rsyslogd,	nxlog
• Advanced	DIY:
• Splunk forwarder,	Logstash,	Flume,	Fluentd
• Third	party
• SaaS
Commercial options: SaaS
• LogEntries,	Sumo	Logic, Loggly,	Splunk Cloud,	PaperTrails,	
AWS	CloudWatch Logs…
• Typically	RESTful JSON	log	dump	APIs	
• Search	&	visualizations	are	core	features
• Most	have	a	free	tier	
• Many	libraries	available	for	various	languages	and/or	
packaged	versions
• Costs	go	up	with	data	size,	retention	period	and	user	count
• Nice	to	have:	User	defined	alerts;	S3	archival…..
Self-hosted solutions
• Splunk – Enterprise	solution	consisting	of	forwarders,	search	
heads,	and	indexers.	Licensed.
• ELK/EFK	stack	=	Elasticsearch,	Logstash or	Fluentd, Kibana
• Today	=	Fluentd (forwarder),	Kinesis	Firehose,	Elasticsearch,	
Kibana
• FKFEK	stack?
Elasticsearch?
• For	search!
• Indexes	
• Shards	- Distributed	&	scales	out
• Replicas
• JSON	REST	API
• Apache	Lucene
• Kibana is	an	Elasticsearch plugin	that	provides	a	nice	
interface	to	the	search	data	*with	visualizations*
Logstash & Fluentd agents
• Packaged	install
• Input	and	output	logs
• Centralise	your	instance	logs
• Often	used	as	a	syslog	tail’er or	as	a	local	HTTP	log	endpoint
• Parse/transform/filter/tag
• Store	or	Forward
• “Logstash emphasizes	flexibility	and	
interoperability whereas Fluentd prioritizes	simplicity	and	
robustness”	- http://goo.gl/f5I4cL
Lo
Architecture – Pre October 2015
Today’s Solution
Lo
Syslog/Fluentd
on	EC2
Kinesis	
Firehose Elasticsearch
Demo: Set up EC2 and Fluentd
• We	spin	up	a	default	AWS	AMI	EC2	instance	with	role	
permission	to	push	data	to	Firehose,	access	via	SSH	&	HTTP)
• We	SSH	in	and	install	Fluentd
• curl -L https://td-
toolbelt.herokuapp.com/sh/install-redhat-td-
agent2.sh | sh
• /usr/sbin/td-agent-gem install fluent-plugin-
kinesis #install	AWS	FH	plugin
• Then	configure	fluentd to	push	our	syslog’s
Demo: Fluentd config (/etc/td-agent/td-
agent.conf)
## Syslog reader. Configure port 42185 to send events to in rsyslog config
<source>
type syslog
port 42185
bind 0.0.0.0
tag system
</source>
## Filters to transform records and add metadata
<filter **>
type record_transformer
enable_ruby
<record>
@timestamp ${require 'time'; Time.now.utc.iso8601}
</record>
</filter>
## Output to Firehose using the instance role
<match **>
@type kinesis_firehose
region us-west-2
delivery_stream_name logs
flush_interval 2s
</match>
Demo: restart log agents and serve HTTP
# After /etc/td-agent/td-agent.conf has been setup
# Send syslog to fluentd listener
echo "*.* @127.0.0.1:42185" | sudo tee /etc/rsyslog.d/22-
fluent.conf
sudo service td-agent restart
sudo service rsyslog restart
# Let’s make a web server for you to push your own logs!
mkdir web
cd web
echo 'Hello!' > index.html
sudo python -m SimpleHTTPServer 80 |& logger -t httpsvr &
Demo: Setting up Elasticsearch
• We	set	up	AWS	Elasticsearch Service:	Some	notes:
• Dedicated	master	- performs	cluster	management	tasks,	doesn’t	
hold	data
• Metrics:	The	usual	stuff.	Note	the	JVMMemoryPressuremetric.	
Amazon	recommends	scale	up/out	if	>	85%
• Cluster	status	is	yellow	on	single	node	because	replicas	cannot	be	
assigned.	Add	a	node	or	change	the	setting
• Cluster	has	it’s	own	access	policy.	If	you	choose	instance	role	
access	control,	you	must	sign	all	requests	to	ES	(use	AWS	SDKs).	
You	will	not	be	able	to	access	Kibana on	this	setting
• Check	what	size	the	instance	store	is	on	your	selected	instance	
type,	or	use	EBS
Demo: Setting up Kinesis Firehose
• We	create	a	logs	delivery	stream	with	Elasticsearch as	the	
target
• Firehose:	Some	notes:
• A	pipeline	to	push	data	in	at	high	scale
• Dump	data	in,	and	it	buffers	recrods(logs	in	our	case)	them	before	
pushing	to	Elasticsearch and	optionally	S3	(Redshift	also	
supported)
• Pay	per	GB	of	ingestion	$0.035USD	(each	record	rounded	to	
nearest	5kb)
• Different	destinations	(e.g.	WARN/ERROR	to	Elasticsearch but	rest	
only	to	S3)	would	need	different	Firehose delivery	streams
Limitations? Further features?
• AWS’s	Elasticsearch is	limited	on	plugins,	but	very	good	out	of	the	box	
settings
• Fast	way	to	get	high	throughput,	high	scale	logs	into	a	stored	index	with	
S3	backups,	and	visualisation!
• Further	features	using	AWS	Lambda	as	glue:	
• Alerts	
• CloudTrial/S3	logs	ingestion	to	FH/Elasticsearch,	
• Deletion	of	old	indexes	at	a	specified	period	(we	want	the	last	X	days	only)	
• and	more…
• S3	bucket	policy	for	storage	optimisation (eg old	stuff	to	glacier)
• Your	custom	applications:	Push	directly	to	Fluentd’s HTTP	endpoint,	not	
via	syslog	(more	flexibility	and	tagging)
• Further	customisations to	Fluentd config
The end!
• Thanks!
• @ChrisJRiddell
• @ParrotAnalytics
• Hiring	Intermediate	/Senior	Java	Engineers	J
• Upcoming:	Web	dev &	more	engineers
• https://parrot-analytics.workable.com/ to	apply

More Related Content

What's hot

Security Day What's (nearly) New
Security Day What's (nearly) NewSecurity Day What's (nearly) New
Security Day What's (nearly) NewAmazon Web Services
 
Automated Compliance and Governance with AWS Config and AWS CloudTrail - June...
Automated Compliance and Governance with AWS Config and AWS CloudTrail - June...Automated Compliance and Governance with AWS Config and AWS CloudTrail - June...
Automated Compliance and Governance with AWS Config and AWS CloudTrail - June...Amazon Web Services
 
Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013
Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013
Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013Amazon Web Services
 
Transparency and Control with AWS CloudTrail and AWS Config
Transparency and Control with AWS CloudTrail and AWS ConfigTransparency and Control with AWS CloudTrail and AWS Config
Transparency and Control with AWS CloudTrail and AWS ConfigAmazon Web Services
 
Account Separation and Mandatory Access Control
Account Separation and Mandatory Access ControlAccount Separation and Mandatory Access Control
Account Separation and Mandatory Access ControlAmazon Web Services
 
Using AWS CloudTrail and AWS Config to Enhance Governance and Compliance of A...
Using AWS CloudTrail and AWS Config to Enhance Governance and Compliance of A...Using AWS CloudTrail and AWS Config to Enhance Governance and Compliance of A...
Using AWS CloudTrail and AWS Config to Enhance Governance and Compliance of A...Amazon Web Services
 
February 2016 Webinar Series - Introducing VPC Support for AWS Lambda
February 2016 Webinar Series - Introducing VPC Support for AWS LambdaFebruary 2016 Webinar Series - Introducing VPC Support for AWS Lambda
February 2016 Webinar Series - Introducing VPC Support for AWS LambdaAmazon Web Services
 
(SEC304) Architecting for HIPAA Compliance on AWS
(SEC304) Architecting for HIPAA Compliance on AWS(SEC304) Architecting for HIPAA Compliance on AWS
(SEC304) Architecting for HIPAA Compliance on AWSAmazon Web Services
 
AWS APAC Webinar Week - Understanding AWS Storage Options
AWS APAC Webinar Week - Understanding AWS Storage OptionsAWS APAC Webinar Week - Understanding AWS Storage Options
AWS APAC Webinar Week - Understanding AWS Storage OptionsAmazon Web Services
 
Building Serverless Chat Bots - AWS August Webinar Series
Building Serverless Chat Bots - AWS August Webinar SeriesBuilding Serverless Chat Bots - AWS August Webinar Series
Building Serverless Chat Bots - AWS August Webinar SeriesAmazon Web Services
 
Getting Started with Docker on AWS
Getting Started with Docker on AWSGetting Started with Docker on AWS
Getting Started with Docker on AWSAmazon Web Services
 
SEC303 Automating Security in Cloud Workloads with DevSecOps
SEC303 Automating Security in Cloud Workloads with DevSecOpsSEC303 Automating Security in Cloud Workloads with DevSecOps
SEC303 Automating Security in Cloud Workloads with DevSecOpsAmazon Web Services
 
(SEC403) Diving into AWS CloudTrail Events w/ Apache Spark on EMR
(SEC403) Diving into AWS CloudTrail Events w/ Apache Spark on EMR(SEC403) Diving into AWS CloudTrail Events w/ Apache Spark on EMR
(SEC403) Diving into AWS CloudTrail Events w/ Apache Spark on EMRAmazon Web Services
 
BDA403 How Netflix Monitors Applications in Real-time with Amazon Kinesis
BDA403 How Netflix Monitors Applications in Real-time with Amazon KinesisBDA403 How Netflix Monitors Applications in Real-time with Amazon Kinesis
BDA403 How Netflix Monitors Applications in Real-time with Amazon KinesisAmazon Web Services
 
Account Separation and Mandatory Access Control
Account Separation and Mandatory Access ControlAccount Separation and Mandatory Access Control
Account Separation and Mandatory Access ControlAmazon Web Services
 
(SEC306) Turn on CloudTrail: Log API Activity in Your AWS Account | AWS re:In...
(SEC306) Turn on CloudTrail: Log API Activity in Your AWS Account | AWS re:In...(SEC306) Turn on CloudTrail: Log API Activity in Your AWS Account | AWS re:In...
(SEC306) Turn on CloudTrail: Log API Activity in Your AWS Account | AWS re:In...Amazon Web Services
 

What's hot (20)

Security Day What's (nearly) New
Security Day What's (nearly) NewSecurity Day What's (nearly) New
Security Day What's (nearly) New
 
Deep Dive on AWS IoT
Deep Dive on AWS IoTDeep Dive on AWS IoT
Deep Dive on AWS IoT
 
Automated Compliance and Governance with AWS Config and AWS CloudTrail - June...
Automated Compliance and Governance with AWS Config and AWS CloudTrail - June...Automated Compliance and Governance with AWS Config and AWS CloudTrail - June...
Automated Compliance and Governance with AWS Config and AWS CloudTrail - June...
 
Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013
Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013
Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013
 
Transparency and Control with AWS CloudTrail and AWS Config
Transparency and Control with AWS CloudTrail and AWS ConfigTransparency and Control with AWS CloudTrail and AWS Config
Transparency and Control with AWS CloudTrail and AWS Config
 
Account Separation and Mandatory Access Control
Account Separation and Mandatory Access ControlAccount Separation and Mandatory Access Control
Account Separation and Mandatory Access Control
 
Using AWS CloudTrail and AWS Config to Enhance Governance and Compliance of A...
Using AWS CloudTrail and AWS Config to Enhance Governance and Compliance of A...Using AWS CloudTrail and AWS Config to Enhance Governance and Compliance of A...
Using AWS CloudTrail and AWS Config to Enhance Governance and Compliance of A...
 
February 2016 Webinar Series - Introducing VPC Support for AWS Lambda
February 2016 Webinar Series - Introducing VPC Support for AWS LambdaFebruary 2016 Webinar Series - Introducing VPC Support for AWS Lambda
February 2016 Webinar Series - Introducing VPC Support for AWS Lambda
 
Monitoring and Alerting
Monitoring and AlertingMonitoring and Alerting
Monitoring and Alerting
 
(SEC304) Architecting for HIPAA Compliance on AWS
(SEC304) Architecting for HIPAA Compliance on AWS(SEC304) Architecting for HIPAA Compliance on AWS
(SEC304) Architecting for HIPAA Compliance on AWS
 
AWS APAC Webinar Week - Understanding AWS Storage Options
AWS APAC Webinar Week - Understanding AWS Storage OptionsAWS APAC Webinar Week - Understanding AWS Storage Options
AWS APAC Webinar Week - Understanding AWS Storage Options
 
Building Serverless Chat Bots - AWS August Webinar Series
Building Serverless Chat Bots - AWS August Webinar SeriesBuilding Serverless Chat Bots - AWS August Webinar Series
Building Serverless Chat Bots - AWS August Webinar Series
 
Getting Started with Docker on AWS
Getting Started with Docker on AWSGetting Started with Docker on AWS
Getting Started with Docker on AWS
 
SRV408 Deep Dive on AWS IoT
SRV408 Deep Dive on AWS IoTSRV408 Deep Dive on AWS IoT
SRV408 Deep Dive on AWS IoT
 
SEC303 Automating Security in Cloud Workloads with DevSecOps
SEC303 Automating Security in Cloud Workloads with DevSecOpsSEC303 Automating Security in Cloud Workloads with DevSecOps
SEC303 Automating Security in Cloud Workloads with DevSecOps
 
(SEC403) Diving into AWS CloudTrail Events w/ Apache Spark on EMR
(SEC403) Diving into AWS CloudTrail Events w/ Apache Spark on EMR(SEC403) Diving into AWS CloudTrail Events w/ Apache Spark on EMR
(SEC403) Diving into AWS CloudTrail Events w/ Apache Spark on EMR
 
Sec301 Security @ (Cloud) Scale
Sec301 Security @ (Cloud) ScaleSec301 Security @ (Cloud) Scale
Sec301 Security @ (Cloud) Scale
 
BDA403 How Netflix Monitors Applications in Real-time with Amazon Kinesis
BDA403 How Netflix Monitors Applications in Real-time with Amazon KinesisBDA403 How Netflix Monitors Applications in Real-time with Amazon Kinesis
BDA403 How Netflix Monitors Applications in Real-time with Amazon Kinesis
 
Account Separation and Mandatory Access Control
Account Separation and Mandatory Access ControlAccount Separation and Mandatory Access Control
Account Separation and Mandatory Access Control
 
(SEC306) Turn on CloudTrail: Log API Activity in Your AWS Account | AWS re:In...
(SEC306) Turn on CloudTrail: Log API Activity in Your AWS Account | AWS re:In...(SEC306) Turn on CloudTrail: Log API Activity in Your AWS Account | AWS re:In...
(SEC306) Turn on CloudTrail: Log API Activity in Your AWS Account | AWS re:In...
 

Viewers also liked

Softnix Logger Centralized Log Management
Softnix Logger Centralized Log ManagementSoftnix Logger Centralized Log Management
Softnix Logger Centralized Log ManagementSoftnix Technology
 
Batch and Stream processing with SQL
Batch and Stream processing with SQLBatch and Stream processing with SQL
Batch and Stream processing with SQLSATOSHI TAGOMORI
 
Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)
Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)
Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)Matt Fuller
 
Presto for the Enterprise @ Hadoop Meetup
Presto for the Enterprise @ Hadoop MeetupPresto for the Enterprise @ Hadoop Meetup
Presto for the Enterprise @ Hadoop MeetupWojciech Biela
 
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)Matt Fuller
 
Hadoop World 2011: Big Data Architecture: Integrating Hadoop with Other Enter...
Hadoop World 2011: Big Data Architecture: Integrating Hadoop with Other Enter...Hadoop World 2011: Big Data Architecture: Integrating Hadoop with Other Enter...
Hadoop World 2011: Big Data Architecture: Integrating Hadoop with Other Enter...Cloudera, Inc.
 
Log management principle and usage
Log management principle and usageLog management principle and usage
Log management principle and usageBikrant Gautam
 
Presto as a Service - Tips for operation and monitoring
Presto as a Service - Tips for operation and monitoringPresto as a Service - Tips for operation and monitoring
Presto as a Service - Tips for operation and monitoringTaro L. Saito
 
Big Data: SQL query federation for Hadoop and RDBMS data
Big Data:  SQL query federation for Hadoop and RDBMS dataBig Data:  SQL query federation for Hadoop and RDBMS data
Big Data: SQL query federation for Hadoop and RDBMS dataCynthia Saracco
 
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by ScyllaScylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by ScyllaScyllaDB
 
(BDT403) Best Practices for Building Real-time Streaming Applications with Am...
(BDT403) Best Practices for Building Real-time Streaming Applications with Am...(BDT403) Best Practices for Building Real-time Streaming Applications with Am...
(BDT403) Best Practices for Building Real-time Streaming Applications with Am...Amazon Web Services
 
Hybrid Data Architecture: Integrating Hadoop with a Data Warehouse
Hybrid Data Architecture: Integrating Hadoop with a Data WarehouseHybrid Data Architecture: Integrating Hadoop with a Data Warehouse
Hybrid Data Architecture: Integrating Hadoop with a Data WarehouseDataWorks Summit
 
Choosing Your Log Management Approach: Buy, Build or Outsource
Choosing Your Log Management Approach: Buy, Build or OutsourceChoosing Your Log Management Approach: Buy, Build or Outsource
Choosing Your Log Management Approach: Buy, Build or OutsourceAnton Chuvakin
 
7 Reasons your existing SIEM is not enough
7 Reasons your existing SIEM is not enough7 Reasons your existing SIEM is not enough
7 Reasons your existing SIEM is not enoughCloudAccess
 
NIST 800-92 Log Management Guide in the Real World
NIST 800-92 Log Management Guide in the Real WorldNIST 800-92 Log Management Guide in the Real World
NIST 800-92 Log Management Guide in the Real WorldAnton Chuvakin
 
Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1Sadayuki Furuhashi
 
Presto: Distributed sql query engine
Presto: Distributed sql query engine Presto: Distributed sql query engine
Presto: Distributed sql query engine kiran palaka
 
(DVO315) Log, Monitor and Analyze your IT with Amazon CloudWatch
(DVO315) Log, Monitor and Analyze your IT with Amazon CloudWatch(DVO315) Log, Monitor and Analyze your IT with Amazon CloudWatch
(DVO315) Log, Monitor and Analyze your IT with Amazon CloudWatchAmazon Web Services
 

Viewers also liked (20)

Softnix Logger Centralized Log Management
Softnix Logger Centralized Log ManagementSoftnix Logger Centralized Log Management
Softnix Logger Centralized Log Management
 
Batch and Stream processing with SQL
Batch and Stream processing with SQLBatch and Stream processing with SQL
Batch and Stream processing with SQL
 
Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)
Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)
Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)
 
Presto for the Enterprise @ Hadoop Meetup
Presto for the Enterprise @ Hadoop MeetupPresto for the Enterprise @ Hadoop Meetup
Presto for the Enterprise @ Hadoop Meetup
 
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
 
Prestogres internals
Prestogres internalsPrestogres internals
Prestogres internals
 
Hadoop World 2011: Big Data Architecture: Integrating Hadoop with Other Enter...
Hadoop World 2011: Big Data Architecture: Integrating Hadoop with Other Enter...Hadoop World 2011: Big Data Architecture: Integrating Hadoop with Other Enter...
Hadoop World 2011: Big Data Architecture: Integrating Hadoop with Other Enter...
 
Log management principle and usage
Log management principle and usageLog management principle and usage
Log management principle and usage
 
Presto as a Service - Tips for operation and monitoring
Presto as a Service - Tips for operation and monitoringPresto as a Service - Tips for operation and monitoring
Presto as a Service - Tips for operation and monitoring
 
Big Data: SQL query federation for Hadoop and RDBMS data
Big Data:  SQL query federation for Hadoop and RDBMS dataBig Data:  SQL query federation for Hadoop and RDBMS data
Big Data: SQL query federation for Hadoop and RDBMS data
 
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by ScyllaScylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
 
(BDT403) Best Practices for Building Real-time Streaming Applications with Am...
(BDT403) Best Practices for Building Real-time Streaming Applications with Am...(BDT403) Best Practices for Building Real-time Streaming Applications with Am...
(BDT403) Best Practices for Building Real-time Streaming Applications with Am...
 
Hybrid Data Architecture: Integrating Hadoop with a Data Warehouse
Hybrid Data Architecture: Integrating Hadoop with a Data WarehouseHybrid Data Architecture: Integrating Hadoop with a Data Warehouse
Hybrid Data Architecture: Integrating Hadoop with a Data Warehouse
 
Choosing Your Log Management Approach: Buy, Build or Outsource
Choosing Your Log Management Approach: Buy, Build or OutsourceChoosing Your Log Management Approach: Buy, Build or Outsource
Choosing Your Log Management Approach: Buy, Build or Outsource
 
7 Reasons your existing SIEM is not enough
7 Reasons your existing SIEM is not enough7 Reasons your existing SIEM is not enough
7 Reasons your existing SIEM is not enough
 
Presto - SQL on anything
Presto  - SQL on anythingPresto  - SQL on anything
Presto - SQL on anything
 
NIST 800-92 Log Management Guide in the Real World
NIST 800-92 Log Management Guide in the Real WorldNIST 800-92 Log Management Guide in the Real World
NIST 800-92 Log Management Guide in the Real World
 
Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1
 
Presto: Distributed sql query engine
Presto: Distributed sql query engine Presto: Distributed sql query engine
Presto: Distributed sql query engine
 
(DVO315) Log, Monitor and Analyze your IT with Amazon CloudWatch
(DVO315) Log, Monitor and Analyze your IT with Amazon CloudWatch(DVO315) Log, Monitor and Analyze your IT with Amazon CloudWatch
(DVO315) Log, Monitor and Analyze your IT with Amazon CloudWatch
 

Similar to AWS Meet-up: Logging At Scale on AWS

Splunk: Forward me the REST of those shells
Splunk: Forward me the REST of those shellsSplunk: Forward me the REST of those shells
Splunk: Forward me the REST of those shellsAnthony D Hendricks
 
Adding Security and Compliance to Your Workflow with InSpec
Adding Security and Compliance to Your Workflow with InSpecAdding Security and Compliance to Your Workflow with InSpec
Adding Security and Compliance to Your Workflow with InSpecMandi Walls
 
Collabnix Online Webinar: Integrated Log Analytics & Monitoring using Docker ...
Collabnix Online Webinar: Integrated Log Analytics & Monitoring using Docker ...Collabnix Online Webinar: Integrated Log Analytics & Monitoring using Docker ...
Collabnix Online Webinar: Integrated Log Analytics & Monitoring using Docker ...Ajeet Singh Raina
 
CodeIgniter Ant Scripting
CodeIgniter Ant ScriptingCodeIgniter Ant Scripting
CodeIgniter Ant ScriptingAlbert Rosa
 
Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek PROIDEA
 
Docker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic StackDocker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic StackJakub Hajek
 
DEF CON 24 - workshop - Craig Young - brainwashing embedded systems
DEF CON 24 - workshop - Craig Young - brainwashing embedded systemsDEF CON 24 - workshop - Craig Young - brainwashing embedded systems
DEF CON 24 - workshop - Craig Young - brainwashing embedded systemsFelipe Prado
 
IBM Think Session 8598 Domino and JavaScript Development MasterClass
IBM Think Session 8598 Domino and JavaScript Development MasterClassIBM Think Session 8598 Domino and JavaScript Development MasterClass
IBM Think Session 8598 Domino and JavaScript Development MasterClassPaul Withers
 
OSSEC @ ISSA Jan 21st 2010
OSSEC @ ISSA Jan 21st 2010OSSEC @ ISSA Jan 21st 2010
OSSEC @ ISSA Jan 21st 2010wremes
 
Service Discovery using etcd, Consul and Kubernetes
Service Discovery using etcd, Consul and KubernetesService Discovery using etcd, Consul and Kubernetes
Service Discovery using etcd, Consul and KubernetesSreenivas Makam
 
Alabama CyberNow 2018: Cloud Hardening and Digital Forensics Readiness
Alabama CyberNow 2018: Cloud Hardening and Digital Forensics ReadinessAlabama CyberNow 2018: Cloud Hardening and Digital Forensics Readiness
Alabama CyberNow 2018: Cloud Hardening and Digital Forensics ReadinessToni de la Fuente
 
Best Practices for Design Hardware APIs
Best Practices for Design Hardware APIsBest Practices for Design Hardware APIs
Best Practices for Design Hardware APIsMatt Haines
 
Building an Observability Platform in 389 Difficult Steps
Building an Observability Platform in 389 Difficult StepsBuilding an Observability Platform in 389 Difficult Steps
Building an Observability Platform in 389 Difficult StepsDigitalOcean
 
Introduction to Node.js
Introduction to Node.jsIntroduction to Node.js
Introduction to Node.jsRichard Lee
 
Rundeck: The missing tool
Rundeck: The missing toolRundeck: The missing tool
Rundeck: The missing toolArtur Martins
 
Automating AWS Compliance with InSpec
Automating AWS Compliance with InSpec Automating AWS Compliance with InSpec
Automating AWS Compliance with InSpec Matt Ray
 
Fluentd - RubyKansai 65
Fluentd - RubyKansai 65Fluentd - RubyKansai 65
Fluentd - RubyKansai 65N Masahiro
 
Aws security with HIDS using Ossec
Aws security with HIDS using OssecAws security with HIDS using Ossec
Aws security with HIDS using OssecGaurav Harsola
 
Aws security with HIDS, OSSEC
Aws security with HIDS, OSSECAws security with HIDS, OSSEC
Aws security with HIDS, OSSECMayank Gaikwad
 
Containerisation Hack of a Legacy Software Solution - Alex Carter - CodeMill ...
Containerisation Hack of a Legacy Software Solution - Alex Carter - CodeMill ...Containerisation Hack of a Legacy Software Solution - Alex Carter - CodeMill ...
Containerisation Hack of a Legacy Software Solution - Alex Carter - CodeMill ...CodeMill digital skills
 

Similar to AWS Meet-up: Logging At Scale on AWS (20)

Splunk: Forward me the REST of those shells
Splunk: Forward me the REST of those shellsSplunk: Forward me the REST of those shells
Splunk: Forward me the REST of those shells
 
Adding Security and Compliance to Your Workflow with InSpec
Adding Security and Compliance to Your Workflow with InSpecAdding Security and Compliance to Your Workflow with InSpec
Adding Security and Compliance to Your Workflow with InSpec
 
Collabnix Online Webinar: Integrated Log Analytics & Monitoring using Docker ...
Collabnix Online Webinar: Integrated Log Analytics & Monitoring using Docker ...Collabnix Online Webinar: Integrated Log Analytics & Monitoring using Docker ...
Collabnix Online Webinar: Integrated Log Analytics & Monitoring using Docker ...
 
CodeIgniter Ant Scripting
CodeIgniter Ant ScriptingCodeIgniter Ant Scripting
CodeIgniter Ant Scripting
 
Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek
 
Docker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic StackDocker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic Stack
 
DEF CON 24 - workshop - Craig Young - brainwashing embedded systems
DEF CON 24 - workshop - Craig Young - brainwashing embedded systemsDEF CON 24 - workshop - Craig Young - brainwashing embedded systems
DEF CON 24 - workshop - Craig Young - brainwashing embedded systems
 
IBM Think Session 8598 Domino and JavaScript Development MasterClass
IBM Think Session 8598 Domino and JavaScript Development MasterClassIBM Think Session 8598 Domino and JavaScript Development MasterClass
IBM Think Session 8598 Domino and JavaScript Development MasterClass
 
OSSEC @ ISSA Jan 21st 2010
OSSEC @ ISSA Jan 21st 2010OSSEC @ ISSA Jan 21st 2010
OSSEC @ ISSA Jan 21st 2010
 
Service Discovery using etcd, Consul and Kubernetes
Service Discovery using etcd, Consul and KubernetesService Discovery using etcd, Consul and Kubernetes
Service Discovery using etcd, Consul and Kubernetes
 
Alabama CyberNow 2018: Cloud Hardening and Digital Forensics Readiness
Alabama CyberNow 2018: Cloud Hardening and Digital Forensics ReadinessAlabama CyberNow 2018: Cloud Hardening and Digital Forensics Readiness
Alabama CyberNow 2018: Cloud Hardening and Digital Forensics Readiness
 
Best Practices for Design Hardware APIs
Best Practices for Design Hardware APIsBest Practices for Design Hardware APIs
Best Practices for Design Hardware APIs
 
Building an Observability Platform in 389 Difficult Steps
Building an Observability Platform in 389 Difficult StepsBuilding an Observability Platform in 389 Difficult Steps
Building an Observability Platform in 389 Difficult Steps
 
Introduction to Node.js
Introduction to Node.jsIntroduction to Node.js
Introduction to Node.js
 
Rundeck: The missing tool
Rundeck: The missing toolRundeck: The missing tool
Rundeck: The missing tool
 
Automating AWS Compliance with InSpec
Automating AWS Compliance with InSpec Automating AWS Compliance with InSpec
Automating AWS Compliance with InSpec
 
Fluentd - RubyKansai 65
Fluentd - RubyKansai 65Fluentd - RubyKansai 65
Fluentd - RubyKansai 65
 
Aws security with HIDS using Ossec
Aws security with HIDS using OssecAws security with HIDS using Ossec
Aws security with HIDS using Ossec
 
Aws security with HIDS, OSSEC
Aws security with HIDS, OSSECAws security with HIDS, OSSEC
Aws security with HIDS, OSSEC
 
Containerisation Hack of a Legacy Software Solution - Alex Carter - CodeMill ...
Containerisation Hack of a Legacy Software Solution - Alex Carter - CodeMill ...Containerisation Hack of a Legacy Software Solution - Alex Carter - CodeMill ...
Containerisation Hack of a Legacy Software Solution - Alex Carter - CodeMill ...
 

Recently uploaded

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 

Recently uploaded (20)

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 

AWS Meet-up: Logging At Scale on AWS

  • 1. Logging @ Scale on AWS Visualization > Plaintext
  • 2. Who am I? • Chris Riddell • Tech Co-Founder of some start ups, Senior Software Engineer • Big Data guy – Redshift, S3, EC2, EMR, Hive, Spark, Dynamo and many others…. • I manage big AWS infrastructure from software to architecture to DevOps (start up life...) • Java (+ others) / AWS / geek • AWS Professionally Certified Solution Architect (ask me how!) • Not a logging expert - like to think I’m getting close J
  • 3. Logging @ Scale on AWS • Use cases • What to log • Commercial options • Common tools • Plausible architectures • Demo! EC2 w/Fluentd -> Kinesis Firehose -> Elasticsearch
  • 4. Why log? • Get visibility into app health, centrally accessible and searchable • Alerts & fast error event response – Agility! • Not having to login to individual instances and tail • Contextual tracking e.g. user behaviour • Finding code optimisations • …and many many other reasons!
  • 5. First have common language • What is DEBUG, INFO, WARN and ERROR used for in your organisation? • Have common language for what should be logged where • Bad leveling messes up your storage • E.g. DEBUG logs going to your expensive Elasticsearch store, when they never need to be searched • One guy’s opinion: http://stackoverflow.com/a/8021604/3843660
  • 6. Centralising the logs • Let’s get them off the host • Basic DIY: • Syslog-ng, rsyslogd, nxlog • Advanced DIY: • Splunk forwarder, Logstash, Flume, Fluentd • Third party • SaaS
  • 7. Commercial options: SaaS • LogEntries, Sumo Logic, Loggly, Splunk Cloud, PaperTrails, AWS CloudWatch Logs… • Typically RESTful JSON log dump APIs • Search & visualizations are core features • Most have a free tier • Many libraries available for various languages and/or packaged versions • Costs go up with data size, retention period and user count • Nice to have: User defined alerts; S3 archival…..
  • 8. Self-hosted solutions • Splunk – Enterprise solution consisting of forwarders, search heads, and indexers. Licensed. • ELK/EFK stack = Elasticsearch, Logstash or Fluentd, Kibana • Today = Fluentd (forwarder), Kinesis Firehose, Elasticsearch, Kibana • FKFEK stack?
  • 9. Elasticsearch? • For search! • Indexes • Shards - Distributed & scales out • Replicas • JSON REST API • Apache Lucene • Kibana is an Elasticsearch plugin that provides a nice interface to the search data *with visualizations*
  • 10. Logstash & Fluentd agents • Packaged install • Input and output logs • Centralise your instance logs • Often used as a syslog tail’er or as a local HTTP log endpoint • Parse/transform/filter/tag • Store or Forward • “Logstash emphasizes flexibility and interoperability whereas Fluentd prioritizes simplicity and robustness” - http://goo.gl/f5I4cL
  • 11. Lo
  • 12. Architecture – Pre October 2015
  • 15. Demo: Set up EC2 and Fluentd • We spin up a default AWS AMI EC2 instance with role permission to push data to Firehose, access via SSH & HTTP) • We SSH in and install Fluentd • curl -L https://td- toolbelt.herokuapp.com/sh/install-redhat-td- agent2.sh | sh • /usr/sbin/td-agent-gem install fluent-plugin- kinesis #install AWS FH plugin • Then configure fluentd to push our syslog’s
  • 16. Demo: Fluentd config (/etc/td-agent/td- agent.conf) ## Syslog reader. Configure port 42185 to send events to in rsyslog config <source> type syslog port 42185 bind 0.0.0.0 tag system </source> ## Filters to transform records and add metadata <filter **> type record_transformer enable_ruby <record> @timestamp ${require 'time'; Time.now.utc.iso8601} </record> </filter> ## Output to Firehose using the instance role <match **> @type kinesis_firehose region us-west-2 delivery_stream_name logs flush_interval 2s </match>
  • 17. Demo: restart log agents and serve HTTP # After /etc/td-agent/td-agent.conf has been setup # Send syslog to fluentd listener echo "*.* @127.0.0.1:42185" | sudo tee /etc/rsyslog.d/22- fluent.conf sudo service td-agent restart sudo service rsyslog restart # Let’s make a web server for you to push your own logs! mkdir web cd web echo 'Hello!' > index.html sudo python -m SimpleHTTPServer 80 |& logger -t httpsvr &
  • 18. Demo: Setting up Elasticsearch • We set up AWS Elasticsearch Service: Some notes: • Dedicated master - performs cluster management tasks, doesn’t hold data • Metrics: The usual stuff. Note the JVMMemoryPressuremetric. Amazon recommends scale up/out if > 85% • Cluster status is yellow on single node because replicas cannot be assigned. Add a node or change the setting • Cluster has it’s own access policy. If you choose instance role access control, you must sign all requests to ES (use AWS SDKs). You will not be able to access Kibana on this setting • Check what size the instance store is on your selected instance type, or use EBS
  • 19. Demo: Setting up Kinesis Firehose • We create a logs delivery stream with Elasticsearch as the target • Firehose: Some notes: • A pipeline to push data in at high scale • Dump data in, and it buffers recrods(logs in our case) them before pushing to Elasticsearch and optionally S3 (Redshift also supported) • Pay per GB of ingestion $0.035USD (each record rounded to nearest 5kb) • Different destinations (e.g. WARN/ERROR to Elasticsearch but rest only to S3) would need different Firehose delivery streams
  • 20.
  • 21. Limitations? Further features? • AWS’s Elasticsearch is limited on plugins, but very good out of the box settings • Fast way to get high throughput, high scale logs into a stored index with S3 backups, and visualisation! • Further features using AWS Lambda as glue: • Alerts • CloudTrial/S3 logs ingestion to FH/Elasticsearch, • Deletion of old indexes at a specified period (we want the last X days only) • and more… • S3 bucket policy for storage optimisation (eg old stuff to glacier) • Your custom applications: Push directly to Fluentd’s HTTP endpoint, not via syslog (more flexibility and tagging) • Further customisations to Fluentd config
  • 22. The end! • Thanks! • @ChrisJRiddell • @ParrotAnalytics • Hiring Intermediate /Senior Java Engineers J • Upcoming: Web dev & more engineers • https://parrot-analytics.workable.com/ to apply