[DSC Europe 23] Muhammad Arslan - A Journey of Auditlogs from Kafka to Elastic Search

A Journey of Auditlogs from
Kafka to Open Search (Elastic
Search)
Muhammad Arslan
Big Data/AI Architect

Agenda
• Overview
• Service Architecture
• Kafka
• Open Search (Elastic Search)
• Monitoring and Alerting
• Security
• Data Backup
• Disaster Recovery
• Questions?

Overview
• What is auditlog?
• What is changed and who changed it.
• Previous state vs Current state
• Auditlogs are essentials part of critical applications
• Gets tricky when having multiple microservices
• Centralized solution that supports all kind of logs
• Don’t lose any logs, as they are critical
• 2 phase commit to track when and what has
happened

Requirements
• SLOs (99% < 100ms, 75% < 30ms)
• Can scale easily
• No Downtime (Thanks to green and blue
deployments)
• Terraform the whole infrastructure
• No vendor cloud lock-in
• Alerts and monitoring

Prestudy POC
Full cloud-native Managed services +
custom loader
SNS + SQS Direct writes to ES Kafka + ES
Managed Services:
• Kinesis
• Kinesis Data
Firehose
• ElasticSearch
Custom services:
• Rest API to validate
and push to Kinesis
• Rest API for search
Managed services:
• Kinesis
• ElasticSearch
Custom services:
• Rest API to
validate and push
to Kinesis
• Kinesis consumer
that writes to ES
Managed services:
• SNS
• SQS
ElasticSearch
• Custom services:
and push to SNS
• Custom sqs
consumer/writer
for each
destination
Managed services:
• ElasticSearch
Custom services
and push to
Elasticsearch and
others
Managed services:
• ElasticSearch
• Kafka
Custom services
and push to Kafka
• Custom Kafka
consumer/writer
for each
destination

Write API
• Endpoint for posting auditlog
• 2 phase commit
• entryId => UUID
• Transaction State:
• STARTED
• COMPLETED
• FAILED
• UNKNOWN
• Same entryId for the following request
• Hash(entryId1 + STARTED)
• Hash(entryId1 + [COMPLETED | FAILED | UNKNOWN])

Kafka
• Write Rest API receives the event and put it in Kafka topic
• Retention 5 days
• 10 partitions for topic (as a start)
• The message is compressed in avro format

Kafka to
ElasticSearch
(OpenSearch)
connector
Reads events from Kafka topic
Based on the event date, ingest
it into monthly index
Example:
Index-prefix + 2022-08-01
Index-prefix + 2022-09-01

Kafka to S3 connector (Backup)
• Connector that copies all the s3 bucket as backup
• Files are stored hourly basis in avro format
• Can be used to reprocess or replay the data
• Retention time 3 months
• Deleted automatically after 3 months

OpenSearch (ElasticSearch)
• Amazon OpenSearch Service (successor to Amazon Elasticsearch Service) is a
managed service that makes it easy to deploy, operate, and scale OpenSearch
clusters in the AWS Cloud.
• Amazon OpenSearch Service supports OpenSearch and legacy Elasticsearch
OSS.
• OpenSearch is a fully open-source search and analytics engine for use cases
such as log analytics, real-time application monitoring, and clickstream
analysis.
• You can scale your cluster with a single API call or a few clicks in the console.

AWS OpenSearch
Setup (Int and
Prod)

Index
template
• An index template is a way to tell
Elasticsearch how to configure an index
when it is created.
• Templates are configured prior to index
creation.
• When an index is created - either manually
or through indexing a document - the
template settings are used as a basis for
creating the index.
• GET _index_template/auditlog-template
17

Policy
template
• To use a policy to manage an index that
doesn’t roll over.
• you can specify a lifecycle policy when you
create the index, or apply a policy directly to
an existing index.
• To maintain the state of the index
• Hot
• Warm (Ultrawarm)
• Cold
• Delete
19

Calculate storage
requirements
• Source Data * (1 + Number of Replicas) * 1.45 =
Minimum Storage Requirement
• 30 * (1 + 2) * 1.45 = 130,5
22

Data backup and recovery
• Automatic snapshots 1 week retention
• Manual snapshots can be done and transferred to s3
bucket automatically
• Data can be restored any time from automatic or manual
snapshots

Monitoring
• Auditlog Platform
• Auditlog OpenSearch
• Dataflow-connect
• Dataflow-msk
• Dataflow-producer

Security
• OpenSearch UI is behind Oauth proxy
• No IAM or SSO based security
• Currently can’t track any user who performed any queries
other then logs

Data Backup
• Automatic
snapshots
• Manual
snapshots
Create
restore
snapshots
runbook
Replay to get
to handle
data
inconsistency
S3 Replay
docs

Disaster Recovery : Exercise
• Exercise in integration environment
• Update domain i.e add more data nodes lower down instance size
• Stop replicas es connector and monitoring
• Drop all daily indices
• Apply ingestion monthly policy
• Apply new datamodel pointing to ingestion monthly policy
• Start replicas of es connector and monitoring

[DSC Europe 23] Muhammad Arslan - A Journey of Auditlogs from Kafka to Elastic Search

Recommended

Recommended

More Related Content

Similar to [DSC Europe 23] Muhammad Arslan - A Journey of Auditlogs from Kafka to Elastic Search

Similar to [DSC Europe 23] Muhammad Arslan - A Journey of Auditlogs from Kafka to Elastic Search (20)

More from DataScienceConferenc1

More from DataScienceConferenc1 (20)

Recently uploaded

Recently uploaded (20)

[DSC Europe 23] Muhammad Arslan - A Journey of Auditlogs from Kafka to Elastic Search