Introduction to Amazon Kinesis Data Streams

Presented By: Prateek Gupta
Introduction to
Amazon Kinesis Data
Streams

Lack of etiquette and manners is a huge turn oﬀ.
KnolX Etiquettes
Punctuality
Join the session 5 minutes prior to
the session start time. We start on
time and conclude on time!
Feedback
Make sure to submit a constructive
feedback for all sessions as it is
very helpful for the presenter.
Silent Mode
Keep your mobile devices in silent
mode, feel free to move out of
session in case you need to attend
an urgent call.
Avoid Disturbance
Avoid unwanted chit chat during
the session.

Our Agenda
02 Amazon Kinesis Data Streams
03 High-Level Architecture
04 Key Concepts and Terminology
05 Basic Operations
01 What is Streaming Data?
06 Demo

What is Streaming Data?
Streaming data refers to the data that is generated continuously in real time by thousands of data
sources and delivered to a system for processing.
Key Points:
● Real-time
● Continuous flow
● Variety of sources
● Variety of formats
● Requires specialized processing
Examples:
● Ecommerce purchases
● Game data
● Information from social networks
● Log data
● Stock prices
● GPS data
● IoT Sensor Data

Amazon Kinesis Data Streams
Amazon Kinesis Data Streams is a real-time streaming data service by AWS. It makes it easy to
collect and process real-time streaming data at high scale.
Some key points to understand:
● Real-time data
● Highly Scalable
● Data sources
● Processing
● Cost-effective
● Easy to use

High-Level Architecture
● The producers continually push data to Kinesis Data Streams, and the consumers process the data in
real time.
● Once the processing is done by the consumer, the result are stored using an AWS service such as
Amazon DynamoDB, Amazon Redshift, or Amazon S3.

Key Concepts and Terminology
➢ Producer: It is an application that puts the data records into Amazon Kinesis Data
Streams.
➢ Consumer: It is an application that retrieves the data records from Amazon Kinesis Data
Streams and process them.
➢ Kinesis Data Stream:
○ A Kinesis data stream is a set of shards.
○ Each shard has a sequence of data records.
○ Each data record has a sequence number.
○ Data retains for 24 hours by default.

➢ Shard:
○ A shard is a uniquely identified sequence of data records
○ A stream is composed of one or more shards, each of which provides a fixed unit of
capacity.
○ Each shard can support up to 1000 PUT records per second(or 1MB/sec), and up to
1,000 GET records per second(or 2MB/sec)
○ The data capacity of a stream is a function of the number of shards.
○ If the data rate increases, increase the number of shards allocated to the stream.
➢ Data Record:
○ A data record is the unit of data stored in a Kinesis data stream.
○ Each data record is composed of a sequence number, a partition key, and a data
blob(up to 1MB).

➢ Sequence Number:
○ A sequence number is a unique identifier for each data record.
○ Allows to read data in the order and also to determine which records have been processed
➢ Partition Key:
○ A partition key is a meaningful identifier that is associated with each record.
○ It is used by the service to determine which shard to store the record in.
○ Specified by the data producer while putting data into a data stream
○ Records with the same partition key are stored together in the same shard.
➢ Retention Period:
○ Amount of time that data records are stored in an Amazon Kinesis Data Stream.
○ Default data retention period for a stream is 24 hours(configurable upto 365 days)

➢ Capacity Mode:
○ The capacity mode determines how capacity is managed and the usage charges for a data
stream.
○ Currently, in Kinesis Data Streams, we can choose between an on-demand mode and a
provisioned mode for our data streams.

Basic Operations
Amazon Kinesis Data Streams provides a number of operations that can be performed on a data
stream. Here are some basic operations:
● create-stream
● describe-stream
● list-streams
● put-record
● get-shard-iterator
● get-records
● split-shard
● merge-shards
● delete-stream

References
● Kinesis Data Streams Oﬃcial Documentation
● AWS Kinesis - Javatpoint

Introduction to Amazon Kinesis Data Streams

More Related Content

What's hot

Similar to Introduction to Amazon Kinesis Data Streams

More from Knoldus Inc.

Recently uploaded

Introduction to Amazon Kinesis Data Streams