Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

A quick introduction to AWS Kinesis

781 views

Published on

A Devoxx Belgium 2015 talk about AWS Kinesis.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

A quick introduction to AWS Kinesis

  1. 1. @ogeisser#Devoxx #Kinesis A quick introduction to AWS Kinesis Streams Oliver Geisser
  2. 2. @ogeisser#Devoxx #Kinesis Kinesis Platform Family Kinesis Streams Kinesis Firehose Kinesis Analytics Build your own custom application that process or analyze streaming data Available since 2014 Load massive volumes of streaming data into Amazon S3 and Redshift NEW Oct 2015 Analyze data streams using SQL queries Announced for 2016
  3. 3. @ogeisser#Devoxx #Kinesis Kinesis Platform Family Kinesis Streams Kinesis Firehose Kinesis Analytics Build your own custom application that process or analyze streaming Data Available since 2014 Load massive volumes of streaming data into Amazon S3 and Redshift NEW Oct 2015 Analyze data streams using SQL queries Announced for 2016
  4. 4. @ogeisser#Devoxx #Kinesis Kinesis Streams – Example Use Case
  5. 5. @ogeisser#Devoxx #Kinesis High Level Architecture
  6. 6. @ogeisser#Devoxx #Kinesis Concepts (I) Stream •  Named Event Stream of Data Records •  Data is stored for 24 hours (default) – up to 168 hours (7 days) •  Data is partioned into Shards Data Record •  Unit of data stored in an Stream •  Data Record = Data Blob + Partition Key + Sequence Number
  7. 7. @ogeisser#Devoxx #Kinesis Concepts (II) Partition Key •  Assigned to the Data Record by the data producer •  Used for partitioning of data across Shards •  MD5 Hash determines Shard Sequence Number •  Unique identifier of a Data Record •  Assigned by Kinesis on write
  8. 8. @ogeisser#Devoxx #Kinesis Concepts (III) Shard •  A shard is a group of Data Records in a Stream •  A stream is composed of multiple shards •  You scale Kinesis streams by adding or removing Shards •  Each shard provides a fixed unit of capacity •  Each shard ingests up to 1MB/sec of data up to 1000 records/sec
  9. 9. Demo @ogeisser#Devoxx #Kinesis
  10. 10. @ogeisser#Devoxx #Kinesis Closing Remarks • Understand the consequences of the limits •  Shards (=Capacity), Number of Consumers, Latency, etc. • Trade Off: Vendor Lock-In vs. Managed Service •  Alternative: Manage your own Kafka Cluster • Choose the right access library for your use-case •  HTTP, SDK, Client, Producer, Connector, Third Party
  11. 11. @ogeisser#Devoxx #Kinesis Thank you Oliver Geisser Twitter: @ogeisser

×