SlideShare a Scribd company logo
1 of 23
Download to read offline
Realtime Data Analysis with .Net,
AWS Kinesis, Glue, and Athena
SLC .Net User Group

May 14, 2020
Tim Collinson
TEKsystems Global Services

Practice Architect

DevOps and Cloud Practice

tcollinson@teksystems.com

https://linkedin.com/in/tcollinson
Important Links
Code Samples and These Slides:
https://bit.ly/dotnetug-kinesis
https://gitlab.com/trcollinson/dotnet-kinesis-example
What is AWS Kinesis?
AWS Kinesis
• Kinesis Video Streams — Capture, Store, and Process Streaming Media
for Playback and Analytics.

• Kinesis Data Streams — Capture, Store, Process, and Analyze Streaming
Data.

• Kinesis Data Analytics — Actionable Insights from Steaming Data in Real-
Time

• Kinesis Data Firehose —Collect, Transform, and Load Streaming Data!
AWS Kinesis
• Kinesis Video Streams — Camera and image feeds captured in real time,
stored, processed, and analytics available. Auto scaling, no infrastructure
to manage.

• Kinesis Data Streams — A competitor to Kafka for steaming data. Allow
for real time production and consumption of data in a streaming flow.

• Kinesis Data Analytics —Built on top of Data Streams and helps to make
analytics available with sub-second latency. No infrastructure to manage.

• Kinesis Data Firehose —This talk is going to go into that!
Let’s create a Kinesis Firehose!
AWS Kinesis Firehose
• Great for receiving a huge amount of data and storing it quickly and
effectively.

• Powerful transformation abilities.

• Low cost: only pay for data ingestion cost and storage.

• Low barrier to entry!
A note on S3 and Permissions
What is AWS Glue?
AWS Glue
• AWS Glue: Fully managed ETL tool.

• ETL: Extract, Transform, Load.

• Prepare your data for Analytics!
AWS Glue
• Create a Database from the Data we have loaded.

• Transform that data into another format: Parquet.

• Prepare to use SQL to get our data back.
Let’s create a Glue Database!
What is Parquet?
Parquet
Parquet
Parquet
• Apache Parquet!

• A column oriented data storage format.

• Part of the Apache Hadoop echo system.
A note on Glue and Costs
What is AWS Athena?
AWS Athena
• Allows you to query and analyze data from S3 using Standard SQL!

• No new skills to learn.

• Out of the box integration with Glue Databases!
Let’s query using Athena!
Question Tim…
Question Time!
More to come!
Ask on LinkedIn, watch code repo, or
email me!
https://linkedin.com/in/tcollinson
tcollinson@teksystems.com
https://bit.ly/dotnetug-kinesis

More Related Content

What's hot

What's hot (20)

Presto: Fast SQL-on-Anything Across Data Lakes, DBMS, and NoSQL Data Stores
Presto: Fast SQL-on-Anything Across Data Lakes, DBMS, and NoSQL Data StoresPresto: Fast SQL-on-Anything Across Data Lakes, DBMS, and NoSQL Data Stores
Presto: Fast SQL-on-Anything Across Data Lakes, DBMS, and NoSQL Data Stores
 
Security sizing meetup
Security sizing meetupSecurity sizing meetup
Security sizing meetup
 
Building a data warehouse with AWS Redshift, Matillion and Yellowfin
Building a data warehouse with AWS Redshift, Matillion and YellowfinBuilding a data warehouse with AWS Redshift, Matillion and Yellowfin
Building a data warehouse with AWS Redshift, Matillion and Yellowfin
 
Elastic at Procter & Gamble: A Network Story
Elastic at Procter & Gamble: A Network StoryElastic at Procter & Gamble: A Network Story
Elastic at Procter & Gamble: A Network Story
 
Kibana Tutorial | Kibana Dashboard Tutorial | Kibana Elasticsearch | ELK Stac...
Kibana Tutorial | Kibana Dashboard Tutorial | Kibana Elasticsearch | ELK Stac...Kibana Tutorial | Kibana Dashboard Tutorial | Kibana Elasticsearch | ELK Stac...
Kibana Tutorial | Kibana Dashboard Tutorial | Kibana Elasticsearch | ELK Stac...
 
Introduction to aws data pipeline services
Introduction to aws data pipeline servicesIntroduction to aws data pipeline services
Introduction to aws data pipeline services
 
Análisis de las novedades del Elastic Stack
Análisis de las novedades del Elastic StackAnálisis de las novedades del Elastic Stack
Análisis de las novedades del Elastic Stack
 
Wikipedia Cloud Search Webinar
Wikipedia Cloud Search WebinarWikipedia Cloud Search Webinar
Wikipedia Cloud Search Webinar
 
Replicate Elasticsearch Data with Cross-Cluster Replication (CCR)
Replicate Elasticsearch Data with Cross-Cluster Replication (CCR)Replicate Elasticsearch Data with Cross-Cluster Replication (CCR)
Replicate Elasticsearch Data with Cross-Cluster Replication (CCR)
 
Keeping Up with the ELK Stack: Elasticsearch, Kibana, Beats, and Logstash
Keeping Up with the ELK Stack: Elasticsearch, Kibana, Beats, and LogstashKeeping Up with the ELK Stack: Elasticsearch, Kibana, Beats, and Logstash
Keeping Up with the ELK Stack: Elasticsearch, Kibana, Beats, and Logstash
 
Building big data applications on AWS by Ran Tessler
Building big data applications on AWS by Ran TesslerBuilding big data applications on AWS by Ran Tessler
Building big data applications on AWS by Ran Tessler
 
Architecture Best Practices to Master + Pitfalls to Avoid
Architecture Best Practices to Master + Pitfalls to AvoidArchitecture Best Practices to Master + Pitfalls to Avoid
Architecture Best Practices to Master + Pitfalls to Avoid
 
AWS for the SQL Server Pro
AWS for the SQL Server ProAWS for the SQL Server Pro
AWS for the SQL Server Pro
 
IronSource Atom - Redshift - Lessons Learned
IronSource Atom -  Redshift - Lessons LearnedIronSource Atom -  Redshift - Lessons Learned
IronSource Atom - Redshift - Lessons Learned
 
Ejecución del Elastic Stack en Kubernetes
Ejecución del Elastic Stack en KubernetesEjecución del Elastic Stack en Kubernetes
Ejecución del Elastic Stack en Kubernetes
 
AWS for the Data Professional
AWS for the Data ProfessionalAWS for the Data Professional
AWS for the Data Professional
 
Speeding Up Atlas Deep Learning Platform with Alluxio + Fluid
Speeding Up Atlas Deep Learning Platform with Alluxio + FluidSpeeding Up Atlas Deep Learning Platform with Alluxio + Fluid
Speeding Up Atlas Deep Learning Platform with Alluxio + Fluid
 
Hurix case study
Hurix case study Hurix case study
Hurix case study
 
Building near real-time HTAP solutions using Synapse Link for Azure Cosmos DB
Building near real-time HTAP solutions using Synapse Link for Azure Cosmos DBBuilding near real-time HTAP solutions using Synapse Link for Azure Cosmos DB
Building near real-time HTAP solutions using Synapse Link for Azure Cosmos DB
 
AWS for Big Data Experts
AWS for Big Data ExpertsAWS for Big Data Experts
AWS for Big Data Experts
 

Similar to SLC .Net User Group -- .Net, Kinesis Firehose, Glue, Athena

Building Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technologyBuilding Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technology
Domino Data Lab
 

Similar to SLC .Net User Group -- .Net, Kinesis Firehose, Glue, Athena (20)

Aws meetup 20190427
Aws meetup 20190427Aws meetup 20190427
Aws meetup 20190427
 
BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...
BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...
BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...
 
AWS Summit Singapore - Architecting a Serverless Data Lake on AWS
AWS Summit Singapore - Architecting a Serverless Data Lake on AWSAWS Summit Singapore - Architecting a Serverless Data Lake on AWS
AWS Summit Singapore - Architecting a Serverless Data Lake on AWS
 
Using Data Lakes
Using Data Lakes Using Data Lakes
Using Data Lakes
 
Using Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFUsing Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SF
 
Fast Track to Your Data Lake on AWS
Fast Track to Your Data Lake on AWSFast Track to Your Data Lake on AWS
Fast Track to Your Data Lake on AWS
 
BDA305 Building Data Lakes and Analytics on AWS
BDA305 Building Data Lakes and Analytics on AWSBDA305 Building Data Lakes and Analytics on AWS
BDA305 Building Data Lakes and Analytics on AWS
 
Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...
Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...
Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...
 
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon RedshiftData warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
 
Moving Quickly with Data Services in the Cloud
Moving Quickly with Data Services in the CloudMoving Quickly with Data Services in the Cloud
Moving Quickly with Data Services in the Cloud
 
Building a Server-less Data Lake on AWS - Technical 301
Building a Server-less Data Lake on AWS - Technical 301Building a Server-less Data Lake on AWS - Technical 301
Building a Server-less Data Lake on AWS - Technical 301
 
Building Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technologyBuilding Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technology
 
Build Data Lakes and Analytics on AWS: Patterns & Best Practices - BDA305 - A...
Build Data Lakes and Analytics on AWS: Patterns & Best Practices - BDA305 - A...Build Data Lakes and Analytics on AWS: Patterns & Best Practices - BDA305 - A...
Build Data Lakes and Analytics on AWS: Patterns & Best Practices - BDA305 - A...
 
Database and Analytics on the AWS Cloud - AWS Innovate Toronto
Database and Analytics on the AWS Cloud - AWS Innovate TorontoDatabase and Analytics on the AWS Cloud - AWS Innovate Toronto
Database and Analytics on the AWS Cloud - AWS Innovate Toronto
 
Community day ppt_kinesisv1.0
Community day ppt_kinesisv1.0Community day ppt_kinesisv1.0
Community day ppt_kinesisv1.0
 
AWS Big Data Solution Days
AWS Big Data Solution DaysAWS Big Data Solution Days
AWS Big Data Solution Days
 
Aws centralized logs
Aws centralized logsAws centralized logs
Aws centralized logs
 
Big Data, Ingeniería de datos, y Data Lakes en AWS
Big Data, Ingeniería de datos, y Data Lakes en AWSBig Data, Ingeniería de datos, y Data Lakes en AWS
Big Data, Ingeniería de datos, y Data Lakes en AWS
 
BDA303 Serverless big data architectures: Design patterns and best practices
BDA303 Serverless big data architectures: Design patterns and best practicesBDA303 Serverless big data architectures: Design patterns and best practices
BDA303 Serverless big data architectures: Design patterns and best practices
 
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Recently uploaded (20)

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 

SLC .Net User Group -- .Net, Kinesis Firehose, Glue, Athena

  • 1. Realtime Data Analysis with .Net, AWS Kinesis, Glue, and Athena SLC .Net User Group May 14, 2020
  • 2. Tim Collinson TEKsystems Global Services Practice Architect DevOps and Cloud Practice tcollinson@teksystems.com https://linkedin.com/in/tcollinson
  • 3. Important Links Code Samples and These Slides: https://bit.ly/dotnetug-kinesis https://gitlab.com/trcollinson/dotnet-kinesis-example
  • 4. What is AWS Kinesis?
  • 5. AWS Kinesis • Kinesis Video Streams — Capture, Store, and Process Streaming Media for Playback and Analytics. • Kinesis Data Streams — Capture, Store, Process, and Analyze Streaming Data. • Kinesis Data Analytics — Actionable Insights from Steaming Data in Real- Time • Kinesis Data Firehose —Collect, Transform, and Load Streaming Data!
  • 6. AWS Kinesis • Kinesis Video Streams — Camera and image feeds captured in real time, stored, processed, and analytics available. Auto scaling, no infrastructure to manage. • Kinesis Data Streams — A competitor to Kafka for steaming data. Allow for real time production and consumption of data in a streaming flow. • Kinesis Data Analytics —Built on top of Data Streams and helps to make analytics available with sub-second latency. No infrastructure to manage. • Kinesis Data Firehose —This talk is going to go into that!
  • 7. Let’s create a Kinesis Firehose!
  • 8. AWS Kinesis Firehose • Great for receiving a huge amount of data and storing it quickly and effectively. • Powerful transformation abilities. • Low cost: only pay for data ingestion cost and storage. • Low barrier to entry!
  • 9. A note on S3 and Permissions
  • 10. What is AWS Glue?
  • 11. AWS Glue • AWS Glue: Fully managed ETL tool. • ETL: Extract, Transform, Load. • Prepare your data for Analytics!
  • 12. AWS Glue • Create a Database from the Data we have loaded. • Transform that data into another format: Parquet. • Prepare to use SQL to get our data back.
  • 13. Let’s create a Glue Database!
  • 17. Parquet • Apache Parquet! • A column oriented data storage format. • Part of the Apache Hadoop echo system.
  • 18. A note on Glue and Costs
  • 19. What is AWS Athena?
  • 20. AWS Athena • Allows you to query and analyze data from S3 using Standard SQL! • No new skills to learn. • Out of the box integration with Glue Databases!
  • 23. More to come! Ask on LinkedIn, watch code repo, or email me! https://linkedin.com/in/tcollinson tcollinson@teksystems.com https://bit.ly/dotnetug-kinesis