SlideShare a Scribd company logo
1 of 79
Download to read offline
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Rahul Bhartia, Principal Solutions Architect, AWS
Ryan Nienhuis, Senior Product Manager, AWS
Tony Gibbs, Solutions Architect, AWS
Radhika Ravirala, Solutions Architect, AWS
November 29, 2016
BDM202
Workshop
Building Your First Big Data Application on AWS
Objectives for Today
Build an end-to-end big data application that captures, stores, processes, and
analyzes web logs.
At the end of the workshop, you will:
1. Understand how a common scenario is implemented using AWS big data
tools
2. Be able to understand how to use
1. Amazon Kinesis for real-time data
2. Amazon Redshift for data warehousing
3. Amazon EMR for data processing
4. Amazon QuickSight for data visualization
Your Big Data Application Architecture
Amazon
Kinesis
Producer UI
Amazon
Kinesis
Firehose
Amazon
Kinesis
Analytics
Amazon Kinesis
Firehose
Amazon
EMR
Amazon
Redshift
Amazon
QuickSight
Generate web
logs
Collect web logs
and deliver to S3
Process & compute
aggregate metrics
Deliver processed web
logs to Amazon Redshift
Raw web logs from
Firehose
Run SQL queries on
processed web logs
Visualize web logs to
discover insights
Amazon S3
Bucket
Ad hoc analysis
of web logs
Getting started
What is qwikLABS?
Access to AWS services for this bootcamp
No need to provide a credit card
Automatically deleted when you’re finished
http://events-aws.qwiklab.com
Create an account with the same email that you
used to register for this bootcamp
Sign in and start course
Once the course is started, you will see “Create in Progress” in the upper
right corner.
Navigating qwikLABS
Connect tab: Access and login
information
Addl Info tab: Links to Interfaces
Lab Instruction tab -
Scripts for your labs
Everything you need for the lab
Open the AWS Management Console, log in and verify,
• Two Amazon Kinesis Firehose delivery streams
• One Amazon EMR cluster
• One Amazon Redshift cluster
Sign up for
• Amazon QuickSight
Real-time data
with Amazon Kinesis
Amazon Kinesis: Streaming Data Made Easy
Services make it easy to capture, deliver, process streams on AWS
Amazon Kinesis
Streams
Amazon Kinesis
Analytics
Amazon Kinesis
Firehose
Amazon Kinesis Streams
• Easy administration
• Build real time applications with framework of choice
• Low cost
Amazon Kinesis Firehose
• Zero administration
• Direct-to-data store integration
• Seamless elasticity
Amazon Kinesis - Firehose vs. Streams
Amazon Kinesis Streams is for use cases that require custom
processing, per incoming record, with sub-1 second processing
latency, and a choice of stream processing frameworks.
Amazon Kinesis Firehose is for use cases that require zero
administration, ability to use existing analytics tools based on
Amazon S3, Amazon Redshift, and Amazon Elasticsearch
Service, and a data latency of 60 seconds or higher.
Activity 1
Collect web logs using
a Firehose delivery stream
Capture data using a Firehose delivery stream
Time: 5 minutes
We are going to:
A. Use a simple producer applications that writes web logs into
Firehose delivery stream configured to write data into
Amazon S3.
75.35.230.210 - - [20/Jul/2009:22:22:42 -0700]
"GET /images/pigtrihawk.jpg " 200 29236
"Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.11) Gecko/2009060215
Firefox/3.0.11 (.NET CLR 3.5.30729)"
Activity 1A: Writing to a Firehose delivery stream
To simplify this, we are going to use simple tool for creating sample data and writing
it to a Amazon Kinesis Firehose delivery stream.
1. Go to the simple data producer UI http://bit.ly/kinesis-producer
2. Copy and past your credentials into the form. The credentials do not leave your
client (locally hosted site). Additionally, the AWS JavaScript SDK is implemented
in the site which uses HTTPS. In normal situations, you do not want to use
your credentials in this way unless you trust the provider.
3. Specify region (US West (Oregon)) and delivery stream name. (qls-
somerandomnumber-FirehoseDeliveryStream-somerandom-11111111111)
4. Specify a data rate between 100 – 500, pick a unique number. This will control
the rate at which records are sent to the stream.
Activity 1A: Writing to a Firehose delivery stream
5. Copy and paste this into the text editor
{{internet.ip}} - - [{{date.now("DD/MMM/YYYY:HH:mm:ss Z")}}]
"{{random.weightedArrayElement({"weights":[0.6,0.1,0.1,0.2],"data":["GET","POST","DEL
ETE","PUT"]})}} {{random.arrayElement(["/list","/wp-content","/wp-
admin","/explore","/search/tag/list","/app/main/posts","/posts/posts/explore"])}}"
{{random.weightedArrayElement({"weights": [0.9,0.04,0.02,0.04],
"data":["200","404","500","301"]})}} {{random.number(10000)}} "-"
"{{internet.userAgent}}"
Make sure there is a new line after your record when you copy and paste
The tool will generate records based upon the above format. The tool
generates hundreds of these records and send them in a single put using
PutRecords. Finally, the tool will drain your battery. Only keep it running while
we are completing the activities if you don’t have a power cord.
Activity 1A: Writing to a Firehose delivery stream
Review: Monitoring a Delivery Stream
Go to the Amazon CloudWatch Metrics Console and search
“IncomingRecords”. Select this metric for your Firehose delivery stream and
choose a 1 Minute SUM.
What are the most important metrics to monitor?
Activity 2
Real-time processing using
Amazon Kinesis Analytics
Amazon Kinesis Analytics
• Apply SQL on streams
• Build real time, stream processing applications
• Easy scalability
Use SQL to build real-time applications
Easily write SQL code to process
streaming data
Connect to streaming source
Continuously deliver SQL results
Process Data using Amazon Kinesis Analytics
Time: 20 minutes
We are going to:
A. Create an Amazon Kinesis Analytics application that reads from
the delivery stream with log data
B. Write a SQL query to parse and transform the log
C. Add another query to compute an aggregate metrics for an
interesting statistic on the incoming data
Activity 2A: Create an Amazon Kinesis Analytics
Application
1. Go to the Amazon Kinesis Analytics Console
1. https://us-west-2.console.aws.amazon.com/kinesisanalytics/home?region=us-
west-2
2. Click “Create new application” and provide a name
3. Click “Connect to a source” and select the stream you created in the
previous exercise (qls-randomnumber-FirehoseDeliveryStream-
randomnumber)
4. Amazon Kinesis Analytics will discover the schema for you if your data
is UTF-8 encoded CSV or JSON data. In this case, we have custom
log, so discovery incorrectly assumes a delimiter
Activity 2A: Create an Amazon Kinesis Analytics
Application
Amazon Kinesis Analytics comes with powerful string manipulation and log parsing
functions. So we are going to define our own schema to handle this format and then
parse it in SQL code.
5. Click “Edit schema”
6. Verify that the format is CSV, Row delimiter is ‘n’, and change the column delimiter
to ‘t’
7. Remove all the existing columns, and create a single column named DATA of type
VARCHAR(5000).
8. Click “Save schema and update stream samples”. This will run your application
(takes about a minute).
Activity 2A: Create an Amazon Kinesis Analytics
Application
There are two system columns added to each record.
• The ROWTIME represents the time the application read the record, and is a special column used for time series analytics. This
is also known as the process time.
• The APPROXIMATE_ARRIVAL_TIME is the time the delivery stream received the record. This is also known as ingest time.
Activity 2A: Create an Amazon Kinesis Analytics
Application
10. Click “Exit (done)” on Schema editing
page.
11. Click “Cancel” on Source
configuration page.
12. Click “Go To SQL Editor”.
You now have a running
stream processing
application!
Activity 2B: Writing SQL over Streaming Data
Writing SQL over streaming data using Amazon Kinesis Analytics follows a two part
model:
1. Create an in-application stream for storing intermediate SQL results. An in-
application stream is like a SQL table, but is continuously updated.
2. Create a PUMP which will continuously read FROM one in-application stream
and INSERT INTO a target in-application stream
DESTINATION_SQL_STREAM
Part 1 (Activity 2-B)
AGGREGATE_STREAM
Part 2 (Activity 2-C)
SOURCE_SQL_STREAM_001
Source Stream
TRANSFORM_PUMP
AGGREGATE_PUMP
Send to Redshift
Activity 2B: Add your first query
Our first query will use a special function called W3C_LOG_PARSE to separate the log into specific
columns. Remember, this is a two part process:
1. We will create the in-application stream to store intermediate SQL results.
2. We will create a pump that selects data from the source stream and inserts into the target in-
application stream.
Your application code is replaced each time you click “Save and Run”. As you add queries, append
them to the bottom of the SQL editor.
Note: You can download all of the application code from qwikLABS. Click on the Lab
instructions tab in qwikLABS and then download the Kinesis Analytics SQL file.
Activity 2B: Calculate an aggregate metric
Calculate a count using a tumbling window and a GROUP BY clause.
A tumbling window is similar to
a periodic report, where you
specify your query and a time
range, and results are emitted
at the end of a range. (EX:
COUNT number of items by key
for 10 seconds)
Activity 2B: Calculate an aggregate metric
The window is defined by the following statement in the SELECT statement.
Note that the ROWTIME column is implicitly included in every stream query,
and represents the processing time of the application.
FLOOR("SOURCE_SQL_STREAM_001".ROWTIME TO SECOND)
(DO NOT INSERT THIS INTO YOUR SQL EDITOR)
This is known as a tumbling window.
Review – In-Application SQL streams
Your application has multiple in-application SQL streams including
TRANSFORMED_STREAM and AGGREGATE_STREAM. These in-application
streams which are like SQL tables that are continuously updated.
What else is unique about an in-application stream aside from its continuous
nature?
Data Warehousing
with Amazon Redshift
Columnar
MPP
OLAP
Amazon Redshift
Amazon Redshift Cluster Architecture
Massively parallel, shared nothing
Leader node
• SQL endpoint
• Stores metadata
• Coordinates parallel SQL processing
Compute nodes
• Local, columnar storage
• Executes queries in parallel
• Load, backup, restore
10 GigE
(HPC)
Ingestion
Backup
Restore
SQL Clients/BI Tools
128GB RAM
16TB disk
16 cores
S3 / EMR / DynamoDB / SSH
JDBC/ODBC
128GB RAM
16TB disk
16 coresCompute
Node
128GB RAM
16TB disk
16 coresCompute
Node
128GB RAM
16TB disk
16 coresCompute
Node
Leader
Node
Designed for I/O Reduction
Columnar storage
Data compression
Zone maps
aid loc dt
1 SFO 2016-09-01
2 JFK 2016-09-14
3 SFO 2017-04-01
4 JFK 2017-05-14
• Accessing dt with row storage:
– Need to read everything
– Unnecessary I/O
aid loc dt
CREATE TABLE loft_deep_dive (
aid INT --audience_id
,loc CHAR(3) --location
,dt DATE --date
);
Designed for I/O Reduction
Columnar storage
Data compression
Zone maps
aid loc dt
1 SFO 2016-09-01
2 JFK 2016-09-14
3 SFO 2017-04-01
4 JFK 2017-05-14
• Accessing dt with columnar storage:
– Only scan blocks for relevant
column
aid loc dt
CREATE TABLE loft_deep_dive (
aid INT --audience_id
,loc CHAR(3) --location
,dt DATE --date
);
Designed for I/O Reduction
Columnar storage
Data compression
Zone maps
aid loc dt
1 SFO 2016-09-01
2 JFK 2016-09-14
3 SFO 2017-04-01
4 JFK 2017-05-14
• Columns grow and shrink independently
• Effective compression ratios due to like
data
• Reduces storage requirements
• Reduces I/O
aid loc dt
CREATE TABLE loft_deep_dive (
aid INT ENCODE LZO
,loc CHAR(3) ENCODE BYTEDICT
,dt DATE ENCODE RUNLENGTH
);
Designed for I/O Reduction
Columnar storage
Data compression
Zone maps
aid loc dt
1 SFO 2016-09-01
2 JFK 2016-09-14
3 SFO 2017-04-01
4 JFK 2017-05-14
aid loc dt
CREATE TABLE loft_deep_dive (
aid INT --audience_id
,loc CHAR(3) --location
,dt DATE --date
);
• In-memory block metadata
• Contains per-block MIN and MAX value
• Effectively prunes blocks which cannot
contain data for a given query
• Eliminates unnecessary I/O
SELECT COUNT(*) FROM LOGS WHERE DATE = '09-JUNE-2013'
MIN: 01-JUNE-2013
MAX: 20-JUNE-2013
MIN: 08-JUNE-2013
MAX: 30-JUNE-2013
MIN: 12-JUNE-2013
MAX: 20-JUNE-2013
MIN: 02-JUNE-2013
MAX: 25-JUNE-2013
Unsorted Table
MIN: 01-JUNE-2013
MAX: 06-JUNE-2013
MIN: 07-JUNE-2013
MAX: 12-JUNE-2013
MIN: 13-JUNE-2013
MAX: 18-JUNE-2013
MIN: 19-JUNE-2013
MAX: 24-JUNE-2013
Sorted By Date
Zone Maps
Activity 4
Deliver streaming results to
Amazon Redshift
Activity 4: Deliver data to Amazon Redshift using
Firehose
Time: 5 minutes
We are going to:
A. Connect to an Amazon Redshift cluster and create a table to
hold weblogs data.
B. Update the Amazon Kinesis Analytics application to send
data to Amazon Redshift, via the Firehose delivery stream.
Activity 4A: Connect to Amazon Redshift
You can connect with pgweb
• Installed and configured for the cluster
• Just navigate to pgWeb and start interacting
Note: Click on the Addl. Info tab in qwikLABS and then open the
pgWeb link in a new window.
Or, Use any JDBC/ODBC/libpq client
• Aginity Workbench for Amazon Redshift
• SQL Workbench/J
• DBeaver
• Datagrip
Activity 4B: Create table in Amazon Redshift
Create a table weblogs to capture the in-coming data from Firehose delivery stream
CREATE TABLE weblogs (
row_time timestamp encode raw,
host_address varchar(512) encode lzo,
request_time timestamp encode raw,
request_method varchar(5) encode lzo,
request_path varchar(1024) encode lzo,
request_protocol varchar(10) encode lzo,
response_code int encode delta,
response_size int encode delta,
referrer_host varchar(1024) encode lzo,
user_agent varchar(512) encode lzo
)
DISTSTYLE EVEN
SORTKEY(request_time);
Note: You can download all of the application code from qwikLabs. Click on the lab
instructions tab in qwikLABS and then download the Amazon Redshift SQL file.
Activity 4C: Deliver Data to Amazon Redshift using
Firehose
Update Amazon Kinesis Analytics application to send data
to a Firehose delivery stream. Firehose delivers the
streaming data to Amazon Redshift.
1. Go to the Amazon Kinesis Analytics console
2. Select your application and choose Application details
3. Choose Connect to a destination
Activity 4C: Deliver Data to Amazon Redshift using
Firehose
4. Configure your destination
1. Choose the Firehose “qls-xxxxxxx-RedshiftDeliveryStream-
xxxxxxxx” delivery stream.
2. Choose CSV as the “Output format”
3. Choose “Create/update <app name> role”
4. Click “Save and continue”
5. It will take about 1 – 2 minutes for everything to be updated and for
data to start appearing in Amazon Redshift.
Review: Amazon Redshift Test Queries
• Find distribution of response codes over days
SELECT TRUNC(request_time), response_code, COUNT(1) FROM weblogs
GROUP BY 1,2 ORDER BY 1,3 DESC
• Find the 404 status codes
SELECT COUNT(1) FROM weblogs WHERE response_code = 404;
• Show all requests for status as PAGE NOT FOUND
SELECT TOP 1 request_path, COUNT(1) FROM weblogs WHERE
response_code = 404 GROUP BY 1 ORDER BY 2 DESC;
Data processing
with Amazon EMR
Why Amazon EMR?
Easy to Use
Launch a cluster in minutes
Low Cost
Pay an hourly rate
Elastic
Easily add or remove capacity
Reliable
Spend less time monitoring
Secure
Manage firewalls
Flexible
Control the cluster
The Hadoop ecosystem can run in Amazon EMR
Spot Instances
for task nodes
Up to 90%
off Amazon EC2
on-demand
pricing
On-Demand for
core nodes
Standard
Amazon EC2
pricing for
on-demand
capacity
Easy to use Spot Instances
Meet SLA at predictable cost Exceed SLA at lower cost
Amazon S3 as your persistent data store
Separate compute and storage
Resize and shut down Amazon EMR
clusters with no data loss
Point multiple Amazon EMR clusters at
same data in Amazon S3
EMR
EMR
Amazon
S3
EMRFS makes it easier to leverage S3
Better performance and error handling options
Transparent to applications – Use “s3://”
Consistent view
• For consistent list and read-after-write for new puts
Support for Amazon S3 server-side and client-side encryption
Faster listing using EMRFS metadata
Apache Spark
• Fast, general-purpose engine
for large-scale data processing
• Write applications quickly in
Java, Scala, or Python
• Combine SQL, streaming, and
complex analytics
Apache Zeppelin
• Web-based notebook for interactive analytics
• Multiple language back end
• Apache Spark integration
• Data visualization
• Collaboration
https://zeppelin.incubator.apache.org/
Activity 5
Ad hoc analysis
using Amazon EMR
Activity 5: Process and query data using Amazon EMR
Time: 20 minutes
We are going to:
A. Use a Zeppelin Notebook to interact with the Amazon EMR cluster
B. Process the data delivered to Amazon S3 by Firehose using
Apache Spark
C. Query the data processed in the earlier stage and create simple
charts
Activity 5A: Open the Zeppelin interface
1. Click on the Lab Instructions tab
in qwikLABS and then download
the Zeppelin Notebook
2. Click on the Addl. Info tab in
qwikLABS and then open the
zeppelin link into a new window.
3. Import the Notebook using the
Import Note link on Zeppelin
interface
Using Zeppelin interface
Run the Notebook
Run the paragraph
Activity 5B: Run the notebook
Enter the S3 bucket name created by qwikLabs in the
Notebook.
• Execute Step #1 (logs-##########-us-west-2)
• Execute Step #2 to create an RDD from the dataset
delivered by Firehose
• Execute Step #3 to print the first row of the output
• Notice the broken columns. We’re going to fix that next.
Activity 5B: Run the notebook
• Execute Step #4 to process the data
• Combine fields: “A B”  A B C
• Define a schema for the data
• Map the RDD to a data frame
• Execute Step #5 to print the weblogsDF
• See how different is it from the previous ones
Activity 5B: Run the notebook
• Execute Step #6 to register the data frame as a
temporary table
• Now you can run SQL queries on the temporary tables.
• Execute the next 3 steps and observe the charts
created.
• What did you learn about the dataset?
Review : Ad-hoc analysis using Amazon EMR
• You just learned on how to process and query data using
Amazon EMR with Apache Spark
• Amazon EMR has many other frameworks available for
you to use
• Hive, Presto, MapReduce, Pig
• Hue, Oozie, HBase
Data Visualization with
Amazon QuickSight
Fast, Easy Ad-Hoc Analytics
for Anyone, Everywhere
• Ease of use targeted at business users.
• Blazing fast performance powered by
SPICE.
• Broad connectivity with AWS data
services, on-premises data, files and
business applications.
• Cloud-native solution that scales
automatically.
• 1/10th the cost of traditional BI solutions.
• Create, share and collaborate with anyone
in your organization, on the web or on
mobile.
Connect, SPICE, Analyze
QuickSight allows you to connect to data from a wide variety of AWS, third-party, and on-
premises sources and import it to SPICE or query directly. Users can then easily explore,
analyze, and share their insights with anyone.
Amazon RDS
Amazon S3
Amazon Redshift
Activity 6
Visualize results in QuickSight
Activity 6: Visualization with QuickSight
We are going to:
A. Register for a QuickSight account
B. Connect to the Amazon Redshift cluster
C. Create visualizations for analysis to answers questions
like
A. What are the most common http requests and how
successful are they
B. Which are the most requested URIs
Activity 6A: QuickSight Registration
• Go to console, choose QuickSight from
the Analytics section.
• Choose Signup for Quicksight.
• On the registration screen, choose US
West Oregon region.
• Enter the AWS account number and
your email.
Note: You can get the AWS account
number from qwikLABS in the Connect
tab
Activity 6B: Connect to Amazon Redshift
• Click Manage Data
to create a new
data set in
QuickSight.
• Choose Redshift
(Auto-discovered)
as the data source.
Activity 6B: Connect to Amazon Redshift
Note: You can get the password from qwikLABS by navigating to the
Custom Connection Details section on the Connect tab
Activity 6C: Ingest data into SPICE
Activity 6C: Ingest data into SPICE
• SPICE is Amazon QuickSight's
in-memory optimized
calculation engine, designed
specifically for fast, ad-hoc data
visualization.
• For faster analytics, you can
import the data into SPICE
instead of using a direct query
to the database.
Activity 6D: Creating your first analysis
What are the most common http requests made?
• Simply select request_type, response_code and let
AUTOGRAPH create the optimal visualization
Review – Creating your Analysis
Exercise: Create a visual to show which URIs are the most
requested
Congratulations!!!
Your Big Data Application Architecture
Amazon
Kinesis
Producer UI
Amazon
Kinesis
Firehose
Amazon
Kinesis
Analytics
Amazon Kinesis
Firehose
Amazon
EMR
Amazon
Redshift
Amazon
QuickSight
Generate web
logs
Collect web logs
and deliver to S3
Process & compute
aggregate metrics
Deliver processed web
logs to Amazon Redshift
Raw web logs from
Firehose
Run SQL queries on
processed web logs
Visualize web logs to
discover insights
Amazon S3
Bucket
Ad-hoc analysis
of web logs
Thank you!
Remember to complete
your evaluations!

More Related Content

What's hot

NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.Amazon Web Services
 
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceBDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceAmazon Web Services
 
利用 Amazon QuickSight 視覺化分析服務剖析資料
利用 Amazon QuickSight 視覺化分析服務剖析資料利用 Amazon QuickSight 視覺化分析服務剖析資料
利用 Amazon QuickSight 視覺化分析服務剖析資料Amazon Web Services
 
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceBDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceAmazon Web Services
 
Escalando para sus primeros 10 millones de usuarios
Escalando para sus primeros 10 millones de usuariosEscalando para sus primeros 10 millones de usuarios
Escalando para sus primeros 10 millones de usuariosAmazon Web Services LATAM
 
AWS October Webinar Series - Introducing Amazon Kinesis Firehose
AWS October Webinar Series - Introducing Amazon Kinesis FirehoseAWS October Webinar Series - Introducing Amazon Kinesis Firehose
AWS October Webinar Series - Introducing Amazon Kinesis FirehoseAmazon Web Services
 
Big Data adoption success using AWS Big Data Services - Pop-up Loft TLV 2017
Big Data adoption success using AWS Big Data Services - Pop-up Loft TLV 2017Big Data adoption success using AWS Big Data Services - Pop-up Loft TLV 2017
Big Data adoption success using AWS Big Data Services - Pop-up Loft TLV 2017Amazon Web Services
 
AWS re:Invent 2016: Real-Time Data Exploration and Analytics with Amazon Elas...
AWS re:Invent 2016: Real-Time Data Exploration and Analytics with Amazon Elas...AWS re:Invent 2016: Real-Time Data Exploration and Analytics with Amazon Elas...
AWS re:Invent 2016: Real-Time Data Exploration and Analytics with Amazon Elas...Amazon Web Services
 
Getting Started with Amazon Kinesis | AWS Public Sector Summit 2016
Getting Started with Amazon Kinesis | AWS Public Sector Summit 2016Getting Started with Amazon Kinesis | AWS Public Sector Summit 2016
Getting Started with Amazon Kinesis | AWS Public Sector Summit 2016Amazon Web Services
 
Real-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS LambdaReal-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS LambdaAmazon Web Services
 
透過 Amazon Redshift 打造數據分析服務及 Amazon Redshift 新功能案例介紹
透過 Amazon Redshift 打造數據分析服務及 Amazon Redshift 新功能案例介紹透過 Amazon Redshift 打造數據分析服務及 Amazon Redshift 新功能案例介紹
透過 Amazon Redshift 打造數據分析服務及 Amazon Redshift 新功能案例介紹Amazon Web Services
 
Serverless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis AnalyticsServerless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis AnalyticsAmazon Web Services
 
Workshop: Building a Streaming Data Platform on AWS
Workshop: Building a Streaming Data Platform on AWSWorkshop: Building a Streaming Data Platform on AWS
Workshop: Building a Streaming Data Platform on AWSAmazon Web Services
 
What's New & What's Next from AWS?
What's New & What's Next from AWS?What's New & What's Next from AWS?
What's New & What's Next from AWS?Ian Massingham
 
Soluzioni di Database completamente gestite: NoSQL, relazionali e Data Warehouse
Soluzioni di Database completamente gestite: NoSQL, relazionali e Data WarehouseSoluzioni di Database completamente gestite: NoSQL, relazionali e Data Warehouse
Soluzioni di Database completamente gestite: NoSQL, relazionali e Data WarehouseAmazon Web Services
 
BDA403 How Netflix Monitors Applications in Real-time with Amazon Kinesis
BDA403 How Netflix Monitors Applications in Real-time with Amazon KinesisBDA403 How Netflix Monitors Applications in Real-time with Amazon Kinesis
BDA403 How Netflix Monitors Applications in Real-time with Amazon KinesisAmazon Web Services
 
Serverless for Developers: Event-Driven & Distributed Apps - Pop-up Loft TLV ...
Serverless for Developers: Event-Driven & Distributed Apps - Pop-up Loft TLV ...Serverless for Developers: Event-Driven & Distributed Apps - Pop-up Loft TLV ...
Serverless for Developers: Event-Driven & Distributed Apps - Pop-up Loft TLV ...Amazon Web Services
 

What's hot (20)

NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
 
Policy Ninja
Policy NinjaPolicy Ninja
Policy Ninja
 
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceBDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
 
利用 Amazon QuickSight 視覺化分析服務剖析資料
利用 Amazon QuickSight 視覺化分析服務剖析資料利用 Amazon QuickSight 視覺化分析服務剖析資料
利用 Amazon QuickSight 視覺化分析服務剖析資料
 
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceBDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
 
Escalando para sus primeros 10 millones de usuarios
Escalando para sus primeros 10 millones de usuariosEscalando para sus primeros 10 millones de usuarios
Escalando para sus primeros 10 millones de usuarios
 
AWS October Webinar Series - Introducing Amazon Kinesis Firehose
AWS October Webinar Series - Introducing Amazon Kinesis FirehoseAWS October Webinar Series - Introducing Amazon Kinesis Firehose
AWS October Webinar Series - Introducing Amazon Kinesis Firehose
 
Big Data adoption success using AWS Big Data Services - Pop-up Loft TLV 2017
Big Data adoption success using AWS Big Data Services - Pop-up Loft TLV 2017Big Data adoption success using AWS Big Data Services - Pop-up Loft TLV 2017
Big Data adoption success using AWS Big Data Services - Pop-up Loft TLV 2017
 
AWS re:Invent 2016: Real-Time Data Exploration and Analytics with Amazon Elas...
AWS re:Invent 2016: Real-Time Data Exploration and Analytics with Amazon Elas...AWS re:Invent 2016: Real-Time Data Exploration and Analytics with Amazon Elas...
AWS re:Invent 2016: Real-Time Data Exploration and Analytics with Amazon Elas...
 
Getting Started with Amazon Kinesis | AWS Public Sector Summit 2016
Getting Started with Amazon Kinesis | AWS Public Sector Summit 2016Getting Started with Amazon Kinesis | AWS Public Sector Summit 2016
Getting Started with Amazon Kinesis | AWS Public Sector Summit 2016
 
Real-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS LambdaReal-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS Lambda
 
透過 Amazon Redshift 打造數據分析服務及 Amazon Redshift 新功能案例介紹
透過 Amazon Redshift 打造數據分析服務及 Amazon Redshift 新功能案例介紹透過 Amazon Redshift 打造數據分析服務及 Amazon Redshift 新功能案例介紹
透過 Amazon Redshift 打造數據分析服務及 Amazon Redshift 新功能案例介紹
 
Almacenamiento en la nube con AWS
Almacenamiento en la nube con AWSAlmacenamiento en la nube con AWS
Almacenamiento en la nube con AWS
 
Serverless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis AnalyticsServerless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis Analytics
 
Databases
DatabasesDatabases
Databases
 
Workshop: Building a Streaming Data Platform on AWS
Workshop: Building a Streaming Data Platform on AWSWorkshop: Building a Streaming Data Platform on AWS
Workshop: Building a Streaming Data Platform on AWS
 
What's New & What's Next from AWS?
What's New & What's Next from AWS?What's New & What's Next from AWS?
What's New & What's Next from AWS?
 
Soluzioni di Database completamente gestite: NoSQL, relazionali e Data Warehouse
Soluzioni di Database completamente gestite: NoSQL, relazionali e Data WarehouseSoluzioni di Database completamente gestite: NoSQL, relazionali e Data Warehouse
Soluzioni di Database completamente gestite: NoSQL, relazionali e Data Warehouse
 
BDA403 How Netflix Monitors Applications in Real-time with Amazon Kinesis
BDA403 How Netflix Monitors Applications in Real-time with Amazon KinesisBDA403 How Netflix Monitors Applications in Real-time with Amazon Kinesis
BDA403 How Netflix Monitors Applications in Real-time with Amazon Kinesis
 
Serverless for Developers: Event-Driven & Distributed Apps - Pop-up Loft TLV ...
Serverless for Developers: Event-Driven & Distributed Apps - Pop-up Loft TLV ...Serverless for Developers: Event-Driven & Distributed Apps - Pop-up Loft TLV ...
Serverless for Developers: Event-Driven & Distributed Apps - Pop-up Loft TLV ...
 

Viewers also liked

AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...Amazon Web Services
 
AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...
AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...
AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...Amazon Web Services
 
AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...
AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...
AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...Amazon Web Services
 
AWS re:Invent 2016: Analyzing Streaming Data in Real-time with Amazon Kinesis...
AWS re:Invent 2016: Analyzing Streaming Data in Real-time with Amazon Kinesis...AWS re:Invent 2016: Analyzing Streaming Data in Real-time with Amazon Kinesis...
AWS re:Invent 2016: Analyzing Streaming Data in Real-time with Amazon Kinesis...Amazon Web Services
 
AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)
AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)
AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)Amazon Web Services
 
AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...
AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...
AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...Amazon Web Services
 
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...Amazon Web Services
 
(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWSAmazon Web Services
 
Building a Server-less Data Lake on AWS - Technical 301
Building a Server-less Data Lake on AWS - Technical 301Building a Server-less Data Lake on AWS - Technical 301
Building a Server-less Data Lake on AWS - Technical 301Amazon Web Services
 
AWS re:Invent 2016: Visualizing Big Data Insights with Amazon QuickSight (BDM...
AWS re:Invent 2016: Visualizing Big Data Insights with Amazon QuickSight (BDM...AWS re:Invent 2016: Visualizing Big Data Insights with Amazon QuickSight (BDM...
AWS re:Invent 2016: Visualizing Big Data Insights with Amazon QuickSight (BDM...Amazon Web Services
 
AWS re:Invent 2016: Serverless Architectural Patterns and Best Practices (ARC...
AWS re:Invent 2016: Serverless Architectural Patterns and Best Practices (ARC...AWS re:Invent 2016: Serverless Architectural Patterns and Best Practices (ARC...
AWS re:Invent 2016: Serverless Architectural Patterns and Best Practices (ARC...Amazon Web Services
 
AWS October Webinar Series - Introducing Amazon QuickSight
AWS October Webinar Series - Introducing Amazon QuickSightAWS October Webinar Series - Introducing Amazon QuickSight
AWS October Webinar Series - Introducing Amazon QuickSightAmazon Web Services
 
AWS re:Invent 2016: Effective Application Data Analytics for Modern Applicati...
AWS re:Invent 2016: Effective Application Data Analytics for Modern Applicati...AWS re:Invent 2016: Effective Application Data Analytics for Modern Applicati...
AWS re:Invent 2016: Effective Application Data Analytics for Modern Applicati...Amazon Web Services
 
Creating a Data Driven Culture with Amazon QuickSight - Technical 201
Creating a Data Driven Culture with Amazon QuickSight - Technical 201Creating a Data Driven Culture with Amazon QuickSight - Technical 201
Creating a Data Driven Culture with Amazon QuickSight - Technical 201Amazon Web Services
 
Apache Spark and the Hadoop Ecosystem on AWS
Apache Spark and the Hadoop Ecosystem on AWSApache Spark and the Hadoop Ecosystem on AWS
Apache Spark and the Hadoop Ecosystem on AWSAmazon Web Services
 
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...Amazon Web Services
 
Visualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSightVisualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSightAmazon Web Services
 
AWS re:Invent 2016: Real-time Data Processing Using AWS Lambda (SVR301)
AWS re:Invent 2016: Real-time Data Processing Using AWS Lambda (SVR301)AWS re:Invent 2016: Real-time Data Processing Using AWS Lambda (SVR301)
AWS re:Invent 2016: Real-time Data Processing Using AWS Lambda (SVR301)Amazon Web Services
 
AWS re:Invent 2016: Deep Dive: Amazon EMR Best Practices & Design Patterns (B...
AWS re:Invent 2016: Deep Dive: Amazon EMR Best Practices & Design Patterns (B...AWS re:Invent 2016: Deep Dive: Amazon EMR Best Practices & Design Patterns (B...
AWS re:Invent 2016: Deep Dive: Amazon EMR Best Practices & Design Patterns (B...Amazon Web Services
 
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.Amazon Web Services
 

Viewers also liked (20)

AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
 
AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...
AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...
AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...
 
AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...
AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...
AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...
 
AWS re:Invent 2016: Analyzing Streaming Data in Real-time with Amazon Kinesis...
AWS re:Invent 2016: Analyzing Streaming Data in Real-time with Amazon Kinesis...AWS re:Invent 2016: Analyzing Streaming Data in Real-time with Amazon Kinesis...
AWS re:Invent 2016: Analyzing Streaming Data in Real-time with Amazon Kinesis...
 
AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)
AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)
AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)
 
AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...
AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...
AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...
 
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
 
(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS
 
Building a Server-less Data Lake on AWS - Technical 301
Building a Server-less Data Lake on AWS - Technical 301Building a Server-less Data Lake on AWS - Technical 301
Building a Server-less Data Lake on AWS - Technical 301
 
AWS re:Invent 2016: Visualizing Big Data Insights with Amazon QuickSight (BDM...
AWS re:Invent 2016: Visualizing Big Data Insights with Amazon QuickSight (BDM...AWS re:Invent 2016: Visualizing Big Data Insights with Amazon QuickSight (BDM...
AWS re:Invent 2016: Visualizing Big Data Insights with Amazon QuickSight (BDM...
 
AWS re:Invent 2016: Serverless Architectural Patterns and Best Practices (ARC...
AWS re:Invent 2016: Serverless Architectural Patterns and Best Practices (ARC...AWS re:Invent 2016: Serverless Architectural Patterns and Best Practices (ARC...
AWS re:Invent 2016: Serverless Architectural Patterns and Best Practices (ARC...
 
AWS October Webinar Series - Introducing Amazon QuickSight
AWS October Webinar Series - Introducing Amazon QuickSightAWS October Webinar Series - Introducing Amazon QuickSight
AWS October Webinar Series - Introducing Amazon QuickSight
 
AWS re:Invent 2016: Effective Application Data Analytics for Modern Applicati...
AWS re:Invent 2016: Effective Application Data Analytics for Modern Applicati...AWS re:Invent 2016: Effective Application Data Analytics for Modern Applicati...
AWS re:Invent 2016: Effective Application Data Analytics for Modern Applicati...
 
Creating a Data Driven Culture with Amazon QuickSight - Technical 201
Creating a Data Driven Culture with Amazon QuickSight - Technical 201Creating a Data Driven Culture with Amazon QuickSight - Technical 201
Creating a Data Driven Culture with Amazon QuickSight - Technical 201
 
Apache Spark and the Hadoop Ecosystem on AWS
Apache Spark and the Hadoop Ecosystem on AWSApache Spark and the Hadoop Ecosystem on AWS
Apache Spark and the Hadoop Ecosystem on AWS
 
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
 
Visualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSightVisualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSight
 
AWS re:Invent 2016: Real-time Data Processing Using AWS Lambda (SVR301)
AWS re:Invent 2016: Real-time Data Processing Using AWS Lambda (SVR301)AWS re:Invent 2016: Real-time Data Processing Using AWS Lambda (SVR301)
AWS re:Invent 2016: Real-time Data Processing Using AWS Lambda (SVR301)
 
AWS re:Invent 2016: Deep Dive: Amazon EMR Best Practices & Design Patterns (B...
AWS re:Invent 2016: Deep Dive: Amazon EMR Best Practices & Design Patterns (B...AWS re:Invent 2016: Deep Dive: Amazon EMR Best Practices & Design Patterns (B...
AWS re:Invent 2016: Deep Dive: Amazon EMR Best Practices & Design Patterns (B...
 
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
 

Similar to AWS re:Invent 2016: Workshop: Building Your First Big Data Application with AWS (BDM202)

Workshop: Building a streaming data platform on AWS
Workshop: Building a streaming data platform on AWSWorkshop: Building a streaming data platform on AWS
Workshop: Building a streaming data platform on AWSAmazon Web Services
 
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...Amazon Web Services
 
Building a Streaming Data Platform on AWS - Workshop
Building a Streaming Data Platform on AWS - WorkshopBuilding a Streaming Data Platform on AWS - Workshop
Building a Streaming Data Platform on AWS - WorkshopAmazon Web Services
 
Getting started with Amazon Kinesis
Getting started with Amazon KinesisGetting started with Amazon Kinesis
Getting started with Amazon KinesisAmazon Web Services
 
Getting started with amazon kinesis
Getting started with amazon kinesisGetting started with amazon kinesis
Getting started with amazon kinesisJampp
 
Build Your First Big Data Application on AWS (ANT213-R1) - AWS re:Invent 2018
Build Your First Big Data Application on AWS (ANT213-R1) - AWS re:Invent 2018Build Your First Big Data Application on AWS (ANT213-R1) - AWS re:Invent 2018
Build Your First Big Data Application on AWS (ANT213-R1) - AWS re:Invent 2018Amazon Web Services
 
Deep Dive and Best Practices for Real Time Streaming Applications
Deep Dive and Best Practices for Real Time Streaming ApplicationsDeep Dive and Best Practices for Real Time Streaming Applications
Deep Dive and Best Practices for Real Time Streaming ApplicationsAmazon Web Services
 
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...Amazon Web Services
 
Easy Analytics with AWS - AWS Summit Bahrain 2017
Easy Analytics with AWS - AWS Summit Bahrain 2017Easy Analytics with AWS - AWS Summit Bahrain 2017
Easy Analytics with AWS - AWS Summit Bahrain 2017Amazon Web Services
 
Getting Started with Amazon Kinesis
Getting Started with Amazon KinesisGetting Started with Amazon Kinesis
Getting Started with Amazon KinesisAmazon Web Services
 
Evolving Your Big Data Use Cases from Batch to Real-Time - AWS May 2016 Webi...
Evolving Your Big Data Use Cases from Batch to Real-Time - AWS May 2016  Webi...Evolving Your Big Data Use Cases from Batch to Real-Time - AWS May 2016  Webi...
Evolving Your Big Data Use Cases from Batch to Real-Time - AWS May 2016 Webi...Amazon Web Services
 
Running Your SQL Server Database on Amazon RDS (DAT329) - AWS re:Invent 2018
Running Your SQL Server Database on Amazon RDS (DAT329) - AWS re:Invent 2018Running Your SQL Server Database on Amazon RDS (DAT329) - AWS re:Invent 2018
Running Your SQL Server Database on Amazon RDS (DAT329) - AWS re:Invent 2018Amazon Web Services
 
Considerations for Building Your First Streaming Application (ANT359) - AWS r...
Considerations for Building Your First Streaming Application (ANT359) - AWS r...Considerations for Building Your First Streaming Application (ANT359) - AWS r...
Considerations for Building Your First Streaming Application (ANT359) - AWS r...Amazon Web Services
 
Automate Your Big Data Workflows (SVC201) | AWS re:Invent 2013
Automate Your Big Data Workflows (SVC201) | AWS re:Invent 2013Automate Your Big Data Workflows (SVC201) | AWS re:Invent 2013
Automate Your Big Data Workflows (SVC201) | AWS re:Invent 2013Amazon Web Services
 
London Redshift Meetup - July 2017
London Redshift Meetup - July 2017London Redshift Meetup - July 2017
London Redshift Meetup - July 2017Pratim Das
 
AWS re:Invent 2016: IoT Visualizations and Analytics (IOT306)
AWS re:Invent 2016: IoT Visualizations and Analytics (IOT306)AWS re:Invent 2016: IoT Visualizations and Analytics (IOT306)
AWS re:Invent 2016: IoT Visualizations and Analytics (IOT306)Amazon Web Services
 
Querying and Analyzing Data in Amazon S3
Querying and Analyzing Data in Amazon S3Querying and Analyzing Data in Amazon S3
Querying and Analyzing Data in Amazon S3Amazon Web Services
 
AWS Summit Stockholm 2014 – B4 – Business intelligence on AWS
AWS Summit Stockholm 2014 – B4 – Business intelligence on AWSAWS Summit Stockholm 2014 – B4 – Business intelligence on AWS
AWS Summit Stockholm 2014 – B4 – Business intelligence on AWSAmazon Web Services
 
BDA309 Build Your First Big Data Application on AWS
BDA309 Build Your First Big Data Application on AWSBDA309 Build Your First Big Data Application on AWS
BDA309 Build Your First Big Data Application on AWSAmazon Web Services
 

Similar to AWS re:Invent 2016: Workshop: Building Your First Big Data Application with AWS (BDM202) (20)

Workshop: Building a streaming data platform on AWS
Workshop: Building a streaming data platform on AWSWorkshop: Building a streaming data platform on AWS
Workshop: Building a streaming data platform on AWS
 
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
 
Building a Streaming Data Platform on AWS - Workshop
Building a Streaming Data Platform on AWS - WorkshopBuilding a Streaming Data Platform on AWS - Workshop
Building a Streaming Data Platform on AWS - Workshop
 
Getting started with Amazon Kinesis
Getting started with Amazon KinesisGetting started with Amazon Kinesis
Getting started with Amazon Kinesis
 
Getting started with amazon kinesis
Getting started with amazon kinesisGetting started with amazon kinesis
Getting started with amazon kinesis
 
Build Your First Big Data Application on AWS (ANT213-R1) - AWS re:Invent 2018
Build Your First Big Data Application on AWS (ANT213-R1) - AWS re:Invent 2018Build Your First Big Data Application on AWS (ANT213-R1) - AWS re:Invent 2018
Build Your First Big Data Application on AWS (ANT213-R1) - AWS re:Invent 2018
 
Deep Dive and Best Practices for Real Time Streaming Applications
Deep Dive and Best Practices for Real Time Streaming ApplicationsDeep Dive and Best Practices for Real Time Streaming Applications
Deep Dive and Best Practices for Real Time Streaming Applications
 
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
 
Easy Analytics with AWS - AWS Summit Bahrain 2017
Easy Analytics with AWS - AWS Summit Bahrain 2017Easy Analytics with AWS - AWS Summit Bahrain 2017
Easy Analytics with AWS - AWS Summit Bahrain 2017
 
Getting Started with Amazon Kinesis
Getting Started with Amazon KinesisGetting Started with Amazon Kinesis
Getting Started with Amazon Kinesis
 
Evolving Your Big Data Use Cases from Batch to Real-Time - AWS May 2016 Webi...
Evolving Your Big Data Use Cases from Batch to Real-Time - AWS May 2016  Webi...Evolving Your Big Data Use Cases from Batch to Real-Time - AWS May 2016  Webi...
Evolving Your Big Data Use Cases from Batch to Real-Time - AWS May 2016 Webi...
 
Running Your SQL Server Database on Amazon RDS (DAT329) - AWS re:Invent 2018
Running Your SQL Server Database on Amazon RDS (DAT329) - AWS re:Invent 2018Running Your SQL Server Database on Amazon RDS (DAT329) - AWS re:Invent 2018
Running Your SQL Server Database on Amazon RDS (DAT329) - AWS re:Invent 2018
 
Considerations for Building Your First Streaming Application (ANT359) - AWS r...
Considerations for Building Your First Streaming Application (ANT359) - AWS r...Considerations for Building Your First Streaming Application (ANT359) - AWS r...
Considerations for Building Your First Streaming Application (ANT359) - AWS r...
 
Automate Your Big Data Workflows (SVC201) | AWS re:Invent 2013
Automate Your Big Data Workflows (SVC201) | AWS re:Invent 2013Automate Your Big Data Workflows (SVC201) | AWS re:Invent 2013
Automate Your Big Data Workflows (SVC201) | AWS re:Invent 2013
 
London Redshift Meetup - July 2017
London Redshift Meetup - July 2017London Redshift Meetup - July 2017
London Redshift Meetup - July 2017
 
Real-Time Streaming Data on AWS
Real-Time Streaming Data on AWSReal-Time Streaming Data on AWS
Real-Time Streaming Data on AWS
 
AWS re:Invent 2016: IoT Visualizations and Analytics (IOT306)
AWS re:Invent 2016: IoT Visualizations and Analytics (IOT306)AWS re:Invent 2016: IoT Visualizations and Analytics (IOT306)
AWS re:Invent 2016: IoT Visualizations and Analytics (IOT306)
 
Querying and Analyzing Data in Amazon S3
Querying and Analyzing Data in Amazon S3Querying and Analyzing Data in Amazon S3
Querying and Analyzing Data in Amazon S3
 
AWS Summit Stockholm 2014 – B4 – Business intelligence on AWS
AWS Summit Stockholm 2014 – B4 – Business intelligence on AWSAWS Summit Stockholm 2014 – B4 – Business intelligence on AWS
AWS Summit Stockholm 2014 – B4 – Business intelligence on AWS
 
BDA309 Build Your First Big Data Application on AWS
BDA309 Build Your First Big Data Application on AWSBDA309 Build Your First Big Data Application on AWS
BDA309 Build Your First Big Data Application on AWS
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Recently uploaded

Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdfChristopherTHyatt
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 

Recently uploaded (20)

Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 

AWS re:Invent 2016: Workshop: Building Your First Big Data Application with AWS (BDM202)

  • 1. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Rahul Bhartia, Principal Solutions Architect, AWS Ryan Nienhuis, Senior Product Manager, AWS Tony Gibbs, Solutions Architect, AWS Radhika Ravirala, Solutions Architect, AWS November 29, 2016 BDM202 Workshop Building Your First Big Data Application on AWS
  • 2. Objectives for Today Build an end-to-end big data application that captures, stores, processes, and analyzes web logs. At the end of the workshop, you will: 1. Understand how a common scenario is implemented using AWS big data tools 2. Be able to understand how to use 1. Amazon Kinesis for real-time data 2. Amazon Redshift for data warehousing 3. Amazon EMR for data processing 4. Amazon QuickSight for data visualization
  • 3. Your Big Data Application Architecture Amazon Kinesis Producer UI Amazon Kinesis Firehose Amazon Kinesis Analytics Amazon Kinesis Firehose Amazon EMR Amazon Redshift Amazon QuickSight Generate web logs Collect web logs and deliver to S3 Process & compute aggregate metrics Deliver processed web logs to Amazon Redshift Raw web logs from Firehose Run SQL queries on processed web logs Visualize web logs to discover insights Amazon S3 Bucket Ad hoc analysis of web logs
  • 5. What is qwikLABS? Access to AWS services for this bootcamp No need to provide a credit card Automatically deleted when you’re finished http://events-aws.qwiklab.com Create an account with the same email that you used to register for this bootcamp
  • 6. Sign in and start course Once the course is started, you will see “Create in Progress” in the upper right corner.
  • 7. Navigating qwikLABS Connect tab: Access and login information Addl Info tab: Links to Interfaces Lab Instruction tab - Scripts for your labs
  • 8. Everything you need for the lab Open the AWS Management Console, log in and verify, • Two Amazon Kinesis Firehose delivery streams • One Amazon EMR cluster • One Amazon Redshift cluster Sign up for • Amazon QuickSight
  • 10. Amazon Kinesis: Streaming Data Made Easy Services make it easy to capture, deliver, process streams on AWS Amazon Kinesis Streams Amazon Kinesis Analytics Amazon Kinesis Firehose
  • 11. Amazon Kinesis Streams • Easy administration • Build real time applications with framework of choice • Low cost
  • 12. Amazon Kinesis Firehose • Zero administration • Direct-to-data store integration • Seamless elasticity
  • 13. Amazon Kinesis - Firehose vs. Streams Amazon Kinesis Streams is for use cases that require custom processing, per incoming record, with sub-1 second processing latency, and a choice of stream processing frameworks. Amazon Kinesis Firehose is for use cases that require zero administration, ability to use existing analytics tools based on Amazon S3, Amazon Redshift, and Amazon Elasticsearch Service, and a data latency of 60 seconds or higher.
  • 14. Activity 1 Collect web logs using a Firehose delivery stream
  • 15. Capture data using a Firehose delivery stream Time: 5 minutes We are going to: A. Use a simple producer applications that writes web logs into Firehose delivery stream configured to write data into Amazon S3. 75.35.230.210 - - [20/Jul/2009:22:22:42 -0700] "GET /images/pigtrihawk.jpg " 200 29236 "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.11) Gecko/2009060215 Firefox/3.0.11 (.NET CLR 3.5.30729)"
  • 16. Activity 1A: Writing to a Firehose delivery stream To simplify this, we are going to use simple tool for creating sample data and writing it to a Amazon Kinesis Firehose delivery stream. 1. Go to the simple data producer UI http://bit.ly/kinesis-producer 2. Copy and past your credentials into the form. The credentials do not leave your client (locally hosted site). Additionally, the AWS JavaScript SDK is implemented in the site which uses HTTPS. In normal situations, you do not want to use your credentials in this way unless you trust the provider. 3. Specify region (US West (Oregon)) and delivery stream name. (qls- somerandomnumber-FirehoseDeliveryStream-somerandom-11111111111) 4. Specify a data rate between 100 – 500, pick a unique number. This will control the rate at which records are sent to the stream.
  • 17. Activity 1A: Writing to a Firehose delivery stream 5. Copy and paste this into the text editor {{internet.ip}} - - [{{date.now("DD/MMM/YYYY:HH:mm:ss Z")}}] "{{random.weightedArrayElement({"weights":[0.6,0.1,0.1,0.2],"data":["GET","POST","DEL ETE","PUT"]})}} {{random.arrayElement(["/list","/wp-content","/wp- admin","/explore","/search/tag/list","/app/main/posts","/posts/posts/explore"])}}" {{random.weightedArrayElement({"weights": [0.9,0.04,0.02,0.04], "data":["200","404","500","301"]})}} {{random.number(10000)}} "-" "{{internet.userAgent}}" Make sure there is a new line after your record when you copy and paste The tool will generate records based upon the above format. The tool generates hundreds of these records and send them in a single put using PutRecords. Finally, the tool will drain your battery. Only keep it running while we are completing the activities if you don’t have a power cord.
  • 18. Activity 1A: Writing to a Firehose delivery stream
  • 19. Review: Monitoring a Delivery Stream Go to the Amazon CloudWatch Metrics Console and search “IncomingRecords”. Select this metric for your Firehose delivery stream and choose a 1 Minute SUM. What are the most important metrics to monitor?
  • 20. Activity 2 Real-time processing using Amazon Kinesis Analytics
  • 21. Amazon Kinesis Analytics • Apply SQL on streams • Build real time, stream processing applications • Easy scalability
  • 22. Use SQL to build real-time applications Easily write SQL code to process streaming data Connect to streaming source Continuously deliver SQL results
  • 23. Process Data using Amazon Kinesis Analytics Time: 20 minutes We are going to: A. Create an Amazon Kinesis Analytics application that reads from the delivery stream with log data B. Write a SQL query to parse and transform the log C. Add another query to compute an aggregate metrics for an interesting statistic on the incoming data
  • 24. Activity 2A: Create an Amazon Kinesis Analytics Application 1. Go to the Amazon Kinesis Analytics Console 1. https://us-west-2.console.aws.amazon.com/kinesisanalytics/home?region=us- west-2 2. Click “Create new application” and provide a name 3. Click “Connect to a source” and select the stream you created in the previous exercise (qls-randomnumber-FirehoseDeliveryStream- randomnumber) 4. Amazon Kinesis Analytics will discover the schema for you if your data is UTF-8 encoded CSV or JSON data. In this case, we have custom log, so discovery incorrectly assumes a delimiter
  • 25. Activity 2A: Create an Amazon Kinesis Analytics Application Amazon Kinesis Analytics comes with powerful string manipulation and log parsing functions. So we are going to define our own schema to handle this format and then parse it in SQL code. 5. Click “Edit schema” 6. Verify that the format is CSV, Row delimiter is ‘n’, and change the column delimiter to ‘t’ 7. Remove all the existing columns, and create a single column named DATA of type VARCHAR(5000). 8. Click “Save schema and update stream samples”. This will run your application (takes about a minute).
  • 26. Activity 2A: Create an Amazon Kinesis Analytics Application There are two system columns added to each record. • The ROWTIME represents the time the application read the record, and is a special column used for time series analytics. This is also known as the process time. • The APPROXIMATE_ARRIVAL_TIME is the time the delivery stream received the record. This is also known as ingest time.
  • 27. Activity 2A: Create an Amazon Kinesis Analytics Application 10. Click “Exit (done)” on Schema editing page. 11. Click “Cancel” on Source configuration page. 12. Click “Go To SQL Editor”. You now have a running stream processing application!
  • 28. Activity 2B: Writing SQL over Streaming Data Writing SQL over streaming data using Amazon Kinesis Analytics follows a two part model: 1. Create an in-application stream for storing intermediate SQL results. An in- application stream is like a SQL table, but is continuously updated. 2. Create a PUMP which will continuously read FROM one in-application stream and INSERT INTO a target in-application stream DESTINATION_SQL_STREAM Part 1 (Activity 2-B) AGGREGATE_STREAM Part 2 (Activity 2-C) SOURCE_SQL_STREAM_001 Source Stream TRANSFORM_PUMP AGGREGATE_PUMP Send to Redshift
  • 29. Activity 2B: Add your first query Our first query will use a special function called W3C_LOG_PARSE to separate the log into specific columns. Remember, this is a two part process: 1. We will create the in-application stream to store intermediate SQL results. 2. We will create a pump that selects data from the source stream and inserts into the target in- application stream. Your application code is replaced each time you click “Save and Run”. As you add queries, append them to the bottom of the SQL editor. Note: You can download all of the application code from qwikLABS. Click on the Lab instructions tab in qwikLABS and then download the Kinesis Analytics SQL file.
  • 30. Activity 2B: Calculate an aggregate metric Calculate a count using a tumbling window and a GROUP BY clause. A tumbling window is similar to a periodic report, where you specify your query and a time range, and results are emitted at the end of a range. (EX: COUNT number of items by key for 10 seconds)
  • 31. Activity 2B: Calculate an aggregate metric The window is defined by the following statement in the SELECT statement. Note that the ROWTIME column is implicitly included in every stream query, and represents the processing time of the application. FLOOR("SOURCE_SQL_STREAM_001".ROWTIME TO SECOND) (DO NOT INSERT THIS INTO YOUR SQL EDITOR) This is known as a tumbling window.
  • 32. Review – In-Application SQL streams Your application has multiple in-application SQL streams including TRANSFORMED_STREAM and AGGREGATE_STREAM. These in-application streams which are like SQL tables that are continuously updated. What else is unique about an in-application stream aside from its continuous nature?
  • 35. Amazon Redshift Cluster Architecture Massively parallel, shared nothing Leader node • SQL endpoint • Stores metadata • Coordinates parallel SQL processing Compute nodes • Local, columnar storage • Executes queries in parallel • Load, backup, restore 10 GigE (HPC) Ingestion Backup Restore SQL Clients/BI Tools 128GB RAM 16TB disk 16 cores S3 / EMR / DynamoDB / SSH JDBC/ODBC 128GB RAM 16TB disk 16 coresCompute Node 128GB RAM 16TB disk 16 coresCompute Node 128GB RAM 16TB disk 16 coresCompute Node Leader Node
  • 36. Designed for I/O Reduction Columnar storage Data compression Zone maps aid loc dt 1 SFO 2016-09-01 2 JFK 2016-09-14 3 SFO 2017-04-01 4 JFK 2017-05-14 • Accessing dt with row storage: – Need to read everything – Unnecessary I/O aid loc dt CREATE TABLE loft_deep_dive ( aid INT --audience_id ,loc CHAR(3) --location ,dt DATE --date );
  • 37. Designed for I/O Reduction Columnar storage Data compression Zone maps aid loc dt 1 SFO 2016-09-01 2 JFK 2016-09-14 3 SFO 2017-04-01 4 JFK 2017-05-14 • Accessing dt with columnar storage: – Only scan blocks for relevant column aid loc dt CREATE TABLE loft_deep_dive ( aid INT --audience_id ,loc CHAR(3) --location ,dt DATE --date );
  • 38. Designed for I/O Reduction Columnar storage Data compression Zone maps aid loc dt 1 SFO 2016-09-01 2 JFK 2016-09-14 3 SFO 2017-04-01 4 JFK 2017-05-14 • Columns grow and shrink independently • Effective compression ratios due to like data • Reduces storage requirements • Reduces I/O aid loc dt CREATE TABLE loft_deep_dive ( aid INT ENCODE LZO ,loc CHAR(3) ENCODE BYTEDICT ,dt DATE ENCODE RUNLENGTH );
  • 39. Designed for I/O Reduction Columnar storage Data compression Zone maps aid loc dt 1 SFO 2016-09-01 2 JFK 2016-09-14 3 SFO 2017-04-01 4 JFK 2017-05-14 aid loc dt CREATE TABLE loft_deep_dive ( aid INT --audience_id ,loc CHAR(3) --location ,dt DATE --date ); • In-memory block metadata • Contains per-block MIN and MAX value • Effectively prunes blocks which cannot contain data for a given query • Eliminates unnecessary I/O
  • 40. SELECT COUNT(*) FROM LOGS WHERE DATE = '09-JUNE-2013' MIN: 01-JUNE-2013 MAX: 20-JUNE-2013 MIN: 08-JUNE-2013 MAX: 30-JUNE-2013 MIN: 12-JUNE-2013 MAX: 20-JUNE-2013 MIN: 02-JUNE-2013 MAX: 25-JUNE-2013 Unsorted Table MIN: 01-JUNE-2013 MAX: 06-JUNE-2013 MIN: 07-JUNE-2013 MAX: 12-JUNE-2013 MIN: 13-JUNE-2013 MAX: 18-JUNE-2013 MIN: 19-JUNE-2013 MAX: 24-JUNE-2013 Sorted By Date Zone Maps
  • 41. Activity 4 Deliver streaming results to Amazon Redshift
  • 42. Activity 4: Deliver data to Amazon Redshift using Firehose Time: 5 minutes We are going to: A. Connect to an Amazon Redshift cluster and create a table to hold weblogs data. B. Update the Amazon Kinesis Analytics application to send data to Amazon Redshift, via the Firehose delivery stream.
  • 43. Activity 4A: Connect to Amazon Redshift You can connect with pgweb • Installed and configured for the cluster • Just navigate to pgWeb and start interacting Note: Click on the Addl. Info tab in qwikLABS and then open the pgWeb link in a new window. Or, Use any JDBC/ODBC/libpq client • Aginity Workbench for Amazon Redshift • SQL Workbench/J • DBeaver • Datagrip
  • 44. Activity 4B: Create table in Amazon Redshift Create a table weblogs to capture the in-coming data from Firehose delivery stream CREATE TABLE weblogs ( row_time timestamp encode raw, host_address varchar(512) encode lzo, request_time timestamp encode raw, request_method varchar(5) encode lzo, request_path varchar(1024) encode lzo, request_protocol varchar(10) encode lzo, response_code int encode delta, response_size int encode delta, referrer_host varchar(1024) encode lzo, user_agent varchar(512) encode lzo ) DISTSTYLE EVEN SORTKEY(request_time); Note: You can download all of the application code from qwikLabs. Click on the lab instructions tab in qwikLABS and then download the Amazon Redshift SQL file.
  • 45. Activity 4C: Deliver Data to Amazon Redshift using Firehose Update Amazon Kinesis Analytics application to send data to a Firehose delivery stream. Firehose delivers the streaming data to Amazon Redshift. 1. Go to the Amazon Kinesis Analytics console 2. Select your application and choose Application details 3. Choose Connect to a destination
  • 46. Activity 4C: Deliver Data to Amazon Redshift using Firehose 4. Configure your destination 1. Choose the Firehose “qls-xxxxxxx-RedshiftDeliveryStream- xxxxxxxx” delivery stream. 2. Choose CSV as the “Output format” 3. Choose “Create/update <app name> role” 4. Click “Save and continue” 5. It will take about 1 – 2 minutes for everything to be updated and for data to start appearing in Amazon Redshift.
  • 47. Review: Amazon Redshift Test Queries • Find distribution of response codes over days SELECT TRUNC(request_time), response_code, COUNT(1) FROM weblogs GROUP BY 1,2 ORDER BY 1,3 DESC • Find the 404 status codes SELECT COUNT(1) FROM weblogs WHERE response_code = 404; • Show all requests for status as PAGE NOT FOUND SELECT TOP 1 request_path, COUNT(1) FROM weblogs WHERE response_code = 404 GROUP BY 1 ORDER BY 2 DESC;
  • 49. Why Amazon EMR? Easy to Use Launch a cluster in minutes Low Cost Pay an hourly rate Elastic Easily add or remove capacity Reliable Spend less time monitoring Secure Manage firewalls Flexible Control the cluster
  • 50. The Hadoop ecosystem can run in Amazon EMR
  • 51. Spot Instances for task nodes Up to 90% off Amazon EC2 on-demand pricing On-Demand for core nodes Standard Amazon EC2 pricing for on-demand capacity Easy to use Spot Instances Meet SLA at predictable cost Exceed SLA at lower cost
  • 52. Amazon S3 as your persistent data store Separate compute and storage Resize and shut down Amazon EMR clusters with no data loss Point multiple Amazon EMR clusters at same data in Amazon S3 EMR EMR Amazon S3
  • 53. EMRFS makes it easier to leverage S3 Better performance and error handling options Transparent to applications – Use “s3://” Consistent view • For consistent list and read-after-write for new puts Support for Amazon S3 server-side and client-side encryption Faster listing using EMRFS metadata
  • 54. Apache Spark • Fast, general-purpose engine for large-scale data processing • Write applications quickly in Java, Scala, or Python • Combine SQL, streaming, and complex analytics
  • 55. Apache Zeppelin • Web-based notebook for interactive analytics • Multiple language back end • Apache Spark integration • Data visualization • Collaboration https://zeppelin.incubator.apache.org/
  • 56. Activity 5 Ad hoc analysis using Amazon EMR
  • 57. Activity 5: Process and query data using Amazon EMR Time: 20 minutes We are going to: A. Use a Zeppelin Notebook to interact with the Amazon EMR cluster B. Process the data delivered to Amazon S3 by Firehose using Apache Spark C. Query the data processed in the earlier stage and create simple charts
  • 58. Activity 5A: Open the Zeppelin interface 1. Click on the Lab Instructions tab in qwikLABS and then download the Zeppelin Notebook 2. Click on the Addl. Info tab in qwikLABS and then open the zeppelin link into a new window. 3. Import the Notebook using the Import Note link on Zeppelin interface
  • 59. Using Zeppelin interface Run the Notebook Run the paragraph
  • 60. Activity 5B: Run the notebook Enter the S3 bucket name created by qwikLabs in the Notebook. • Execute Step #1 (logs-##########-us-west-2) • Execute Step #2 to create an RDD from the dataset delivered by Firehose • Execute Step #3 to print the first row of the output • Notice the broken columns. We’re going to fix that next.
  • 61. Activity 5B: Run the notebook • Execute Step #4 to process the data • Combine fields: “A B”  A B C • Define a schema for the data • Map the RDD to a data frame • Execute Step #5 to print the weblogsDF • See how different is it from the previous ones
  • 62. Activity 5B: Run the notebook • Execute Step #6 to register the data frame as a temporary table • Now you can run SQL queries on the temporary tables. • Execute the next 3 steps and observe the charts created. • What did you learn about the dataset?
  • 63. Review : Ad-hoc analysis using Amazon EMR • You just learned on how to process and query data using Amazon EMR with Apache Spark • Amazon EMR has many other frameworks available for you to use • Hive, Presto, MapReduce, Pig • Hue, Oozie, HBase
  • 65. Fast, Easy Ad-Hoc Analytics for Anyone, Everywhere • Ease of use targeted at business users. • Blazing fast performance powered by SPICE. • Broad connectivity with AWS data services, on-premises data, files and business applications. • Cloud-native solution that scales automatically. • 1/10th the cost of traditional BI solutions. • Create, share and collaborate with anyone in your organization, on the web or on mobile.
  • 66. Connect, SPICE, Analyze QuickSight allows you to connect to data from a wide variety of AWS, third-party, and on- premises sources and import it to SPICE or query directly. Users can then easily explore, analyze, and share their insights with anyone. Amazon RDS Amazon S3 Amazon Redshift
  • 68. Activity 6: Visualization with QuickSight We are going to: A. Register for a QuickSight account B. Connect to the Amazon Redshift cluster C. Create visualizations for analysis to answers questions like A. What are the most common http requests and how successful are they B. Which are the most requested URIs
  • 69. Activity 6A: QuickSight Registration • Go to console, choose QuickSight from the Analytics section. • Choose Signup for Quicksight. • On the registration screen, choose US West Oregon region. • Enter the AWS account number and your email. Note: You can get the AWS account number from qwikLABS in the Connect tab
  • 70. Activity 6B: Connect to Amazon Redshift • Click Manage Data to create a new data set in QuickSight. • Choose Redshift (Auto-discovered) as the data source.
  • 71. Activity 6B: Connect to Amazon Redshift Note: You can get the password from qwikLABS by navigating to the Custom Connection Details section on the Connect tab
  • 72. Activity 6C: Ingest data into SPICE
  • 73. Activity 6C: Ingest data into SPICE • SPICE is Amazon QuickSight's in-memory optimized calculation engine, designed specifically for fast, ad-hoc data visualization. • For faster analytics, you can import the data into SPICE instead of using a direct query to the database.
  • 74. Activity 6D: Creating your first analysis What are the most common http requests made? • Simply select request_type, response_code and let AUTOGRAPH create the optimal visualization
  • 75. Review – Creating your Analysis Exercise: Create a visual to show which URIs are the most requested
  • 77. Your Big Data Application Architecture Amazon Kinesis Producer UI Amazon Kinesis Firehose Amazon Kinesis Analytics Amazon Kinesis Firehose Amazon EMR Amazon Redshift Amazon QuickSight Generate web logs Collect web logs and deliver to S3 Process & compute aggregate metrics Deliver processed web logs to Amazon Redshift Raw web logs from Firehose Run SQL queries on processed web logs Visualize web logs to discover insights Amazon S3 Bucket Ad-hoc analysis of web logs