Learn the fundamentals of Amazon DynamoDB and see the DynamoDB console first-hand as we walk through a demo of building a serverless web application using this high-performance key-value and JSON document store.
Breaking the Kubernetes Kill Chain: Host Path Mount
Getting started with Amazon DynamoDB
1. Getting Started with NoSQL on AWS
Padma Malligarjunan
Sr. Technical Account Manager
AWS Enterprise Support
September 14, 2016
2. Agenda
• Brief history of data processing
• Relational (SQL) vs. nonrelational (NoSQL)
• NoSQL solutions on AWS
• Introduction to MongoDB on AWS
• Amazon DynamoDB’s fully managed features
• Demo – serverless applications
3. Data volume since 2010
• 90% of stored data generated in
last 2 years
• 1 terabyte of data in 2010 equals
6.5 petabytes today
• Linear correlation between data
pressure and technical innovation
• No reason these trends will not
continue over time
7. Relational vs. nonrelational databases
Traditional SQL NoSQL
DB
Primary Secondary
Scale up
DB
DB
DBDB
DB DB
Scale out
8. SQL vs. NoSQL schema design
NoSQL design optimizes for
compute instead of storage
9. Why NoSQL?
Optimized for storage Optimized for compute
Normalized/relational Denormalized/hierarchical
Ad hoc queries Instantiated views
Scale vertically Scale horizontally
Good for OLAP Built for OLTP at scale
SQL NoSQL
10. NoSQL solutions on AWS
• Bring your own NoSQL (or) use Amazon DynamoDB
• The widest range of NoSQL options
MongoDB
Cassandra
• Avoid the overhead of provisioning hardware
• Visit https://aws.amazon.com/nosql/document/
Couchbase
MarkLogic Amazon DynamoDB
11. NoSQL solutions using Amazon EC2 and EBS
DB hosted on-premises DB hosted on Amazon EC2
15. MongoDB named a
leader in The
Forrester Wave™:
Document Stores,
Q3 2016
The Forrester Wave™ is copyrighted by Forrester Research, Inc. Forrester and Forrester Wave are trademarks of Forrester Research, Inc. The Forrester Wave is a graphical representation of Forrester's call on a market and is
plotted using a detailed spreadsheet with exposed scores, weightings, and comments. Forrester does not endorse any vendor, product, or service depicted in the Forrester Wave. Information is based on best available resources.
Opinions reflect judgment at the time and are subject to change.
16. MongoDB – Deployment Best Practices
Packages
• Always use 64-bit builds for production. 32-bit builds support systems that have only 2 GB of memory
• Use the latest version of MongoDB
Networking
• Limit exposure by using network rules that prevent access from unknown machines, systems, & networks
Storage
• When using the default WiredTiger storage engine, the use of XFS file system is strongly recommended
• Turn off atime and diratime when you mount the data volume
• For improved performance, consider separating your data, journal and logs onto separate storage devices
• Assign swap space for your system
• Use a NOOP scheduler for best performance
Operating system
• Raise file descriptor limits. The default limit of 1,024 open files on most systems won’t work for most production-scale workloads
• Disable transparent huge pages. MongoDB performs better with standard (4,096) virtual memory pages
Visit: https://d0.awsstatic.com/whitepapers/AWS_NoSQL_MongoDB.pdf
17. MongoDB – High Availability and Horizontal
Scale Out
Availability
Zone A
Availability
Zone B
Availability
Zone C
AWS Region
Primary Secondary Secondary
PrimarySecondarySecondary
Shard A
Shard B
You can also get an optimally
configured cluster with
MongoDB Atlas, the new
database as a service for
MongoDB available on AWS.
Visit mongodb.com/atlas to
learn more.
23. WRITES
Replicated continuously to 3 AZs
Persisted to disk (custom SSD)
READS
Strongly or eventually consistent
No latency trade-off
Designed to
support 99.99%
of availability
Built for high
durability
High availability and durability
26. MLBAM (MLB Advanced Media) is a full service solutions
provider, operating a powerful content delivery platform.
For the first time, we can
measure things we’ve never
been able to measure
before.
Joe Inzerillo
Executive Vice President and CTO, MLBAM
”
“ • MLBAM can scale to support many games on a
single day.
• Amazon DynamoDB powers queries and supports the
fast data retrieval required.
• MLBAM distributes 25,000 live events annually and
10 million streams daily.
Major League Baseball Fields Big Data,
Excitement with Amazon DynamoDB
27. Redfin is a full-service real estate company with local
agents and online tools to help people buy & sell homes.
We have billions of records
on DynamoDB being
refreshed daily or hourly or
even by seconds.
Yong Huang
Director, Big Data Analytics, Redfin
”
“ • Redfin provides property and agent details and
ratings through its websites and apps.
• With DynamoDB, latency for “similar” properties
improved from 2 seconds to just 12 milliseconds.
• Redfin stores and processes five billion items in
DynamoDB.
Redfin Is Revolutionizing Home Buying and
Selling with Amazon DynamoDB
28. Expedia is a leader in the $1 trillion travel industry, with an
extensive portfolio that includes some of the world’s most
trusted travel brands.
With DynamoDB we were up
and running in a less than
day, and there is no need for
a team to maintain.
Kuldeep Chowhan
Principal Engineer, Expedia
”
“ • Expedia’s real-time analytics application collects data
for its “test & learn” experiments on Expedia sites.
• The analytics application processes ~200 million
messages daily.
• Ease of setup, monitoring, and scaling were key
factors in choosing Amazon DynamoDB.
Expedia’s Real-time Analytics Application Uses
Amazon DynamoDB
29. Nexon is a leading South Korean video game developer
and a pioneer in the world of interactive entertainment.
By using AWS, we
decreased our initial
investment costs, and only
pay for what we use.
Chunghoon Ryu
Department Manager, Nexon
”
“ • Nexon used Amazon DynamoDB as its
primary game database for a new blockbuster
mobile game, HIT
• HIT became the #1 Mobile Game in Korea
within the first day of launch and has > 2M
registered users
• Nexon’s HIT leverages DynamoDB to deliver
steady latency of less than 10ms to deliver a
fantastic mobile gaming experience for
170,000 concurrent players
Nexon Scales Mobile Gaming with Amazon
DynamoDB
30. Ad Tech Gaming MobileIoT Web
Scaling high-velocity use cases with DynamoDB
35. Local secondary index (LSI)
Alternate sort key attribute
Index is local to a partition key
A1
(partition)
A3
(sort)
A2
(item key)
A1
(partition)
A2
(sort)
A3 A4 A5
LSIs A1
(partition)
A4
(sort)
A2
(item key)
A3
(projected)
Table
KEYS_ONLY
INCLUDE A3
A1
(partition)
A5
(sort)
A2
(item key)
A3
(projected)
A4
(projected)
ALL
10 GB maximum per
partition key; LSIs limit the
number of range keys!
36. Global secondary index (GSI)
Alternate partition and/or sort key
Index is across all partition keys
A1
(partition)
A2 A3 A4 A5
GSIs A5
(partition)
A4
(sort)
A1
(item key)
A3
(projected)
Table
INCLUDE A3
A4
(partition)
A5
(sort)
A1
(item key)
A2
(projected)
A3
(projected) ALL
A2
(partition)
A1
(itemkey) KEYS_ONLY
Online indexing
Read capacity units
(RCUs) and write
capacity units (WCUs)
are provisioned
separately for GSIs
37. How do GSI updates work?
Table
Primary
table
Primary
table
Primary
table
Primary
table
Global
secondary
index
Client
2. Asynchronous
update (in progress)
If GSIs don’t have enough write capacity, table writes will be throttled!
38. LSI or GSI?
LSI can be modeled as a GSI
If data size in an item collection > 10 GB, use GSI
If eventual consistency is okay for your scenario, use
GSI!
39. Advanced topics in DynamoDB
• Design patterns and best practices
• Data modeling
• Understanding Partitions
• DynamoDB Scaling
40. To learn more, please attend:
Deep Dive on Amazon DynamoDB
Sean Shriver, AWS NoSQL Solutions Architect
50. • Free Tier
25GB of storage
25 Reads per second
25 Writes per second
• Pricing for additional usage in US East (N. Virginia)
$0.25 per GB per month
Write throughput: $0.0065 per hour for every 10 units of Write Capacity
Read throughput: $0.0065 per hour for every 50 units of Read Capacity
DynamoDB Pricing & Free Tier
51. Resources
Padma Malligarjunan | pmalli@amazon.com
Amazon DynamoDB: https://aws.amazon.com/dynamodb/
NoSQL on AWS: https://aws.amazon.com/nosql/document/
Upcoming session: Deep Dive on Amazon DynamoDB
We will look at the history of databases, and we’ll discuss relational database and non-relational databases, and the differences.
I’ll introduce Amazon DynamoDB and we’ll look at customer references who have built scalable applications using this technology.
To fully appreciate the need for NoSQL… Let’s start by looking into how much data volume has grown in the last 5 years. 90% of data was generated in the last 2 years.
1 TB vs 6.5 PB .. To put that into perspective…
We are starting to see Businesses with multi-TB have exploded to multiPB databases.
As data volume increased, we started innovating data processing systems that would scale to process the large volume of data
We started by remembering everything (human brain) and advanced to writing things down (for centuries). As data pressure increased we saw Magnetic storage, File systems, and then finally Relational Databases. 40 years. Table normalization was designed to eliminate duplicates and save storage costs. Multiple tables – Complex SQL joins – Resource intensive. Optimize for the costlier asset. AGNOSTIC TO ACCESS PATTERNS -- Great for adhoc queries –NOT optimized. Business are seeing the limitations in relational databases. Switching to NoSQL.
Every time there is a new technology - initial excitement with early adopters… They may run into roadblocks.
It’s the same with NoSQL. The goal of this presentation is to explain the difference between relational and NoSQL databases. And as you gain more experience with this technology you wll start to realize the benefits of NoSQL for your application. And that will help you cross the chasm in getting started with DynamoDB.
Let’s deep dive into the differences between relational and non-relational databases.
Why? Databases are a crucial part of your application and your choice of database technology will determine how your application scales.
To understand the benefits of NoSQL…
Relational - Data is normalized. To enable joins, You are tied to a single partition and a single system. performance on the hardware specs of the primary server. To improve performance, Optimize -- Move to a bigger box. You may still run out of headroom. Create Read Replicas. You will still run out. Scale UP.
NoSQL -- NoSQL databases were designed specifically to overcome scalability issues. Scale “out” data using distributed clusters, low-cost hardware, throughput + low latency
Therefore, Using NoSQL, businesses can scale virtually without limit.
Generic product catalog. Table relationships in normalized.
A product could be a book – say the Harry Potter Series. There’s a 1:1 relationship. Or it could be a movie..
You can imagine the types of queries that you’d have to execute. 1. Show me all the movies starring. 2. the entire product catalog. This is Resource intensive – perform complex join
** NoSQL you have to ask – how will the application access the data?
optimize for the costlier asset. No joins. Just a select. Hierarchical structures. Designed by keeping in mind Access patterns.
Via duplication of data (storage), optimized for compute, it is fast.
Businesses are starting to see scalability problems with relational databases. I once had a customer say they top out with relational at around 3,000 requests per second and had to scale up to move to bigger hardware.With NoSQL, we have a technology that can easily sale to 100s of nodes, or even 1000s, and the scalability bottleneck goes away.
Excellent for OLTP applications that scale, real time data access, fast, low latency, user cannot wait.
==
They store data in a denormalized hierarchical view, that makes it faster and easier to access the data.
Using AWS, you can easily get started with a variety of NoSQL solutions. For those customers who want full control over their NoSQL databases but who don’t want to manage hardware infrastructure you can run your database on AWS and choose from a variety of database engines – Cassandra, Couchbase, MarkLogic or MongoDB. And you will use Amazon EC2 and Amazon EBS and have to think about availability and scalability. If instead, you just want to focus on building your application, then you can use the fully managed Amazon DynamoDB.
All our solutions offer flexible, pay-as-you-go pricing, so you can quickly and easily scale at a low cost. You can download the AWS whitepaper to getting started with these NoSQL technologies.
The key takeaway from this slide is that we offer the widest range of NoSQL options and no matter which one you choose, you don’t have to worry about provisioning hardware and you will get the benefits of the underlying AWS global Cloud infrastructure.
In the next few slides, I will give an overview of getting started with MongoDB on AWS and then we will discuss Amazon DynamoDB.
==
Cassandra – distributed open source, handles large amounts of data providing high availability with no single point of failure.
Couchbase - a high-performance distributed key-value store.
MongoDB - open source, high performance document database.
Those if you who are involved in spinning up and managing your own servers surely realize how resource intensive it is to manage your own infrastructure. It can be possible to underestimate the cost and complexity of maintaining…. You have to think about power, cooling, OS maintenance and patching. Now imagine managing a 1000 node cluster, this can become very resuource intensive
Amazon EC2 is an AWS service for is the comupte capacity in cloud, it is resizable. Database instance hosted in an EC2 instance takes away some of the overhead. But, you still need to think about scalability and availability.
Game might be designed in a relational database.
I would intro to this slide explaining the need for a more flexible storage format as the variety of the data being ingested increases. Documents within a MongoDB collection can have different fields, which makes it much more flexible than the relational data model.
Emphasize, if possible, that MongoDB began as a document database.
What sets it apart from other implementations of storing data in documents is its expressive query language and full support for secondary indexing. Each field in a document can be indexed for fast access.
This is how it looks like in mongoDB. Stored in a single doc. Flexibility in adding fields, data tyoes – arrays, sub documents within arrays.
Emphasize: started as a document database, whereas a lot of other vendors are adding documents.
Expressive query language and full support for secondary indexing. Each field in a document can be indexed for fast access.
Recently released by Forrester.
Prompt: take a pic
Allocating swap space can avoid issues with memory contention .
The guest operating system should use a noop scheduler for best performance.
You can find more on the best practices guide.
TODO: EC2, EBS types
MongoDB offers a DBaaS named Atlas, which is available on AWS. You get an “optimally configured cluster” as a service. MongoDB Atlas makes it easy to set up, operate, and scale your MongoDB deployments in the cloud.
Covers availability, horizontal scalability aspects. (This is one way of setting up MongoDB in AWS). Talk to Jay Gordon from MongoDB / Leo Zheng. about questions.,
1P, 2S. HA / failover. Across Azs. Scaling – sharding data.
So, this brings us to Amazon DynamoDB, which is what we are going to discuss today. Let’s take a closer look.
Fully managed – With just a few clicks on the AWS console – create a table that is highly scalalable, highly available, and gives you fast consistent predictable performance.
No need to launch or maintain any servers.
Tell DynamoDB read/write – DynamoDB will scale to meet your application’s requirements
Only pay for what you use. You get all of this with just a few clicks.
Key take away: Using DynamoDB customers get consistent, single-digit millisecond latency at any scale.
==
DynamoDB supports both document and key-value store models, and offers a range of features including global secondary indexes, fine-grained access control via AWS Identity and Access Management, support for event-driven programming, and more.
==
Fully Managed
Amazon DynamoDB is a fully managed cloud NoSQL database service – you simply create a database table, set your throughput, and let the service handle the rest. You no longer need to worry about database management tasks such as hardware or software provisioning, setup and configuration, software patching, operating a reliable, distributed database cluster, or partitioning data over multiple instances as you scale.
Fast, Consistent Performance
Amazon DynamoDB is designed to deliver consistent, fast performance at any scale for all applications. Average service-side latencies are typically single-digit milliseconds. As your data volumes grow and application performance demands increase, Amazon DynamoDB uses automatic partitioning and SSD technologies to meet your throughput requirements and deliver low latencies at any scale.
Highly Scalable
When creating a table, simply specify how much request capacity you require. If your throughput requirements change, simply update your table's request capacity using the AWS Management Console or the Amazon DynamoDB APIs. Amazon DynamoDB manages all the scaling behind the scenes, and you are still able to achieve your prior throughput levels while scaling is underway.
Flexible
Amazon DynamoDB supports both document and key-value data structures, giving you the flexibility to design the best architecture that is optimal for your application.
Event Driven Programming
Amazon DynamoDB integrates with AWS Lambda to provide Triggers which enables you to architect applications that automatically react to data changes.
Fine-grained Access Control
Amazon DynamoDB integrates with AWS Identity and Access Management (IAM) for fine-grained access control for users within your organization. You can assign unique security credentials to each user and control each user's access to services and resources.
http://aws.amazon.com/dynamodb
Those if you who are involved in spinning up and managing your own servers surely realize how resource intensive it is to manage your own infrastructure. It can be possible to underestimate the cost and complexity of maintaining…. You have to think about power, cooling, OS maintenance and patching. Now imagine managing a 1000 node cluster, this can become very resuource intensive
Amazon EC2 is an AWS service for is the comupte capacity in cloud, it is resizable. Database instance hosted in an EC2 instance takes away some of the overhead. But, you still need to think about scalability and availability.
This is the value that is built into DynamoDB. With DynamoDB, you have get an easy-to-use database. You don’t have to spin up any servers. You can easily design serverless scalable aplications with DynamoDB. You get scalability and multi-AZ replication without designing a distributed system. You get ongoing security upgrades, software improvements, cost reduction efforts, monitoring…without any effort at all.
DDB is fully managed service, you have all of that benefit built into it. We built Dynamo to just work so you can focus on your app.
In any business, as your business scales up, you need a way to easy scale to meet the traffic, and be able to get consistent predicatable latency at any scale.
You need a way to scale down as your business needs changes. DynamoDB was designed to offer consistent and predictable single-digit millisecond latency, at any scale. And you only pay for what you use. NO limit on throughouput. No limit on Size – PB of data any number of items.
The latency characteristics of DynamoDB are under 10 milliseconds and highly consistent.
Most importantly, the data is durable in DynamoDB, constantly replicated across multiple data centers and persisted to SSD storage.
Predictable Performance
This is obviously something that’s important and valuable in any industry, whether it’s powering the New York Times recommendation engine, storing and retrieving game data for the game Fruit Ninja, or powering queries and fast data retrieval for Major League Baseball Advanced Media. Predictable performance at scale is a must-have for many web apps, and DynamoDB was designed specifically to deliver on this.
13/35. 4 more regions. DynamoDB is highly durable. AWS has a concept of regions and Availability zones. AWS region is a geographic area. Each region has multiple availability zones. Each AZ has 1 or more physical DCs. They have redundant power and cooling, and interconnected via high speed low latency fiber. Take for example the AWS region in NVIrgina. It has 4 Azs.
When you create a DynamoDB table in Nvirgina, we will replicate the data to 3 Azs. All the data is stored in SSDs.
A lot of value built into DynamoDB– a few clicks.
Growing number of customers in the Mobile, IoT, Gaming space are using DynamoDB.
Amazon’s path from Relational Databases to NoSQL reflects the journey many customers are now taking.
Amazon.com, the online retail business, runs on one of the world’s largest web infrastructures. Back in 2004, Amazon.com was using Relational Oracle Databases and they were unable to scale their relational database. Maintenance and adminstration. In order to keep Amazon.com highly scalable to support all the incoming traffic, Internal project to investigate options… “If availability, durability, and scalability are the priority, what would the database look like?”. This resulted in a whitepaper that described what the database should look like. This paper made the way for many NoSQL technologies out there today. This was also the beginning of DynamoDB.
Database as a Swiss Army Knife - Hundreds of applications built on RDBMS, Poor Scalability (Q4 was a pain), Poor availability, Exorbitantly high costs for h/w, software, admin
Dynamo = replicated DHT with consistency management
Specialist tool with limited query and simpler consistency
Problem: required significant effort to maintain
DynamoDB was designed to deliver consistently high performance at any scale:
Predictable Performance
Massively Scalable
Fully Managed
Low Cost
Major League Baseball – A great example of a customer using DynamoDB to build IoT solution.
Amazon DynamoDB powers queries required to support many games on a single day. When there are only a few games, it dials down throughput to save money; MLBAM only pays for the capacity it uses.
===
STORY BACKGROUND
MLBAM (MLB Advanced Media) is a full service solutions provider, operating a powerful content delivery platform.
Amazon DynamoDB powers queries and supports the fast data retrieval required to support many games on a single day.
MLBAM distributes 25,000 live events annually and 10 million streams daily.
SOLUTION AND BENEFITS
MLBAM only pays for the capacity it uses.
When there are only a few games, it dials down throughput to save money.
MLBAM can focus on what it does best, rather than spending resources managing clusters of non-relational (NoSQL) databases.
On big game days, MLBAM can quickly scale up DynamoDB read and write capacity to meet its demand without increased latency.
ADDITIONAL INFORMATION
https://aws.amazon.com/solutions/case-studies/major-league-baseball-mlbam/
A customer who is using DynamoDB to power their Mobile applications -- Redfin – people use this application for searching buying and selling homes. More than 10,000 customers buy or sell homes with Redfin each year.
==
STORY BACKGROUND
Redfin offers full-service real estate brokerage services with local agents and online tools to help people buy & sell homes.
Redfin built technology to make customers smarter and faster when buying and selling homes.
More than 10,000 customers buy or sell homes with Redfin each year.
SOLUTION AND BENEFITS
Redfin connects users with properties and agents.
Redfin uses DynamoDB to deliver insights to its website and apps.
DynamoDB stores property scores, recommendations, property data (e.g., sold, est. value), agent scoring (i.e., how the agent is performing).
Redfin websites and apps consume these data from DynamoDB.
Using Amazon DynamoDB, Amazon Redshift, Amazon EMR, Amazon S3
ADDITIONAL INFORMATION
[Coming December 2015]
Expedia says. “With DynamoDB we were up and running in a less than day, and there is no need for a team to maintain.”. They are an example of a customer powering web applications using DynamoDB.
==
STORY BACKGROUND
Expedia initially started out setting up Cassandra cluster to support an application to collect data for its test & learn experiments that run on Expedia sites, then processes and stores them
The application processes ~200 million messages a day
The team spent more than a week setting up 3 node Cassandra ring and was nowhere close to setting up Cassandra in AWS with clustering, monitoring for scaling out
With DynamoDB we were up and running in a less than day
SOLUTION AND BENEFITS
Setup, monitoring, ease to scale made us choose DynamoDB over Cassandra
With DynamoDB there is no need for a team to maintain
The application uses Apache Storm, Amazon DynamoDB, and Amazon ElastiCache redis
ADDITIONAL INFORMATION
https://www.youtube.com/watch?v=ie4dWGT76LM
Nexon is a leading South Korean video game developer. Their blockbuster game titled HIT attracts over 2 million players. They were ranked #1 Mobile Game in Korea on the day of its launch. They used Amazon DynamoDB to scale and to provide a reliable user experience.
===
STORY BACKGROUND
Nexon is a leading South Korean video game developer and a pioneer in the world of interactive entertainment.
Nexon provides 150 games to 150 countries, including FIFA Online 3, MapleStory, and Sudden Attack.
As of 2014, sales reached $1.6 billion, with 60% from overseas business
Nexon used DynamoDB as its primary game database for a new blockbuster Mobile Game, HIT
SOLUTION AND BENEFITS
DynamoDB serves as the primary game database, offering low latency and scale to match player demand
Despite a steady increase in the size of the data, DynamoDB delivered steady latency of less than 10ms.
This enabled Nexon to provide a reliable service to users HIT, which was the foundation for the success of HIT.
ADDITIONAL INFORMATION
https://aws.amazon.com/solutions/case-studies/nexon/
Here are just a few examples of customers achieving tremendous scale with DynamoDB:
And what do customers want? They want Predictable consistent low latency performance at scale; and DynamoDB was designed specifically to deliver on this.
==
Ad Tech
AdRoll http://aws.amazon.com/solutions/case-studies/adroll/
DataXu http://info.qubole.com/how-dataxu-manages-big-data
AdBrain http://www.adbrain.com/careers-generalapp/
DoApp https://aws.amazon.com/solutions/case-studies/doapp/
VidRoll https://aws.amazon.com/solutions/case-studies/vidroll/
Fiksu https://aws.amazon.com/solutions/case-studies/fiksu/
TubeMogul https://www.tubemogul.com/engineering/using-contextual-information-in-programmatic-advertising/
TCC https://github.com/TheClimateCorporation/mandolin
Gaming
Supercell http://aws.amazon.com/solutions/case-studies/supercell/
Zynga https://aws.amazon.com/solutions/case-studies/zynga/
Nexon http://aws.amazon.com/solutions/case-studies/nexon
PennyPop http://aws.amazon.com/solutions/case-studies/battle-camp/
Frontier http://aws.amazon.com/solutions/case-studies/frontier-games/
scopely https://aws.amazon.com/solutions/case-studies/scopely/
Unalis https://aws.amazon.com/solutions/case-studies/unalis/
IoT
MLBAM http://aws.amazon.com/solutions/case-studies/major-league-baseball-mlbam/
ACTi https://aws.amazon.com/solutions/case-studies/acti-case-study/
Canary https://aws.amazon.com/solutions/case-studies/canary/
Dropcam https://aws.amazon.com/solutions/case-studies/dropcam/
MediaTek https://aws.amazon.com/solutions/case-studies/mediatek/
Devicescape https://aws.amazon.com/solutions/case-studies/devicescape/
Mobile
Duolingo http://aws.amazon.com/solutions/case-studies/duolingo-case-study-dynamodb/
Mapbox https://www.mapbox.com/blog/scaling-the-mapbox-infrastructure-with-dynamodb-streams/
Redfin http://aws.amazon.com/solutions/case-studies/redfin/ and https://www.youtube.com/watch?v=YiaPjILR9zw
Remind https://aws.amazon.com/solutions/case-studies/remind/
Infraware http://aws.amazon.com/solutions/case-studies/infraware/
Myriad http://aws.amazon.com/solutions/case-studies/myriad-group/
Peak http://aws.amazon.com/solutions/case-studies/peak/
Web
Expedia https://aws.amazon.com/solutions/case-studies/expedia/
Nordstrom https://aws.amazon.com/solutions/case-studies/nordstrom/
JustGiving http://aws.amazon.com/solutions/case-studies/justgiving/
Tokyu Hands https://aws.amazon.com/blogs/aws/how-tokyu-hands-architected-a-cost-effective-shopping-system-with-amazon-dynamodb/
jobandtalent https://aws.amazon.com/solutions/case-studies/jobandtalent/
Tigerspike http://aws.amazon.com/solutions/case-studies/tigerspike/
Amazon DynamoDB is a fully Managed Service. So, to get started with Amazon DynamoDB you simply have to create a table.
After you logon to the AWS console, select DynamoDB, and click create table, here’s what the screen looks like.
Specify a table name, specify a “partition key”. IT’s like a primary key, and uniquely identifies a row.
Next, if required change the value for amount of reads / writes the table should support. Or accept the defaults and Click Create.
And you’ve created your table – this table which you’ve created in just a few clicks is highly scalable, highly available, and is designed to provide consistent low ms latency at scale.
Attributes can vary between the items, Each item can have a different set of attributes than the other items. (as with any NoSQL database).
Partition key – Primary key – uniquely identifies each item. Also determines HOW DATA IS Partitioned STORED
Optional Sort key – you have a composite key; Sort keys help to create 1:many relationships, and useful in range queries.
Some applications only need to query data using the table's primary key; however, there may be situations where an alternate sort key would be helpful. You can use LSIs.
LSI is collocated on the same partition as the item in the table, so this gives us consistency. When an item is updated, LSI is updated, and then ack’d.
LSI is partitioned by the same primary key as the parent table. Different Sort key.
Say, there is a table containing Customers, Orders, date range. Customers and Orders. LSI can have sort key on a “date range”. A local secondary index maintains an alternate sort key for a given partition key value.
Some applications might need to perform many kinds of queries, using a variety of different attributes as query criteria.
Global Secondary Indexes – Parallel tables or secondary tables.
GSI can have a partition key that is different from the Table. They can also have an alternate sort key.
Customers, Orders, Date Range. Partition by Order Id and query for a date range.
Note: When you create a GSI, you must specify read and write capacity units for the expected workload on that index.
Customers often ask if LSI should be used or GSI. Think of this as a parallel table asynchronously populated by DynamoDB. Eventually consistent. GSI updates typically happen within a second.
Throughput for GSI is important.. That is important on how soon the GSI will be updated.
Note: When you create a GSI, you must specify read and write capacity units for the expected workload on that index.
1 Table update = 0, 1 or 2 GSI updates
More flexibility with GSI.
You can have only 5 LSI and 5 GSI, however, with GSI, you have the flexibility to create them after the table is created. LSI must be created when the table is defined.
LSI can be modeled as a GSI
If data size in an item collection > 10 GB (Example, many orders for a customerID) use GSI that’s the only choice. Because LSI limit the data size in a particular partition.
If eventual consistency is okay for your scenario, use GSI – it works for 99% of the scenarios out there.
For those of you who want to learn more, there is a session later today that will cover advanced topics.
Three years exp with DynamoDB.
http://heroesmission24.s3-website-us-east-1.amazonaws.com/apiGateway-js-sdk/index.html
http://superheroes-24.s3-website-us-east-1.amazonaws.com/apiGateway-js-sdk/index.html
https://isengard.amazon.com/federate?account=696964988608&role=IsengardAdministrator
I’ll show you a Demo of building a serveless web app, and we’ll also look at the integration capabilities of DynamoDB with AWS services.
DynamoDB is a managed NoSQL offering from AWS, and we are looking for talented engineers to help build the next generation of this service. Contact Raja for more details.
We will build a web application, that will ask you for your Super Hero name and show you your Mission Dossier. The website is a simple HTML/javascript web interface. All the data – Names of the Super heroes and the mission details is stored in Amazon DynamoDB.
When I created this application, I had to main objectives. I do not want to spin up or have to manage any servers. Two, I want to take advantage of the high availability, scalability, and durability features of AWS services..
*** Amazon S3 is secure, durable, highly-scalable cloud storage, where you can store and retrieve any amount of data..
In this demo, I will access a website using the internet. The website is a simple HTML/javascript web interface. The website is stored in Amazon S3.
The application is a simple web interface that will retrieve flight schedules, flight number, wait list, etc stored in a DynamoDB table. In order to set this up, I did not have to spin up any servers, so no servers to maintain. I am taking advantage of all the fully managed capabilities of AWS services to securely access my data. All I did is create my application and let AWS handle the infrastructure and the scaling.
The web client needs an easy way to access my business logic. I am using Amazon API gateway as the “front door to manage the traffic that is hitting your business logic running in backends”. Amazon API Gateway Amazon is an easy to use, fully managed service. What is the advantage? If the traffic to my website increase, then API Gateway will help me to scale to withstand traffic spikes.
So we said that API Gateway acts as a front door to the “business logic”. So my business logic is running on AWS Lambda. AWS Lambda is a Fully managed compute service – you just write the code and upload it. In this example, the APIs created from the API Gateway front-door, will call the business logic running on AWS lambda functions.
As in any business or application, you want to specify permissions that guide who has access to what resources. AWS IAM allows you to securely access the AWS services and resources. Using IAM, you can create and manage permissions to allow and deny access to AWS resources, at a very fine grained level.
And all the data is stored in DynamoDB.
DynamoDB is a fully managed NoSQL database service that provides consistent, single-digit millisecond latency at any scale.
[CLICK] So if you put this together, I will show you a demo where I will access a website hosted in an S3 bucket, which uses API Gateway calls to send requests to Lambda backends to query the DynamoDB data.
You can get started with creating your first serverless web application in AWS, by taking advantage of the DynamoDB free tier, that can handle up to 200 million requests for free.
==
As part of the AWS Free Tier, DynamoDB customers get 25GB of storage, 25 writes per second, and 25 reads per second. This lets you handle up to 200 million requests per month so you can deploy a proof-of-concept and begin testing the live cloud service. The DynamoDB free tier does not expire at the end of your 12 month AWS Free Tier term.
http://aws.amazon.com/free