The document discusses architecting web consumables to provide instant responses even with slow internet. It covers technologies like request-response, load balancing, auto-scaling, databases, and content delivery to build scalable and reliable infrastructure. The goal is to elastically scale infrastructure to match costs with fluctuating customer traffic.
The document discusses several Amazon Web Services related to databases and data warehousing. It describes Amazon Redshift, a fully managed data warehouse service; the purpose of data warehousing; Amazon ElastiCache, a web service for deploying Redis or Memcached in the cloud to improve application performance; Amazon DynamoDB Accelerator (DAX) which provides an in-memory cache for DynamoDB; AWS Database Migration Service which helps migrate databases to AWS easily and securely; and benefits of AWS DMS like simplicity, zero downtime, support for many databases, low cost, and reliability.
Amazon Aurora Relational Database Built for the AWS Cloud, Version 1 SeriesDataLeader.io
DOWNLOAD THE PRESENTATION TO SEE THE ANIMATIONS PROPERLY.
Amazon Aurora has been the fastest growing service in AWS history since 2016!
Amazon Aurora is a cloud relational database built from the ground up with a new, ingenious architecture. This video is part of a series.
Section 1.0 here on Amazon Aurora has 16 videos! Skip over the quizzes if you'd like. Amazon Aurora is the fastest growing Service in AWS history since September, 2016 & STILL IS TODAY 2/9/2019! I cover what makes Amazon Aurora so unique & perfect for analytics that must use a relational database. I describe how it came to be, its features, its business value, some comparisons between Amazon Aurora to Amazon RDS for MySQL (now supports PostgreSQL & there's also a Serverless version! I cover high performance & why/how it accomplishes that, a high-level view of Amazon Aurora's Architecture, its ability to scale both up & out, its high availability & durability & how that's achieved, how to secure it, & a few ways to take advantage of different pricing options. It also covers Database Storage & Input/Output (IO), backups, AWS' "Simple Monthly Calculator" (which has been updated since making this video), & how its pricing compares to SQL Server
A closer look at the MySQL and PostgreSQL compatible relational database built for the cloud that combines the performance and availability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases. We’ll explore how Aurora uses the AWS cloud to provide high reliability, high durability, and high throughput.
Speakers:
Steve Abraham - Principal Database Specialist Solutions Architect, AWS
Peter Dachnowicz - Sr. Technical Account Manager, AWS
The document discusses Amazon Aurora, a database service from AWS that is compatible with PostgreSQL and MySQL. It provides summaries of Aurora's architecture, performance advantages, and customer benefits compared to traditional databases. Specifically, the document notes that Aurora achieves higher performance and availability than PostgreSQL by using a distributed, scalable storage system and replicating data across Availability Zones. It shares performance test results showing that Aurora can be up to 3x faster than PostgreSQL for various workloads. Customers have also cited lower costs and easier management with Aurora compared to commercial databases.
Getting Maximum Performance from Amazon Redshift: Complex Queriestimonk
This document discusses how to get maximum performance from Amazon Redshift for complex queries over large datasets. It recommends writing optimized SQL queries using techniques like common table expressions and window functions, organizing data in Redshift's columnar format by partitioning and sorting, and optimizing for fast query times by leveraging Redshift's massively parallel processing capabilities and provisioning additional clusters when needed. The goal is to enable complex, multi-stage queries and reports over web-scale impression and conversion data without creating bottlenecks for operations.
Amazon Relational Database Service (RDS) provides a managed relational database in the cloud. It supports several database engines including Amazon Aurora, MariaDB, Microsoft SQL Server, MySQL, Oracle, and PostgreSQL. Key features of RDS include automated backups, manual snapshots, multi-AZ deployment for high availability, read replicas for scaling reads, and encryption options. DynamoDB is AWS's key-value and document database that delivers single-digit millisecond performance at any scale. It is a fully managed NoSQL database and supports both document and key-value data models. Redshift is a data warehouse service and is used for analytics workloads requiring fast queries against large datasets.
The document discusses several Amazon Web Services related to databases and data warehousing. It describes Amazon Redshift, a fully managed data warehouse service; the purpose of data warehousing; Amazon ElastiCache, a web service for deploying Redis or Memcached in the cloud to improve application performance; Amazon DynamoDB Accelerator (DAX) which provides an in-memory cache for DynamoDB; AWS Database Migration Service which helps migrate databases to AWS easily and securely; and benefits of AWS DMS like simplicity, zero downtime, support for many databases, low cost, and reliability.
Amazon Aurora Relational Database Built for the AWS Cloud, Version 1 SeriesDataLeader.io
DOWNLOAD THE PRESENTATION TO SEE THE ANIMATIONS PROPERLY.
Amazon Aurora has been the fastest growing service in AWS history since 2016!
Amazon Aurora is a cloud relational database built from the ground up with a new, ingenious architecture. This video is part of a series.
Section 1.0 here on Amazon Aurora has 16 videos! Skip over the quizzes if you'd like. Amazon Aurora is the fastest growing Service in AWS history since September, 2016 & STILL IS TODAY 2/9/2019! I cover what makes Amazon Aurora so unique & perfect for analytics that must use a relational database. I describe how it came to be, its features, its business value, some comparisons between Amazon Aurora to Amazon RDS for MySQL (now supports PostgreSQL & there's also a Serverless version! I cover high performance & why/how it accomplishes that, a high-level view of Amazon Aurora's Architecture, its ability to scale both up & out, its high availability & durability & how that's achieved, how to secure it, & a few ways to take advantage of different pricing options. It also covers Database Storage & Input/Output (IO), backups, AWS' "Simple Monthly Calculator" (which has been updated since making this video), & how its pricing compares to SQL Server
A closer look at the MySQL and PostgreSQL compatible relational database built for the cloud that combines the performance and availability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases. We’ll explore how Aurora uses the AWS cloud to provide high reliability, high durability, and high throughput.
Speakers:
Steve Abraham - Principal Database Specialist Solutions Architect, AWS
Peter Dachnowicz - Sr. Technical Account Manager, AWS
The document discusses Amazon Aurora, a database service from AWS that is compatible with PostgreSQL and MySQL. It provides summaries of Aurora's architecture, performance advantages, and customer benefits compared to traditional databases. Specifically, the document notes that Aurora achieves higher performance and availability than PostgreSQL by using a distributed, scalable storage system and replicating data across Availability Zones. It shares performance test results showing that Aurora can be up to 3x faster than PostgreSQL for various workloads. Customers have also cited lower costs and easier management with Aurora compared to commercial databases.
Getting Maximum Performance from Amazon Redshift: Complex Queriestimonk
This document discusses how to get maximum performance from Amazon Redshift for complex queries over large datasets. It recommends writing optimized SQL queries using techniques like common table expressions and window functions, organizing data in Redshift's columnar format by partitioning and sorting, and optimizing for fast query times by leveraging Redshift's massively parallel processing capabilities and provisioning additional clusters when needed. The goal is to enable complex, multi-stage queries and reports over web-scale impression and conversion data without creating bottlenecks for operations.
Amazon Relational Database Service (RDS) provides a managed relational database in the cloud. It supports several database engines including Amazon Aurora, MariaDB, Microsoft SQL Server, MySQL, Oracle, and PostgreSQL. Key features of RDS include automated backups, manual snapshots, multi-AZ deployment for high availability, read replicas for scaling reads, and encryption options. DynamoDB is AWS's key-value and document database that delivers single-digit millisecond performance at any scale. It is a fully managed NoSQL database and supports both document and key-value data models. Redshift is a data warehouse service and is used for analytics workloads requiring fast queries against large datasets.
Redshift is a petabyte-scale data warehouse that is a lot faster, a lot less expensive and a whole lot simpler to use. How can you get your data into Amazon Redshift? In this webinar, hear from representatives of Attunity (Amazon Redshift Partner), and AWS as they present many of the options available for data integration. Whether your data is in an on premise platform or a cloud based database like DynamoDB, we will show you how you can easily load your data in to Re
dshift.
Reasons to attend: - Learn about best practices to efficiently integrate data into Redshift. - Attend Q&A session with Redshift experts
Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...Amazon Web Services
Amazon DynamoDB is a fully managed NoSQL database service provided by AWS that provides fast and predictable performance with seamless scalability. It offers a flexible data model and reliable access patterns. With DynamoDB, users do not need to provision, operate, or scale their own database clusters and can instead pay only for the storage and throughput capacity they need.
Amazon Aurora Getting started Guide -level 0kartraj
Introduction To Amazon Aurora, Amazon Aurora
applying a Service-oriented architecture
to the database
Aurora Makes it Easy to Run Your Databases
Aurora simplifies storage management
Aurora simplifies Data Security
Aurora is Highly Available
by Rich Alberth, Solutions Architect, AWS
Modernizing your database environment can bring many benefits, from avoiding technical debt to reducing expenses. AWS Database Migration Service enables easy modernization, enabling you to easily change database versions (and even database engines) and schema topologies while avoiding downtimes. We’ll look at some models for modernization, then do a hands-on exercise to migrate and consolidate MySQL databases to Amazon Aurora. You’ll need a laptop with a Firefox or Chrome browser.
Deep Dive on Amazon Aurora MySQL Performance Tuning (DAT429-R1) - AWS re:Inve...Amazon Web Services
Amazon Aurora offers several options for monitoring and optimizing MySQL database performance. These include Enhanced Monitoring and Performance Insights, an easy-to-use tool for assessing the load on your database and identifying slow-performing queries. In this session, learn how to tune the performance of your Aurora database with MySQL compatibility, whether your application is in development or in production.
This document discusses high availability website design. It recommends hosting static assets on Amazon S3 for high durability and redundancy. Content delivery can be improved with Amazon CloudFront. Dynamic applications can be built on Amazon EC2 across availability zones and auto-scaled for failure recovery. Databases can use Amazon RDS for management. Multi-tier designs with load balancing, caching, and auto-scaling provide tolerance to instance and availability zone failures.
RDS Postgres and Aurora Postgres | AWS Public Sector Summit 2017Amazon Web Services
Attend this session for a technical deep dive about RDS Postgres and Aurora Postgres. Come hear from Mark Porter, the General Manager of Aurora PostgreSQL and RDS at AWS, as he covers service specific use cases and applications within the AWS worldwide public sector community. Learn More: https://aws.amazon.com/government-education/
Building with AWS Databases: Match Your Workload to the Right Database (DAT30...Amazon Web Services
We have recently seen some convergence of different database technologies. Many customers are evaluating heterogeneous migrations as their database needs have evolved or changed. Evaluating the best database to use for a job isn't as clear as it was ten years ago. We'll discuss the ideal use cases for relational and nonrelational data services, including Amazon ElastiCache for Redis, Amazon DynamoDB, Amazon Aurora, Amazon Neptune, and Amazon Redshift. This session digs into how to evaluate a new workload for the best managed database option. Please join us for a speaker meet-and-greet following this session at the Speaker Lounge (ARIA East, Level 1, Willow Lounge). The meet-and-greet starts 15 minutes after the session and runs for half an hour.
by Rajeev Srinivasan, Sr. Solutions Architect, AWS
Amazon DynamoDB is a fast and flexible NoSQL database service for all applications that need consistent, single-digit millisecond latency at any scale. It is a fully managed cloud database and supports both document and key-value store models. Its flexible data model, reliable performance, and automatic scaling of throughput capacity, makes it a great fit for mobile, web, gaming, ad tech, IoT, and many other applications. We’ll take a look at how DynamoDB works and how it can be accelerated by DAX, the DynamoDB Accelerator.
Deep Dive on PostgreSQL Databases on Amazon RDS (DAT324) - AWS re:Invent 2018Amazon Web Services
In this session, we provide an overview of the PostgreSQL options available on AWS, and do a deep dive on Amazon Relational Database Service (Amazon RDS) for PostgreSQL, a fully managed PostgreSQL service, and Amazon Aurora, a PostgreSQL-compatible database with up to 3x the performance of standard PostgreSQL. Learn about the features, functionality, and many innovations in Amazon RDS and Aurora, which give you the background to choose the right service to solve different technical challenges, and the knowledge to easily move between services as your requirements change over time.
Amazon RDS makes it easy to set up, operate, and scale a relational database in the cloud. We’ll look at what RDS does (and does not) do to manage the “muck” of database operations.
This document provides an overview and use cases for Amazon Redshift, a fast, fully managed, petabyte-scale data warehouse service from Amazon Web Services. It summarizes Redshift's features including columnar storage, data compression, and massively parallel query processing. It also provides examples of how Redshift is used by companies to reduce costs, improve query performance, and scale their data warehousing needs. Specific use cases and customers of Redshift are highlighted.
This document provides a summary of a presentation on Amazon Aurora by Dickson Yue. It discusses Aurora fundamentals like its scale-out distributed architecture and 6 copies of data for fault tolerance. Recent improvements discussed include fast database cloning, backup and restore capabilities, and backtrack for point-in-time recovery. Coming soon features outlined are asynchronous key prefetch, batched scans, hash joins, and Aurora Serverless for automatic scaling.
Database Week at the San Francisco Loft
Amazon Aurora
A closer look at the MySQL and PostgreSQL compatible relational database built for the cloud that combines the performance and availability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases. We’ll explore how Aurora uses the AWS cloud to provide high reliability, high durability, and high throughput.
Level: 200
Speakers:
Mahesh Pakala - Solutions Architect, AWS
Arabinda Pani - Partner Solutions Architect, Database Specialist, AWS
The document discusses Amazon Aurora, Amazon's cloud-optimized relational database. It provides an overview of Aurora's architecture, which breaks apart the traditional monolithic database stack into separate services for improved scalability. The document announces that Amazon Aurora now provides compatibility with PostgreSQL in addition to MySQL. It describes Aurora's high performance and availability compared to open source databases like PostgreSQL through its use of Amazon's cloud-optimized storage.
This document provides an overview of Amazon Aurora including:
- Aurora is a database service that provides the performance and availability of high-end commercial databases at a lower cost.
- It uses a distributed, fault-tolerant storage system across 3 Availability Zones for data durability.
- Aurora provides up to 5x better performance than MySQL and 3x better than PostgreSQL through optimized storage, caching, processing and parallelism.
- It offers high availability with zero downtime and the ability to survive the loss of up to 2 Availability Zones through 6 way data replication.
by Darin Briskman, Technical Evangelist, AWS
Microsoft SQL Server is a commonly-used commercial relational database, especially for organizations that use Microsoft development tools. We’ll look at how to run SQL Server on the AWS Cloud, with examples of organizations using it.
Introducing Amazon Aurora with PostgreSQL Compatibility - AWS Online Tech TalksAmazon Web Services
Learning Objectives:
- Learn about optimizing relational databases for the cloud
- Learn about Amazon Aurora scalability and high availability
- Learn about Amazon Aurora compatibility with PostgreSQL
This document describes an architecture for analyzing large-scale web logs generated by applications hosted on Amazon Web Services. The architecture utilizes Amazon Elastic MapReduce to process log files stored in Amazon S3 using Hadoop. Results are stored in an Amazon RDS database. EC2 spot instances provide additional flexible capacity for log analysis jobs. Amazon CloudFront also generates logs that can be analyzed in this system.
The document discusses building data lakes and analytics on AWS. It provides an overview of challenges with big data like increasing data variety and growth. It then describes how AWS services like S3, Glue, Athena, EMR, and Redshift can be used to address these challenges by enabling quick ingestion of diverse data types, metadata management, and running analytics tools on curated datasets. The document emphasizes storing raw data immutable and using tiered storage for cost optimization. It outlines using the right AWS service based on user roles and discusses how data lakes and data warehouses are complementary.
Redshift is a petabyte-scale data warehouse that is a lot faster, a lot less expensive and a whole lot simpler to use. How can you get your data into Amazon Redshift? In this webinar, hear from representatives of Attunity (Amazon Redshift Partner), and AWS as they present many of the options available for data integration. Whether your data is in an on premise platform or a cloud based database like DynamoDB, we will show you how you can easily load your data in to Re
dshift.
Reasons to attend: - Learn about best practices to efficiently integrate data into Redshift. - Attend Q&A session with Redshift experts
Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...Amazon Web Services
Amazon DynamoDB is a fully managed NoSQL database service provided by AWS that provides fast and predictable performance with seamless scalability. It offers a flexible data model and reliable access patterns. With DynamoDB, users do not need to provision, operate, or scale their own database clusters and can instead pay only for the storage and throughput capacity they need.
Amazon Aurora Getting started Guide -level 0kartraj
Introduction To Amazon Aurora, Amazon Aurora
applying a Service-oriented architecture
to the database
Aurora Makes it Easy to Run Your Databases
Aurora simplifies storage management
Aurora simplifies Data Security
Aurora is Highly Available
by Rich Alberth, Solutions Architect, AWS
Modernizing your database environment can bring many benefits, from avoiding technical debt to reducing expenses. AWS Database Migration Service enables easy modernization, enabling you to easily change database versions (and even database engines) and schema topologies while avoiding downtimes. We’ll look at some models for modernization, then do a hands-on exercise to migrate and consolidate MySQL databases to Amazon Aurora. You’ll need a laptop with a Firefox or Chrome browser.
Deep Dive on Amazon Aurora MySQL Performance Tuning (DAT429-R1) - AWS re:Inve...Amazon Web Services
Amazon Aurora offers several options for monitoring and optimizing MySQL database performance. These include Enhanced Monitoring and Performance Insights, an easy-to-use tool for assessing the load on your database and identifying slow-performing queries. In this session, learn how to tune the performance of your Aurora database with MySQL compatibility, whether your application is in development or in production.
This document discusses high availability website design. It recommends hosting static assets on Amazon S3 for high durability and redundancy. Content delivery can be improved with Amazon CloudFront. Dynamic applications can be built on Amazon EC2 across availability zones and auto-scaled for failure recovery. Databases can use Amazon RDS for management. Multi-tier designs with load balancing, caching, and auto-scaling provide tolerance to instance and availability zone failures.
RDS Postgres and Aurora Postgres | AWS Public Sector Summit 2017Amazon Web Services
Attend this session for a technical deep dive about RDS Postgres and Aurora Postgres. Come hear from Mark Porter, the General Manager of Aurora PostgreSQL and RDS at AWS, as he covers service specific use cases and applications within the AWS worldwide public sector community. Learn More: https://aws.amazon.com/government-education/
Building with AWS Databases: Match Your Workload to the Right Database (DAT30...Amazon Web Services
We have recently seen some convergence of different database technologies. Many customers are evaluating heterogeneous migrations as their database needs have evolved or changed. Evaluating the best database to use for a job isn't as clear as it was ten years ago. We'll discuss the ideal use cases for relational and nonrelational data services, including Amazon ElastiCache for Redis, Amazon DynamoDB, Amazon Aurora, Amazon Neptune, and Amazon Redshift. This session digs into how to evaluate a new workload for the best managed database option. Please join us for a speaker meet-and-greet following this session at the Speaker Lounge (ARIA East, Level 1, Willow Lounge). The meet-and-greet starts 15 minutes after the session and runs for half an hour.
by Rajeev Srinivasan, Sr. Solutions Architect, AWS
Amazon DynamoDB is a fast and flexible NoSQL database service for all applications that need consistent, single-digit millisecond latency at any scale. It is a fully managed cloud database and supports both document and key-value store models. Its flexible data model, reliable performance, and automatic scaling of throughput capacity, makes it a great fit for mobile, web, gaming, ad tech, IoT, and many other applications. We’ll take a look at how DynamoDB works and how it can be accelerated by DAX, the DynamoDB Accelerator.
Deep Dive on PostgreSQL Databases on Amazon RDS (DAT324) - AWS re:Invent 2018Amazon Web Services
In this session, we provide an overview of the PostgreSQL options available on AWS, and do a deep dive on Amazon Relational Database Service (Amazon RDS) for PostgreSQL, a fully managed PostgreSQL service, and Amazon Aurora, a PostgreSQL-compatible database with up to 3x the performance of standard PostgreSQL. Learn about the features, functionality, and many innovations in Amazon RDS and Aurora, which give you the background to choose the right service to solve different technical challenges, and the knowledge to easily move between services as your requirements change over time.
Amazon RDS makes it easy to set up, operate, and scale a relational database in the cloud. We’ll look at what RDS does (and does not) do to manage the “muck” of database operations.
This document provides an overview and use cases for Amazon Redshift, a fast, fully managed, petabyte-scale data warehouse service from Amazon Web Services. It summarizes Redshift's features including columnar storage, data compression, and massively parallel query processing. It also provides examples of how Redshift is used by companies to reduce costs, improve query performance, and scale their data warehousing needs. Specific use cases and customers of Redshift are highlighted.
This document provides a summary of a presentation on Amazon Aurora by Dickson Yue. It discusses Aurora fundamentals like its scale-out distributed architecture and 6 copies of data for fault tolerance. Recent improvements discussed include fast database cloning, backup and restore capabilities, and backtrack for point-in-time recovery. Coming soon features outlined are asynchronous key prefetch, batched scans, hash joins, and Aurora Serverless for automatic scaling.
Database Week at the San Francisco Loft
Amazon Aurora
A closer look at the MySQL and PostgreSQL compatible relational database built for the cloud that combines the performance and availability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases. We’ll explore how Aurora uses the AWS cloud to provide high reliability, high durability, and high throughput.
Level: 200
Speakers:
Mahesh Pakala - Solutions Architect, AWS
Arabinda Pani - Partner Solutions Architect, Database Specialist, AWS
The document discusses Amazon Aurora, Amazon's cloud-optimized relational database. It provides an overview of Aurora's architecture, which breaks apart the traditional monolithic database stack into separate services for improved scalability. The document announces that Amazon Aurora now provides compatibility with PostgreSQL in addition to MySQL. It describes Aurora's high performance and availability compared to open source databases like PostgreSQL through its use of Amazon's cloud-optimized storage.
This document provides an overview of Amazon Aurora including:
- Aurora is a database service that provides the performance and availability of high-end commercial databases at a lower cost.
- It uses a distributed, fault-tolerant storage system across 3 Availability Zones for data durability.
- Aurora provides up to 5x better performance than MySQL and 3x better than PostgreSQL through optimized storage, caching, processing and parallelism.
- It offers high availability with zero downtime and the ability to survive the loss of up to 2 Availability Zones through 6 way data replication.
by Darin Briskman, Technical Evangelist, AWS
Microsoft SQL Server is a commonly-used commercial relational database, especially for organizations that use Microsoft development tools. We’ll look at how to run SQL Server on the AWS Cloud, with examples of organizations using it.
Introducing Amazon Aurora with PostgreSQL Compatibility - AWS Online Tech TalksAmazon Web Services
Learning Objectives:
- Learn about optimizing relational databases for the cloud
- Learn about Amazon Aurora scalability and high availability
- Learn about Amazon Aurora compatibility with PostgreSQL
This document describes an architecture for analyzing large-scale web logs generated by applications hosted on Amazon Web Services. The architecture utilizes Amazon Elastic MapReduce to process log files stored in Amazon S3 using Hadoop. Results are stored in an Amazon RDS database. EC2 spot instances provide additional flexible capacity for log analysis jobs. Amazon CloudFront also generates logs that can be analyzed in this system.
The document discusses building data lakes and analytics on AWS. It provides an overview of challenges with big data like increasing data variety and growth. It then describes how AWS services like S3, Glue, Athena, EMR, and Redshift can be used to address these challenges by enabling quick ingestion of diverse data types, metadata management, and running analytics tools on curated datasets. The document emphasizes storing raw data immutable and using tiered storage for cost optimization. It outlines using the right AWS service based on user roles and discusses how data lakes and data warehouses are complementary.
The document discusses building data lakes and analytics on AWS. It provides an overview of challenges posed by big data including volume, velocity, variety and veracity of data. It then describes how AWS services like S3, Glue and Athena can help address these challenges by allowing quick ingestion and storage of raw data in its original format. The document also discusses best practices for preparing and analyzing data in the lake using services like EMR, Redshift and SageMaker to derive insights and drive machine learning models.
The document discusses building data lakes and analytics on AWS. It provides an overview of challenges posed by big data including volume, velocity, variety and veracity of data. It then describes how AWS services like S3, Glue and Athena can help address these challenges by allowing quick ingestion and storage of raw data in its original format. The document also discusses best practices for preparing and analyzing data in the lake using services like EMR, Redshift and SageMaker to derive insights and drive machine learning models.
Build Data Lakes and Analytics on AWS: Patterns & Best PracticesAmazon Web Services
With over 90% of today’s data generated in the last two years, the rate of data growth is showing no sign of slowing down. In this session, we step through the challenges and best practices for capturing data, understanding what data you own, driving insights, and predicting the future using AWS services. We frame the session and demonstrations around common pitfalls of building data lakes and how to successfully drive analytics and insights from data. We also discuss the architecture patterns brought together key AWS services, including Amazon S3, AWS Glue, Amazon Athena, Amazon Kinesis, and Amazon Machine Learning. Discover the real-world application of data lakes for roles including data scientists and business users.
Stephen Moon, Sr. Solutions Architect, Amazon Web Services
James Juniper, Solution Architect for the Geo-Community Cloud, Natural Resources Canada
Build Data Lakes & Analytics on AWS: Patterns & Best PracticesAmazon Web Services
This document discusses building data lakes and analytics on AWS. It covers challenges with big data like volume, velocity, and variety. An AWS data lake can quickly ingest and store any type of data. The data lake includes analytics, machine learning, real-time data movement, and traditional data movement. Metadata management is important for data lakes. AWS Glue crawlers can discover data in various formats and populate the data catalog. Different tools like Amazon Athena, Amazon EMR, and Amazon Redshift can be used for analytics depending on the user and use case. Machine learning benefits from big data, and a data lake supports agility in machine learning.
This document discusses big data analytics and provides examples of how organizations are leveraging big data. It begins by defining big data as large datasets that require innovative collection, storage, organization, analysis and sharing due to their size. Common sources of big data include human-generated data from activities like social media usage and machine-generated data from sensors and IoT devices. The document then discusses challenges of big data like storage, analytics and collaboration. It provides examples of how AWS services help address these challenges through scalable, flexible and cost-effective solutions.
Optimizing data lakes with Amazon S3 - STG302 - New York AWS SummitAmazon Web Services
The document discusses optimizing data lakes with Amazon S3. It describes how Epic Games uses Amazon S3 as a data lake to collect telemetry data from Fortnite players with Amazon Kinesis and performs real-time analytics with Spark on EMR and queries with DynamoDB. Game designers then use the data to inform decisions about improving gamer engagement.
ABD327_Migrating Your Traditional Data Warehouse to a Modern Data LakeAmazon Web Services
In this session, we discuss the latest features of Amazon Redshift and Redshift Spectrum, and take a deep dive into its architecture and inner workings. We share many of the recent availability, performance, and management enhancements and how they improve your end user experience. You also hear from 21st Century Fox, who presents a case study of their fast migration from an on-premises data warehouse to Amazon Redshift. Learn how they are expanding their data warehouse to a data lake that encompasses multiple data sources and data formats. This architecture helps them tie together siloed business units and get actionable 360-degree insights across their consumer base.
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Amazon Web Services
This document provides a summary of a presentation on building data lakes and analytics on AWS. It discusses:
- The challenges of big data including volume, velocity, variety and veracity.
- How an AWS data lake can address these challenges by quickly ingesting and storing any type of data while providing insights, security and the ability to run the right analytics tools without data movement.
- Key components of a data lake on AWS including storage, data catalog, analytics, machine learning capabilities, and tools for real-time and traditional data movement.
The document discusses data lake architectures on AWS. It defines a data lake as a centralized storage platform capable of storing heterogeneous data sets at virtually limitless scale. It describes how AWS services like S3, Glue, Athena, EMR, Redshift, and Kinesis can be used to build data lakes for storing, cataloging, processing, analyzing and gaining insights from large volumes of diverse data. Examples of using these services for clickstream analytics, real-time analytics, machine learning, and reducing total cost of ownership are also provided.
1) The document discusses serverless real-time analytics using AWS services like Kinesis, Kinesis Firehose, Kinesis Analytics and Athena. It provides examples of how to ingest raw JSON data from sensors in real-time, convert to CSV, aggregate data and generate alerts.
2) The document compares Kinesis Streams and Firehose for ingesting streaming data and explains how to process and analyze real-time data using Kinesis Analytics with SQL. It also discusses visualizing results using QuickSight.
3) The examples show how to model streaming data with schemas, apply windowing functions to aggregate data, detect anomalies with Lambda/SNS and enable real-time predictions with Machine Learning
The document discusses building data lakes and analytics on AWS. It describes how data lakes extend the traditional approach of data warehousing by allowing storage and analysis of structured, semi-structured, and unstructured data at massive scales cost effectively. It provides an overview of various AWS services that can be used for data ingestion, storage, processing, analysis and machine learning with data lakes.
In this session, we show you how to understand what data you have, how to drive insights, and how to make predictions using purpose-built AWS services. Learn about the common pitfalls of building data lakes and discover how to successfully drive analytics and insights from your data. Also learn how services such as Amazon S3, AWS Glue, Amazon Redshift, Amazon Athena, Amazon EMR, Amazon Kinesis, and Amazon ML services work together to build a successful data lake for various roles, including data scientists and business users.
AWS reInvent 2023 recaps from Chicago AWS user groupAWS Chicago
Chicago AWS Solutions Architect Scott Hewitt recaps the non-GenAI updates from AWS re:Invent 2023. Updates range from storage, networking, compute and developer tools.
Best Practices for Building a Data Lake in Amazon S3 and Amazon Glacier, with...Amazon Web Services
Learn how to build a data lake for analytics in Amazon S3 and Amazon Glacier. In this session, we discuss best practices for data curation, normalization, and analysis on Amazon object storage services. We examine ways to reduce or eliminate costly extract, transform, and load (ETL) processes using query-in-place technology, such as Amazon Athena and Amazon Redshift Spectrum. We also review custom analytics integration using Apache Spark, Apache Hive, Presto, and other technologies in Amazon EMR. You'll also get a chance to hear from Airbnb & Viber about their solutions for Big Data analytics using S3 as a data lake.
The document discusses big data analytics and machine learning on AWS. It describes what big data is and the 3Vs of big data - variety, velocity, and volume. It provides examples of AWS services that can be used for big data analytics like S3, Redshift, EMR, Athena, and Kinesis. It also provides examples of customers like Sysco, FINRA, and Nasdaq that are using AWS services to build data lakes and leverage big data analytics.
The document discusses building data lakes on AWS. It describes how data lakes extend the traditional data warehouse approach by allowing storage of both structured and unstructured data at massive scales. Amazon S3 provides durable, available, scalable, and easy-to-use storage for the data lake. AWS Glue crawls data to create a data catalog and can automate ETL processes. Amazon Athena and Amazon EMR enable interactive analysis and big data processing through SQL and Spark. The data lake architecture on AWS supports a variety of analytical use cases.
Value of Data Beyond Analytics by Darin BriskmanSameer Kenkare
The document discusses analytics capabilities provided by Amazon Web Services (AWS). It describes how AWS offers a variety of services for building data lakes, loading and querying data, and performing analytics. These services include Amazon S3, Amazon Redshift, Amazon Athena, Amazon EMR, and Amazon QuickSight. It also provides examples of how customers like Epic Games and a large media company use these AWS analytics services.
In this session, we show you how to understand what data you have, how to drive insights, and how to make predictions using purpose-built AWS services. Learn about the common pitfalls of building data lakes and discover how to successfully drive analytics and insights from your data. Also learn how services such as Amazon S3, AWS Glue, Amazon Redshift, Amazon Athena, Amazon EMR, Amazon Kinesis, and Amazon ML services work together to build a successful data lake for various roles, including data scientists and business users.
inQuba Webinar Mastering Customer Journey Management with Dr Graham HillLizaNolte
HERE IS YOUR WEBINAR CONTENT! 'Mastering Customer Journey Management with Dr. Graham Hill'. We hope you find the webinar recording both insightful and enjoyable.
In this webinar, we explored essential aspects of Customer Journey Management and personalization. Here’s a summary of the key insights and topics discussed:
Key Takeaways:
Understanding the Customer Journey: Dr. Hill emphasized the importance of mapping and understanding the complete customer journey to identify touchpoints and opportunities for improvement.
Personalization Strategies: We discussed how to leverage data and insights to create personalized experiences that resonate with customers.
Technology Integration: Insights were shared on how inQuba’s advanced technology can streamline customer interactions and drive operational efficiency.
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...Jason Yip
The typical problem in product engineering is not bad strategy, so much as “no strategy”. This leads to confusion, lack of motivation, and incoherent action. The next time you look for a strategy and find an empty space, instead of waiting for it to be filled, I will show you how to fill it in yourself. If you’re wrong, it forces a correction. If you’re right, it helps create focus. I’ll share how I’ve approached this in the past, both what works and lessons for what didn’t work so well.
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor IvaniukFwdays
At this talk we will discuss DDoS protection tools and best practices, discuss network architectures and what AWS has to offer. Also, we will look into one of the largest DDoS attacks on Ukrainian infrastructure that happened in February 2022. We'll see, what techniques helped to keep the web resources available for Ukrainians and how AWS improved DDoS protection for all customers based on Ukraine experience
Have you ever been confused by the myriad of choices offered by AWS for hosting a website or an API?
Lambda, Elastic Beanstalk, Lightsail, Amplify, S3 (and more!) can each host websites + APIs. But which one should we choose?
Which one is cheapest? Which one is fastest? Which one will scale to meet our needs?
Join me in this session as we dive into each AWS hosting service to determine which one is best for your scenario and explain why!
In the realm of cybersecurity, offensive security practices act as a critical shield. By simulating real-world attacks in a controlled environment, these techniques expose vulnerabilities before malicious actors can exploit them. This proactive approach allows manufacturers to identify and fix weaknesses, significantly enhancing system security.
This presentation delves into the development of a system designed to mimic Galileo's Open Service signal using software-defined radio (SDR) technology. We'll begin with a foundational overview of both Global Navigation Satellite Systems (GNSS) and the intricacies of digital signal processing.
The presentation culminates in a live demonstration. We'll showcase the manipulation of Galileo's Open Service pilot signal, simulating an attack on various software and hardware systems. This practical demonstration serves to highlight the potential consequences of unaddressed vulnerabilities, emphasizing the importance of offensive security practices in safeguarding critical infrastructure.
Your One-Stop Shop for Python Success: Top 10 US Python Development Providersakankshawande
Simplify your search for a reliable Python development partner! This list presents the top 10 trusted US providers offering comprehensive Python development services, ensuring your project's success from conception to completion.
Introduction of Cybersecurity with OSS at Code Europe 2024Hiroshi SHIBATA
I develop the Ruby programming language, RubyGems, and Bundler, which are package managers for Ruby. Today, I will introduce how to enhance the security of your application using open-source software (OSS) examples from Ruby and RubyGems.
The first topic is CVE (Common Vulnerabilities and Exposures). I have published CVEs many times. But what exactly is a CVE? I'll provide a basic understanding of CVEs and explain how to detect and handle vulnerabilities in OSS.
Next, let's discuss package managers. Package managers play a critical role in the OSS ecosystem. I'll explain how to manage library dependencies in your application.
I'll share insights into how the Ruby and RubyGems core team works to keep our ecosystem safe. By the end of this talk, you'll have a better understanding of how to safeguard your code.
"Choosing proper type of scaling", Olena SyrotaFwdays
Imagine an IoT processing system that is already quite mature and production-ready and for which client coverage is growing and scaling and performance aspects are life and death questions. The system has Redis, MongoDB, and stream processing based on ksqldb. In this talk, firstly, we will analyze scaling approaches and then select the proper ones for our system.
Taking AI to the Next Level in Manufacturing.pdfssuserfac0301
Read Taking AI to the Next Level in Manufacturing to gain insights on AI adoption in the manufacturing industry, such as:
1. How quickly AI is being implemented in manufacturing.
2. Which barriers stand in the way of AI adoption.
3. How data quality and governance form the backbone of AI.
4. Organizational processes and structures that may inhibit effective AI adoption.
6. Ideas and approaches to help build your organization's AI strategy.
How information systems are built or acquired puts information, which is what they should be about, in a secondary place. Our language adapted accordingly, and we no longer talk about information systems but applications. Applications evolved in a way to break data into diverse fragments, tightly coupled with applications and expensive to integrate. The result is technical debt, which is re-paid by taking even bigger "loans", resulting in an ever-increasing technical debt. Software engineering and procurement practices work in sync with market forces to maintain this trend. This talk demonstrates how natural this situation is. The question is: can something be done to reverse the trend?
5th LF Energy Power Grid Model Meet-up SlidesDanBrown980551
5th Power Grid Model Meet-up
It is with great pleasure that we extend to you an invitation to the 5th Power Grid Model Meet-up, scheduled for 6th June 2024. This event will adopt a hybrid format, allowing participants to join us either through an online Mircosoft Teams session or in person at TU/e located at Den Dolech 2, Eindhoven, Netherlands. The meet-up will be hosted by Eindhoven University of Technology (TU/e), a research university specializing in engineering science & technology.
Power Grid Model
The global energy transition is placing new and unprecedented demands on Distribution System Operators (DSOs). Alongside upgrades to grid capacity, processes such as digitization, capacity optimization, and congestion management are becoming vital for delivering reliable services.
Power Grid Model is an open source project from Linux Foundation Energy and provides a calculation engine that is increasingly essential for DSOs. It offers a standards-based foundation enabling real-time power systems analysis, simulations of electrical power grids, and sophisticated what-if analysis. In addition, it enables in-depth studies and analysis of the electrical power grid’s behavior and performance. This comprehensive model incorporates essential factors such as power generation capacity, electrical losses, voltage levels, power flows, and system stability.
Power Grid Model is currently being applied in a wide variety of use cases, including grid planning, expansion, reliability, and congestion studies. It can also help in analyzing the impact of renewable energy integration, assessing the effects of disturbances or faults, and developing strategies for grid control and optimization.
What to expect
For the upcoming meetup we are organizing, we have an exciting lineup of activities planned:
-Insightful presentations covering two practical applications of the Power Grid Model.
-An update on the latest advancements in Power Grid -Model technology during the first and second quarters of 2024.
-An interactive brainstorming session to discuss and propose new feature requests.
-An opportunity to connect with fellow Power Grid Model enthusiasts and users.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/temporal-event-neural-networks-a-more-efficient-alternative-to-the-transformer-a-presentation-from-brainchip/
Chris Jones, Director of Product Management at BrainChip , presents the “Temporal Event Neural Networks: A More Efficient Alternative to the Transformer” tutorial at the May 2024 Embedded Vision Summit.
The expansion of AI services necessitates enhanced computational capabilities on edge devices. Temporal Event Neural Networks (TENNs), developed by BrainChip, represent a novel and highly efficient state-space network. TENNs demonstrate exceptional proficiency in handling multi-dimensional streaming data, facilitating advancements in object detection, action recognition, speech enhancement and language model/sequence generation. Through the utilization of polynomial-based continuous convolutions, TENNs streamline models, expedite training processes and significantly diminish memory requirements, achieving notable reductions of up to 50x in parameters and 5,000x in energy consumption compared to prevailing methodologies like transformers.
Integration with BrainChip’s Akida neuromorphic hardware IP further enhances TENNs’ capabilities, enabling the realization of highly capable, portable and passively cooled edge devices. This presentation delves into the technical innovations underlying TENNs, presents real-world benchmarks, and elucidates how this cutting-edge approach is positioned to revolutionize edge AI across diverse applications.
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyScyllaDB
Freshworks creates AI-boosted business software that helps employees work more efficiently and effectively. Managing data across multiple RDBMS and NoSQL databases was already a challenge at their current scale. To prepare for 10X growth, they knew it was time to rethink their database strategy. Learn how they architected a solution that would simplify scaling while keeping costs under control.
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...DanBrown980551
This LF Energy webinar took place June 20, 2024. It featured:
-Alex Thornton, LF Energy
-Hallie Cramer, Google
-Daniel Roesler, UtilityAPI
-Henry Richardson, WattTime
In response to the urgency and scale required to effectively address climate change, open source solutions offer significant potential for driving innovation and progress. Currently, there is a growing demand for standardization and interoperability in energy data and modeling. Open source standards and specifications within the energy sector can also alleviate challenges associated with data fragmentation, transparency, and accessibility. At the same time, it is crucial to consider privacy and security concerns throughout the development of open source platforms.
This webinar will delve into the motivations behind establishing LF Energy’s Carbon Data Specification Consortium. It will provide an overview of the draft specifications and the ongoing progress made by the respective working groups.
Three primary specifications will be discussed:
-Discovery and client registration, emphasizing transparent processes and secure and private access
-Customer data, centering around customer tariffs, bills, energy usage, and full consumption disclosure
-Power systems data, focusing on grid data, inclusive of transmission and distribution networks, generation, intergrid power flows, and market settlement data
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...Fwdays
Direct losses from downtime in 1 minute = $5-$10 thousand dollars. Reputation is priceless.
As part of the talk, we will consider the architectural strategies necessary for the development of highly loaded fintech solutions. We will focus on using queues and streaming to efficiently work and manage large amounts of data in real-time and to minimize latency.
We will focus special attention on the architectural patterns used in the design of the fintech system, microservices and event-driven architecture, which ensure scalability, fault tolerance, and consistency of the entire system.
Session 1 - Intro to Robotic Process Automation.pdfUiPathCommunity
👉 Check out our full 'Africa Series - Automation Student Developers (EN)' page to register for the full program:
https://bit.ly/Automation_Student_Kickstart
In this session, we shall introduce you to the world of automation, the UiPath Platform, and guide you on how to install and setup UiPath Studio on your Windows PC.
📕 Detailed agenda:
What is RPA? Benefits of RPA?
RPA Applications
The UiPath End-to-End Automation Platform
UiPath Studio CE Installation and Setup
💻 Extra training through UiPath Academy:
Introduction to Automation
UiPath Business Automation Platform
Explore automation development with UiPath Studio
👉 Register here for our upcoming Session 2 on June 20: Introduction to UiPath Studio Fundamentals: https://community.uipath.com/events/details/uipath-lagos-presents-session-2-introduction-to-uipath-studio-fundamentals/
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving
Manufacturing custom quality metal nameplates and badges involves several standard operations. Processes include sheet prep, lithography, screening, coating, punch press and inspection. All decoration is completed in the flat sheet with adhesive and tooling operations following. The possibilities for creating unique durable nameplates are endless. How will you create your brand identity? We can help!
20. E
la
s
tic
L
o
a
d
B
a
la
n
c
in
g
A
E
la
s
tic
L
o
a
d
B
a
la
n
c
in
g
B
Amazon
R
o
u
te
)
z
Amazon
C
lo
u
d
F
ro
n
t
Amazon
S
z
A
u
to
S
c
a
lin
g
A
u
to
S
c
a
lin
g
A
m
a
zo
n
E
C
6
A
m
a
zo
n
E
C
6A
u
to
S
c
a
lin
g
A
u
to
S
c
a
lin
g
A
m
a
zo
n
E
C
6
A
m
a
zo
n
E
C
6
Database
Servers
Load
Balancer
Load
Balancer
Web
ServersWeb
Servers
Application
ServersApplication
Servers
Application
ServersApplication
Servers
A
m
a
zo
n
R
D
S
M
u
lti-A
Z
S
ta
n
d
b
yA
m
a
zo
n
R
D
S
M
u
lti-A
Z
S
ta
n
d
b
y
S
y
n
c
h
ro
n
o
u
s
R
e
p
lic
a
tio
n
E
la
s
tic
L
o
a
d
B
a
la
n
c
in
g
A
m
a
zo
n
R
D
S
M
a
s
te
r
A
m
a
zo
n
R
D
S
M
a
s
te
r
Resources and
Static Content
Content
Delivery
Network
DNS Resolution
Web
ServersWeb
Servers
Highly available and scalable web hosting can be complex and
expensiveZ Dense peak periods and wild swings in traffic patterns
result in low utilization of expensive hardwareZ Amazon Web
Services provides the reliableR scalableR secureR and high-
performance infrastructure required for web applications while
enabling an elasticR scale-out and scale-down infrastructure to
match IT costs in real time as customer traffic fluctuatesZ
System
Overview
WEB APPLICATION
HOSTING
A
m
a
zo
n
R
o
u
te
)
z
A
m
a
zo
n
S
z
A
m
a
zo
n
E
C
6
E
la
stic
L
o
a
d
B
a
la
n
cin
g
Amazon CloudFront
AWS
Reference
Architectures
A
u
to
S
c
a
lin
g
A
m
a
zo
n
R
D
S
7
6
z
(
) 7
6
6
6
(
)
z
7
7
7
The userBs DNS requests are served by Amazon Route
)zR a highly available Domain Name System 'DNSN
serviceZ Network traffic is routed to infrastructure running in
Amazon Web ServicesZ
StaticR streamingR and dynamic content is delivered by
Amazon CloudFrontR a global network of edge
locationsZ Requests are automatically routed to the nearest
edge locationR so content is delivered with the best possible
performanceZ
HTTP requests are first handled by Elastic Load
BalancingR which automatically distributes incoming
application traffic among multiple Amazon Elastic Compute
Cloud OEC6P instances across Availability Zones 'AZsNZ It
enables even greater fault tolerance in your applicationsR
seamlessly providing the amount of load balancing capacity
needed in response to incoming application trafficZ
Web servers and application servers are deployed on
Amazon ECj instancesZ Most organizations will select
an Amazon Machine Image OAMIP and then customize it to
their needsZ This custom AMI will then become the starting
point for future web developmentZ
Web servers and application servers are deployed in an
Auto Scaling groupZ Auto Scaling automatically adjusts
your capacity up or down according to conditions you defineZ
With Auto ScalingR you can ensure that the number of
Amazon EC6 instances you’re using increases seamlessly
during demand spikes to maintain performance and
decreases automatically during demand to minimize costsZ
To provide high availabilityR the relational database that
contains applicationBs data is hosted redundantly on a
multi-AZ 'multiple Availability Zones–zones A and B hereN
deployment of Amazon Relational Database Service
'Amazon RDSNZ
Resources and static content used by the web
application are stored on Amazon Simple Storage
Service OSzPR a highly durable storage infrastructure
designed for mission-critical and primary data storageZ
21. satisfactory player experience. Amazon Web Services provides
different tools and services that can be used for building online
games that scale under high usage traffic patterns.
This document presents a cost-effective online game architecture
featuring automatic capacity adjustment, a highly available and
high-speed database, and a data processing cluster for player
behavior analysis.
System
Overview
ONLINE
GAMES
A
m
a
z
o
n
E
C
2
E
lastic
Load
B
alancing
A
m
azon
D
yn
am
o
D
B
A
m
a
z
o
n
E
M
R
A
u
to
S
c
a
lin
g
AWS
Reference
Architectures
A
m
a
z
o
n
S
3
Online games back-end infrastructures can be challenging to
maintain and operate. Peak usage periods, multiple players, and
high volumes of write operations are some of the most common
problems that operations teams face.
But the most difficult challenge is ensuring flexibility in the scale of
that system. A popular game might suddenly receive millions of
users in a matter of hours, yet it must continue to provide a
______
A
m
a
z
o
n
S
E
S
1 Browser games can be represented as client-server
applications. The client generally consists of static files,
such as images, sounds, flash applications, or Java applets.
Those files are hosted on Amazon Simple Storage Service
(Amazon S3), a highly available and reliable data store.
5 Log files generated by each web server are pushed
back into Amazon S3 for long-term storage.
2 As the user base grows and becomes more
geographically distributed, a high-performance cache
like Amazon CloudFront can provide substantial
improvements in latency, fault tolerance, and cost. By using
Amazon S3 as the origin server for the Amazon CloudFront
distribution, the game infrastructure benefits from fast
network data transfer rates and a simple publishing/caching
workflow.
3 Requests from the game application are distributed by
Elastic Load Balancing to a group of web servers
running on Amazon Elastic Compute Cloud (Amazon EC2)
instances. Auto Scaling automatically adjusts the size of this
group, depending on rules like network load, CPU usage, and
so on.
4 Player data is persisted on Amazon DynamoDB, a
fully managed NoSQL database service. As the player
population grows, Amazon DynamoDB provides predictable
performance with seamless scalability.
A
m
a
z
o
n
R
o
u
te
5
3
6 Managing and analyzing high data volumes produced
by online games platforms can be challenging. Amazon
Elastic MapReduce (Amazon EMR) is a service that
processes vast amounts of data easily. Input data can be
retrieved from web server logs stored on Amazon S3 or from
player data stored in Amazon DynamoDB tables to run
analytics on player behavior, usage patterns, etc. Those
results can be stored again on Amazon S3, or inserted in a
relational database for further analysis with classic business
intelligence tools.
7 Based on the needs of the game, Amazon Simple
Email Service (Amazon SES) can be used to send
email to players in a cost-effective and scalable way.
A
m
azon
C
lo
u
d
F
ro
n
t
w
w
w
.m
ygam
e
.co
m
A
m
azo
n
R
o
u
te
5
3
DNS
Resolution
A
m
a
z
o
n
D
y
n
a
m
o
D
B
G
am
e
in
te
ractio
n
(statu
s,JSO
N
,...)
A
u
to
Scalin
g
A
u
to
Scalin
g
Elastic
Lo
ad
B
alan
cin
g
Web
Servers
A
m
azo
n
C
lo
u
d
Fro
n
t
Content
Delivery
Network
A
m
azo
n
S3
G
am
e
file
s
(flash
,ap
p
let,...)
Files
Repository
Game
Database
A
m
azo
n
Elastic
M
ap
R
e
d
u
ce
Game
Analysis
log files
4
5
logfiles
Game
clientfiles
7
A
m
azo
n
SES
●
6
Players
Email
Emitter
logfiles
2
1
3
22. Customers want to find the products they are interested in quickly,
and they expect pages to load quickly. Worldwide customers want
to be able to make purchases at any time, so the website should
be highly available. Meeting these challenges becomes harder as
your catalog and customer base grow.
With the tools that AWS provides, you can build a compelling,
scalable website with a searchable product catalog that is
accessible with very low latency.
System
Overview
E-COMMERCE
WEB SITE
PART 1: WEB FRONT-END
A
m
azon
R
o
u
te
53
A
m
azon
D
yn
am
o
D
B
A
m
azon
E
lastiC
ach
e
A
W
S
E
lastic
B
eanstalk
AWS
Reference
Architectures
A
m
azon
S
3
With Amazon Web Services, you can build a highly available e-
commerce website with a flexible product catalog that scales with
your business.
Maintaining an e-commerce website with a large product catalog
and global customer base can be challenging. The catalog should
be searchable, and individual product pages should contain a rich
information set that includes, for example, images, a PDF manual,
and customer reviews.
A
m
azon
C
lo
u
d
F
ro
n
t
1 DNS requests to the e-commerce website are handled
by Amazon Route 53, a highly available Domain Name
System (DNS) service.
5 Amazon DynamoDB is a fully-managed, high
performance, NoSQL database service that is easy to
set up, operate, and scale. It is used both as a session store
for persistent session data, such as the shopping cart, and as
the product database. Because DynamoDB does not have a
schema, we have a great deal of flexibility in adding new
product categories and attributes to the catalog.
2 Amazon CloudFront is a content distribution network
(CDN) with edge locations around the globe. It can
cache static and streaming content and deliver dynamic
content with low latency from locations close to the customer.
3 The e-commerce application is deployed by AWS
Elastic Beanstalk, which automatically handles the
details of capacity provisioning, load balancing, auto scaling,
and application health monitoring.
4 Amazon Simple Storage Service (Amazon S3) stores
all static catalog content, such as product images,
manuals, and videos, as well as all log files and clickstream
information from Amazon CloudFront and the e-commerce
application.
6 Amazon ElastiCache is used as a session store for
volatile data and as a caching layer for the product
catalog to reduce I/O (and cost) on DynamoDB.
7 Product catalog data is loaded into Amazon
CloudSearch, a fully managed search service that
provides fast and highly scalable search functionality.
8 When customers check out their products, they are
redirected to an SSL-encrypted checkout service.
9 A marketing and recommendation service consumes
log data stored on Amazon S3 to provide the customer
with product recommendations.
A
m
azon
C
lo
u
d
S
earch
Custom
er
A
W
S
E
la
s
tic
B
e
a
n
s
ta
lk
A
m
a
z
o
n
C
lo
u
d
F
ro
n
t
A
m
a
z
o
n
R
o
u
te
5
3
1
6
A
m
a
z
o
n
E
la
s
tiC
a
c
h
e
5
9
A
W
S
E
la
s
tic
B
e
a
n
s
ta
lk
A
W
S
E
la
s
tic
B
e
a
n
s
ta
lk
A
W
S
E
la
s
tic
B
e
a
n
s
ta
lk
A
W
S
E
la
s
tic
B
e
a
n
s
ta
lk
A
m
a
z
o
n
C
lo
u
d
S
e
a
rc
h
A
m
a
z
o
n
S
3
A
m
a
z
o
n
D
y
n
a
m
o
D
B
LOGS
MARKETING AND
RECOMMENDATION
SERVICE
Part
3
CHECKOUT
SERVICE
Part
2
E-commerce
Application
Recommendation
Web Service
Recommendation
Web Service
Catalog Cache &
Transient Session
Store
Search
Engine
Product Catalog &
Persistent Session
Store
Checkout
Application
Checkout
Application
Log File Repository &
Static Catalog Content
DNS
2
3
4
7
8
Secure
Connection
Secure
Connection
23. Customers expect their private data, such as their purchase
history and their credit card information, to be managed on a
secure infrastructure and application stack. AWS has achieved
multiple security certifications relevant to e-commerce business,
including the Payment Cards Industry (PCI) Data Security
Standard (DSS).
With the tools that AWS provides, you can build a secure checkout
service that manages the purchasing workflow from order to
fulfillment.
System
Overview
A
m
a
z
o
n
V
P
C
A
m
azon
S
E
S
Amazon EC2
E
la
s
tic
B
e
a
n
s
ta
lk
AWS
Reference
Architectures
A
m
a
z
o
n
R
D
S
With Amazon Web Services, you can build a secure and highly
available checkout service for your e-commerce website that
scales with your business. Managing the checkout process
involves many steps, which have to be coordinated. Some steps,
such as credit card transactions, are subject to specific regulatory
requirements. Other parts of the process involve manual labor,
such as picking, packing, and shipping items from a warehouse.
Amazon SW
F
1 The e-commerce web front end redirects the customer
to an SSL-encrypted checkout application to
authenticate the customer and execute a purchase.
5 SWF Workers are deployed on Amazon EC2
instances within a private subnet. The EC2 instances
are part of an Auto Scaling group, which can scale in and
out according to demand. The Workers manage the different
steps of the checkout pipeline, such as validating the order,
reserving and charging the credit card, and triggering the
sending of order and shipping confirmation emails.
2 The checkout application, which is deployed by AWS
Elastic Beanstalk, uses Amazon Simple Workflow
Service (Amazon SWF) to authenticate the customer and
trigger a new order workflow.
3 Amazon SWF coordinates all running order workflows
by using SWF Deciders and SWF Workers.
4 The SWF Decider implements the workflow logic. It
runs on an Amazon Elastic Compute Cloud (Amazon
EC2) instance within a private subnet that is isolated from the
public Internet.
6 SWF Workers can also be implemented on mobile
devices, such as tablets or smartphones, in order to
integrate pick, pack, and ship steps into the overall order
workflow.
7 Amazon Simple Email Service (Amazon SES) is used
to send transactional email, such as order and shipping
confirmations, to the customer.
8 To provide high availability, the customer and orders
databases are hosted redundantly on a multi-AZ (multi
Availability Zone) deployment of Amazon Relational
Database Service (Amazon RDS)within private subnets that
are isolated from the public Internet.
E-COMMERCE
WEB SITE
PART 2: CHECKOUT SERVICE
●
7
2
1
5
4
A
W
S
E
la
s
tic
B
e
a
n
s
ta
lk
A
W
S
E
la
s
tic
B
e
a
n
s
ta
lk
WEB FRONT-END
Part
1 E-Commerce
Application
E-Commerce
Application
Custom
er
A
W
S
E
la
s
tic
B
e
a
n
s
ta
lk
A
m
a
z
o
n
S
E
S
Checkout
Application
Email
Service
Customers & Orders
Database
Mobile Workers
(in warehouse)
A
u
to
S
c
a
lin
g
A
u
to
S
c
a
lin
g
A
m
a
z
o
n
S
W
F
A
m
a
z
o
n
R
D
S
M
a
s
te
r
A
m
a
z
o
n
R
D
S
M
u
lti-A
Z
S
ta
n
d
b
y
Workers
Workers
Decider
Decider
Order Emails
6
8
Workflow
Service 3
24. CHECKOUT
SERVICE
Part
2
A
W
S
E
la
s
tic
B
e
a
n
s
ta
lk
●
A
m
a
z
o
n
S
E
S
7
5
3
A
m
a
z
o
n
R
D
S
R
e
a
d
R
e
p
lic
a
A
m
a
z
o
n
E
la
s
tic
M
a
p
R
e
d
u
c
e
A
m
a
z
o
n
D
y
n
a
m
o
D
B
A
m
a
z
o
n
S
3A
m
a
z
o
n
S
3
1
Email
Service
Marketing
Mgmt App
User
Profiles
A
m
a
z
o
n
R
D
S
M
a
s
te
r
A
m
a
z
o
n
R
D
S
M
a
s
te
r
Customer
& Orders DB
Customer
& Orders DB
Log File
Repository
Log File
Repository
Recommendation
Web Service
Customer
& Orders DB
Read Replica
A
W
S
E
la
s
tic
B
e
a
n
s
ta
lk
A
W
S
E
la
s
tic
B
e
a
n
s
ta
lk
A
W
S
E
la
s
tic
B
e
a
n
s
ta
lk
WEB FRONT-END
Part
1
E-commerce
Application
E-commerce
Application
Marketing
Emails
Marketing
Manager
Custom
ers
46
2
The insights that you gain about your customers can also be used
to manage personalized marketing campaigns targeted at specific
customer segments.
With the tools that AWS provides, you can build highly scalable
recommendation services that can be consumed by different
channels, such as dynamic product recommendations on the
e - commerce website or targeted email campaigns for your
customers.
System
Overview
E-COMMERCE
WEBSITE
PART 3: MARKETING & RECOMMENDATIONS
A
m
a
z
o
n
E
M
R
A
m
a
z
o
n
S
E
S
A
W
S
E
lastic
B
ean
stalk
A
W
S
E
lastic
B
eanstalk
AWS
Reference
Architectures
A
m
a
z
o
n
R
D
S
With Amazon Web Services, you can build a recommendation and
marketing service to manage targeted marketing campaigns and
offer personalized product recommendations to customers who
are browsing your e-commerce site.
In order to build such a service, you have to process very large
amounts of data from multiple data sources. The resulting user
profile information has to be available to deliver real-time product
recommendations on your e-commerce website.
A
m
a
z
o
n
S
3
1 Amazon Elastic MapReduce (Amazon EMR) is a
hosted Hadoop framework that runs on Amazon Elastic
Compute Cloud (Amazon EC2) instances. It aggregates and
processes user data from server log files and from the
customer´s purchase history.
5 A recommendation web service used by the web front
end is deployed by AWS Elastic Beanstalk. This
service uses the profile information stored on Amazon
DynamoDB to provide personalized recommendations to be
mm
shown on the e-commerce web front end.
2 An Amazon Relational Database Services (Amazon
RDS) Read Replica of customer and order databases is
used by Amazon EMR to compute user profiles and by
Amazon Simple Email Service (Amazon SES) to send
targeted marketing emails to customers.
3 Log files produced by the e-commerce web front end
have been stored on Amazon Simple Storage Service
(Amazon S3) and are consumed by the Amazon EMR cluster
to compute user profiles.
4 User profile information generated by the Amazon EMR
cluster is stored in Amazon DynamoDB, a scalable,
high-performance managed NoSQL database that can serve
recommendations with low latency.
6 A marketing administration application deployed by
AWS Elastic Beanstalk is being used by marketing
managers to send targeted email campaigns to customers
with specific user profiles. The application reads customer
email addresses from an Amazon RDS Read Replica of the
customer database.
7 Amazon SES is used to send marketing emails to
customers. Amazon SES is based on the scalable
technology used by Amazon web sites around the world to
send billions of messages a year.
A
m
azon
D
yn
am
o
D
B
25. A
W
S
D
ata P
ip
elin
e
This elasticity is achieved by using Auto Scaling groups for ingest
processing, AWS Data Pipeline for scheduled Amazon Elastic
MapReduce jobs, AWS Data Pipeline for intersystem data
orchestration, and Amazon Redshift for potentially massive-scale
analysis. Key architectural throttle points involving Amazon SQS
for sensor message buffering and less frequent AWS Data
Pipeline scheduling keep the overall solution costs predictable and
controlled.
System
Overview
TIME SERIES
PROCESSING
A
m
azon
E
C
2
A
m
azon
E
M
R
A
m
azon
D
yn
am
o
D
B
A
W
S
D
a
ta
P
ip
e
lin
e
A
u
to
S
c
a
lin
g
AWS
Reference
Architectures
A
m
azon
S
3
When data arrives as a succession of regular measurements, it is
known as time series information. Processing of time series
information poses systems scaling challenges that the elasticity of
AWS services is uniquely positioned to address.
A
m
azon
S
Q
S
A
m
azon
E
C
2
S
p
o
t
2 Send messages to an Amazon Simple Queue Service
queue for processing into Amazon DynamoDB using
autoscaled Amazon EC2 workers. Or, if the sensor source
can do so, post sensor samples directly to Amazon
DynamoDB. Try starting with a DynamoDB table that is a
week-oriented, time-based table structure.
2
1
6
3
3 If a Supervisory Control and Data Acquisition (SCADA)
system exists, create a flow of samples to or from
Amazon DynamoDB to support additional cloud processing
or other existing systems, respectively.
4 Using AWS Data Pipeline, create a pipeline with a
regular Amazon Elastic MapReduce job that both
calculates expensive sample processing and delivers
samples and results.
4
7
7 The pipeline also optionally exports results in a
format custom applications can accept.
Corporate
Data Center
A
m
a
z
o
n
S
Q
S
A
m
a
z
o
n
D
y
n
a
m
o
D
B
A
u
to
S
c
a
lin
g
Worker
Nodes
S
e
n
s
o
r
S
a
m
p
le
d
D
a
ta
SCADA
AmazonS3
R
e
m
o
te
S
e
n
s
o
r
M
e
s
s
a
g
e
s
A
m
a
z
o
n
Elastic
MapReduce
+EC2 Spot Instances
A
m
a
z
o
n
R
e
d
s
h
ift
5
Custom
Application
5 The pipeline places results into Amazon Redshift for
additional analysis.
8 Amazon Redshift optionally imports historic
samples to reside with calculated results.
9 Using in-house or Amazon partner business
intelligence solutions, Amazon Redshift supports
additional analysis on a potentially massive scale.
1 Remote devices such as power meters, mobile clients,
ad-network clients, industrial meters, satellites, and
environmental meters measure the world around them and
send sampled sensor data as messages via HTTP(S) for
processing.
6 The pipeline exports historical week-oriented
sample tables, from Amazon DynamoDB to
Amazon Simple Storage Service (Amazon S3)
Business
Intelligence
User
8
9
A
m
a
z
o
n
E
C
2
26. PROBLEMS SUMMARISED
• So many types of clients
• So many users
• Low latency expectance
• So many servers
• Many data-centers
• So many services
• Inter-service communication
• Inter-dc communication
• Service consumer to dc
routing
• Low latency expectance!
W
H
AT
ABO
UT
D
EPLO
YM
EN
T
27. DEVELOPMENT
• Developers develop
• They need to develop together
• They need to see how their code works together
• Customers need to see what is happening
• Staging, testing
42. SOME NUMBERS
6 digits ($) hosting invoice
~200k RPM
more than 500 servers
You can compare us with others:
https://aws.amazon.com/solutions/case-studies/all/
ik@metglobal.com