In this session, Darin Briskman dives deep into what databases to use for which components of your application. Learn how to evaluate a new workload for the best managed database option based on specific application needs related to data shape, data size at limit, computational requirements, programmability, throughput and latency needs, and more. This session explains the ideal use cases for relational and non-relational database services, including Amazon Aurora, Amazon DynamoDB, Amazon ElastiCache for Redis, Amazon Neptune, and Amazon Redshift.
Darin Briskman, Chief Evangelist, Database, Analytics, & Machine Learning, Amazon Web Services
In this session, we show you how to understand what data you have, how to drive insights, and how to make predictions using purpose-built AWS services. Learn about the common pitfalls of building data lakes and discover how to successfully drive analytics and insights from your data. Also learn how services such as Amazon S3, AWS Glue, Amazon Redshift, Amazon Athena, Amazon EMR, Amazon Kinesis, and Amazon ML services work together to build a successful data lake for various roles, including data scientists and business users.
Data preparation and transformation - Spin your straw into gold - Tel Aviv Su...Amazon Web Services
Data preparation is always a challenge. Why care about infrastructure?
Come learn how to deploy your Spark jobs in minutes using our managed services, EMR & Glue and focus on your business needs.
AWS Summit Singapore - Architecting a Serverless Data Lake on AWSAmazon Web Services
Unni Pillai, Specialist Solution Architect, ASEAN, AWS.
Daniel Muller, Head of Cloud Infrastructure, Spuul.
As the volume and types of data continues to grow, customers often have valuable data that is not easily discoverable and available for analytics. A common challenge for data engineering teams is architecting a data lake that can cater to the needs of diverse users - from developers to business analysts to data scientists.
In this session, we will dive deep into building a data lake using Amazon S3, Amazon Kinesis, Amazon Athena and AWS Glue. We will also see how AWS Glue crawlers can automatically discover your data, extracting and cataloguing relevant metadata to reduce operations in preparing your data for downstream consumers.
Furthermore, learn from our customer Spuul, on how they moved from a Data Warehouse based analytics to a serverless data lake. Why and how did Spuul undertake this journey? Hear about the benefits and challenges they encountered.
Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018Amazon Web Services
We’re witnessing an unprecedented growth in the amount of data collected and stored in the cloud. Getting insights from this data requires database and analytics services that scale and perform in ways not possible before. AWS offers the broadest set of database and analytics services to process, store, manage, and analyze all your data. In this session, we provide an overview of the database and analytics services at AWS, new services and features we launched this year, how customers are using these services, and our vision for continued innovation in this space.
Your data has value for multiple business functions in your organization. Shorten your time to analytics and take faster, better decisions based on data.
In this session you will learn how you can access your data from a myriad of tools such as multiple EMR clusters, Athena & Redshift.
The Open Data Lake Platform Brief - Data Sheets | WhitepaperVasu S
An open data lake platform provides a robust and future-proof data management paradigm to support a wide range of data processing needs, including data exploration, ad-hoc analytics, streaming analytics, and machine learning.
In this session, we show you how to understand what data you have, how to drive insights, and how to make predictions using purpose-built AWS services. Learn about the common pitfalls of building data lakes and discover how to successfully drive analytics and insights from your data. Also learn how services such as Amazon S3, AWS Glue, Amazon Redshift, Amazon Athena, Amazon EMR, Amazon Kinesis, and Amazon ML services work together to build a successful data lake for various roles, including data scientists and business users.
Data preparation and transformation - Spin your straw into gold - Tel Aviv Su...Amazon Web Services
Data preparation is always a challenge. Why care about infrastructure?
Come learn how to deploy your Spark jobs in minutes using our managed services, EMR & Glue and focus on your business needs.
AWS Summit Singapore - Architecting a Serverless Data Lake on AWSAmazon Web Services
Unni Pillai, Specialist Solution Architect, ASEAN, AWS.
Daniel Muller, Head of Cloud Infrastructure, Spuul.
As the volume and types of data continues to grow, customers often have valuable data that is not easily discoverable and available for analytics. A common challenge for data engineering teams is architecting a data lake that can cater to the needs of diverse users - from developers to business analysts to data scientists.
In this session, we will dive deep into building a data lake using Amazon S3, Amazon Kinesis, Amazon Athena and AWS Glue. We will also see how AWS Glue crawlers can automatically discover your data, extracting and cataloguing relevant metadata to reduce operations in preparing your data for downstream consumers.
Furthermore, learn from our customer Spuul, on how they moved from a Data Warehouse based analytics to a serverless data lake. Why and how did Spuul undertake this journey? Hear about the benefits and challenges they encountered.
Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018Amazon Web Services
We’re witnessing an unprecedented growth in the amount of data collected and stored in the cloud. Getting insights from this data requires database and analytics services that scale and perform in ways not possible before. AWS offers the broadest set of database and analytics services to process, store, manage, and analyze all your data. In this session, we provide an overview of the database and analytics services at AWS, new services and features we launched this year, how customers are using these services, and our vision for continued innovation in this space.
Your data has value for multiple business functions in your organization. Shorten your time to analytics and take faster, better decisions based on data.
In this session you will learn how you can access your data from a myriad of tools such as multiple EMR clusters, Athena & Redshift.
The Open Data Lake Platform Brief - Data Sheets | WhitepaperVasu S
An open data lake platform provides a robust and future-proof data management paradigm to support a wide range of data processing needs, including data exploration, ad-hoc analytics, streaming analytics, and machine learning.
AWS-powered services for analytics can handle the scale, agility, and flexibility required to combine different types of data and analytics approaches that will allow you to transform your data into a valuable corporate asset. In this session, AWS will provide an overview of the different AWS services available for your data analytics needs. You can combine these blocks to build data flows that will extend your organization’s agility, ability to derive more insights and value from its data, and capability to adopt more sophisticated analytics tools and processes as your needs evolve. In the second part of the session, Paddy Power Betfair’s Data team will discuss the adoption and large scale operation of a broad range of AWS services that make up PPB’s scalable, mixed workload, multi-brand data platform. The data capabilities developed by PPB and powered by AWS were implemented to enable low-latency, high-volume and near real-time advanced analytics use cases, in the highly regulated and fast-paced betting industry. This was only possible through a focus on automation, innovation and continuous improvement.
Modern data is massive, quickly evolving, unstructured, and increasingly hard to catalog and understand from multiple consumers and applications. This session will guide you though the best practices for designing a robust data architecture, highlightning the benefits and typical challenges of data lakes and data warehouses. We will build a scalable solution based on managed services such as Amazon Athena, AWS Glue, and AWS Lake Formation.
Come along and learn about the enhancements that we have made to Amazon Redshift, including features around performance, scalability and cluster management. We will also explore why a cloud native Data Warehouse solution allows AWS to innovate faster and deliver customer outcomes that are not possible in a more traditional on premises solution.
by Mamoon Chowdry, Solutions Architect
AWS Data & Analytics Week is an opportunity to learn about Amazon’s family of managed analytics services. These services provide easy, scalable, reliable, and cost-effective ways to manage your data in the cloud. We explain the fundamentals and take a technical deep dive into Amazon Redshift data warehouse; Data Lake services including Amazon EMR, Amazon Athena, & Amazon Redshift Spectrum; Log Analytics with Amazon Elasticsearch Service; and data preparation and placement services with AWS Glue and Amazon Kinesis. You'll will learn how to get started, how to support applications, and how to scale.
A data lake is an architectural approach that allows you to store massive amounts of data into a central location, so it's readily available to be categorized, processed, analyzed and consumed by diverse groups within an organization.In this session, we will introduce the Data Lake concept and its implementation on AWS.We will explain the different roles our services play and how they fit into the Data Lake picture.
Today’s organisations require a data storage and analytics solution that offers more agility and flexibility than traditional data management systems. Data Lake is a new and increasingly popular way to store all of your data, structured and unstructured, in one, centralised repository. Since data can be stored as-is, there is no need to convert it to a predefined schema and you no longer need to know what questions you want to ask of your data beforehand.
In this webinar, you will discover how AWS gives you fast access to flexible and low-cost IT resources, so you can rapidly scale and build your data lake that can power any kind of analytics such as data warehousing, clickstream analytics, fraud detection, recommendation engines, event-driven ETL, serverless computing, and internet-of-things processing regardless of volume, velocity and variety of data.
Learning Objectives:
• Discover how you can rapidly scale and build your data lake with AWS.
• Explore the key pillars behind a successful data lake implementation.
• Learn how to use the Amazon Simple Storage Service (S3) as the basis for your data lake.
• Learn about the new AWS services recently launched, Amazon Athena and Amazon Redshift Spectrum, that help customers directly query that data lake.
Visualize your data in Data Lake with AWS Athena and AWS Quicksight Hands-on ...Amazon Web Services
Level 200: Visualize Your Data in Data Lake with AWS Athena and AWS Quicksight
Nowadays, enterprises are building Data Lake which store lots of structured and unstructured data for data analysis. But it takes lots of time for building the data modeling and infrastructure that is required. How to make quick data queries without servers and databases is the next big question for every enterprises.
In this workshop, eCloudvalley, the first and only Premier Consulting Partner in GCR, will demonstrate how to use serverless architecture to visualize your data using Amazon Athena and Amazon Quicksight.
You can easily query and visualize the data in your S3, and get business insights with the combination of these two services. Also, you can also build business reports with other tools such as AWS IoT, Amazon Kinesis Firehose.
Reason to Attend:
Learn how to quickly search for thousands of data on S3 via serverless Amazon's Athena
Learn how to use AWS QuickSight to retrieve information from your database quickly and create detailed reports
The AWS Big Data services are inherently built to run at @scale. In this session, you will learn how to develop an enterprise scale big data application using AWS services such as Amazon EMR, Amazon Redshift & Redshift Spectrum, Amazon Athena, Amazon Elasticsearch Service, Amazon Kinesis, Amazon QuickSight and AWS Glue. This session will also cover different architectural patterns and customer use cases.
Modern Cloud Data Warehousing ft. Intuit: Optimize Analytics Practices (ANT20...Amazon Web Services
Most companies are overrun with data, yet they lack critical insights to make timely and accurate business decisions. They are missing the opportunity to combine large amounts of new, unstructured big data that resides outside their data warehouse with trusted, structured data inside their data warehouse. In this session, we discuss the most common use cases with Amazon Redshift, and we take an in-depth look at how modern data warehousing blends and analyzes all your data to give you deeper insights to run your business. Intuit joins us to share their experience modernizing their analytics pipeline.
What's New with Amazon Redshift ft. Dow Jones (ANT350-R) - AWS re:Invent 2018Amazon Web Services
Learn about the latest and hottest features of Amazon Redshift. We’ll deep dive into the architecture and inner workings of Amazon Redshift and discuss how the recent availability, performance, and manageability improvements we’ve made can significantly enhance your user experience. We’ll also share glimpse of what we are working on and our plans for the future. Dow Jones will join us to share how they leverage a data lake powered by Redshift, Redshift spectrum and Athena to get fast time to insights.
by Rajeev Srinivasan, Sr. Solutions Architect and Gautam Srinivasan, Solutions Architect, AWS
While a Data Lake can support completely unstructured data, getting performant analytics at scale requires some data preparation. We'll look at how to use Amazon Kinesis, AWS Glue, and Amazon EMR to make raw data ready to high-performance analytics.
Build a High-Performance, Cloud-Native, Open-Source Platform on AWS & Save Mi...Amazon Web Services
As a leader in agriculture technologies and services, Bayer is using technologies such as unmanned aerial vehicles (UAV), satellite imagery, and sensor data from multiple sources to generate real time insights. Over 300 data sources are ingested into their open source HPC geospatial platform to generate on average 100M API calls per day. The platform is used to provide real-time visualization and computational analysis to Bayer’s internal research community, partners, and is licensed to third-party applications to provide insights relevant to high-yield production of crops. In this session, Mendez-Costabel discusses how Bayer transitioned from on-premises packaged software architecture to open-source software and cloud services from AWS to build a modern, scalable, high-performance, open-source app on AWS. Learn about the open-source application architecture and AWS services used. Learn how the computing environment has changed the way that Bayer is performing R&D projects, and how the move to a modern architecture has enabled Bayer’s customers to gain insights that are transforming their businesses.
The introductory morning session will discuss big data challenges and provide an overview of the AWS Big Data Platform. We will also cover:
• How AWS customers leverage the platform to manage massive volumes of data from a variety of sources while containing costs.
• Reference architectures for popular use cases, including: connected devices (IoT), log streaming, real-time intelligence, and analytics.
• The AWS big data portfolio of services, including Amazon S3, Kinesis, DynamoDB, Elastic MapReduce (EMR) and Redshift.
• The latest relational database engine, Amazon Aurora - a MySQL-compatible, highly-available relational database engine which provides up to five times better performance than MySQL at a price one-tenth the cost of a commercial database.
• Amazon Machine Learning – the latest big data service from AWS provides visualization tools and wizards that guide you through the process of creating machine learning (ML) models without having to learn complex ML algorithms and technology.
ABD206-Building Visualizations and Dashboards with Amazon QuickSightAmazon Web Services
Just as a picture is worth a thousand words, a visual is worth a thousand data points. A key aspect of our ability to gain insights from our data is to look for patterns, and these patterns are often not evident when we simply look at data in tables. The right visualization will help you gain a deeper understanding in a much quicker timeframe. In this session, we will show you how to quickly and easily visualize your data using Amazon QuickSight. We will show you how you can connect to data sources, generate custom metrics and calculations, create comprehensive business dashboards with various chart types, and setup filters and drill downs to slice and dice the data.
Data Lake allows an organisation to store all of their data, structured and unstructured, in one, centralised repository. Since data can be stored as-is, there is no need to convert it to a predefined schema and you no longer need to know what questions you want to ask of your data beforehand. In this session we will explore the architecture of a Data Lake on AWS and cover topics such as storage, processing and security.
Speakers:
Tom McMeekin, Associate Solutions Architect, Amazon Web Services
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Amazon Web Services
In this session, we show you how to understand what data you have, how to drive insights, and how to make predictions using purpose-built AWS services. Learn about the common pitfalls of building data lakes and discover how to successfully drive analytics and insights from your data. Also learn how services such as Amazon S3, AWS Glue, Amazon Redshift, Amazon Athena, Amazon EMR, Amazon Kinesis, and Amazon ML services work together to build a successful data lake for various roles, including data scientists and business users.
AWS delivers an integrated suite of services that provide everything needed to quickly and easily build and manage a data lake for analytics. AWS-powered data lakes can handle the scale, agility, and flexibility required to combine different types of data and analytics approaches to gain deeper insights, in ways that traditional data silos and data warehouses cannot. In this session, we will show you how you can quickly build a data lake on AWS that ingests, catalogs and processes incoming data and makes it ready for analysis. Using a live demo, we demonstrate the capabilities of AWS provided analytical services such as AWS Glue, Amazon Athena and Amazon EMR and how to build a Data Lake on AWS step-by-step.
Build Data Engineering Platforms with Amazon EMR (ANT204) - AWS re:Invent 2018Amazon Web Services
Amazon EMR provides a flexible range of service customization options, enabling customers to use it as a building block for their data platforms. In this session, AWS customers Salesforce.com and Vanguard discuss in detail how they use Amazon EMR to build a self-service, secure, and auditable data engineering platform. Customers who want to optimize their design and configurations should attend this session to learn best practices from customer experts. Topics include achieving cost-efficient scale, using notebooks, processing streaming data, rapid prototyping of applications and data pipelines, architecting for both transient and persistent clusters, setting up advanced security and authorization controls, and enabling easy self service for users.
Using AWS Purpose-Built Databases to Modernize your ApplicationsAmazon Web Services
As you look to modernizing your applications, you will need to consider your database options to meet the new application requirements. AWS offers a series of purpose-built databases that include relational, key value, document, graph and cache use cases to help you deliver new and enhanced functionalities. In this webinar session, we share the different modern application architectures, and how to combine different database services to meet your requirements. Understand how to modernize your relational databases through easy upgrades with Amazon Relational Database Service and learn how to migrate from one database to another with AWS Database Migration Service and AWS Schema Conversion Tool.
Speaker:
Blair Layton, Business Development Manager, Amazon Web Services
AWS-powered services for analytics can handle the scale, agility, and flexibility required to combine different types of data and analytics approaches that will allow you to transform your data into a valuable corporate asset. In this session, AWS will provide an overview of the different AWS services available for your data analytics needs. You can combine these blocks to build data flows that will extend your organization’s agility, ability to derive more insights and value from its data, and capability to adopt more sophisticated analytics tools and processes as your needs evolve. In the second part of the session, Paddy Power Betfair’s Data team will discuss the adoption and large scale operation of a broad range of AWS services that make up PPB’s scalable, mixed workload, multi-brand data platform. The data capabilities developed by PPB and powered by AWS were implemented to enable low-latency, high-volume and near real-time advanced analytics use cases, in the highly regulated and fast-paced betting industry. This was only possible through a focus on automation, innovation and continuous improvement.
Modern data is massive, quickly evolving, unstructured, and increasingly hard to catalog and understand from multiple consumers and applications. This session will guide you though the best practices for designing a robust data architecture, highlightning the benefits and typical challenges of data lakes and data warehouses. We will build a scalable solution based on managed services such as Amazon Athena, AWS Glue, and AWS Lake Formation.
Come along and learn about the enhancements that we have made to Amazon Redshift, including features around performance, scalability and cluster management. We will also explore why a cloud native Data Warehouse solution allows AWS to innovate faster and deliver customer outcomes that are not possible in a more traditional on premises solution.
by Mamoon Chowdry, Solutions Architect
AWS Data & Analytics Week is an opportunity to learn about Amazon’s family of managed analytics services. These services provide easy, scalable, reliable, and cost-effective ways to manage your data in the cloud. We explain the fundamentals and take a technical deep dive into Amazon Redshift data warehouse; Data Lake services including Amazon EMR, Amazon Athena, & Amazon Redshift Spectrum; Log Analytics with Amazon Elasticsearch Service; and data preparation and placement services with AWS Glue and Amazon Kinesis. You'll will learn how to get started, how to support applications, and how to scale.
A data lake is an architectural approach that allows you to store massive amounts of data into a central location, so it's readily available to be categorized, processed, analyzed and consumed by diverse groups within an organization.In this session, we will introduce the Data Lake concept and its implementation on AWS.We will explain the different roles our services play and how they fit into the Data Lake picture.
Today’s organisations require a data storage and analytics solution that offers more agility and flexibility than traditional data management systems. Data Lake is a new and increasingly popular way to store all of your data, structured and unstructured, in one, centralised repository. Since data can be stored as-is, there is no need to convert it to a predefined schema and you no longer need to know what questions you want to ask of your data beforehand.
In this webinar, you will discover how AWS gives you fast access to flexible and low-cost IT resources, so you can rapidly scale and build your data lake that can power any kind of analytics such as data warehousing, clickstream analytics, fraud detection, recommendation engines, event-driven ETL, serverless computing, and internet-of-things processing regardless of volume, velocity and variety of data.
Learning Objectives:
• Discover how you can rapidly scale and build your data lake with AWS.
• Explore the key pillars behind a successful data lake implementation.
• Learn how to use the Amazon Simple Storage Service (S3) as the basis for your data lake.
• Learn about the new AWS services recently launched, Amazon Athena and Amazon Redshift Spectrum, that help customers directly query that data lake.
Visualize your data in Data Lake with AWS Athena and AWS Quicksight Hands-on ...Amazon Web Services
Level 200: Visualize Your Data in Data Lake with AWS Athena and AWS Quicksight
Nowadays, enterprises are building Data Lake which store lots of structured and unstructured data for data analysis. But it takes lots of time for building the data modeling and infrastructure that is required. How to make quick data queries without servers and databases is the next big question for every enterprises.
In this workshop, eCloudvalley, the first and only Premier Consulting Partner in GCR, will demonstrate how to use serverless architecture to visualize your data using Amazon Athena and Amazon Quicksight.
You can easily query and visualize the data in your S3, and get business insights with the combination of these two services. Also, you can also build business reports with other tools such as AWS IoT, Amazon Kinesis Firehose.
Reason to Attend:
Learn how to quickly search for thousands of data on S3 via serverless Amazon's Athena
Learn how to use AWS QuickSight to retrieve information from your database quickly and create detailed reports
The AWS Big Data services are inherently built to run at @scale. In this session, you will learn how to develop an enterprise scale big data application using AWS services such as Amazon EMR, Amazon Redshift & Redshift Spectrum, Amazon Athena, Amazon Elasticsearch Service, Amazon Kinesis, Amazon QuickSight and AWS Glue. This session will also cover different architectural patterns and customer use cases.
Modern Cloud Data Warehousing ft. Intuit: Optimize Analytics Practices (ANT20...Amazon Web Services
Most companies are overrun with data, yet they lack critical insights to make timely and accurate business decisions. They are missing the opportunity to combine large amounts of new, unstructured big data that resides outside their data warehouse with trusted, structured data inside their data warehouse. In this session, we discuss the most common use cases with Amazon Redshift, and we take an in-depth look at how modern data warehousing blends and analyzes all your data to give you deeper insights to run your business. Intuit joins us to share their experience modernizing their analytics pipeline.
What's New with Amazon Redshift ft. Dow Jones (ANT350-R) - AWS re:Invent 2018Amazon Web Services
Learn about the latest and hottest features of Amazon Redshift. We’ll deep dive into the architecture and inner workings of Amazon Redshift and discuss how the recent availability, performance, and manageability improvements we’ve made can significantly enhance your user experience. We’ll also share glimpse of what we are working on and our plans for the future. Dow Jones will join us to share how they leverage a data lake powered by Redshift, Redshift spectrum and Athena to get fast time to insights.
by Rajeev Srinivasan, Sr. Solutions Architect and Gautam Srinivasan, Solutions Architect, AWS
While a Data Lake can support completely unstructured data, getting performant analytics at scale requires some data preparation. We'll look at how to use Amazon Kinesis, AWS Glue, and Amazon EMR to make raw data ready to high-performance analytics.
Build a High-Performance, Cloud-Native, Open-Source Platform on AWS & Save Mi...Amazon Web Services
As a leader in agriculture technologies and services, Bayer is using technologies such as unmanned aerial vehicles (UAV), satellite imagery, and sensor data from multiple sources to generate real time insights. Over 300 data sources are ingested into their open source HPC geospatial platform to generate on average 100M API calls per day. The platform is used to provide real-time visualization and computational analysis to Bayer’s internal research community, partners, and is licensed to third-party applications to provide insights relevant to high-yield production of crops. In this session, Mendez-Costabel discusses how Bayer transitioned from on-premises packaged software architecture to open-source software and cloud services from AWS to build a modern, scalable, high-performance, open-source app on AWS. Learn about the open-source application architecture and AWS services used. Learn how the computing environment has changed the way that Bayer is performing R&D projects, and how the move to a modern architecture has enabled Bayer’s customers to gain insights that are transforming their businesses.
The introductory morning session will discuss big data challenges and provide an overview of the AWS Big Data Platform. We will also cover:
• How AWS customers leverage the platform to manage massive volumes of data from a variety of sources while containing costs.
• Reference architectures for popular use cases, including: connected devices (IoT), log streaming, real-time intelligence, and analytics.
• The AWS big data portfolio of services, including Amazon S3, Kinesis, DynamoDB, Elastic MapReduce (EMR) and Redshift.
• The latest relational database engine, Amazon Aurora - a MySQL-compatible, highly-available relational database engine which provides up to five times better performance than MySQL at a price one-tenth the cost of a commercial database.
• Amazon Machine Learning – the latest big data service from AWS provides visualization tools and wizards that guide you through the process of creating machine learning (ML) models without having to learn complex ML algorithms and technology.
ABD206-Building Visualizations and Dashboards with Amazon QuickSightAmazon Web Services
Just as a picture is worth a thousand words, a visual is worth a thousand data points. A key aspect of our ability to gain insights from our data is to look for patterns, and these patterns are often not evident when we simply look at data in tables. The right visualization will help you gain a deeper understanding in a much quicker timeframe. In this session, we will show you how to quickly and easily visualize your data using Amazon QuickSight. We will show you how you can connect to data sources, generate custom metrics and calculations, create comprehensive business dashboards with various chart types, and setup filters and drill downs to slice and dice the data.
Data Lake allows an organisation to store all of their data, structured and unstructured, in one, centralised repository. Since data can be stored as-is, there is no need to convert it to a predefined schema and you no longer need to know what questions you want to ask of your data beforehand. In this session we will explore the architecture of a Data Lake on AWS and cover topics such as storage, processing and security.
Speakers:
Tom McMeekin, Associate Solutions Architect, Amazon Web Services
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Amazon Web Services
In this session, we show you how to understand what data you have, how to drive insights, and how to make predictions using purpose-built AWS services. Learn about the common pitfalls of building data lakes and discover how to successfully drive analytics and insights from your data. Also learn how services such as Amazon S3, AWS Glue, Amazon Redshift, Amazon Athena, Amazon EMR, Amazon Kinesis, and Amazon ML services work together to build a successful data lake for various roles, including data scientists and business users.
AWS delivers an integrated suite of services that provide everything needed to quickly and easily build and manage a data lake for analytics. AWS-powered data lakes can handle the scale, agility, and flexibility required to combine different types of data and analytics approaches to gain deeper insights, in ways that traditional data silos and data warehouses cannot. In this session, we will show you how you can quickly build a data lake on AWS that ingests, catalogs and processes incoming data and makes it ready for analysis. Using a live demo, we demonstrate the capabilities of AWS provided analytical services such as AWS Glue, Amazon Athena and Amazon EMR and how to build a Data Lake on AWS step-by-step.
Build Data Engineering Platforms with Amazon EMR (ANT204) - AWS re:Invent 2018Amazon Web Services
Amazon EMR provides a flexible range of service customization options, enabling customers to use it as a building block for their data platforms. In this session, AWS customers Salesforce.com and Vanguard discuss in detail how they use Amazon EMR to build a self-service, secure, and auditable data engineering platform. Customers who want to optimize their design and configurations should attend this session to learn best practices from customer experts. Topics include achieving cost-efficient scale, using notebooks, processing streaming data, rapid prototyping of applications and data pipelines, architecting for both transient and persistent clusters, setting up advanced security and authorization controls, and enabling easy self service for users.
Using AWS Purpose-Built Databases to Modernize your ApplicationsAmazon Web Services
As you look to modernizing your applications, you will need to consider your database options to meet the new application requirements. AWS offers a series of purpose-built databases that include relational, key value, document, graph and cache use cases to help you deliver new and enhanced functionalities. In this webinar session, we share the different modern application architectures, and how to combine different database services to meet your requirements. Understand how to modernize your relational databases through easy upgrades with Amazon Relational Database Service and learn how to migrate from one database to another with AWS Database Migration Service and AWS Schema Conversion Tool.
Speaker:
Blair Layton, Business Development Manager, Amazon Web Services
In this session, we discuss the evolution of database and analytics services in AWS, the new database and analytics services and features we launched this year, and our vision for continued innovation in this space. We are witnessing an unprecedented growth in the amount of data collected, in many different forms. Storage, management, and analysis of this data require database services that scale and perform in ways not possible before. AWS offers a collection of database and other data services—including Amazon Aurora, Amazon DynamoDB, Amazon RDS, Amazon Redshift, Amazon ElastiCache, Amazon Kinesis, and Amazon EMR—to process, store, manage, and analyze data. In this session, we provide an overview of AWS database and analytics services and discuss how customers are using these services today.
As the volume and types of data continues to grow, customers often have valuable data that is not easily discoverable and available for analytics. A common challenge for data engineering teams is architecting a data lake that can cater to the needs of diverse users - from developers to business analysts to data scientists. In this session, dive deep into building a data lake using Amazon S3, Amazon Kinesis, Amazon Athena and AWS Glue. Learn how AWS Glue crawlers can automatically discover your data, extracting and cataloguing relevant metadata to reduce operations in preparing your data for downstream consumers.
Technology Trends in Data Processing - DAT311 - re:Invent 2017Amazon Web Services
In this talk, Anurag Gupta, VP for AWS Analytic and Transactional Database Services, will talk about some of the key trends we see in data processing and how they shape the services we offer at AWS. Specific trends will include the rise of machine generated logs as the dominant source of data, the move towards Serverless, api-centric computing, and the growing need for local access to data from users around the world.
Understanding AWS Managed Databases and Analytic Services - AWS Innovate Otta...Amazon Web Services
• Overview of database services to elevate your applications, analytic services to engage your data, and migration services to help you reach database freedom.
• Survey of how Canadian and other organizations are using the cloud to make data scalable, reliable, and secure.
AWS re:Invent è l’annuale conferenza globale di Amazon Web Services. Ogni anno presentiamo più di 1000 sessioni tecniche, workshops e hackathon che coprono argomenti chiave inerenti a AWS e che illustrano le tecnologie che AWS sviluppa e introduce. In questo webinar vedremo un riepilogo degli annunci e delle novità presentate a Las Vegas e diversi casi d’uso per i principali servizi introdotti.
Building with Purpose-Built Databases: Match Your workload to the Right DatabaseAWS Summits
Learn how to evaluate a new workload for the best managed database option based on specific application needs related to data shape, data size at limit, computational requirements, programmability, throughput and latency needs, and more. This session explains the ideal use cases for relational and non-relational database services, including Amazon Aurora, Amazon DynamoDB, Amazon ElastiCache for Redis, Amazon Neptune, and Amazon Redshift.
Laura Caicedo, Solutions Architect, Amazon Web Services
A growing number of organizations today need to deploy and operate Internet-scale applications, which requires Internet-scale database services. Join us to learn about the broad and deep AWS portfolio of database services, with solutions that provide the scalability, flexibility, resilience, security, and regulatory compliance to help enable you to achieve your mission, no matter how small or how large your needs might be. You’ll learn about how to manage data to meet a wide range of needs – from different data sizes, to varieties of data types, to differing requirements for speed and complexity. In this session you will also learn how you can achieve both cost savings and increase agility through AWS innovation that helps you move beyond legacy commercial databases.
Building with Purpose-Built Databases: Match Your Workload to the Right DatabaseAmazon Web Services
Learn how to evaluate a new workload for the best managed database option based on specific application needs related to data shape, data size at limit, computational requirements, programmability, throughput and latency needs, and more. This session explains the ideal use cases for relational and non-relational database services, including Amazon Aurora, Amazon DynamoDB, Amazon ElastiCache for Redis, Amazon Neptune, and Amazon Redshift.
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Amazon Web Services
The world is creating more data in more ways than ever before. The average internet user in 2017 generates 1.5GB of data per day, with the rate doubling every 18 months. A single autonomous vehicle can generate 4TB per day. Each smart manufacturing plant generates 1PB per day. Storing, managing, and analyzing this data requires integrated database and analytic services that provide reliability and security at scale. AWS offers a range of managed data services that let customers focus on making data useful, including Amazon Aurora, RDS, DynamoDB, Redshift, Spectrum, ElastiCache, Kinesis, EMR, Elasticsearch Service, and Glue. In this session, we discuss these services, share our vision for innovation, and show how our customers use these services today. Learn More: https://aws.amazon.com/government-education/
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Amazon Web Services
The world is creating more data in more ways than ever before. The average internet user in 2017 generates 1.5GB of data per day, with the rate doubling every 18 months. A single autonomous vehicle can generate 4TB per day. Each smart manufacturing plant generates 1PB per day. Storing, managing, and analyzing this data requires integrated database and analytic services that provide reliability and security at scale. AWS offers a range of managed data services that let customers focus on making data useful, including Amazon Aurora, RDS, DynamoDB, Redshift, Spectrum, ElastiCache, Kinesis, EMR, Elasticsearch Service, and Glue. In this session, we discuss these services, share our vision for innovation, and show how our customers use these services today. Learn More: https://aws.amazon.com/government-education/
Amazon Relational Database Service (RDS) makes it easy to set up, operate, and scale a relational database in the cloud. It provides cost-efficient, resizable capacity while automating time-consuming tasks such as hardware provisioning, database setup, patching, and backups. There are multiple database engines to choose from, including Amazon Aurora, PostgreSQL, MySQL, MariaDB, Oracle, and Microsoft SQL Server. Amazon Aurora is a relational database engine that combines the speed and reliability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases. It is designed to be compatible with MySQL and PostgreSQL so that existing applications and tools can run without modification.
Build Data Lakes and Analytics on AWS: Patterns & Best PracticesAmazon Web Services
With over 90% of today’s data generated in the last two years, the rate of data growth is showing no sign of slowing down. In this session, we step through the challenges and best practices for capturing data, understanding what data you own, driving insights, and predicting the future using AWS services. We frame the session and demonstrations around common pitfalls of building data lakes and how to successfully drive analytics and insights from data. We also discuss the architecture patterns brought together key AWS services, including Amazon S3, AWS Glue, Amazon Athena, Amazon Kinesis, and Amazon Machine Learning. Discover the real-world application of data lakes for roles including data scientists and business users.
Stephen Moon, Sr. Solutions Architect, Amazon Web Services
James Juniper, Solution Architect for the Geo-Community Cloud, Natural Resources Canada
Build Data Lakes & Analytics on AWS: Patterns & Best PracticesAmazon Web Services
With over 90% of today’s data generated in the last two years, the rate of data growth is showing no sign of slowing down. In this session, we step through the challenges and best practices for capturing data, understanding what data you own, driving insights, and predicting the future using AWS services. We frame the session and demonstrations around common pitfalls of building data lakes and how to successfully drive analytics and insights from data. We also discuss the architecture patterns brought together key AWS services, including Amazon S3, AWS Glue, Amazon Athena, Amazon Kinesis, and Amazon Machine Learning. Discover the real-world application of data lakes for roles including data scientists and business users.
Stephen Moon, Sr. Solutions Architect, Amazon Web Services
James Juniper, Solution Architect for the Geo-Community Cloud, Natural Resources Canada
Building low latency apps with a serverless architecture and in-memory data I...AWS Germany
Memory data stores such as ElastiCache for Redis enables applications with response times in microseconds. By using Aurora, DynamoDB, DAX, Lambda, and ElastiCache, we explored how to design and deploy high-perfomance applications. Learn more here: https://aws.amazon.com/products/databases/
Similar to Building with Purpose - Built Databases: Match Your Workloads to the Right Database (20)
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
Il Forecasting è un processo importante per tantissime aziende e viene utilizzato in vari ambiti per cercare di prevedere in modo accurato la crescita e distribuzione di un prodotto, l’utilizzo delle risorse necessarie nelle linee produttive, presentazioni finanziarie e tanto altro. Amazon utilizza delle tecniche avanzate di forecasting, in parte questi servizi sono stati messi a disposizione di tutti i clienti AWS.
In questa sessione illustreremo come pre-processare i dati che contengono una componente temporale e successivamente utilizzare un algoritmo che a partire dal tipo di dato analizzato produce un forecasting accurato.
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
La varietà e la quantità di dati che si crea ogni giorno accelera sempre più velocemente e rappresenta una opportunità irripetibile per innovare e creare nuove startup.
Tuttavia gestire grandi quantità di dati può apparire complesso: creare cluster Big Data su larga scala sembra essere un investimento accessibile solo ad aziende consolidate. Ma l’elasticità del Cloud e, in particolare, i servizi Serverless ci permettono di rompere questi limiti.
Vediamo quindi come è possibile sviluppare applicazioni Big Data rapidamente, senza preoccuparci dell’infrastruttura, ma dedicando tutte le risorse allo sviluppo delle nostre le nostre idee per creare prodotti innovativi.
Ora puoi utilizzare Amazon Elastic Kubernetes Service (EKS) per eseguire pod Kubernetes su AWS Fargate, il motore di elaborazione serverless creato per container su AWS. Questo rende più semplice che mai costruire ed eseguire le tue applicazioni Kubernetes nel cloud AWS.In questa sessione presenteremo le caratteristiche principali del servizio e come distribuire la tua applicazione in pochi passaggi
Vent'anni fa Amazon ha attraversato una trasformazione radicale con l'obiettivo di aumentare il ritmo dell'innovazione. In questo periodo abbiamo imparato come cambiare il nostro approccio allo sviluppo delle applicazioni ci ha permesso di aumentare notevolmente l'agilità, la velocità di rilascio e, in definitiva, ci ha consentito di creare applicazioni più affidabili e scalabili. In questa sessione illustreremo come definiamo le applicazioni moderne e come la creazione di app moderne influisce non solo sull'architettura dell'applicazione, ma sulla struttura organizzativa, sulle pipeline di rilascio dello sviluppo e persino sul modello operativo. Descriveremo anche approcci comuni alla modernizzazione, compreso l'approccio utilizzato dalla stessa Amazon.com.
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
L’utilizzo dei container è in continua crescita.
Se correttamente disegnate, le applicazioni basate su Container sono molto spesso stateless e flessibili.
I servizi AWS ECS, EKS e Kubernetes su EC2 possono sfruttare le istanze Spot, portando ad un risparmio medio del 70% rispetto alle istanze On Demand. In questa sessione scopriremo insieme quali sono le caratteristiche delle istanze Spot e come possono essere utilizzate facilmente su AWS. Impareremo inoltre come Spreaker sfrutta le istanze spot per eseguire applicazioni di diverso tipo, in produzione, ad una frazione del costo on-demand!
In recent months, many customers have been asking us the question – how to monetise Open APIs, simplify Fintech integrations and accelerate adoption of various Open Banking business models. Therefore, AWS and FinConecta would like to invite you to Open Finance marketplace presentation on October 20th.
Event Agenda :
Open banking so far (short recap)
• PSD2, OB UK, OB Australia, OB LATAM, OB Israel
Intro to Open Finance marketplace
• Scope
• Features
• Tech overview and Demo
The role of the Cloud
The Future of APIs
• Complying with regulation
• Monetizing data / APIs
• Business models
• Time to market
One platform for all: a Strategic approach
Q&A
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
Per creare valore e costruire una propria offerta differenziante e riconoscibile, le startup di successo sanno come combinare tecnologie consolidate con componenti innovativi creati ad hoc.
AWS fornisce servizi pronti all'utilizzo e, allo stesso tempo, permette di personalizzare e creare gli elementi differenzianti della propria offerta.
Concentrandoci sulle tecnologie di Machine Learning, vedremo come selezionare i servizi di intelligenza artificiale offerti da AWS e, anche attraverso una demo, come costruire modelli di Machine Learning personalizzati utilizzando SageMaker Studio.
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
Con l'approccio tradizionale al mondo IT per molti anni è stato difficile implementare tecniche di DevOps, che finora spesso hanno previsto attività manuali portando di tanto in tanto a dei downtime degli applicativi interrompendo l'operatività dell'utente. Con l'avvento del cloud, le tecniche di DevOps sono ormai a portata di tutti a basso costo per qualsiasi genere di workload, garantendo maggiore affidabilità del sistema e risultando in dei significativi miglioramenti della business continuity.
AWS mette a disposizione AWS OpsWork come strumento di Configuration Management che mira ad automatizzare e semplificare la gestione e i deployment delle istanze EC2 per mezzo di workload Chef e Puppet.
Scopri come sfruttare AWS OpsWork a garanzia e affidabilità del tuo applicativo installato su Instanze EC2.
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
Vuoi conoscere le opzioni per eseguire Microsoft Active Directory su AWS? Quando si spostano carichi di lavoro Microsoft in AWS, è importante considerare come distribuire Microsoft Active Directory per supportare la gestione, l'autenticazione e l'autorizzazione dei criteri di gruppo. In questa sessione, discuteremo le opzioni per la distribuzione di Microsoft Active Directory su AWS, incluso AWS Directory Service per Microsoft Active Directory e la distribuzione di Active Directory su Windows su Amazon Elastic Compute Cloud (Amazon EC2). Trattiamo argomenti quali l'integrazione del tuo ambiente Microsoft Active Directory locale nel cloud e l'utilizzo di applicazioni SaaS, come Office 365, con AWS Single Sign-On.
Dal riconoscimento facciale al riconoscimento di frodi o difetti di fabbricazione, l'analisi di immagini e video che sfruttano tecniche di intelligenza artificiale, si stanno evolvendo e raffinando a ritmi elevati. In questo webinar esploreremo le possibilità messe a disposizione dai servizi AWS per applicare lo stato dell'arte delle tecniche di computer vision a scenari reali.
Amazon Web Services e VMware organizzano un evento virtuale gratuito il prossimo mercoledì 14 Ottobre dalle 12:00 alle 13:00 dedicato a VMware Cloud ™ on AWS, il servizio on demand che consente di eseguire applicazioni in ambienti cloud basati su VMware vSphere® e di accedere ad una vasta gamma di servizi AWS, sfruttando a pieno le potenzialità del cloud AWS e tutelando gli investimenti VMware esistenti.
Molte organizzazioni sfruttano i vantaggi del cloud migrando i propri carichi di lavoro Oracle e assicurandosi notevoli vantaggi in termini di agilità ed efficienza dei costi.
La migrazione di questi carichi di lavoro, può creare complessità durante la modernizzazione e il refactoring delle applicazioni e a questo si possono aggiungere rischi di prestazione che possono essere introdotti quando si spostano le applicazioni dai data center locali.
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
Molte aziende oggi, costruiscono applicazioni con funzionalità di tipo ledger ad esempio per verificare lo storico di accrediti o addebiti nelle transazioni bancarie o ancora per tenere traccia del flusso supply chain dei propri prodotti.
Alla base di queste soluzioni ci sono i database ledger che permettono di avere un log delle transazioni trasparente, immutabile e crittograficamente verificabile, ma sono strumenti complessi e onerosi da gestire.
Amazon QLDB elimina la necessità di costruire sistemi personalizzati e complessi fornendo un database ledger serverless completamente gestito.
In questa sessione scopriremo come realizzare un'applicazione serverless completa che utilizzi le funzionalità di QLDB.
Con l’ascesa delle architetture di microservizi e delle ricche applicazioni mobili e Web, le API sono più importanti che mai per offrire agli utenti finali una user experience eccezionale. In questa sessione impareremo come affrontare le moderne sfide di progettazione delle API con GraphQL, un linguaggio di query API open source utilizzato da Facebook, Amazon e altro e come utilizzare AWS AppSync, un servizio GraphQL serverless gestito su AWS. Approfondiremo diversi scenari, comprendendo come AppSync può aiutare a risolvere questi casi d’uso creando API moderne con funzionalità di aggiornamento dati in tempo reale e offline.
Inoltre, impareremo come Sky Italia utilizza AWS AppSync per fornire aggiornamenti sportivi in tempo reale agli utenti del proprio portale web.
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
Molte organizzazioni sfruttano i vantaggi del cloud migrando i propri carichi di lavoro Oracle e assicurandosi notevoli vantaggi in termini di agilità ed efficienza dei costi.
La migrazione di questi carichi di lavoro, può creare complessità durante la modernizzazione e il refactoring delle applicazioni e a questo si possono aggiungere rischi di prestazione che possono essere introdotti quando si spostano le applicazioni dai data center locali.
In queste slide, gli esperti AWS e VMware presentano semplici e pratici accorgimenti per facilitare e semplificare la migrazione dei carichi di lavoro Oracle accelerando la trasformazione verso il cloud, approfondiranno l’architettura e dimostreranno come sfruttare a pieno le potenzialità di VMware Cloud ™ on AWS.
Amazon Elastic Container Service (Amazon ECS) è un servizio di gestione dei container altamente scalabile, che semplifica la gestione dei contenitori Docker attraverso un layer di orchestrazione per il controllo del deployment e del relativo lifecycle. In questa sessione presenteremo le principali caratteristiche del servizio, le architetture di riferimento per i differenti carichi di lavoro e i semplici passi necessari per poter velocemente migrare uno o più dei tuo container.
2. Managed services transform operations
Power, HVAC, net
Rack & stack
Server maintenance
OS patches
DB software patches
Database backups
High Availability
DB software installs
OS installation
Scaling
Operating
Databases
in AWS
App optimization
you
Power, HVAC, net
Rack & stack
Server maintenance
OS patches
DB software patches
Database backups
Scaling
High Availability
DB software installs
OS installation
you
App optimization
Operating
Databases
in the Old World
3. A one size fits all database doesn’t fit anyone
Modern Applications Need Purpose-Built Databases
Users: 1M+
Data volume: TB–PB–EB
Locality: Global
Performance: Milliseconds–microseconds
Request Rate: Millions
Access: Mobile, IoT, devices
Scale: Up-out-in
Economics: Pay as you go
Developer Access: Instant API access
Relational Key-value Document
In-memory Graph Search
4. AWS purpose-built strategy
The right tool for the right job
Relational
Non-Relational
Aurora RDS
ElastiCacheDynamoDB
Key-value Document
Neptune
Graph
5. Data models and common use cases
Relational Key-value Document In-memory Graph Search
Referential
integrity, ACID
transactions,
schema-on-write
Low-latency,
key look-ups with
high throughput
and fast ingestion
of data
Indexing and
storing
documents with
support
for query on
any attribute
Microseconds
latency,
key-based
queries, and
specialized
data structures
Creating and
navigating
relations
between data
easily
and quickly
Indexing and
searching
semistructured
logs and data
ERP, medical records,
CRM, finance
Real-time bidding,
shopping cart, IoT device
tracking
Content management,
personalization, mobile
Leaderboards, real-time
analytics, caching
Fraud detection, social
networking,
recommendation engine
Product catalog,
help/FAQs, full-text
Amazon Aurora
Amazon RDS
Amazon Redshift
Amazon Amazon Amazon
for Redis &
Memcached
Amazon Neptune Amazon
Elasticsearch
6. AWS databases and analytics
B r o a d a n d d e e p p o r t f o l i o , p u r p o s e - b u i l t f o r b u i l d e r s
Data Lake
S3/Glacier Glue
(ETL & Data Catalog)
Data Movement
Database Migration Service | Snowball | Snowmobile | Kinesis Data Firehose | Kinesis Data Streams
Non-Relational Databases
DynamoDB
ElastiCache
(Redis, Memcached)
Neptune
(Graph)
Analytics
DW | Big Data Processing | Interactive
Redshift EMR Athena
Kinesis
Analytics
Elasticsearch
Service
Real-time
Relational Databases
RDS
Aurora
Business Intelligence & Machine Learning
QuickSight SageMaker Comprehend
10. Fully compatible with
PostgreSQL and MySQL,
with 3x – 5x the throughput
Storage volume striped across
hundreds of storage nodes
distributed over 3 different
availability zones
Six copies of data on SSD, two
copies in each availability zone, to
protect against AZ+1 failures
Continuous backup to Amazon
S3 (built for 99.999999999%
durability)
Master Replica Replica Replica
Availability
Zone 1
Availability
Zone 2
Availability
Zone 3
Large relational databases with Amazon Aurora
Scale-out, distributed, multi-tenant architecture
12. Amazon DynamoDB
Fully-managed nonrelational database for any scale
Secure
Encryption at rest and transit
Fine-grained access control
PCI, HIPAA, FIPS140-2 eligible
High performance
Fast, consistent performance
Virtually unlimited throughput
Virtually unlimited storage
Fully managed
Maintenance-free
Serverless
Auto scaling
Backup and restore
Global tables GlobalTables
High-performance, globally distributed
applications
Multi-region redundancy
and resiliency
Easy to set up and no application
rewrites required
13. Managed services for open source software
Redis, Memcached, Elasticsearch, Apache Hadoop, etc.
Fully managed
AWS manages all hardware
and software setup,
configuration, monitoring
Extreme performance
In-memory data store and cache
for sub-millisecond response times
Easily scalable
Non-disruptive scaling
up and down to
meet changing
demands
Amazon ElastiCache
Open and Secure
Direct access to open-source APIs
Secure access withVPC
Amazon Elasticsearch Service
Apache Hadoop Ecosystem
19 open-source frameworks
Low costs with S3 storage and Spot
Amazon EMR
14. Highly connected data best represented in a graph
Relational model
Foreign keys used to represent relationships
Queries can involve nesting & complex joins
Performance can degrade as datasets grow
Graph model
Relationships are first-order citizens
Write queries that navigate the graph
Results returned quickly, even on large datasets
15. Amazon Neptune
Fully managed graph database
Fast & Scalable ReliableFlexible
Store billions of relationships; query
with millisecond latency
Six replicas of your data
across three AZs with full
backup and restore
Build powerful queries
with
Gremlin and SPARQL
Supports Apache
TinkerPop & W3C RDF
graph models
Gremlin
SPARQL
Open Standards
18. Amazon Redshift Spectrum
Extend the data warehouse to exabytes of data in S3 data lake
• Exabyte Redshift SQL queries against Amazon
S3
• Join data across Redshift and S3
• Scale compute and storage separately
• Stable query performance and unlimited
concurrency
• CSV, ORC, Grok, Avro, & Parquet data formats
• Pay only for the amount of data scanned
S3 data lakeAmazon
Redshift data
Redshift Spectrum
query engine
19. Amazon Elasticsearch Service
Fully-managed.
Deploy production-ready
clusters in minutes
Open
Direct access to Amazon ES
open-source APIs; supports
Logstash and Kibana
Secure
Secure access with VPC to
keep all traffic within AWS
network
Available
Zone awareness replicates
data between two AZs;
automatically monitors &
replaces failed nodes
Managed service to deploy, secure, operate, and scale Amazon Elasticsearch Service
Customers use Amazon ES for log analytics, full-text search & application monitoring
Fully Managed
22. M L F R A M E W O R K S
Put Machine Learning in the hands of
every developer
M L S E R V I C E S
A M A Z O N
S A G E M A K E R
A I S E R V I C E S
R E K O G N I T I O N R E K O G N I T I O N
V I D E O
P O L L Y T R A N S C R I B E T R A N S L A T E C O M P R E H E N D L E X
V isio n S p eec h L an g u ag e C h at b o t s &
C o n t ac t C en t er s
The Cloud is a fully managed environment
Using managed services frees you to focus on your mission instead of minutia of operational details
AWS customers use a broad and deep selection of fully managed services to support work at any scale
Using a single database for every purpose doesn’t work in today’s world of large scale
Developers choose relational databases, nonrelational databases, analytical databases, machine learning, visualization and other tools to do the job
AWS customers need flexibility, scale, and performance
Application requirements are changing, and a one-size-fits-all approach of using a relational database as the only data store for your application no longer works. An increasing number of developers now choose relational and nonrelational databases that are purpose-built to meet their application’s specific needs, like storing key-value pairs and documents.
If you are building an online retail site, you might choose a relational database to help ensure financial transactions related to an order are 100% correct.
If you want a shopping cart that can provide consistent single-digit-millisecond latency with virtually limitless scale to handle the likes of Amazon Prime Day, you can choose a key-value database.
If you want to show more personalized recommendations like accessories that friends of your users purchased, you can choose a graph database.
The characteristics of cloud applications is driving why different database services exist today. Developers are always looking for the right tool for the job and because they are so easy to gain access to, developers can enjoy a rich development flexibility without sacrificing scale and performance.
No one truck is right for every job, which is why there are tractor-trailers, pickup trucks, earth movers, and delivery trucks
No one data tool is right for every job, which is why there are AWS services for both relational and nonrelational data
This is one way to think about the different database choices developers have, as they think about using the right tool for the job often considering speed, scale, & programmability.
If I were standing here today, saying we have one database, that can literally do everything, it might be like me saying, you can use one vehicle that is a utility, earth mover, delivery truck and long-haul cargo mover that is equally efficient in all aspects of the job its being used for.
Relational data is important for many customers
A lot of new development uses nonrelational data
The key is using the right tool for the job
AWS offers services across the full range of data tools
Business Intelligence and Machine Learning tools help make sense of data
Database services are the right tools for relational, nonrelational, and analytic jobs
The data lake combines data tools with scalable storage and data governance
Data movement tools let you get data between different formats and places
Many customers are still trapped using old-guard databases such as Oracle, Microsoft SQL Server, or IBM Db2
These databases are expensive, with proprietary lock-in and punitive licensing
Old-guard vendors will conduct audits (“you’ve got mail – audit coming!”) whenever they want to force extra payments
AWS helps customers escape for all of these limitations
The relational database world has been an unpleasant place for most customers. These customers have had to deal with old-guard database providers that are expensive, proprietary, have high lock-in, and impose punitive licensing terms. And, you occasionally get an email that says you’re being audited!
Open Source relational databases are widely-used and well supported
AWS customers want the low cost and community support of Open Source and the high performance and reliability of commercial databases
You can get fully managed Open Source with performance and reliability with Amazon RDS and Amazon Aurora
However, getting the same performance on open source databases as you get on commercial-grade databases is difficult. We have done this at Amazon.com, but it has required a lot of tuning. Customers that are moving to open source databases have asked us for the performance of commercial-grade databases with the pricing, freedom, and flexibility of open source databases. That's why we spent a few years building Amazon Aurora.
RDS is fully managed, automating patching, backup, high availability, encryption, and security.
With up to 16TB per database instance, run hundreds or thousands of DB instances w/out large staff commitments.
You can use both Open Source (MySQL, MariaDB, PostgreSQL) and Commercial (Oracle, Microsoft SQL Server) databases
Aurora combines Open Source interfaces of MySQL and PostgreSQL with enterprise-class scalability, performance, and reliability
All data is stored in six copies across three independent physical facilities (we call these Availability Zones)
Aurora is high performance, with 3x – 5x the throughput of MySQL or PostgreSQL
DMS enables customers to copy and move databases without downtime
No lock-in at AWS: you can move data to the Cloud, off of the Cloud, and between Clouds
AWS customers have used DMS to migrate over 90,000 databases
In addition to offering a broad portfolio of purpose-built database services, AWS makes it easy for you to migrate your database to the cloud. The AWS Database Migration Service (DMS) helps customers securely migrate their databases to AWS with minimal or no downtime. The source database remains fully operational during the migration, causing no interruption to applications that rely on that database. DMS can migrate your data from most widely used commercial and open-source databases. DMS supports migrations such as Oracle to Oracle migrations, as well as migrations between different database platforms, such as Oracle to Amazon Aurora. DMS offers six months of free usage for migrations to Amazon Aurora, Amazon DynamoDB, and Amazon Redshift. For large databases, where terabytes of data need to be migrated, you can use AWS Snowball, a petabyte-scale data transport service that uses secure appliances to transfer data into and out of AWS.
Amazon.com runs our own business largely with DynamoDB
DynamoDB is fully managed, with consistent high performance at any scale, with some customers storing over 1 PB in a single DynamoDB table
Global Tables enable true active-active databases across the world
Amazon DynamoDB is a fully managed NoSQL database service running in the AWS Cloud. The complexity of running a massively scalable, distributed NoSQL database is managed by the service itself, allowing software developers to focus on building applications rather than managing infrastructure. NoSQL databases are designed for scale, but their architectures are sophisticated, and there can be significant operational overhead in running a large NoSQL cluster. Instead of having to become experts in advanced distributed computing concepts, the developer need only to learn DynamoDB’s straightforward API using the SDK for the programming language of choice. In addition to being easy to use, DynamoDB is also cost effective. With DynamoDB, you pay for the storage you are consuming and the IO throughput you have provisioned. It is designed to scale elastically while maintaining high performance. When the storage and throughput requirements of an application are low, only a small amount of capacity needs to be provisioned in the DynamoDB service. As the number of users of an application grows and the required IO throughput increases, additional capacity can be provisioned on the fly. This enables an application to seamlessly grow to support millions of users making thousands of concurrent requests to the database every second. Finally, DynamoDB is secure with support for end to end encryption and fine grained access control.
AWS also has fully managed solutions for other popular Open Source packages
ElastiCache provides managed redis and memcached for sub-millisecond in-memory data
Elasticsearch is a managed search engine, both for text search and log analytics
Nineteen Apache Hadoop packages are managed with Amazon EMR, including Hbase, Spark, Presto, Hive, and others
While millisecond latency works for many applications, microsecond latency is required by real-time, data-intensive applications. For example, gaming leaderboards capture the scores of millions of online players every time their scores change and re-rank the players in real-time. A common solution for this is an in-memory data store where millions of data records can be written and accessed in microseconds. In-memory data stores can also function as stand-alone databases for transient data such as website user authentication tokens that expire at the end of the user session. Redis and Memcached are two popular choices for in-memory data stores. Redis is an open-source, in-memory, key-value store that offers a variety of built-in data structures such as sorted sets, lists, and geospatial data, making it faster to develop applications. Memcached is an open-source in-memory caching system that is easy to use. However, Redis and Memcached lack enterprise features such as scalability and reliability, and that's why we built Amazon ElastiCache.
Amazon ElastiCache offers Redis and Memcached as fully managed services. It automates management tasks such as hardware provisioning, software patching, setup, configuration, monitoring, and backups, making it easy to run Redis and Memcached on AWS. ElastiCache can scale-out, scale-in, and scale-up to meet changing application demands. ElastiCache for Redis allows you add up to five read replicas across multiple availability zones, enabling you to easily scale read capacity. And, if the primary read/write node fails, ElastiCache for Redis automatically promotes one of the read replicas to be the primary node, making your application more reliable. For scaling write capacity, ElastiCache for Redis lets you partition your data across multiple primary nodes, and distributes write requests across these nodes. ElastiCache for Redis provides encryption-at-rest and encryption-in-transit, helping you secure your data.
Key benefits of Amazon ElastiCache include:
Redis and Memcached Compatible
With Amazon ElastiCache, you get native access to Redis or Memcached in-memory environments. This enables compatibility with your existing tools and applications.
Extreme Performance
Amazon ElastiCache works as an in-memory data store and cache to support the most demanding applications requiring sub-millisecond response times. By utilizing an end-to-end optimized stack running on customer dedicated nodes, Amazon Elasticache provides you secure, blazing fast performance.
Fully Managed
You no longer need to perform management tasks such as hardware provisioning, software patching, setup, configuration, monitoring, failure recovery, and backups. ElastiCache continuously monitors your clusters to keep your workloads up and running so that you can focus on higher value application development.
Easily Scalable
Amazon ElastiCache can scale-out, scale-in, and scale-up to meet fluctuating application demands. Write and memory scaling is supported with sharding. Replicas provide read scaling.
Relational models, ironically, are not great for representing the relationships between data
Graph models are great for highly connected data, such as recommendation engines and social networks
The mathematics behind graph databases go back to the 1700’s, but actually implementing them has been difficult and expensive
Now consider an app like for recommendations, where someone wants to recommend some kind of organization, entity or sites of a certain type, in a particular city, that for example some of their connections also liked.
To do this, you need to put together a lot of connected datasets. To know the users, their connections & their likes.
You also need to know the organizations, entities and their attributes, such as museums, or schools, or places to eat.
In a rel. model, you end up with mult. tables, mult foreign keys, and soon, queries slow down & maint. is most difficult.
Alternatively, you can use an open source graph database, which are hard to scale and lack ent capabilities such as HA
Or commercial graph databases which are expensive, often proprietary, and you have to choose from graph models.
What we want is a graph DB compat. w/ldg graph models, open APIs, & also fast, reliable, scalable, & cost effective.
Amazon Neptune enable very large graph databases at low cost, high performance, and reliability
Just like the other databases, Neptune is fully managed, with AWS providing patching, backup, high availability, and high performance
Amazon Neptune is a fast, reliable, fully-man. graph DB. It makes it EZ to build & run apps that work w/highly conn. datasets.
It has a purpose-built, high-perf graph DB engine optimized for storing B’s of relationships & querying the graph w/ms latency.
Neptune supports the popular graph models, Property Graph and W3C's RDF
And their respective query languages Apache TinkerPop Gremlin and SPARQL.
Neptune is fully mang’d w/HA, read replicas, point-in-time recovery, continuous b/up to Amazon S3, & repl. across AZs.
Neptune is secure, w/support for encrypt. @ rest & in transit.
As data sizes grow, so does the need for analytics
AWS provides services for the full range of analytics, from visualization to queries to storage to security
Customers like NETFLIX, Zillow, NASDAQ, Yelp, iRobot, and FINRA trust AWS to run their analytics workloads.
AWS Big Data and Analytic services enable customers to easily run any analytic workload (batch, ad-hoc, real-time, IoT and predictive analytics) at any scale (GB to TB to PB to EB), in a secure fashion, at the lowest possible cost. AWS provides a highly scalable, available, secure, and cost effective data store that lets you store data in its native format and easily extract value from your data. (what people call a Data Lake). This is particularly true now that many customers see much of their new data created directly in the cloud, with Amazon S3 being home to the vast majority of it. With much more operating experience and scale, and a broader set analytics services available than anywhere else, S3 and our portfolio of Big Data & Analytics services is the clear number one choice for you to build your data lake and analytics solutions with.
Data Lakes extend analytics to any scale, from Gigabytes to Exabytes
With a Data Lake, you can use any analytical approach, from dashboards to reporting to predictive analytics powered by machine learning
These enable cust’s to build cloud data lakes to analyze all their data w/broadest set of analytical approaches including ML.
As a result, there are more organizations running their data lakes and analytics on AWS than anywhere else.
Redshift includes Spectrum, a feature that enables queries against data stored in files
Spectrum allows queries of very large data sets (Petabytes and Exabytes) at a low cost
Amazon Redshift Spectrum lets you to run Amazon Redshift SQL queries against exabytes of data in Amazon S3.
Extending the analytic power of Amazon Redshift beyond data stored on local disks in your DW.
You can query vast am’ts of unstruct data in your Amazon S3 “Data Lake” – w/out having to load or transform any data.
It uses sophisticated query optimization across 000s of nodes so results are fast – even w/large data sets & complex queries.
It dir. queries data in S3 using open data fmts like CSV, Grok, ORC, Parquet, RCFile, SequenceFile, TextFile, TSV, & others.
It supports the SQL syntax of Amazon Redshift so you can run sophisticated queries using the same BI tools you use today.
You can also run queries that span data stored locally in Amazon Redshift and your full data sets in Amazon S3.
You only pay for queries you run, w/S3 rates for data storage and Amazon Redshift instance rates for the clusters used.
Elasticsearch is an Open Source search engine that is popular, but hard to install and maintain
Amazon Elasticsearch service is fully managed, making it easy to deploy, secure, operate and scale Elasticsearch
Customers use Elasticsearch both for text search and for log analytics
Amazon Elasticsearch Service makes it easy to deploy, secure, operate, and scale Elasticsearch
This is for log analytics, full text search, application monitoring, and more.
It is a fully managed service that delivers Elasticsearch’s easy-to-use APIs and real-time analytics capabilities
But with the availability, scalability, and security that production workloads require.
It has built-in integrations w/Kibana, Logstash, & AWS so you can go from raw data to actionable insights quickly & securely.
These AWS integrations include Amazon Virtual Private Cloud (VPC), Amazon Kinesis Firehose, AWS Lambda, and Amazon CloudWatch
You get direct access to the Elasticsearch open-source API so existing Elasticsearch environments work seamlessly.
Humans use data better with pictures. QuickSight makes it easy to make data understandable for everyone
QuickSight can connect to data from almost any source, from AWS services to traditional BI services off-the-Cloud, to Excel spreadsheets
QuickSight is low cost and serverless, so you only pay for what you use, as you use it
Amazon QuickSight is a fast, cloud-powered business analytics service that makes it easy to build visualizations, perform ad-hoc analysis, and quickly get business insights from your data. Using our cloud-based service you can easily connect to your data, perform advanced analysis, and create stunning visualizations and rich dashboards that can be accessed from any browser or mobile device.
Insights for everyone: QuickSight enables self-serve/decentralized analytics in your organization better than any other system out there. As a Business Analyst can take an Analysis from concept to reality without depending on data engineers. You can create and prepare datasets, build your analysis, and share/collaborate with little to no intervention from IT or data engineers.
Seamless connectivity: Connecting to a data source (especially AWS data source) doesn’t involve back-end coding or complex setups. You simply click on the data source and enter your credentials, and QuickSight will auto-discover tables that you can select from – taking out the guesswork from selecting the right table to create your datasets. Then there is “Schedule Refresh”. If you setup your datasets to Refresh every day, every week - then you are assured of the latest data when you are looking at your analysis.
Fast analysis: Fast interactions with your charts and graphs – The charts and graphs built on SPICE data set are highly responsive. You can zoom-in/zoom-out, drill through, add filters on the fly with little to no time delay.
Serverless: With QuickSight it is completely serverless. Not only do you not anything installed or deployed for QuickSight, but in combination with S3, and Athena, you can have an end-to-end analytics solution without ever starting or managing servers.
Over 400,000 customers use AWS database and analytics.
While the database and analytics markets have been around for a while, with many mature offerings for customers to choose from.
We continue to see customers move to the cloud for a number of reasons and our recent growth in the database market is evidence of how rapidly the landscape is changing.
Customers move to the cloud to minimize time spent managing infrastructure
Customers are choosing the cloud and migrating more and more of their workloads to it. In the next 10 to 15 years, the majority of computing is going to be done in the cloud. In the fullness of time, very few companies will want to own their own data centers, manage infrastructure whether it is compute, storage, databases or analytics.
Customers move to the cloud for performance, scale, reliability and costIncreasingly, new applications need to be globally distributed, support millions of users and devices, work with petabytes of data, run 24/7 and be responsive.
As customers move to the cloud and to micro-service architectures, developers are increasingly the ones making technology decisionsAs customers move from monolithic apps to micro-service architectures with loosely coupled components and DevOps cultures. The developers are increasingly making decisions as part of their application development lifecycle on what frameworks and components do they use.
For Developers, AWS offers a number of AI services
AI services don’t require any knowledge of AI or ML
Now, at the next level up, we have a set of AI services. These are really designed for application developers who don't want to get into the weeds of how machine learning operates. They don't want to have to become deep learning experts. They don't want to have to go and label a whole bunch of data. They just want to get stuff done. And you can see right off the bat you get this broad set of capabilities available to you.
For computer vision, using services such as Amazon Recognition, which provides image analysis and facial detection; Recognition Video, which provides video analysis, [pathing], face identification; Speech, using Polly, which is a text-to-speech service, the same service that we use to generate the voice of Alexa; or Transcribe, which does it the other way around. It takes speech and turns it into text.
Or in terms of languages, Translate, which takes text in one language and translates it into another; or Comprehend, which looks inside the document at all of the text, takes all of the context, and then allows you to derive better insights and understanding of what that natural language text looks like, for completely unstructured information. All the way through to Amazon Lex, which is a natural language understanding and speech recognition system that a lot of customers are using to build Chatbots or use in context centers as IVR systems.
And Lex is the exact same beating heart that we use to run Alexa for the Echo and other Alexa-enabled devices. So whilst all of these services are designed to work independently and have a broad set of capabilities themselves, the real magic comes as customers start to pull them together. And they can be joined together for some common use cases.