With hundreds of new and sometimes disparate tools, it’s hard to keep pace. Amazon Web Services provides a broad and fully integrated portfolio of cloud computing services to help you build, secure and deploy your big data applications.
Attend this webinar to get an overview of the different big data options available in the AWS Cloud – including popular big data frameworks such as Hadoop, Spark, NoSQL databases, and more. Learn about ideal use cases, cases to avoid, performance, interfaces, and more. Finally, learn how you can build valuable applications with a real-life example.
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...Amazon Web Services
The world is producing an ever increasing volume, velocity, and variety of big data. Consumers and businesses are demanding up-to-the-second (or even millisecond) analytics on their fast-moving data, in addition to classic batch processing. AWS delivers many technologies for solving big data problems. But what services should you use, why, when, and how? In this session, we simplify big data processing as a data bus comprising various stages: ingest, store, process, and visualize. Next, we discuss how to choose the right technology in each stage based on criteria such as data structure, query latency, cost, request rate, item size, data volume, durability, and so on. Finally, we provide reference architecture, design patterns, and best practices for assembling these technologies to solve your big data problems at the right cost.
Database Migration Using AWS DMS and AWS SCT (GPSCT307) - AWS re:Invent 2018Amazon Web Services
Database migrations are an important step in any journey to AWS. In this session, we show you how to get started with AWS Database Migration Service (AWS DMS) and AWS Schema Conversion Tool (AWS SCT) to quickly and securely migrate your databases to AWS. Learn how to simplify your database migrations by using this service to migrate your data to and from commercial and open-source databases. We also explain how you can perform homogenous migrations such as MySQL to MySQL, as well as heterogeneous migrations between different database platforms, such as Oracle to Amazon Aurora.
발표영상 다시보기: https://youtu.be/eQjkwhyOOmI
대규모 데이터 레이크 구성 및 관리는 복잡하고 시간이 많이 걸리는 작업입니다. AWS Lake Formation은 수일만에 안전한 데이터 레이크를 구성할 수 있는 완전 관리 서비스입니다. 본 세션에서는 데이터 수집, 분류, 정리, 변환 및 보안을 위해 AWS Lake Formation을 통해 Amazon S3, EMR, Redshift 및 Athena와 같은 분석 도구를 쉽게 구성하는 방법을 알아봅니다. (2019년 11월 서울 리전 출시)
Modern Cloud Data Warehousing ft. Intuit: Optimize Analytics Practices (ANT20...Amazon Web Services
Most companies are overrun with data, yet they lack critical insights to make timely and accurate business decisions. They are missing the opportunity to combine large amounts of new, unstructured big data that resides outside their data warehouse with trusted, structured data inside their data warehouse. In this session, we discuss the most common use cases with Amazon Redshift, and we take an in-depth look at how modern data warehousing blends and analyzes all your data to give you deeper insights to run your business. Intuit joins us to share their experience modernizing their analytics pipeline.
효율적인 빅데이터 분석 및 처리를 위한 Glue, EMR 활용 - 김태현 솔루션즈 아키텍트, AWS :: AWS Summit Seoul 2019Amazon Web Services Korea
효율적인 빅데이터 분석 및 처리를 위한 Glue, EMR 활용
김태현 솔루션즈 아키텍트, AWS
AWS에서는 Big Data 분석 및 처리를 위해 분석 목적에 맞는 다양한 Big Data Framework 서비스를 지원합니다. 이 세션에서는 시간이 지날수록 증가하는 데이터의 분석 및 처리를 위해 사용되는 AWS Glue와 Amazon EMR 같은 AWS Big Data Framework의 내부구조를 살펴보고 머신러닝을 포함한 다양한 분석 및 ETL을 위해 효율적으로 사용할 수 있는 방법들을 소개합니다.
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...Amazon Web Services
The world is producing an ever increasing volume, velocity, and variety of big data. Consumers and businesses are demanding up-to-the-second (or even millisecond) analytics on their fast-moving data, in addition to classic batch processing. AWS delivers many technologies for solving big data problems. But what services should you use, why, when, and how? In this session, we simplify big data processing as a data bus comprising various stages: ingest, store, process, and visualize. Next, we discuss how to choose the right technology in each stage based on criteria such as data structure, query latency, cost, request rate, item size, data volume, durability, and so on. Finally, we provide reference architecture, design patterns, and best practices for assembling these technologies to solve your big data problems at the right cost.
Database Migration Using AWS DMS and AWS SCT (GPSCT307) - AWS re:Invent 2018Amazon Web Services
Database migrations are an important step in any journey to AWS. In this session, we show you how to get started with AWS Database Migration Service (AWS DMS) and AWS Schema Conversion Tool (AWS SCT) to quickly and securely migrate your databases to AWS. Learn how to simplify your database migrations by using this service to migrate your data to and from commercial and open-source databases. We also explain how you can perform homogenous migrations such as MySQL to MySQL, as well as heterogeneous migrations between different database platforms, such as Oracle to Amazon Aurora.
발표영상 다시보기: https://youtu.be/eQjkwhyOOmI
대규모 데이터 레이크 구성 및 관리는 복잡하고 시간이 많이 걸리는 작업입니다. AWS Lake Formation은 수일만에 안전한 데이터 레이크를 구성할 수 있는 완전 관리 서비스입니다. 본 세션에서는 데이터 수집, 분류, 정리, 변환 및 보안을 위해 AWS Lake Formation을 통해 Amazon S3, EMR, Redshift 및 Athena와 같은 분석 도구를 쉽게 구성하는 방법을 알아봅니다. (2019년 11월 서울 리전 출시)
Modern Cloud Data Warehousing ft. Intuit: Optimize Analytics Practices (ANT20...Amazon Web Services
Most companies are overrun with data, yet they lack critical insights to make timely and accurate business decisions. They are missing the opportunity to combine large amounts of new, unstructured big data that resides outside their data warehouse with trusted, structured data inside their data warehouse. In this session, we discuss the most common use cases with Amazon Redshift, and we take an in-depth look at how modern data warehousing blends and analyzes all your data to give you deeper insights to run your business. Intuit joins us to share their experience modernizing their analytics pipeline.
효율적인 빅데이터 분석 및 처리를 위한 Glue, EMR 활용 - 김태현 솔루션즈 아키텍트, AWS :: AWS Summit Seoul 2019Amazon Web Services Korea
효율적인 빅데이터 분석 및 처리를 위한 Glue, EMR 활용
김태현 솔루션즈 아키텍트, AWS
AWS에서는 Big Data 분석 및 처리를 위해 분석 목적에 맞는 다양한 Big Data Framework 서비스를 지원합니다. 이 세션에서는 시간이 지날수록 증가하는 데이터의 분석 및 처리를 위해 사용되는 AWS Glue와 Amazon EMR 같은 AWS Big Data Framework의 내부구조를 살펴보고 머신러닝을 포함한 다양한 분석 및 ETL을 위해 효율적으로 사용할 수 있는 방법들을 소개합니다.
A closer look at the MySQL and PostgreSQL compatible relational database built for the cloud that combines the performance and availability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases. We’ll explore how Aurora uses the AWS cloud to provide high reliability, high durability, and high throughput.
Speakers:
Steve Abraham - Principal Database Specialist Solutions Architect, AWS
Peter Dachnowicz - Sr. Technical Account Manager, AWS
Amazon EMR enables fast processing of large structured or unstructured datasets, and in this presentation we'll show you how to setup an Amazon EMR job flow to analyse application logs, and perform Hive queries against it. We also review best practices around data file organisation on Amazon Simple Storage Service (S3), how clusters can be started from the AWS web console and command line, and how to monitor the status of a Map/Reduce job.
Finally we take a look at Hadoop ecosystem tools you can use with Amazon EMR and the additional features of the service.
See a recording of the webinar based on this presentation on YouTube here:
Check out the rest of the Masterclass webinars for 2015 here: http://aws.amazon.com/campaigns/emea/masterclass/
See the Journey Through the Cloud webinar series here: http://aws.amazon.com/campaigns/emea/journey/
DAT316_Report from the field on Aurora PostgreSQL PerformanceAmazon Web Services
Tatsuo Ishii from SRA OSS has done extensive testing to compare the Aurora PostgreSQL-compatible Edition with standard PostgreSQL. In this session, he will present his performance testing results, and his work on Pgpool-II with Aurora; Pgpool-II is an open source tool which provides load balancing, connection pooling, and connection management for PostgreSQL.
Amazon RDS allows you to launch an optimally configured, secure and highly available database with just a few clicks. It provides cost-efficient and resizable capacity while managing time-consuming database administration tasks, freeing you up to focus on your applications and business. We’ll discuss Amazon RDS fundamentals, learn about the six available database engines (with the seventh on the way), and examine customer success stories.
Cloud computing gives you a number of advantages, such as the ability to scale your web application or website on demand. If you have a new web application and want to use cloud computing, you might be asking yourself, "Where do I start?" Join us in this session to understand best practices for scaling your resources from zero to millions of users. We show you how to best combine different AWS services, how to make smarter decisions for architecting your application, and how to scale your infrastructure in the cloud.
How will my on-premises data migrate to the cloud? How can I make it transparent to my users? Afterwards, how will on-premises and cloud data interact? In this session you will learn about the AWS Database Migration Service (DMS) and the AWS Schema Migration Tool (SCT). You can use these tools to convert your commercial database and database warehouse to open-source engines or AWS-native services, such as Amazon Aurora and Redshift.
Speakers:
Saurabh Saxena - Principal Technical Account Manager, AWS
Chris England - Sr. Technical Account Manager, AWS
Scaling Up to Your First 10 Million Users (ARC205-R1) - AWS re:Invent 2018Amazon Web Services
Cloud computing provides a number of advantages, such as the ability to scale your web application or website on demand. If you have a new web application and want to use cloud computing, you might be asking yourself, "Where do I start?" Join us in this session for best practices on scaling your resources from one to millions of users. We show you how to best combine different AWS services, how to make smarter decisions for architecting your application, and how to scale your infrastructure in the cloud.
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...Dremio Corporation
Essentially every successful analytical DBMS in the market today makes use of column-oriented data structures. In the Hadoop ecosystem, Apache Parquet (and Apache ORC) provide similar advantages in terms of processing and storage efficiency. Apache Arrow is the in-memory counterpart to these formats and has been been embraced by over a dozen open source projects as the de facto standard for in-memory processing. In this session the PMC Chair for Apache Arrow and the PMC Chair for Apache Parquet discuss the future of column-oriented processing.
A quick tour in 16 slides of Amazon's Redshift clustered, massively parallel database.
Find out what differentiates it from the other database products Amazon has, including SimpleDB, DynamoDB and RDS (MySQL, SQL Server and Oracle).
Learn how it stores data on disk in a columnar format and how this relates to performance and interesting compression techniques.
Contrast the difference between Redshift and a MySQL instance and discover how the clustered architecture may help to dramatically reduce query time.
Mario Molina, Software Engineer
CDC systems are usually used to identify changes in data sources, capture and replicate those changes to other systems. Companies are using CDC to sync data across systems, cloud migration or even applying stream processing, among others.
In this presentation we’ll see CDC patterns, how to use it in Apache Kafka, and do a live demo!
https://www.meetup.com/Mexico-Kafka/events/277309497/
OpenSearch는 배포형 오픈 소스 검색과 분석 제품군으로 실시간 애플리케이션 모니터링, 로그 분석 및 웹 사이트 검색과 같이 다양한 사용 사례에 사용됩니다. OpenSearch는 데이터 탐색을 쉽게 도와주는 통합 시각화 도구 OpenSearch와 함께 뛰어난 확장성을 지닌 시스템을 제공하여 대량 데이터 볼륨에 빠르게 액세스 및 응답합니다. 이 세션에서는 실제 동작 구조에 대한 설명을 바탕으로 최적화를 하기 위한 방법과 운영상에 발생할 수 있는 이슈에 대해서 알아봅니다.
The webinar based on this presentation discussed strategies that you can adopt to help you save money in the AWS Cloud. From turning systems off at night, to implementing bidding strategies on the spot market, there are many ways in which you can manage and reduce your costs with AWS.
Dive into the differences between instance types; explain how you can reduce costs with Reserved Instances, the spot market and by architecting to reduce costs. We'll discuss how to combine on-demand pricing with spot pricing to perform cost effective big data analysis, and introduce customer examples to illustrate how AWS customers gain the most from AWS whilst at the same time managing their spend.
Topics include:
• Understand different cost optimisation strategies you can employ in the AWS Cloud
• Learn how to take advantage of different instance types
• Discover architectural principles behind cost optimisation in AWS
• Learn about tools to help you keep on top of your AWS spend
You can find a recording of this webinar on YouTube here: http://youtu.be/kId90Q7b6kY
Migrate from Oracle to Aurora PostgreSQL: Best Practices, Design Patterns, & ...Amazon Web Services
In this session, we show you how to set the source Oracle database environment, the target PostgreSQL environment, and parameter group configuration. We also recommended database parameters to disable foreign keys and triggers. Finally, we discuss best practices for using AWS Database Migration Service (AWS DMS) and AWS Schema Conversion Tool (AWS SCT) and show you how to choose the instance type and configure AWS DMS.
데이터 센터에서 호스팅하고 있는 기존의 SAP ERP 시스템을 어떻게 AWS로 이관 하는지 고객의 사례를 통해 설명드릴 예정입니다. 이 세션에서는 기존의 SAP ECC 버전을 그대로 Lift and Shift로 마이그레이션 하는 방법과 S/4HANA 버전으로 업그레이드 작업을 병행하면서 마이그레이션하는 방법을 모두 다룰 예정입니다. SAP ERP시스템의 서비스 다운 타임을 최소화하기 위한 대량 파일 전송 방법, 네트웍 구성 전략, 3rd Party 솔루션들을 소개해드리며 AWS 환경을 이용하여 어떻게 마이그레이션 프로젝트의 기간과 비용을 줄일 수 있는지 설명드릴 예정입니다.
Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...Amazon Web Services
"Customers are migrating their analytics, data processing (ETL), and data science workloads running on Apache Hadoop/Spark to AWS in order to save costs, increase availability, and improve performance. In this session, AWS customers Airbnb and Guardian Life discuss how they migrated their workload to Amazon EMR. This session focuses on key motivations to move to the cloud. It details key architectural changes and the benefits of migrating Hadoop/Spark workloads to the cloud.
"
Amazon VPC: Security at the Speed Of Light (NET313) - AWS re:Invent 2018Amazon Web Services
With Amazon Virtual Private Cloud (Amazon VPC) you can build your own virtual data center networks in seconds. Every VPC is free, but it comes with enterprise-grade capabilities that would cost millions of dollars in a traditional data center. How is this possible? Come hear how Amazon VPC works under the hood. We uncover how we use Amazon-designed hardware to deliver high-assurance security and ultra-fast performance that makes the speed of light feel slow. Leave with insights and tips for how to optimize your own applications, and even whole organizations, to deliver faster than ever.
사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...Amazon Web Services Korea
Database Migration Service(DMS)는 RDBMS 이외에도 다양한 데이터베이스 이관을 지원합니다. 실제 고객사 사례를 통해 DMS가 데이터베이스 이관, 통합, 분리를 수행하는 데 어떻게 활용되는지 알아보고, 동시에 데이터 분석을 위한 데이터 수집(Data Ingest)에도 어떤 역할을 하는지 살펴보겠습니다.
Amazon Aurora Storage Demystified: How It All Works (DAT363) - AWS re:Invent ...Amazon Web Services
Amazon Aurora is a high performance, highly scalable database service with MySQL- and PostgreSQL-compatibility. One of its key components is an innovative storage system that is optimized for database workloads and specifically designed to take advantage of modern cloud technology. Hear from the team that built Amazon Aurora's storage system on how the system is designed, how it works, and what you need to know to get the most out of it.
A closer look at the MySQL and PostgreSQL compatible relational database built for the cloud that combines the performance and availability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases. We’ll explore how Aurora uses the AWS cloud to provide high reliability, high durability, and high throughput.
Speakers:
Steve Abraham - Principal Database Specialist Solutions Architect, AWS
Peter Dachnowicz - Sr. Technical Account Manager, AWS
Amazon EMR enables fast processing of large structured or unstructured datasets, and in this presentation we'll show you how to setup an Amazon EMR job flow to analyse application logs, and perform Hive queries against it. We also review best practices around data file organisation on Amazon Simple Storage Service (S3), how clusters can be started from the AWS web console and command line, and how to monitor the status of a Map/Reduce job.
Finally we take a look at Hadoop ecosystem tools you can use with Amazon EMR and the additional features of the service.
See a recording of the webinar based on this presentation on YouTube here:
Check out the rest of the Masterclass webinars for 2015 here: http://aws.amazon.com/campaigns/emea/masterclass/
See the Journey Through the Cloud webinar series here: http://aws.amazon.com/campaigns/emea/journey/
DAT316_Report from the field on Aurora PostgreSQL PerformanceAmazon Web Services
Tatsuo Ishii from SRA OSS has done extensive testing to compare the Aurora PostgreSQL-compatible Edition with standard PostgreSQL. In this session, he will present his performance testing results, and his work on Pgpool-II with Aurora; Pgpool-II is an open source tool which provides load balancing, connection pooling, and connection management for PostgreSQL.
Amazon RDS allows you to launch an optimally configured, secure and highly available database with just a few clicks. It provides cost-efficient and resizable capacity while managing time-consuming database administration tasks, freeing you up to focus on your applications and business. We’ll discuss Amazon RDS fundamentals, learn about the six available database engines (with the seventh on the way), and examine customer success stories.
Cloud computing gives you a number of advantages, such as the ability to scale your web application or website on demand. If you have a new web application and want to use cloud computing, you might be asking yourself, "Where do I start?" Join us in this session to understand best practices for scaling your resources from zero to millions of users. We show you how to best combine different AWS services, how to make smarter decisions for architecting your application, and how to scale your infrastructure in the cloud.
How will my on-premises data migrate to the cloud? How can I make it transparent to my users? Afterwards, how will on-premises and cloud data interact? In this session you will learn about the AWS Database Migration Service (DMS) and the AWS Schema Migration Tool (SCT). You can use these tools to convert your commercial database and database warehouse to open-source engines or AWS-native services, such as Amazon Aurora and Redshift.
Speakers:
Saurabh Saxena - Principal Technical Account Manager, AWS
Chris England - Sr. Technical Account Manager, AWS
Scaling Up to Your First 10 Million Users (ARC205-R1) - AWS re:Invent 2018Amazon Web Services
Cloud computing provides a number of advantages, such as the ability to scale your web application or website on demand. If you have a new web application and want to use cloud computing, you might be asking yourself, "Where do I start?" Join us in this session for best practices on scaling your resources from one to millions of users. We show you how to best combine different AWS services, how to make smarter decisions for architecting your application, and how to scale your infrastructure in the cloud.
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...Dremio Corporation
Essentially every successful analytical DBMS in the market today makes use of column-oriented data structures. In the Hadoop ecosystem, Apache Parquet (and Apache ORC) provide similar advantages in terms of processing and storage efficiency. Apache Arrow is the in-memory counterpart to these formats and has been been embraced by over a dozen open source projects as the de facto standard for in-memory processing. In this session the PMC Chair for Apache Arrow and the PMC Chair for Apache Parquet discuss the future of column-oriented processing.
A quick tour in 16 slides of Amazon's Redshift clustered, massively parallel database.
Find out what differentiates it from the other database products Amazon has, including SimpleDB, DynamoDB and RDS (MySQL, SQL Server and Oracle).
Learn how it stores data on disk in a columnar format and how this relates to performance and interesting compression techniques.
Contrast the difference between Redshift and a MySQL instance and discover how the clustered architecture may help to dramatically reduce query time.
Mario Molina, Software Engineer
CDC systems are usually used to identify changes in data sources, capture and replicate those changes to other systems. Companies are using CDC to sync data across systems, cloud migration or even applying stream processing, among others.
In this presentation we’ll see CDC patterns, how to use it in Apache Kafka, and do a live demo!
https://www.meetup.com/Mexico-Kafka/events/277309497/
OpenSearch는 배포형 오픈 소스 검색과 분석 제품군으로 실시간 애플리케이션 모니터링, 로그 분석 및 웹 사이트 검색과 같이 다양한 사용 사례에 사용됩니다. OpenSearch는 데이터 탐색을 쉽게 도와주는 통합 시각화 도구 OpenSearch와 함께 뛰어난 확장성을 지닌 시스템을 제공하여 대량 데이터 볼륨에 빠르게 액세스 및 응답합니다. 이 세션에서는 실제 동작 구조에 대한 설명을 바탕으로 최적화를 하기 위한 방법과 운영상에 발생할 수 있는 이슈에 대해서 알아봅니다.
The webinar based on this presentation discussed strategies that you can adopt to help you save money in the AWS Cloud. From turning systems off at night, to implementing bidding strategies on the spot market, there are many ways in which you can manage and reduce your costs with AWS.
Dive into the differences between instance types; explain how you can reduce costs with Reserved Instances, the spot market and by architecting to reduce costs. We'll discuss how to combine on-demand pricing with spot pricing to perform cost effective big data analysis, and introduce customer examples to illustrate how AWS customers gain the most from AWS whilst at the same time managing their spend.
Topics include:
• Understand different cost optimisation strategies you can employ in the AWS Cloud
• Learn how to take advantage of different instance types
• Discover architectural principles behind cost optimisation in AWS
• Learn about tools to help you keep on top of your AWS spend
You can find a recording of this webinar on YouTube here: http://youtu.be/kId90Q7b6kY
Migrate from Oracle to Aurora PostgreSQL: Best Practices, Design Patterns, & ...Amazon Web Services
In this session, we show you how to set the source Oracle database environment, the target PostgreSQL environment, and parameter group configuration. We also recommended database parameters to disable foreign keys and triggers. Finally, we discuss best practices for using AWS Database Migration Service (AWS DMS) and AWS Schema Conversion Tool (AWS SCT) and show you how to choose the instance type and configure AWS DMS.
데이터 센터에서 호스팅하고 있는 기존의 SAP ERP 시스템을 어떻게 AWS로 이관 하는지 고객의 사례를 통해 설명드릴 예정입니다. 이 세션에서는 기존의 SAP ECC 버전을 그대로 Lift and Shift로 마이그레이션 하는 방법과 S/4HANA 버전으로 업그레이드 작업을 병행하면서 마이그레이션하는 방법을 모두 다룰 예정입니다. SAP ERP시스템의 서비스 다운 타임을 최소화하기 위한 대량 파일 전송 방법, 네트웍 구성 전략, 3rd Party 솔루션들을 소개해드리며 AWS 환경을 이용하여 어떻게 마이그레이션 프로젝트의 기간과 비용을 줄일 수 있는지 설명드릴 예정입니다.
Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...Amazon Web Services
"Customers are migrating their analytics, data processing (ETL), and data science workloads running on Apache Hadoop/Spark to AWS in order to save costs, increase availability, and improve performance. In this session, AWS customers Airbnb and Guardian Life discuss how they migrated their workload to Amazon EMR. This session focuses on key motivations to move to the cloud. It details key architectural changes and the benefits of migrating Hadoop/Spark workloads to the cloud.
"
Amazon VPC: Security at the Speed Of Light (NET313) - AWS re:Invent 2018Amazon Web Services
With Amazon Virtual Private Cloud (Amazon VPC) you can build your own virtual data center networks in seconds. Every VPC is free, but it comes with enterprise-grade capabilities that would cost millions of dollars in a traditional data center. How is this possible? Come hear how Amazon VPC works under the hood. We uncover how we use Amazon-designed hardware to deliver high-assurance security and ultra-fast performance that makes the speed of light feel slow. Leave with insights and tips for how to optimize your own applications, and even whole organizations, to deliver faster than ever.
사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...Amazon Web Services Korea
Database Migration Service(DMS)는 RDBMS 이외에도 다양한 데이터베이스 이관을 지원합니다. 실제 고객사 사례를 통해 DMS가 데이터베이스 이관, 통합, 분리를 수행하는 데 어떻게 활용되는지 알아보고, 동시에 데이터 분석을 위한 데이터 수집(Data Ingest)에도 어떤 역할을 하는지 살펴보겠습니다.
Amazon Aurora Storage Demystified: How It All Works (DAT363) - AWS re:Invent ...Amazon Web Services
Amazon Aurora is a high performance, highly scalable database service with MySQL- and PostgreSQL-compatibility. One of its key components is an innovative storage system that is optimized for database workloads and specifically designed to take advantage of modern cloud technology. Hear from the team that built Amazon Aurora's storage system on how the system is designed, how it works, and what you need to know to get the most out of it.
Pegasus: Designing a Distributed Key Value System (Arch summit beijing-2016)涛 吴
This slide delivered by Zuoyan Qin, Chief engineer from XiaoMi Cloud Storage Team, was for talk at Arch summit Beijing-2016 regarding how Pegasus was designed.
面對日新月異的大數據工具,有時候很難跟上這節奏。有鑑於此,Amazon Web Services提供了廣泛而完善的雲端運算服務組合,幫助您構建、維護和部署大數據應用程式。
這場線上研討會,將為各位深入淺出介紹AWS 雲端平台提供的各種大數據選項,包括現正流行的大數據框架,如Hadoop、Spark、NoSQL數據庫等,同時透過使用案例來瞭解最佳實踐方式。最後,您將了解如何應用這些工具服務,將大數據導入您的現實應用程式中。
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
Il Forecasting è un processo importante per tantissime aziende e viene utilizzato in vari ambiti per cercare di prevedere in modo accurato la crescita e distribuzione di un prodotto, l’utilizzo delle risorse necessarie nelle linee produttive, presentazioni finanziarie e tanto altro. Amazon utilizza delle tecniche avanzate di forecasting, in parte questi servizi sono stati messi a disposizione di tutti i clienti AWS.
In questa sessione illustreremo come pre-processare i dati che contengono una componente temporale e successivamente utilizzare un algoritmo che a partire dal tipo di dato analizzato produce un forecasting accurato.
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
La varietà e la quantità di dati che si crea ogni giorno accelera sempre più velocemente e rappresenta una opportunità irripetibile per innovare e creare nuove startup.
Tuttavia gestire grandi quantità di dati può apparire complesso: creare cluster Big Data su larga scala sembra essere un investimento accessibile solo ad aziende consolidate. Ma l’elasticità del Cloud e, in particolare, i servizi Serverless ci permettono di rompere questi limiti.
Vediamo quindi come è possibile sviluppare applicazioni Big Data rapidamente, senza preoccuparci dell’infrastruttura, ma dedicando tutte le risorse allo sviluppo delle nostre le nostre idee per creare prodotti innovativi.
Ora puoi utilizzare Amazon Elastic Kubernetes Service (EKS) per eseguire pod Kubernetes su AWS Fargate, il motore di elaborazione serverless creato per container su AWS. Questo rende più semplice che mai costruire ed eseguire le tue applicazioni Kubernetes nel cloud AWS.In questa sessione presenteremo le caratteristiche principali del servizio e come distribuire la tua applicazione in pochi passaggi
Vent'anni fa Amazon ha attraversato una trasformazione radicale con l'obiettivo di aumentare il ritmo dell'innovazione. In questo periodo abbiamo imparato come cambiare il nostro approccio allo sviluppo delle applicazioni ci ha permesso di aumentare notevolmente l'agilità, la velocità di rilascio e, in definitiva, ci ha consentito di creare applicazioni più affidabili e scalabili. In questa sessione illustreremo come definiamo le applicazioni moderne e come la creazione di app moderne influisce non solo sull'architettura dell'applicazione, ma sulla struttura organizzativa, sulle pipeline di rilascio dello sviluppo e persino sul modello operativo. Descriveremo anche approcci comuni alla modernizzazione, compreso l'approccio utilizzato dalla stessa Amazon.com.
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
L’utilizzo dei container è in continua crescita.
Se correttamente disegnate, le applicazioni basate su Container sono molto spesso stateless e flessibili.
I servizi AWS ECS, EKS e Kubernetes su EC2 possono sfruttare le istanze Spot, portando ad un risparmio medio del 70% rispetto alle istanze On Demand. In questa sessione scopriremo insieme quali sono le caratteristiche delle istanze Spot e come possono essere utilizzate facilmente su AWS. Impareremo inoltre come Spreaker sfrutta le istanze spot per eseguire applicazioni di diverso tipo, in produzione, ad una frazione del costo on-demand!
In recent months, many customers have been asking us the question – how to monetise Open APIs, simplify Fintech integrations and accelerate adoption of various Open Banking business models. Therefore, AWS and FinConecta would like to invite you to Open Finance marketplace presentation on October 20th.
Event Agenda :
Open banking so far (short recap)
• PSD2, OB UK, OB Australia, OB LATAM, OB Israel
Intro to Open Finance marketplace
• Scope
• Features
• Tech overview and Demo
The role of the Cloud
The Future of APIs
• Complying with regulation
• Monetizing data / APIs
• Business models
• Time to market
One platform for all: a Strategic approach
Q&A
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
Per creare valore e costruire una propria offerta differenziante e riconoscibile, le startup di successo sanno come combinare tecnologie consolidate con componenti innovativi creati ad hoc.
AWS fornisce servizi pronti all'utilizzo e, allo stesso tempo, permette di personalizzare e creare gli elementi differenzianti della propria offerta.
Concentrandoci sulle tecnologie di Machine Learning, vedremo come selezionare i servizi di intelligenza artificiale offerti da AWS e, anche attraverso una demo, come costruire modelli di Machine Learning personalizzati utilizzando SageMaker Studio.
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
Con l'approccio tradizionale al mondo IT per molti anni è stato difficile implementare tecniche di DevOps, che finora spesso hanno previsto attività manuali portando di tanto in tanto a dei downtime degli applicativi interrompendo l'operatività dell'utente. Con l'avvento del cloud, le tecniche di DevOps sono ormai a portata di tutti a basso costo per qualsiasi genere di workload, garantendo maggiore affidabilità del sistema e risultando in dei significativi miglioramenti della business continuity.
AWS mette a disposizione AWS OpsWork come strumento di Configuration Management che mira ad automatizzare e semplificare la gestione e i deployment delle istanze EC2 per mezzo di workload Chef e Puppet.
Scopri come sfruttare AWS OpsWork a garanzia e affidabilità del tuo applicativo installato su Instanze EC2.
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
Vuoi conoscere le opzioni per eseguire Microsoft Active Directory su AWS? Quando si spostano carichi di lavoro Microsoft in AWS, è importante considerare come distribuire Microsoft Active Directory per supportare la gestione, l'autenticazione e l'autorizzazione dei criteri di gruppo. In questa sessione, discuteremo le opzioni per la distribuzione di Microsoft Active Directory su AWS, incluso AWS Directory Service per Microsoft Active Directory e la distribuzione di Active Directory su Windows su Amazon Elastic Compute Cloud (Amazon EC2). Trattiamo argomenti quali l'integrazione del tuo ambiente Microsoft Active Directory locale nel cloud e l'utilizzo di applicazioni SaaS, come Office 365, con AWS Single Sign-On.
Dal riconoscimento facciale al riconoscimento di frodi o difetti di fabbricazione, l'analisi di immagini e video che sfruttano tecniche di intelligenza artificiale, si stanno evolvendo e raffinando a ritmi elevati. In questo webinar esploreremo le possibilità messe a disposizione dai servizi AWS per applicare lo stato dell'arte delle tecniche di computer vision a scenari reali.
Amazon Web Services e VMware organizzano un evento virtuale gratuito il prossimo mercoledì 14 Ottobre dalle 12:00 alle 13:00 dedicato a VMware Cloud ™ on AWS, il servizio on demand che consente di eseguire applicazioni in ambienti cloud basati su VMware vSphere® e di accedere ad una vasta gamma di servizi AWS, sfruttando a pieno le potenzialità del cloud AWS e tutelando gli investimenti VMware esistenti.
Molte organizzazioni sfruttano i vantaggi del cloud migrando i propri carichi di lavoro Oracle e assicurandosi notevoli vantaggi in termini di agilità ed efficienza dei costi.
La migrazione di questi carichi di lavoro, può creare complessità durante la modernizzazione e il refactoring delle applicazioni e a questo si possono aggiungere rischi di prestazione che possono essere introdotti quando si spostano le applicazioni dai data center locali.
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
Molte aziende oggi, costruiscono applicazioni con funzionalità di tipo ledger ad esempio per verificare lo storico di accrediti o addebiti nelle transazioni bancarie o ancora per tenere traccia del flusso supply chain dei propri prodotti.
Alla base di queste soluzioni ci sono i database ledger che permettono di avere un log delle transazioni trasparente, immutabile e crittograficamente verificabile, ma sono strumenti complessi e onerosi da gestire.
Amazon QLDB elimina la necessità di costruire sistemi personalizzati e complessi fornendo un database ledger serverless completamente gestito.
In questa sessione scopriremo come realizzare un'applicazione serverless completa che utilizzi le funzionalità di QLDB.
Con l’ascesa delle architetture di microservizi e delle ricche applicazioni mobili e Web, le API sono più importanti che mai per offrire agli utenti finali una user experience eccezionale. In questa sessione impareremo come affrontare le moderne sfide di progettazione delle API con GraphQL, un linguaggio di query API open source utilizzato da Facebook, Amazon e altro e come utilizzare AWS AppSync, un servizio GraphQL serverless gestito su AWS. Approfondiremo diversi scenari, comprendendo come AppSync può aiutare a risolvere questi casi d’uso creando API moderne con funzionalità di aggiornamento dati in tempo reale e offline.
Inoltre, impareremo come Sky Italia utilizza AWS AppSync per fornire aggiornamenti sportivi in tempo reale agli utenti del proprio portale web.
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
Molte organizzazioni sfruttano i vantaggi del cloud migrando i propri carichi di lavoro Oracle e assicurandosi notevoli vantaggi in termini di agilità ed efficienza dei costi.
La migrazione di questi carichi di lavoro, può creare complessità durante la modernizzazione e il refactoring delle applicazioni e a questo si possono aggiungere rischi di prestazione che possono essere introdotti quando si spostano le applicazioni dai data center locali.
In queste slide, gli esperti AWS e VMware presentano semplici e pratici accorgimenti per facilitare e semplificare la migrazione dei carichi di lavoro Oracle accelerando la trasformazione verso il cloud, approfondiranno l’architettura e dimostreranno come sfruttare a pieno le potenzialità di VMware Cloud ™ on AWS.
Amazon Elastic Container Service (Amazon ECS) è un servizio di gestione dei container altamente scalabile, che semplifica la gestione dei contenitori Docker attraverso un layer di orchestrazione per il controllo del deployment e del relativo lifecycle. In questa sessione presenteremo le principali caratteristiche del servizio, le architetture di riferimento per i differenti carichi di lavoro e i semplici passi necessari per poter velocemente migrare uno o più dei tuo container.
8. Simplify Big Data Processing
收集
COLLECT
儲存
STORE
處理/分析
PROCESS /
ANALYZE
使用
CONSUME
數據
data
答案
answers
延遲時間 Time to answer (Latency)
吞吐量 Throughput
成本 Cost
從這裡開始
WITH A BUSINESS CASE
簡化大數據處理
11. 數據類型收集
COLLECT
Mobile apps
Web apps
Data centers
AWS Direct
Connect
RECORDS
Applications
In-memory data structures
Database records
AWS Import/Export
Snowball
Logging
Amazon
CloudWatch
AWS
CloudTrail
DOCUMENTS
FILES
LoggingTransport
Search documents
Log files
Messaging
Message MESSAGES
Messaging
Messages
Devices
Sensors &
IoT platforms
AWS IoT STREAMS
IoT
Data streams
交易記錄 Transactions
檔案 Files
事件 Events
12. Hot Warm Cold
Volume MB–GB GB–TB PB–EB
Item size B–KB KB–MB KB–TB
Latency ms ms, sec min, hrs
Durability Low–high High Very high
Request rate Very high High Low
Cost/GB $$-$ $-¢¢ ¢
Hot data Warm data Cold data
數據特點:熱,暖,冷
28. 數據結構和訪問模式
存取模式 Access 應該用什麼 What to use?
Put/Get (key, value) In-memory, NoSQL
Simple relationships → 1:N, M:N NoSQL
Multi-table joins, transaction,
SQL
SQL
Faceting, search Search
數據結構 Data Structure 應該用什麼 What to use?
Fixed schema SQL, NoSQL
Schema-free (JSON) NoSQL, Search
(Key, value) In-memory, NoSQL
29. Amazon ElastiCache Amazon
DynamoDB
Amazon
RDS/Aurora
Amazon
ES
Amazon S3 Amazon Glacier
延遲時間
Average
latency
ms ms ms, sec ms, sec ms, sec,min
(~ size)
hrs
總儲量
Typical
data stored
GB GB–TBs
(no limit)
GB–TB
(64 TB max)
GB–TB MB–PB
(no limit)
GB–PB
(no limit)
物件體積
Typical
item size
B-KB KB
(400 KB max)
KB
(64 KB max)
B-KB
(2 GB max)
KB-TB
(5 TB max)
GB
(40 TB max)
提取频率
Request Rate
High – very high Very high
(no limit)
High High Low – high
(no limit)
Very low
成本Storage
cost GB/month
$$ ¢¢ ¢¢ ¢¢ ¢ ¢4/10
耐久力
Durability
Low - moderate Very high Very high High Very high Very high
可用性
Availability
High
2 AZ
Very high
3 AZ
Very high
3 AZ
High
2 AZ
Very high
3 AZ
Very high
3 AZ
Hot data Warm data Cold data
應該使用哪個數據存儲?