이제 빅데이터란 개념은 익숙한 것이 되었지만 이를 비지니스에 적용하고 최대의 효과를 얻는 방법에 대한 고찰은 여전히 필요합니다. 소중한 데이터를 쉽게 저장 및 분석하고 시각화하는 것은 비즈니스에 대한 통찰을 얻기 위한 중요한 과정입니다.
이 강연에서는 AWS Elastic MapReduce, Amazon Redshift, Amazon Kinesis 등 AWS가 제공하는 다양한 데이터 분석 도구를 활용해 보다 간편하고 빠른 빅데이터 분석 서비스를 구축하는 방법에 대해 소개합니다.
이 강연에서는 NoSQL 데이터베이스 서비스인 Amazon DynamoDB 서비스를 간단하게 소개하고, 새롭게 발표된 신규 시간 기반 (TTL) 데이터 관리 기능 및 인메모리 캐시 신규 기능 (Amazon DynamoDB Accelerator) 등에 대해 함께 설명해 드릴 예정입니다.
연사: Pranav Nambiar, 아마존 웹서비스 Amazon DynamoDB 총괄 프로덕트 매니저
이 강연에서는 NoSQL 데이터베이스 서비스인 Amazon DynamoDB 서비스를 간단하게 소개하고, 새롭게 발표된 신규 시간 기반 (TTL) 데이터 관리 기능 및 인메모리 캐시 신규 기능 (Amazon DynamoDB Accelerator) 등에 대해 함께 설명해 드릴 예정입니다.
연사: Pranav Nambiar, 아마존 웹서비스 Amazon DynamoDB 총괄 프로덕트 매니저
Nesta sessão faremos uma demonstração de controle e defesa de tráfego aéreo utilizando processamento em tempo real. Trataremos das boas práticas para ingestão, armazenamento, processamento e visualização de dados através de serviços da AWS como Kinesis, DynamoDB, Lambda, Redshift, Quicksight e Amazon Machine Learning.
이 강연에서는 AWS Big Data 분석 아키텍처 모범 사례를 살펴보고 표준 SQL을 사용해 Amazon S3에 저장된 데이터를 간편하게 분석할 수 있는 대화식 쿼리 서비스인 Amazon Athena의 특징과 최신 기능들에 대하여 고객 사례와 함께 소개드립니다.
연사: Greg Khairallah, 아마존 웹서비스 Amazon Big Data 및 Athena 총괄 사업 개발 매니저
클라우드에서 보안은 매우 중요한 요소로서 클라우드 내에서 실행중인 애플리케이션에 대한 보안 인증 정책과 접근 제어 및 변경 사항 추적 및 알림 등의 기능이 필수적입니다. 본 온라인 세미나에서는 AWS 클라우드의 보안에 대한 기초 지식과 아울러 서비스 규모의 확장에 따른 AWS 아키텍처 변화에 맞는 보안 서비스 활용 방법과 모범 사례 등을 소개합니다.
Amazon EKS 그리고 Service Mesh
Kubernetes는 컨테이너 서비스를 도입하는 기업들에게 가장 있기있는 Orchestration 플랫폼입니다. 이 세션에서는 아마존에서 6월 정식 출시한 managed Kubenetes서비스인 EKS를 소개해드리며, 오픈소스 버전과의 차이점 및 장점 등에 대해 설명하고, 진보한 마이크로 서비스인 Service Mesh를 구현하는 Linkerd 소개 및 데모를 진행하고자 합니다.
Amazon EC2 changes the economics of computing and provides you with complete control of your computing resources. It is designed to make web-scale cloud computing easier for developers. In this session, we will take you on a journey, starting with the basics of key management and security groups and ending with an explanation of Auto Scaling and how you can use it to match capacity and costs to demand using dynamic policies. We will also discuss tools and best practices that will help you build failure resilient applications that take advantage of the scale and robustness of AWS regions.
Amazon Elastic Compute Cloud (Amazon EC2) provides a broad selection of instance types to accommodate a diverse mix of workloads. In this technical session, we provide an overview of the Amazon EC2 instance platform, key platform features, and the concept of instance generations. We dive into the current-generation design choices of the different instance families, including the General Purpose, Compute Optimized, Storage Optimized, Memory Optimized, and GPU instance families. We also detail best practices and share performance tips for getting the most out of your Amazon EC2 instances. Other Compute options such as ECS and Lambda for processing in the cloud will be introduced and explained at a high level.
Path to the future #4 - Ingestão, processamento e análise de dados em tempo realAmazon Web Services LATAM
Nesta sessão faremos uma demonstração de controle e defesa de tráfego aéreo utilizando processamento em tempo real.
Trataremos das boas práticas para ingestão, armazenamento, processamento e visualização de dados através de serviços da AWS como Kinesis, DynamoDB, Lambda, Redshift, Quicksight e Amazon Machine Learning.
Database migration simple, cross-engine and cross-platform migrations with ...Amazon Web Services
Learn how you can migrate databases with minimal downtime from on-premises and Amazon EC2 environments to Amazon RDS, Amazon Redshift, Amazon Aurora and EC2 databases using AWS Database Migration Service. We'll discuss homogeneous (e.g. Oracle-to-Oracle, PostgreSQL-to-PostgreSQL, etc.) and heterogeneous (e.g. Oracle to Aurora, SQL Server to MariaDB) database migrations. We'll also talk about the new AWS Schema Conversion Tool that saves you development time when migrating your Oracle and SQL Server database schemas, including PL/SQL and T-SQL procedural code, to their MySQL, MariaDB and Aurora equivalents. Best of all, we'll spend most of the time demonstrating the product and showing use cases designed to help your business.
(BDT209) Launch: Amazon Elasticsearch For Real-Time Data AnalyticsAmazon Web Services
Organizations are collecting an ever-increasing amount of data from numerous sources such as log systems, click streams, and connected devices. Launched in 2009, Elasticsearch —an open-source analytics and search engine— has emerged as a popular tool for real-time analytics and visualization of data. Some of the most common use cases include risk assessment, error detection, and sentiment analysis. However, as data volumes and applications grow, managing Elasticsearch clusters can consume significant IT resources while adding little or no differentiated value to the organization. Amazon Elasticsearch Service (Amazon ES) is a managed service that makes it easy to deploy, operate, and scale Elasticsearch clusters in the AWS Cloud. Amazon ES offers the benefits of a managed service, including cluster provisioning, easy configuration, replication for high availability, scaling options, data durability, security, and node monitoring. This session presents a technical deep dive on Amazon ES. Attendees learn: Common challenges with real-time data analytics and visualization and how to address them; the benefits, reference architecture, and best practices for using Amazon ES; and data ingestion options with Amazon DynamoDB, AWS Lambda, and Amazon Kinesis.
Saiba como Amazon Redshift, o nosso dataware house totalmente gerenciados, pode ajudá-lo de forma rápida e rentável analisar todos os seus dados utilizando suas ferramentas de BI. Também será abordado introdução ao serviço, o qual utiliza MPP, arquitetura scale-out e armazenamento de forma colunar.
Nesta sessão faremos uma demonstração de controle e defesa de tráfego aéreo utilizando processamento em tempo real. Trataremos das boas práticas para ingestão, armazenamento, processamento e visualização de dados através de serviços da AWS como Kinesis, DynamoDB, Lambda, Redshift, Quicksight e Amazon Machine Learning.
이 강연에서는 AWS Big Data 분석 아키텍처 모범 사례를 살펴보고 표준 SQL을 사용해 Amazon S3에 저장된 데이터를 간편하게 분석할 수 있는 대화식 쿼리 서비스인 Amazon Athena의 특징과 최신 기능들에 대하여 고객 사례와 함께 소개드립니다.
연사: Greg Khairallah, 아마존 웹서비스 Amazon Big Data 및 Athena 총괄 사업 개발 매니저
클라우드에서 보안은 매우 중요한 요소로서 클라우드 내에서 실행중인 애플리케이션에 대한 보안 인증 정책과 접근 제어 및 변경 사항 추적 및 알림 등의 기능이 필수적입니다. 본 온라인 세미나에서는 AWS 클라우드의 보안에 대한 기초 지식과 아울러 서비스 규모의 확장에 따른 AWS 아키텍처 변화에 맞는 보안 서비스 활용 방법과 모범 사례 등을 소개합니다.
Amazon EKS 그리고 Service Mesh
Kubernetes는 컨테이너 서비스를 도입하는 기업들에게 가장 있기있는 Orchestration 플랫폼입니다. 이 세션에서는 아마존에서 6월 정식 출시한 managed Kubenetes서비스인 EKS를 소개해드리며, 오픈소스 버전과의 차이점 및 장점 등에 대해 설명하고, 진보한 마이크로 서비스인 Service Mesh를 구현하는 Linkerd 소개 및 데모를 진행하고자 합니다.
Amazon EC2 changes the economics of computing and provides you with complete control of your computing resources. It is designed to make web-scale cloud computing easier for developers. In this session, we will take you on a journey, starting with the basics of key management and security groups and ending with an explanation of Auto Scaling and how you can use it to match capacity and costs to demand using dynamic policies. We will also discuss tools and best practices that will help you build failure resilient applications that take advantage of the scale and robustness of AWS regions.
Amazon Elastic Compute Cloud (Amazon EC2) provides a broad selection of instance types to accommodate a diverse mix of workloads. In this technical session, we provide an overview of the Amazon EC2 instance platform, key platform features, and the concept of instance generations. We dive into the current-generation design choices of the different instance families, including the General Purpose, Compute Optimized, Storage Optimized, Memory Optimized, and GPU instance families. We also detail best practices and share performance tips for getting the most out of your Amazon EC2 instances. Other Compute options such as ECS and Lambda for processing in the cloud will be introduced and explained at a high level.
Path to the future #4 - Ingestão, processamento e análise de dados em tempo realAmazon Web Services LATAM
Nesta sessão faremos uma demonstração de controle e defesa de tráfego aéreo utilizando processamento em tempo real.
Trataremos das boas práticas para ingestão, armazenamento, processamento e visualização de dados através de serviços da AWS como Kinesis, DynamoDB, Lambda, Redshift, Quicksight e Amazon Machine Learning.
Database migration simple, cross-engine and cross-platform migrations with ...Amazon Web Services
Learn how you can migrate databases with minimal downtime from on-premises and Amazon EC2 environments to Amazon RDS, Amazon Redshift, Amazon Aurora and EC2 databases using AWS Database Migration Service. We'll discuss homogeneous (e.g. Oracle-to-Oracle, PostgreSQL-to-PostgreSQL, etc.) and heterogeneous (e.g. Oracle to Aurora, SQL Server to MariaDB) database migrations. We'll also talk about the new AWS Schema Conversion Tool that saves you development time when migrating your Oracle and SQL Server database schemas, including PL/SQL and T-SQL procedural code, to their MySQL, MariaDB and Aurora equivalents. Best of all, we'll spend most of the time demonstrating the product and showing use cases designed to help your business.
(BDT209) Launch: Amazon Elasticsearch For Real-Time Data AnalyticsAmazon Web Services
Organizations are collecting an ever-increasing amount of data from numerous sources such as log systems, click streams, and connected devices. Launched in 2009, Elasticsearch —an open-source analytics and search engine— has emerged as a popular tool for real-time analytics and visualization of data. Some of the most common use cases include risk assessment, error detection, and sentiment analysis. However, as data volumes and applications grow, managing Elasticsearch clusters can consume significant IT resources while adding little or no differentiated value to the organization. Amazon Elasticsearch Service (Amazon ES) is a managed service that makes it easy to deploy, operate, and scale Elasticsearch clusters in the AWS Cloud. Amazon ES offers the benefits of a managed service, including cluster provisioning, easy configuration, replication for high availability, scaling options, data durability, security, and node monitoring. This session presents a technical deep dive on Amazon ES. Attendees learn: Common challenges with real-time data analytics and visualization and how to address them; the benefits, reference architecture, and best practices for using Amazon ES; and data ingestion options with Amazon DynamoDB, AWS Lambda, and Amazon Kinesis.
Saiba como Amazon Redshift, o nosso dataware house totalmente gerenciados, pode ajudá-lo de forma rápida e rentável analisar todos os seus dados utilizando suas ferramentas de BI. Também será abordado introdução ao serviço, o qual utiliza MPP, arquitetura scale-out e armazenamento de forma colunar.
오토스케일링(Auto-scaling)은 AWS 클라우드를 통해 고확장성 서비스와 아키텍처를 구성하는 데 필요한 가장 중요한 요소 중 하나입니다. 이 강연에서는 효과적인 클라우드 인프라 구축을 위해 오토 스케일링을 활용하는 다양한 방법에 대해 자세히 소개해 드립니다.
오토 스케일링 그룹의 구성과 확장 계획에 따른 설정 방법, 오토 스케일링 라이프 사이클과 CloudWatch 및 알림을 이용한 관리 방법, 각종 오토스케일링 모범사례 등을 알아보실 수 있습니다.
이번 월간 웨비나에서는 AWS 클라우드를 통해 어떻게 손쉽게 소프트웨어를 개발하고, 배포하는 과정을 자동화 할 수 있는지를 알아 봅니다. 이를 위해 Amazon.com의 소프트웨어 개발 과정 상의 경험과 이를 토대로 만들어진 AWS CodeDeploy와 CodePipeline 서비스를 소개해 드리고, 이를 통해 EC2 인스턴스 뿐만 아니라 기존 서버에 손쉽게 배포하는 방법을 알려드립니다. 본 세션을 통해 클라우드를 통한 민첩하고 빠른 개발 및 배포를 통해 진화된 데브옵스(DevOps) 프로세스를 정립할 수 있는 방법을 안내해 드립니다.
Join us for a for a Amazon Kinesis tutorial webinar. In this session we will provide a reference architecture and instructions for building a system that performs real-time sliding-windows analysis over streaming clickstream data. We will use Amazon Kinesis for managed ingestion of streaming data at scale with the ability to replay past data, and run sliding-window computation using Apache Storm. We’ll demonstrate in the webinar on how to build the system and deploy on AWS and walkthrough all the steps from ingestion, processing, and storing to visualizing of the data in real-time.
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...Amazon Web Services
It is becoming increasingly important to analyze real time streaming data. It allows organizations to remain competitive by uncovering relevant, actionable insights. AWS makes it easy to capture, store, and analyze real-time streaming data.
In this webinar, we will guide you through some of the proven architectures for processing streaming data, using a combination of tools including Amazon Kinesis Streams, AWS Lambda, and Spark Streaming on Amazon Elastic MapReduce (EMR). We will then talk about common use cases and best practices for real-time data analysis on AWS.
Learning Objectives:
Understand how you can analyze real-time data streams using Amazon Kinesis, AWS Lambda, and Spark running on Amazon EMR
Learn use cases and best practices for streaming data applications on AWS
Deep Dive and Best Practices for Real Time Streaming ApplicationsAmazon Web Services
Get answers to technical questions, frequently asked by those starting to work with streaming data. Learn best practices for building a real-time streaming data architecture on AWS with Amazon Kinesis, Spark Streaming, AWS Lambda, and Amazon EMR. First, we will focus on building a scalable, durable streaming data ingestion workflow from data producers like mobile devices, servers, or even web browsers. We will provide guidelines to minimize duplicates and achieve exactly-once processing semantics in your stream-processing applications. Then, we will show some of the proven architectures for processing streaming data using a combination of tools including Amazon Kinesis Stream, AWS Lambda, and Spark Streaming running on Amazon EMR.
AWS Webcast - Managing Big Data in the AWS Cloud_20140924Amazon Web Services
This presentation deck will cover specific services such as Amazon S3, Kinesis, Redshift, Elastic MapReduce, and DynamoDB, including their features and performance characteristics. It will also cover architectural designs for the optimal use of these services based on dimensions of your data source (structured or unstructured data, volume, item size and transfer rates) and application considerations - for latency, cost and durability. It will also share customer success stories and resources to help you get started.
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of ThingsAmazon Web Services
Big Data is everywhere these days. But what is it and how can you use it to fuel your business? Data is as important to organizations as labour and capital, and if organizations can effectively capture, analyze, visualize and apply big data insights to their business goals, they can differentiate themselves from their competitors and outperform them in terms of operational efficiency and the bottom line.
Join this session to understand the different AWS Big Data and Analytics services such as Amazon Elastic MapReduce (Hadoop), Amazon Redshift (Data Warehouse) and Amazon Kinesis (Streaming), when to use them and how they work together.
Reasons to attend:
- Learn how AWS can help you process and make better use of your data with meaningful insights.
- Learn about Amazon Elastic MapReduce and Amazon Redshift, fully managed petabyte-scale data warehouse solutions.
- Learn about real time data processing with Amazon Kinesis.
Amazon Kinesis is a fully managed service for real-time processing of streaming data at massive scale. In this webinar, developers will learn how to build and deploy a streaming data processing application with Amazon Kinesis. We will cover the following: - A brief overview of Amazon Kinesis and drill down on key technical concepts. - Amazon Kinesis Client Library capabilities that enable customers to build fault tolerant, continuous processing applications that scale elastically. - The role of the supporting connector library for moving data into stores like S3 and Redshift. - Best practices for streaming data ingestion and processing with Amazon Kinesis.
Traditional data warehouses become expensive and slow down as the volume of your data grows. Amazon Redshift is a fast, petabyte-scale data warehouse that makes it easy to analyze all of your data using existing business intelligence tools for 1/10th the traditional cost. This session will provide an introduction to Amazon Redshift and cover the essentials you need to deploy your data warehouse in the cloud so that you can achieve faster analytics and save costs.
Day 5 - Real-time Data Processing/Internet of Things (IoT) with Amazon KinesisAmazon Web Services
Amazon Kinesis is a fully managed service for real-time processing of streaming data at massive scale. Amazon Kinesis can collect and process hundreds of terabytes of data per hour from hundreds of thousands of sources, allowing you to easily write applications that process information in real-time, from sources such as web site click-streams, marketing and financial information, manufacturing instrumentation and social media, and operational logs and metering data.
Reasons to attend:
- This session, will provide you with an overview of Amazon Kinesis.
- Learn about sample use cases and real life case studies.
- Learn how Amazon Kinesis can be integrated into your own applications.
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Amazon Web Services
Originally, Hadoop was used as a batch analytics tool; however, this is rapidly changing, as applications move towards real-time processing and streaming. Amazon Elastic MapReduce has made running Hadoop in the cloud easier and more accessible than ever. Each day, tens of thousands of Hadoop clusters are run on the Amazon Elastic MapReduce infrastructure by users of every size — from university students to Fortune 50 companies. We recently launched Amazon Kinesis – a managed service for real-time processing of high volume, streaming data. Amazon Kinesis enables a new class of big data applications which can continuously analyze data at any volume and throughput, in real-time. Adi will discuss each service, dive into how customers are adopting the services for different use cases, and share emerging best practices. Learn how you can architect Amazon Kinesis and Amazon Elastic MapReduce together to create a highly scalable real-time analytics solution which can ingest and process terabytes of data per hour from hundreds of thousands of different concurrent sources. Forever change how you process web site click-streams, marketing and financial transactions, social media feeds, logs and metering data, and location-tracking events.
BDA307 Real-time Streaming Applications on AWS, Patterns and Use CasesAmazon Web Services
In this session, you will learn best practices for implementing simple to advanced real-time streaming data use cases on AWS. First, we’ll review decision points on near real-time versus real time scenarios. Next, we will take a look at streaming data architecture patterns that include Amazon Kinesis Analytics, Amazon Kinesis Firehose, Amazon Kinesis Streams, Spark Streaming on Amazon EMR, and other open source libraries. Finally, we will dive deep into the most common of these patterns and cover design and implementation considerations.
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceAmazon Web Services
Everything generates logs. Applications, infrastructure, security ... everything. Keeping track of the flood of log data is a big challenge, yet critical to your ability to understand your systems and troubleshoot (or prevent) issues. In this session, we will use both Amazon CloudWatch and application logs to show you how to build an end-to-end log analytics solution. First, we cover how to configure an Amazon Elaticsearch Service domain and ingest data into it using Amazon Kinesis Firehose, demonstrating how easy it is to transform data with Firehose. We look at best practices for choosing instance types, storage options, shard counts, and index rotations based on the throughput of incoming data and configure a secure analytics environment. We demonstrate how to set up a Kibana dashboard and build custom dashboard widgets. Finally, we dive deep into the Elasticsearch query DSL and review approaches for generating custom, ad-hoc reports.
Amazon Kinesis is a fully managed service for real-time processing of streaming data at massive scale. Amazon Kinesis can collect and process hundreds of terabytes of data per hour from hundreds of thousands of sources, allowing you to easily write applications that process information in real-time, from sources such as web site click-streams, marketing and financial information, manufacturing instrumentation and social media, and operational logs and metering data.
This introductory webinar, presented by Adi Krishnan, Senior Product Manager for Amazon Kinesis, will provide you with an overview of the service, sample use cases, and some examples of customer experiences with the service so you can better understand its capabilities and see how it might be integrated into your own applications.
Amazon Kinesis is the AWS service for real-time streaming big data ingestion and processing. This talk gives a detailed exploration of Kinesis stream processing. We'll discuss in detail techniques for building, and scaling Kinesis processing applications, including data filtration and transformation. Finally we'll address tips and techniques to emitting data into S3, DynamoDB, and Redshift.
Working with big volumes of data is a complicated task, but it's even harder if you have to do everything in real time and try to figure it all out yourself. Over the past decades many open-source projects helped solve problems within the data analytics lifecycle around ingestion, storage, processing and visualisation of data. This session will use practical examples to discuss architectural best practices and lessons learned when solving real-time analytics and data visualisation decision-making problems with open-source at scale with the power of Amazon Web Services. It furthermore dives into a demo, using source code from the AWS Labs to visualise live data streams at scale.
Olivier Klein, Solutions Architect, Amazon Web Services, Greater China
Data & Analytics - Session 2 - Introducing Amazon RedshiftAmazon Web Services
Amazon Redshift is a fast and powerful, fully managed, petabyte-scale data warehouse service in the cloud. This presentation will give an introduction to the service and its pricing before diving into how it delivers fast query performance on data sets ranging from hundreds of gigabytes to a petabyte or more.
Steffen Krause, Technical Evangelist, AWS
Padraic Mulligan, Architect and Lead Developer and Mike McCarthy, CTO, Skillspage
This session is recommended for anyone interested in understanding how to use AWS big data services to develop real-time analytics applications. In this session, you will get an overview of a number of Amazon's big data and analytics services that enable you to build highly scaleable cloud applications that immediately and continuously analyze large sets of distributed data. We'll explain how services like Amazon Kinesis, EMR and Redshift can be used for data ingestion, processing and storage to enable real-time insights and analysis into customer, operational and machine generated data and log files. We'll explore system requirements, design considerations, and walk through a specific customer use case to illustrate the power of real-time insights on their business.
Similar to AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015 (20)
사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...Amazon Web Services Korea
Database Migration Service(DMS)는 RDBMS 이외에도 다양한 데이터베이스 이관을 지원합니다. 실제 고객사 사례를 통해 DMS가 데이터베이스 이관, 통합, 분리를 수행하는 데 어떻게 활용되는지 알아보고, 동시에 데이터 분석을 위한 데이터 수집(Data Ingest)에도 어떤 역할을 하는지 살펴보겠습니다.
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...Amazon Web Services Korea
Amazon ElastiCache는 Redis 및 MemCached와 호환되는 완전관리형 서비스로서 현대적 애플리케이션의 성능을 최적의 비용으로 실시간으로 개선해 줍니다. ElastiCache의 Best Practice를 통해 최적의 성능과 서비스 최적화 방법에 대해 알아봅니다.
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...Amazon Web Services Korea
ccAmazon Aurora 데이터베이스는 클라우드용으로 구축된 관계형 데이터베이스입니다. Aurora는 상용 데이터베이스의 성능과 가용성, 그리고 오픈소스 데이터베이스의 단순성과 비용 효율성을 모두 제공합니다. 이 세션은 Aurora의 고급 사용자들을 위한 세션으로써 Aurora의 내부 구조와 성능 최적화에 대해 알아봅니다.
[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...Amazon Web Services Korea
오랫동안 관계형 데이터베이스가 가장 많이 사용되었으며 거의 모든 애플리케이션에서 널리 사용되었습니다. 따라서 애플리케이션 아키텍처에서 데이터베이스를 선택하기가 더 쉬웠지만, 구축할 수 있는 애플리케이션의 유형이 제한적이었습니다. 관계형 데이터베이스는 스위스 군용 칼과 같아서 많은 일을 할 수 있지만 특정 업무에는 완벽하게 적합하지는 않습니다. 클라우드 컴퓨팅의 등장으로 경제적인 방식으로 더욱 탄력적이고 확장 가능한 애플리케이션을 구축할 수 있게 되면서 기술적으로 가능한 일이 달라졌습니다. 이러한 변화는 전용 데이터베이스의 부상으로 이어졌습니다. 개발자는 더 이상 기본 관계형 데이터베이스를 사용할 필요가 없습니다. 개발자는 애플리케이션의 요구 사항을 신중하게 고려하고 이러한 요구 사항에 맞는 데이터베이스를 선택할 수 있습니다.
Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...Amazon Web Services Korea
실시간 분석은 AWS 고객의 사용 사례가 점점 늘어나고 있습니다. 이 세션에 참여하여 스트리밍 데이터 기술이 어떻게 데이터를 즉시 분석하고, 시스템 간에 데이터를 실시간으로 이동하고, 실행 가능한 통찰력을 더 빠르게 얻을 수 있는지 알아보십시오. 일반적인 스트리밍 데이터 사용 사례, 비즈니스에서 실시간 분석을 쉽게 활성화하는 단계, AWS가 Amazon Kinesis와 같은 AWS 스트리밍 데이터 서비스를 사용하도록 지원하는 방법을 다룹니다.
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...Amazon Web Services Korea
Amazon EMR은 Apache Spark, Hive, Presto, Trino, HBase 및 Flink와 같은 오픈 소스 프레임워크를 사용하여 분석 애플리케이션을 쉽게 실행할 수 있는 관리형 서비스를 제공합니다. Spark 및 Presto용 Amazon EMR 런타임에는 오픈 소스 Apache Spark 및 Presto에 비해 두 배 이상의 성능 향상을 제공하는 최적화 기능이 포함되어 있습니다. Amazon EMR Serverless는 Amazon EMR의 새로운 배포 옵션이지만 데이터 엔지니어와 분석가는 클라우드에서 페타바이트 규모의 데이터 분석을 쉽고 비용 효율적으로 실행할 수 있습니다. 이 세션에 참여하여 개념, 설계 패턴, 라이브 데모를 사용하여 Amazon EMR/EMR 서버리스를 살펴보고 Spark 및 Hive 워크로드, Amazon EMR 스튜디오 및 Amazon SageMaker Studio와의 Amazon EMR 통합을 실행하는 것이 얼마나 쉬운지 알아보십시오.
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...Amazon Web Services Korea
로그 및 지표 데이터를 쉽게 가져오고, OpenSearch 검색 API를 사용하고, OpenSearch 대시보드를 사용하여 시각화를 구축하는 등 Amazon OpenSearch의 새로운 기능과 기능에 대해 자세히 알아보십시오. 애플리케이션 문제를 디버깅할 수 있는 OpenSearch의 Observability 기능에 대해 알아보세요. Amazon OpenSearch Service를 통해 인프라 관리에 대해 걱정하지 않고 검색 또는 모니터링 문제에 집중할 수 있는 방법을 알아보십시오.
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...Amazon Web Services Korea
데이터 거버넌스는 전체 프로세스에서 데이터를 관리하여 데이터의 정확성과 완전성을 보장하고 필요한 사람들이 데이터에 액세스할 수 있도록 하는 프로세스입니다. 이 세션에 참여하여 AWS가 어떻게 분석 서비스 전반에서 데이터 준비 및 통합부터 데이터 액세스, 데이터 품질 및 메타데이터 관리에 이르기까지 포괄적인 데이터 거버넌스를 제공하는지 알아보십시오. AWS에서의 스트리밍에 대해 자세히 알아보십시오.
Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...Amazon Web Services Korea
이 세션에 참여하여 Amazon Redshift의 새로운 기능을 자세히 살펴보십시오. Amazon Data Sharing, Amazon Redshift Serverless, Redshift Streaming, Redshift ML 및 자동 복사 등에 대한 자세한 내용과 데모를 통해 Amazon Redshift의 새로운 기능을 알고 싶은 사용자에게 적합합니다.
From Insights to Action, How to build and maintain a Data Driven Organization...Amazon Web Services Korea
데이터는 혁신과 변혁의 토대입니다. 비즈니스 혁신을 이끄는 혁신은 특정 시점의 전략이나 솔루션이 아니라 성장을 위한 반복적이고 집단적인 계획입니다. 혁신에 이러한 접근 방식을 채택하는 기업은 전략과 비즈니스 문화에서 데이터를 기반으로 하는 경우가 많습니다. 이러한 접근 방식을 개발하려면 리더가 데이터를 조직의 자산처럼 취급하고 조직이 더 나은 비즈니스 성과를 위해 데이터를 활용할 수 있도록 권한을 부여해야 합니다. AWS와 Amazon이 어떻게 데이터와 분석을 활용하여 확장 가능한 비즈니스 효율성을 창출하고 고객의 가장 복잡한 문제를 해결하는 메커니즘을 개발했는지 알아보십시오.
[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...Amazon Web Services Korea
데이터는 최종 소비자의 성공에 초점을 맞춘 디지털 혁신에서 중추적인 역할을 하고 있습니다. 모든 기업들은 데이터를 자산으로 사용하여 사례 제공을 추진하고 까다로운 결과를 해결하고 있습니다. AWS 클라우드 기술과 분석 솔루션의 강력한 성능을 통해 고객은 혁신 여정을 가속화할 수 있습니다. 이 세션에서는 기업 고객들이 클라우드에서 데이터의 힘을 활용하여 혁신 목표를 달성하고 필요한 결과를 제공하는 방법에 대해 다룹니다.
LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...Amazon Web Services Korea
LG ThinQ는 LG전자의 가전제품과 서비스를 아우르는 플랫폼 브랜드로서 앱 하나로 간편한 컨트롤, 똑똑한 케어, 스마트한 쇼핑까지 한번에 가능한 플랫폼입니다. ThinQ 플랫폼은 글로벌 서비스로 제공되고 있어, 작업 시간을 최소화하고, 서비스의 영향을 최소화 할 필요가 있었습니다. 따라서 DB 버전 업그레이드 작업 시 애플리케이션 배포가 필요없는 Blue/Green Deployment 방식은 최선의 선택이 되었습니다.
KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...Amazon Web Services Korea
온프레미스 분석 플랫폼에는 자원 증설 비용, 자원 관리 비용, 신규 자원 도입 및 환경 설정의 리드타임 등 다양한 측면에서의 한계가 존재합니다. 이에 KB국민카드에서는 기존 분석 플랫폼의 한계를 극복함과 동시에 시너지를 낼 수 있는 클라우드 기반 분석 플랫폼을 설계 및 도입하였습니다. 본 사례 소개는 KB국민카드의 데이터 혁신 여정과 노하우를 소개합니다.
SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...Amazon Web Services Korea
SK Telecom의 망관리 프로젝트인 TANGO에서는 오라클을 기반으로 시스템을 구축하여 운영해 왔습니다. 하지만 늘어나는 사용자와 데이터로 인해 유연하고 비용 효율적인 인프라가 필요하게 되었고, 이에 클라우드 도입을 검토 및 실행에 옮기게 되었습니다. TANGO 프로젝트의 클라우드 도입을 위한 검토부터 준비, 실행 및 이를 통해 얻게 된 교훈과 향후 계획에 대해 소개합니다.
코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...Amazon Web Services Korea
2022년 코리안리는 핵심업무시스템(기간계/정보계 시스템)을 AWS 클라우드로 전환하는 사업과 AWS 클라우드 기반에서 손익분석을 위한 어플리케이션 구축 사업을 동시에 진행하고 있었습니다. 이에 따라 클라우드 전환 이후 시스템 간 상호운용성과 호환성을갖춘 데이터 분석 플랫폼 또한 필요하게 되었습니다. 코리안리 IT 환경에 적합한 플랫폼 선정을 위하여 AWS Native Analytics Platform, 3rd Party Analytics Platform (클라우데라, 데이터브릭스)과의 PoC를 진행하고, 최종적으로 AWS Native Analytics Platform 으로 확정하였습니다. 코리안리는 메가존클라우드와 함께 2022년 10월부터 4개월(구축 3개월, 안정화 및 교육 1개월) 동안 AWS 기반 데이터 분석 플랫폼을 구축하고 활용 범위를 지속적으로 확대하고 있습니다.
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...Amazon Web Services Korea
LG 이노텍은 세계 시장을 선도하는 글로벌 소재·부품기업으로, Amazon Redshift 을 데이터 분석 플랫폼의 핵심 서비스로 활용하고 있습니다.지속적인 데이터 증가와 업무 확대에 따른 유연한 아키텍처 개선의 필요성에 대처하기 위해, 2022년에 AWS 에서 발표된 Redshift Serverless 를 활용한, 비용 최적화된 아키텍처 개선 과정의 실사례를 엿볼수 있는 기회가 됩니다.
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
3. 이번 웨비나 에서 들으실 내용..
이 강연에서는 AWS Elastic MapReduce, Amazon Redshift,
Amazon Kinesis 등 AWS가 제공하는 다양한 데이터 분석 도
구를 활용해 보다 간편하고 빠른 빅데이터 분석 서비스를 구
축하는 방법에 대해 소개합니다.
4. v
Agenda
• AWS Big data building blocks
• AWS Big data platform
• Log data collection & storage
• Introducing Amazon Kinesis
• Data Analytics & Computation
• Collaboration & sharing
• Netflix Use-case
21. v
Collection of Data
Sources
Aggrega8on
Tool
Data
Sink
Web
Servers
Applica8on
servers
Connected
Devices
Mobile
Phones
Etc
Scalable
method
to
collect
and
aggregate
Flume,
KaGa,
Kinesis,
Queue
Reliable
and
durable
des8na8on
OR
Des8na8ons
23. Run your own log collector
Your
applica0on
Amazon S3
DynamoDB
Any
other
data
store
Amazon S3
Amazon
EC2
24. Use a Queue
Amazon
Simple
Queue
Service
(SQS)
Amazon S3
DynamoDB
Any
other
data
store
25. Agency Customer: Video Analytics on AWS
Elastic Load
Balancer
Edge Servers
on EC2
Workers on
EC2
Logs Reports
HDFS Cluster
Amazon Simple Queue Service
(SQS)
Amazon Simple Storage Service (S3)
Amazon Elastic MapReduce
26. Use a Tool like FLUME, KAFKA, HONU etc
Flume running
on EC2
Amazon S3
Any
other
data
store
HDFS
27. v
Choice of tools
• (+) Pros / (-) Cons
• (+) Flexibility: Customers select the most appropriate software and underlying
infrastructure
• (+) Control: Software and hardware can be tuned to meet specific business and
scenario needs.
• (-) Ongoing Operational Complexity: Deploy, and manage an end-to-end system
• (-) Infrastructure planning and maintenance: Managing a reliable, scalable
infrastructure
• (-) Developer/ IT staff overhead: Developers, Devops and IT staff time and
energy expended
• (-) Unsupported Software: deprecated and/ pre-version 1 open source software
• Future – Need for to stream data for real time
32. Data
Sources
App.4
[Machine
Learning]
AWS
Endpoint
App.1
[Aggregate
&
De-‐Duplicate]
Data
Sources
Data
Sources
Data
Sources
App.2
[Metric
Extrac0on]
S3
DynamoDB
Redshift
App.3
[Sliding
Window
Analysis]
Data
Sources
Availability
Zone
Shard
1
Shard
2
Shard
N
Availability
Zone
Availability
Zone
Introducing Amazon Kinesis
Managed Service for Real-Time Processing of Big Data
EMR
33. Kinesis Architecture
Amazon Web Services
AZ AZ AZ
Durable, highly consistent storage replicates data
across three data centers (availability zones)
Aggregate and
archive to S3
Millions of
sources producing
100s of terabytes
per hour
Front
End
Authentication
Authorization
Ordered stream
of events supports
multiple readers
Real-time
dashboards
and alarms
Machine learning
algorithms or
sliding window
analytics
Aggregate analysis
in Hadoop or a
data warehouse
Inexpensive: $0.028 per million puts
34. Putting data into Kinesis
Managed Service for Ingesting Fast Moving Data
• Streams
are
made
of
Shards
⁻ A
Kinesis
Stream
is
composed
of
mul8ple
Shards
⁻ Each
Shard
ingests
up
to
1MB/sec
of
data,
and
up
to
1000
TPS
⁻ Each
Shard
emits
up
to
2
MB/sec
of
data
⁻ All
data
is
stored
for
24
hours
⁻ You
scale
Kinesis
streams
by
adding
or
removing
Shards
• Simple
PUT
interface
to
store
data
in
Kinesis
⁻ Producers
use
a
PUT
call
to
store
data
in
a
Stream
⁻ A
Par00on
Key
is
used
to
distribute
the
PUTs
across
Shards
⁻ A
unique
Sequence
#
is
returned
to
the
Producer
upon
a
successful
PUT
call
Producer
Shard 1
Shard 2
Shard 3
Shard n
Shard 4
Producer
Producer
Producer
Producer
Producer
Producer
Producer
Producer
Kinesis
36. v
Shard 1
Shard 2
Shard 3
Shard n
Shard 4
KCL Worker 1
KCL Worker 2
EC2 Instance
KCL Worker 3
KCL Worker 4
EC2 Instance
KCL Worker n
EC2 Instance
Kinesis
Building Kinesis Apps
Client library for fault-tolerant, at least-once, real-time
processing
• Key streaming application attributes:
• Be distributed, to handle multiple shards
• Be fault tolerant, to handle failures in hardware or software
• Scale up and down as the number of shards increase or decrease
• Kinesis Client Library (KCL) helps with distributed processing:
• Automatically starts a Kinesis Worker for each shard
• Simplifies reading from the stream by abstracting individual shards
• Increases / Decreases Kinesis Workers as # of shards changes
• Checkpoints to keep track of a Worker’s location in the stream
• Restarts Workers if they fail
• Use the KCL with Auto Scaling Groups
• Auto Scaling policies will restart EC2 instances if they fail
• Automatically add EC2 instances when load increases
• KCL will redistributes Workers to use the new EC2 instances
OR
• Use the Get APIs for raw reads of Kinesis data streams
37. 37
Easy
Administra0on
Managed
service
for
real-‐8me
streaming
data
collec8on,
processing
and
analysis.
Simply
create
a
new
stream,
set
the
desired
level
of
capacity,
and
let
the
service
handle
the
rest.
Real-‐0me
Performance
Perform
con8nual
processing
on
streaming
big
data.
Processing
latencies
fall
to
a
few
seconds,
compared
with
the
minutes
or
hours
associated
with
batch
processing.
High
Throughput.
Elas0c
Seamlessly
scale
to
match
your
data
throughput
rate
and
volume.
You
can
easily
scale
up
to
gigabytes
per
second.
The
service
will
scale
up
or
down
based
on
your
opera8onal
or
business
needs.
S3,
EMR,
Storm,
RedshiY,
&
DynamoDB
Integra0on
Reliably
collect,
process,
and
transform
all
of
your
data
in
real-‐8me
&
deliver
to
AWS
data
stores
of
choice,
with
Connectors
for
S3,
Redshi],
and
DynamoDB.
Build
Real-‐0me
Applica0ons
Client
libraries
that
enable
developers
to
design
and
operate
real-‐8me
streaming
data
processing
applica8ons.
Low
Cost
Cost-‐efficient
for
workloads
of
any
scale.
You
can
get
started
by
provisioning
a
small
stream,
and
pay
low
hourly
rates
only
for
what
you
use.
Amazon Kinesis: Key Developer Benefits
38. Customers using Amazon Kinesis
Mobile/
Social
Gaming
Digital
Adver0sing
Tech.
Deliver
con8nuous/
real-‐8me
delivery
of
game
insight
data
by
100’s
of
game
servers
Generate
real-‐8me
metrics,
KPIs
for
online
ad
performance
for
adver8sers/
publishers
Custom-‐built
solu8ons
opera8onally
complex
to
manage,
&
not
scalable
Store
+
Forward
fleet
of
log
servers,
and
Hadoop
based
processing
pipeline
• Delay
with
cri8cal
business
data
delivery
• Developer
burden
in
building
reliable,
scalable
pladorm
for
real-‐8me
data
inges8on/
processing
• Slow-‐down
of
real-‐8me
customer
insights
• Lost
data
with
Store/
Forward
layer
• Opera8onal
burden
in
managing
reliable,
scalable
pladorm
for
real-‐8me
data
inges8on/
processing
• Batch-‐driven
real-‐8me
customer
insights
Accelerate
8me
to
market
of
elas8c,
real-‐8me
applica8ons
–
while
minimizing
opera8onal
overhead
Generate
freshest
analy8cs
on
adver8ser
performance
to
op8mize
marke8ng
spend,
and
increase
responsiveness
to
clients
39. Digital Ad. Tech Metering with Kinesis
Con0nuous
Ad
Metrics
Extrac0on
Incremental
Ad.
Sta0s0cs
Computa0on
Metering
Record
Archive
Ad
Analy0cs
Dashboard
40. v
Collection of Data
Sources
Aggrega8on
Tool
Data
Sink
Web
Servers
Applica8on
servers
Connected
Devices
Mobile
Phones
Etc
Scalable
method
to
collect
and
aggregate
Flume,
KaGa,
Kinesis,
Queue
Reliable
and
durable
des8na8on
OR
Des8na8ons
43. Cloud Database and Storage Tier — Use the Right
Tool for the Job!
App/Web
Tier
Client
Tier
Data
Tier
Database
&
Storage
Tier
Search
Hadoop/HDFS
Cache
Blob
Store
SQL
NoSQL
44. App/Web
Tier
Client
Tier
Database
&
Storage
Tier
Amazon
RDS
Amazon
DynamoDB
Amazon
Elas0Cache
Amazon
S3
Amazon
Glacier
Amazon
CloudSearch
HDFS
on
Amazon
EMR
Cloud Database and Storage Tier — Use the Right
Tool for the Job!
45. v
What Database and Storage Should I
Use?
• Data structure
• Query complexity
• Data characteristics: hot, warm, cold
48. Amazon
RDS
Amazon
S3
Request
Rate
High
Low
Cost/GB
High
Low
Latency
Low
High
Data
Volume
Low
High
Amazon
Glacier
Amazon
CloudSearch
Structure
Low
High
Amazon
DynamoDB
Amazon
Elas8Cache
HDFS
49. What Data Store Should I Use?
Amazon
Elas0Cache
Amazon
DynamoDB
Amazon
RDS
Amazon
CloudSearch
Amazon
EMR
(HDFS)
Amazon
S3
Amazon
Glacier
Average
latency
ms
ms
ms,
sec
ms,sec
sec,min,hrs
ms,sec,min
(~
size)
hrs
Data
volume
GB
GB–TBs
(no
limit)
GB–TB
(3
TB
Max)
GB–TB
GB–PB
(~nodes)
GB–PB
(no
limit)
GB–PB
(no
limit)
Item
size
B-‐KB
KB
(64
KB
max)
KB
(~rowsize)
KB
(1
MB
max)
MB-‐GB
KB-‐GB
(5
TB
max)
GB
(40
TB
max)
Request
rate
Very
High
Very
High
High
High
Low
–
Very
High
Low–
Very
High
(no
limit)
Very
Low
(no
limit)
Storage
cost
$/GB/month
$$
¢¢
¢¢
$
¢
¢
¢
Durability
Low
-‐
Moderate
Very
High
High
High
High
Very
High
Very
High
Hot
Data
Warm
Data
Cold
Data
50. Decouple your storage and analysis engine
1. Single Version of Truth
2. Choice of multiple analytics Tools
3. Parallel execution from different teams
4. Lower cost
Learning
from
Nealix
51. v
S3 as a “single source of truth”
Courtesy http://techblog.netflix.com/2013/01/hadoop-platform-as-service-in-cloud.html
S3
52. Amazon
SQS
Amazon S3
DynamoDB
Any
SQL
or
NO
SQL
Store
Kinesis
Choose depending upon design
56. v
Batch Processing
• Take large amount of cold data and ask
questions
• Takes minutes or hours to get answers back
Example:
Genera-ng
hourly,
daily,
weekly
reports
58. v
Stream Processing (AKA Real Time)
• Take small amount of hot data and ask
questions
• Takes short amount of time to get your
answer back
Example:
1min
metrics
60. Amplab Big Data Benchmark
Scan
query
Aggregate
query
Join
query
hops://amplab.cs.berkeley.edu/benchmark/
61. v
What Batch Processing Technology Should I Use?
RedshiY
Impala
Presto
Spark
Hive
Query
Latency
Low
Low
Low
Low
-‐
Medium
Medium
-‐
High
Durability
High
High
High
High
High
Data
Volume
1.6PB
Max
~Nodes
~Nodes
~Nodes
~Nodes
Managed
Yes
EMR
bootstrap
EMR
bootstrap
EMR
bootstrap
Yes
(EMR)
Storage
Na8ve
HDFS
HDFS/S3
HDFS/S3
HDFS/S3
#
of
BI
Tools
High
Medium
High
Low
High
Query
Latency
(Low
is
beoer)
62. v
What Stream Processing Technology Should I Use?
Spark
Streaming
Apache
Storm
+
Trident
Kinesis
Client
Library
Scale/Throughput
~
Nodes
~
Nodes
~
Nodes
Data
Volume
~
Nodes
~
Nodes
~
Nodes
Manageability
Yes
(EMR
bootstrap)
Do
it
yourself
EC2
+
Auto
Scaling
Fault
Tolerance
Built-‐in
Built-‐in
KCL
Check
poin8ng
Programming
languages
Java,
Python,
Scala
Java,
Scala,
Clojure
Java,
Python
69. SQL based processing
Amazon
SQS
Amazon S3
DynamoDB
Any
SQL
or
NO
SQL
Store
Log
Aggrega0on
tools
Amazon
Redshift
Petabyte scale
Columnar Data -
warehouse
70. SQL based processing for unstructured data
Amazon
SQS
Amazon S3
DynamoDB
Any
SQL
or
NO
SQL
Store
Log
Aggrega0on
tools
Amazon
EMR
Amazon
Redshift
Pre-processing
framework
Petabyte scale
Columnar Data -
warehouse
71. Your choice of BI Tools on the cloud
Amazon
SQS
Amazon S3
DynamoDB
Any
SQL
or
NO
SQL
Store
Log
Aggrega0on
tools
Amazon
EMR
Amazon
Redshift
Pre-processing
framework
73. Collaboration and Sharing insights
Amazon
SQS
Amazon S3
DynamoDB
Any
SQL
or
NO
SQL
Store
Log
Aggrega0on
tools
Amazon
EMR
Amazon
Redshift
74. Sharing results and visualizations
Amazon
SQS
Amazon S3
DynamoDB
Any
SQL
or
NO
SQL
Store
Log
Aggrega0on
tools
Amazon
EMR
Amazon
Redshift
Web App Server
Visualization tools
75. Sharing results and visualizations and scale
Amazon
SQS
Amazon S3
DynamoDB
Any
SQL
or
NO
SQL
Store
Log
Aggrega0on
tools
Amazon
EMR
Amazon
Redshift
Web App Server
Visualization tools
76. Sharing results and visualizations
Amazon
SQS
Amazon S3
DynamoDB
Any
SQL
or
NO
SQL
Store
Log
Aggrega0on
tools
Amazon
EMR
Amazon
Redshift Business
Intelligence Tools
Business
Intelligence Tools
77. Geospatial Visualizations
Amazon
SQS
Amazon S3
DynamoDB
Any
SQL
or
NO
SQL
Store
Log
Aggrega0on
tools
Amazon
EMR
Amazon
Redshift Business
Intelligence Tools
Business
Intelligence Tools
GIS tools on
hadoop
GIS tools
Visualization tools
78. Rinse and Repeat
Amazon
SQS
Amazon S3
DynamoDB
Any
SQL
or
NO
SQL
Store
Log
Aggrega0on
tools
Amazon
EMR
Amazon
Redshift
Visualization tools
Business
Intelligence Tools
Business
Intelligence Tools
GIS tools on
hadoop
GIS tools
Amazon data pipeline
79. The complete architecture
Amazon
SQS
Amazon S3
DynamoDB
Any
SQL
or
NO
SQL
Store
Log
Aggrega0on
tools
Amazon
EMR
Amazon
Redshift
Visualization tools
Business
Intelligence Tools
Business
Intelligence Tools
GIS tools on
hadoop
GIS tools
Amazon data pipeline
88. 온라인 자습 및 실습
다양한 온라인 강의 자
료 및 실습을 통해 AWS
에 대한 기초적인 사용
법 및 활용 방법을 익히
실 수 있습니다.
강의식 교육
AWS 전문 강사가 진행하는 강의를
통해 AWS 클라우드로 고가용성,
비용 효율성을 갖춘 안전한 애플리
케이션을 만드는 방법을 알아보세
요. 아키텍쳐 설계 및 구현에 대한
다양한 오프라인 강의가 개설되어
있습니다.
인증 시험을 통해 클라우
드에 대한 자신의 전문 지
식 및 경험을 공인받고 개
발 경력을 제시할 수 있습
니다.
AWS 공인 자격증
http://aws.amazon.com/ko/training
다양한 교육 프로그램
89. AWS 기초 웨비나 시리즈에 참여해 주셔서 감사합니다!
이번 웨비나가 여러분의 궁금증 해소에 도움이 되었길 바랍니다.
이후 이어질 설문 조사를 통해 오늘 웨비나에 대한 의견을 알려주세요.
aws-korea-marketing@amazon.com
http://twitter.com/AWSKorea
http://facebook.com/AmazonWebServices.ko
http://youtube.com/user/AWSKorea
http://slideshare.net/AWSKorea