This document discusses real-time data processing using Amazon Web Services. It describes how to use Amazon Kinesis for real-time data ingestion and processing and Amazon Elastic MapReduce (EMR) for batch processing. It provides examples of using EMR for batch processing large amounts of log data and for interactive querying of data stored in Amazon S3. It also discusses using Kinesis as a data broker to distribute streaming data to multiple applications and using Kinesis with EMR, Spark, and Storm for real-time analytics.
이제 빅데이터란 개념은 익숙한 것이 되었지만 이를 비지니스에 적용하고 최대의 효과를 얻는 방법에 대한 고찰은 여전히 필요합니다. 소중한 데이터를 쉽게 저장 및 분석하고 시각화하는 것은 비즈니스에 대한 통찰을 얻기 위한 중요한 과정입니다.
이 강연에서는 AWS Elastic MapReduce, Amazon Redshift, Amazon Kinesis 등 AWS가 제공하는 다양한 데이터 분석 도구를 활용해 보다 간편하고 빠른 빅데이터 분석 서비스를 구축하는 방법에 대해 소개합니다.
이제 빅데이터란 개념은 익숙한 것이 되었지만 이를 비지니스에 적용하고 최대의 효과를 얻는 방법에 대한 고찰은 여전히 필요합니다. 소중한 데이터를 쉽게 저장 및 분석하고 시각화하는 것은 비즈니스에 대한 통찰을 얻기 위한 중요한 과정입니다.
이 강연에서는 AWS Elastic MapReduce, Amazon Redshift, Amazon Kinesis 등 AWS가 제공하는 다양한 데이터 분석 도구를 활용해 보다 간편하고 빠른 빅데이터 분석 서비스를 구축하는 방법에 대해 소개합니다.
Premiers pas et bonnes pratiques sur Amazon AWS présenté par Carlos Condé Au Xebia Cloud Day 2012.
La vidéo de la présentation est disponible ici : http://vimeo.com/44228168
Le Xebia Cloud Day 2012 est une conférence gratuite dédiée au Cloud Computing focalisée sur l'écosystème Java.
http://blog.xebia.fr/22-mai-2012-cloud-day-chez-xebia/
이 강연에서는 AWS Big Data 분석 아키텍처 모범 사례를 살펴보고 표준 SQL을 사용해 Amazon S3에 저장된 데이터를 간편하게 분석할 수 있는 대화식 쿼리 서비스인 Amazon Athena의 특징과 최신 기능들에 대하여 고객 사례와 함께 소개드립니다.
연사: Greg Khairallah, 아마존 웹서비스 Amazon Big Data 및 Athena 총괄 사업 개발 매니저
Nesta sessão faremos uma demonstração de controle e defesa de tráfego aéreo utilizando processamento em tempo real. Trataremos das boas práticas para ingestão, armazenamento, processamento e visualização de dados através de serviços da AWS como Kinesis, DynamoDB, Lambda, Redshift, Quicksight e Amazon Machine Learning.
Amazon Web Services (AWS) can make hosting scalable, highly-available websites and web applications easier and less expensive for the Enterprise Education customers. Join us for an informative webinar on tools AWS provides to elastically scale your architecture to avoid underutilized resources while reducing complexity with templates, partners, and tools to do much of the heavy lifting of creating and running a website for you.
You have heard how containers are great for running microservices, but running and managing large scale applications with microservices architectures is hard and often requires operating complex container management infrastructure. So what exactly is needed to get microservices to run in production at scale?
In this session, we will explore the reasoning and concepts behind microservices and how containers simplify building microservices based applications, and we will walk through a number of patterns used by our customers to run their microservices platforms. We will also dive deep into some of the challenges of running microservices, such as load balancing, service discovery, and secrets management, and we’ll see how Amazon EC2 Container Service (ECS) can help address them. We will also demo how you can easily deploy complex microservices applications using Amazon ECS.
Database migration simple, cross-engine and cross-platform migrations with ...Amazon Web Services
Learn how you can migrate databases with minimal downtime from on-premises and Amazon EC2 environments to Amazon RDS, Amazon Redshift, Amazon Aurora and EC2 databases using AWS Database Migration Service. We'll discuss homogeneous (e.g. Oracle-to-Oracle, PostgreSQL-to-PostgreSQL, etc.) and heterogeneous (e.g. Oracle to Aurora, SQL Server to MariaDB) database migrations. We'll also talk about the new AWS Schema Conversion Tool that saves you development time when migrating your Oracle and SQL Server database schemas, including PL/SQL and T-SQL procedural code, to their MySQL, MariaDB and Aurora equivalents. Best of all, we'll spend most of the time demonstrating the product and showing use cases designed to help your business.
Premiers pas et bonnes pratiques sur Amazon AWS présenté par Carlos Condé Au Xebia Cloud Day 2012.
La vidéo de la présentation est disponible ici : http://vimeo.com/44228168
Le Xebia Cloud Day 2012 est une conférence gratuite dédiée au Cloud Computing focalisée sur l'écosystème Java.
http://blog.xebia.fr/22-mai-2012-cloud-day-chez-xebia/
이 강연에서는 AWS Big Data 분석 아키텍처 모범 사례를 살펴보고 표준 SQL을 사용해 Amazon S3에 저장된 데이터를 간편하게 분석할 수 있는 대화식 쿼리 서비스인 Amazon Athena의 특징과 최신 기능들에 대하여 고객 사례와 함께 소개드립니다.
연사: Greg Khairallah, 아마존 웹서비스 Amazon Big Data 및 Athena 총괄 사업 개발 매니저
Nesta sessão faremos uma demonstração de controle e defesa de tráfego aéreo utilizando processamento em tempo real. Trataremos das boas práticas para ingestão, armazenamento, processamento e visualização de dados através de serviços da AWS como Kinesis, DynamoDB, Lambda, Redshift, Quicksight e Amazon Machine Learning.
Amazon Web Services (AWS) can make hosting scalable, highly-available websites and web applications easier and less expensive for the Enterprise Education customers. Join us for an informative webinar on tools AWS provides to elastically scale your architecture to avoid underutilized resources while reducing complexity with templates, partners, and tools to do much of the heavy lifting of creating and running a website for you.
You have heard how containers are great for running microservices, but running and managing large scale applications with microservices architectures is hard and often requires operating complex container management infrastructure. So what exactly is needed to get microservices to run in production at scale?
In this session, we will explore the reasoning and concepts behind microservices and how containers simplify building microservices based applications, and we will walk through a number of patterns used by our customers to run their microservices platforms. We will also dive deep into some of the challenges of running microservices, such as load balancing, service discovery, and secrets management, and we’ll see how Amazon EC2 Container Service (ECS) can help address them. We will also demo how you can easily deploy complex microservices applications using Amazon ECS.
Database migration simple, cross-engine and cross-platform migrations with ...Amazon Web Services
Learn how you can migrate databases with minimal downtime from on-premises and Amazon EC2 environments to Amazon RDS, Amazon Redshift, Amazon Aurora and EC2 databases using AWS Database Migration Service. We'll discuss homogeneous (e.g. Oracle-to-Oracle, PostgreSQL-to-PostgreSQL, etc.) and heterogeneous (e.g. Oracle to Aurora, SQL Server to MariaDB) database migrations. We'll also talk about the new AWS Schema Conversion Tool that saves you development time when migrating your Oracle and SQL Server database schemas, including PL/SQL and T-SQL procedural code, to their MySQL, MariaDB and Aurora equivalents. Best of all, we'll spend most of the time demonstrating the product and showing use cases designed to help your business.
AWS 클라우드를 활용하면 사용자의 트래픽에 따라 IT 인프라 아키텍처를 확장할 수 있습니다. 이번 강연에서는 서비스 초기의 작은 트래픽에 대응할 수 있는 단순한 아키텍처로 시작해 사업 성장 후의 수백만 사용자에 달하는 대규모 트래픽을 지탱할 수 있는 고확장성 아키텍처에 이르기까지의 단계별 아키텍처 구성 방법에 대해 소개해 드리고 컴퓨팅 및 데이터베이스 선택 및 사용자 증가에 따른 트래픽 경감 방법, 오토스케일링 및 모니터링과 자동화, DB 부하 분산, 고가용성 확보 등에 대한 다양한 모범사례를 알려드릴 예정입니다.
Amazon Elastic MapReduce (EMR) is a web service that allows you to easily and securely provision and manage your Hadoop clusters. In this talk, we will introduce you to Amazon EMR design patterns, such as using various data stores such as Amazon S3, how to take advantage of both transient and active clusters, as well as other Amazon EMR architectural patterns. We will dive deep on how to dynamically scale your cluster and address the ways you can fine-tune your cluster. We will discuss bootstrapping Hadoop applications from our partner ecosystem that you can use natively with Amazon EMR. Lastly, we will share best practices on how to keep your Amazon EMR cluster cost-effective.
Deep Dive and Best Practices for Real Time Streaming ApplicationsAmazon Web Services
Get answers to technical questions, frequently asked by those starting to work with streaming data. Learn best practices for building a real-time streaming data architecture on AWS with Amazon Kinesis, Spark Streaming, AWS Lambda, and Amazon EMR. First, we will focus on building a scalable, durable streaming data ingestion workflow from data producers like mobile devices, servers, or even web browsers. We will provide guidelines to minimize duplicates and achieve exactly-once processing semantics in your stream-processing applications. Then, we will show some of the proven architectures for processing streaming data using a combination of tools including Amazon Kinesis Stream, AWS Lambda, and Spark Streaming running on Amazon EMR.
Working with big volumes of data is a complicated task, but it's even harder if you have to do everything in real time and try to figure it all out yourself. Over the past decades many open-source projects helped solve problems within the data analytics lifecycle around ingestion, storage, processing and visualisation of data. This session will use practical examples to discuss architectural best practices and lessons learned when solving real-time analytics and data visualisation decision-making problems with open-source at scale with the power of Amazon Web Services. It furthermore dives into a demo, using source code from the AWS Labs to visualise live data streams at scale.
Olivier Klein, Solutions Architect, Amazon Web Services, Greater China
Building Big Data Applications with Serverless Architectures - June 2017 AWS...Amazon Web Services
Learning Objectives:
- Use cases and best practices for serverless big data applications
- Leverage AWS technologies such as AWS Lambda and Amazon Kinesis
- Learn to perform ETL, event processing, ad-hoc analysis, real-time processing, and MapReduce with serverless
Building data processing applications is challenging and time-consuming, and often requires specialized expertise to deploy and operate. With serverless computing, you can perform real-time stream processing of multiple data types without needing to spin up servers or install software, allowing you to deploy big data applications quickly and more easily. Come learn how you can use AWS Lambda with Amazon Kinesis to analyze streaming data in real-time and then store the results in a managed NoSQL database such as Amazon DynamoDB. You’ll learn tips and tricks for doing in-line processing, data manipulation, and even distributed MapReduce on large data sets.
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...Amazon Web Services
It is becoming increasingly important to analyze real time streaming data. It allows organizations to remain competitive by uncovering relevant, actionable insights. AWS makes it easy to capture, store, and analyze real-time streaming data.
In this webinar, we will guide you through some of the proven architectures for processing streaming data, using a combination of tools including Amazon Kinesis Streams, AWS Lambda, and Spark Streaming on Amazon Elastic MapReduce (EMR). We will then talk about common use cases and best practices for real-time data analysis on AWS.
Learning Objectives:
Understand how you can analyze real-time data streams using Amazon Kinesis, AWS Lambda, and Spark running on Amazon EMR
Learn use cases and best practices for streaming data applications on AWS
If you are interested to know more about AWS Chicago Summit, please use the following to register: http://amzn.to/1RooPPL
Many AWS customers store vast amounts of data in Amazon S3, a low cost, scalable, and durable object store; Amazon DynamoDB, a NoSQL database; or Amazon Kinesis, a real time data stream processing service. With large datasets in various AWS services, how do you derive value from this information in a cost-effective way? Using Amazon Elastic MapReduce (Amazon EMR) with applications in the Apache Hadoop ecosystem, you can directly interact with data in each of these storage services for scalable analytics workloads or ad hoc queries. You can quickly and easily launch an Amazon EMR cluster from the AWS Management Console, and scale your cluster to match the compute and memory resources needed for your workflow, independent from the storage capacity used in your AWS storage services. The webinar will accelerate your use of Amazon EMR by showing you how to create and monitor Amazon EMR clusters, and provide several use cases and architectures for using Amazon EMR with different AWS data stores.
Learning Objectives: • Recognize when to use Amazon EMR • Understand the steps required to set up and monitor an Amazon EMR cluster • Architect applications that effectively use Amazon EMR • Understand how to use HUE for ad hoc query of data in Amazon S3
Who Should Attend: • Developers, LOB owners, Continuous Integration & Continuous Delivery (CICD) practitioners
Log Analytics with Amazon Elasticsearch Service and Amazon Kinesis - March 20...Amazon Web Services
Log analytics is a common big data use case that allows you to analyze log data from websites, mobile devices, servers, sensors, and more for a wide variety of applications including digital marketing, application monitoring, fraud detection, ad tech, gaming, and IoT. In this tech talk, we will walk you step-by-step through the process of building an end-to-end analytics solution that ingests, transforms, and loads streaming data using Amazon Kinesis Firehose, Amazon Kinesis Analytics and AWS Lambda. The processed data will be saved to an Amazon Elasticsearch Service cluster, and we will use Kibana to visualize the data in near real-time.
Learning Objectives:
1. Reference architecture for building a complete log analytics solution
2. Overview of the services used and how they fit together
3. Best practices for log analytics implementation
What if there were an easier way to perform big data analysis with less setup, instant scaling, and no servers to provision and manage? With serverless computing, you can perform real-time stream processing of multiple data types without needing to spin up servers or install software. Come learn how you can use AWS Lambda with Amazon Kinesis to analyze streaming data in real-time and then store the results in a managed NoSQL database such as Amazon DynamoDB. You’ll learn tips and tricks for doing in-line processing, data manipulation, and even distributed MapReduce on large data sets.
Amazon Elastic MapReduce is one of the largest Hadoop operators in the world. Since its launch five years ago, AWS customers have launched more than 5.5 million Hadoop clusters.
In this talk, we introduce you to Amazon EMR design patterns such as using Amazon S3 instead of HDFS, taking advantage of both long and short-lived clusters and other Amazon EMR architectural patterns. We talk about how to scale your cluster up or down dynamically and introduce you to ways you can fine-tune your cluster. We also share best practices to keep your Amazon EMR cluster cost efficient.
Speakers:
Ian Meyers, AWS Solutions Architect
Ian McDonald, IT Director, SwiftKey
Amazon Elastic MapReduce (Amazon EMR) is a web service that allows you to easily and securely provision and manage your Hadoop clusters. In this talk, we will introduce you to Amazon EMR design patterns, such as using various data stores like Amazon S3, how to take advantage of both transient and active clusters, and how to work with other Amazon EMR architectural patterns. We will dive deep on how to dynamically scale your cluster and address the ways you can fine-tune your cluster. We will discuss bootstrapping Hadoop applications from our partner ecosystem that you can use natively with Amazon EMR. Lastly, we will share best practices on how to keep your Amazon EMR cluster cost-effective.
AWS Webcast - Managing Big Data in the AWS Cloud_20140924Amazon Web Services
This presentation deck will cover specific services such as Amazon S3, Kinesis, Redshift, Elastic MapReduce, and DynamoDB, including their features and performance characteristics. It will also cover architectural designs for the optimal use of these services based on dimensions of your data source (structured or unstructured data, volume, item size and transfer rates) and application considerations - for latency, cost and durability. It will also share customer success stories and resources to help you get started.
The world is producing an ever increasing volume, velocity, and variety of big data. Consumers and businesses are demanding up-to-the-second (or even millisecond) analytics on their fast-moving data, in addition to classic batch processing. AWS delivers many technologies for solving big data problems. But what services should you use, why, when, and how? In this session, we simplify big data processing as a data bus comprising various stages: ingest, store, process, and visualize. Next, we discuss how to choose the right technology in each stage based on criteria such as data structure, query latency, cost, request rate, item size, data volume, durability, and so on. Finally, we provide reference architecture, design patterns, and best practices for assembling these technologies to solve your big data problems at the right cost.
Presented by: Arie Leeuwesteijn, Principal Solutions Architect, Amazon Web Services
Customer Guest: Sander Kieft, Sanoma
Data processing and analysis is where big data is most often consumed - driving business intelligence (BI) use cases that discover and report on meaningful patterns in the data. In this session, we will discuss options for processing, analyzing and visualizing data. We will also look at partner solutions and BI-enabling services from AWS. Attendees will learn about optimal approaches for stream processing, batch processing and Interactive analytics. AWS services to be covered include: Amazon Machine Learning, Elastic MapReduce (EMR), and Redshift.
사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...Amazon Web Services Korea
Database Migration Service(DMS)는 RDBMS 이외에도 다양한 데이터베이스 이관을 지원합니다. 실제 고객사 사례를 통해 DMS가 데이터베이스 이관, 통합, 분리를 수행하는 데 어떻게 활용되는지 알아보고, 동시에 데이터 분석을 위한 데이터 수집(Data Ingest)에도 어떤 역할을 하는지 살펴보겠습니다.
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...Amazon Web Services Korea
Amazon ElastiCache는 Redis 및 MemCached와 호환되는 완전관리형 서비스로서 현대적 애플리케이션의 성능을 최적의 비용으로 실시간으로 개선해 줍니다. ElastiCache의 Best Practice를 통해 최적의 성능과 서비스 최적화 방법에 대해 알아봅니다.
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...Amazon Web Services Korea
ccAmazon Aurora 데이터베이스는 클라우드용으로 구축된 관계형 데이터베이스입니다. Aurora는 상용 데이터베이스의 성능과 가용성, 그리고 오픈소스 데이터베이스의 단순성과 비용 효율성을 모두 제공합니다. 이 세션은 Aurora의 고급 사용자들을 위한 세션으로써 Aurora의 내부 구조와 성능 최적화에 대해 알아봅니다.
[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...Amazon Web Services Korea
오랫동안 관계형 데이터베이스가 가장 많이 사용되었으며 거의 모든 애플리케이션에서 널리 사용되었습니다. 따라서 애플리케이션 아키텍처에서 데이터베이스를 선택하기가 더 쉬웠지만, 구축할 수 있는 애플리케이션의 유형이 제한적이었습니다. 관계형 데이터베이스는 스위스 군용 칼과 같아서 많은 일을 할 수 있지만 특정 업무에는 완벽하게 적합하지는 않습니다. 클라우드 컴퓨팅의 등장으로 경제적인 방식으로 더욱 탄력적이고 확장 가능한 애플리케이션을 구축할 수 있게 되면서 기술적으로 가능한 일이 달라졌습니다. 이러한 변화는 전용 데이터베이스의 부상으로 이어졌습니다. 개발자는 더 이상 기본 관계형 데이터베이스를 사용할 필요가 없습니다. 개발자는 애플리케이션의 요구 사항을 신중하게 고려하고 이러한 요구 사항에 맞는 데이터베이스를 선택할 수 있습니다.
Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...Amazon Web Services Korea
실시간 분석은 AWS 고객의 사용 사례가 점점 늘어나고 있습니다. 이 세션에 참여하여 스트리밍 데이터 기술이 어떻게 데이터를 즉시 분석하고, 시스템 간에 데이터를 실시간으로 이동하고, 실행 가능한 통찰력을 더 빠르게 얻을 수 있는지 알아보십시오. 일반적인 스트리밍 데이터 사용 사례, 비즈니스에서 실시간 분석을 쉽게 활성화하는 단계, AWS가 Amazon Kinesis와 같은 AWS 스트리밍 데이터 서비스를 사용하도록 지원하는 방법을 다룹니다.
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...Amazon Web Services Korea
Amazon EMR은 Apache Spark, Hive, Presto, Trino, HBase 및 Flink와 같은 오픈 소스 프레임워크를 사용하여 분석 애플리케이션을 쉽게 실행할 수 있는 관리형 서비스를 제공합니다. Spark 및 Presto용 Amazon EMR 런타임에는 오픈 소스 Apache Spark 및 Presto에 비해 두 배 이상의 성능 향상을 제공하는 최적화 기능이 포함되어 있습니다. Amazon EMR Serverless는 Amazon EMR의 새로운 배포 옵션이지만 데이터 엔지니어와 분석가는 클라우드에서 페타바이트 규모의 데이터 분석을 쉽고 비용 효율적으로 실행할 수 있습니다. 이 세션에 참여하여 개념, 설계 패턴, 라이브 데모를 사용하여 Amazon EMR/EMR 서버리스를 살펴보고 Spark 및 Hive 워크로드, Amazon EMR 스튜디오 및 Amazon SageMaker Studio와의 Amazon EMR 통합을 실행하는 것이 얼마나 쉬운지 알아보십시오.
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...Amazon Web Services Korea
로그 및 지표 데이터를 쉽게 가져오고, OpenSearch 검색 API를 사용하고, OpenSearch 대시보드를 사용하여 시각화를 구축하는 등 Amazon OpenSearch의 새로운 기능과 기능에 대해 자세히 알아보십시오. 애플리케이션 문제를 디버깅할 수 있는 OpenSearch의 Observability 기능에 대해 알아보세요. Amazon OpenSearch Service를 통해 인프라 관리에 대해 걱정하지 않고 검색 또는 모니터링 문제에 집중할 수 있는 방법을 알아보십시오.
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...Amazon Web Services Korea
데이터 거버넌스는 전체 프로세스에서 데이터를 관리하여 데이터의 정확성과 완전성을 보장하고 필요한 사람들이 데이터에 액세스할 수 있도록 하는 프로세스입니다. 이 세션에 참여하여 AWS가 어떻게 분석 서비스 전반에서 데이터 준비 및 통합부터 데이터 액세스, 데이터 품질 및 메타데이터 관리에 이르기까지 포괄적인 데이터 거버넌스를 제공하는지 알아보십시오. AWS에서의 스트리밍에 대해 자세히 알아보십시오.
Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...Amazon Web Services Korea
이 세션에 참여하여 Amazon Redshift의 새로운 기능을 자세히 살펴보십시오. Amazon Data Sharing, Amazon Redshift Serverless, Redshift Streaming, Redshift ML 및 자동 복사 등에 대한 자세한 내용과 데모를 통해 Amazon Redshift의 새로운 기능을 알고 싶은 사용자에게 적합합니다.
From Insights to Action, How to build and maintain a Data Driven Organization...Amazon Web Services Korea
데이터는 혁신과 변혁의 토대입니다. 비즈니스 혁신을 이끄는 혁신은 특정 시점의 전략이나 솔루션이 아니라 성장을 위한 반복적이고 집단적인 계획입니다. 혁신에 이러한 접근 방식을 채택하는 기업은 전략과 비즈니스 문화에서 데이터를 기반으로 하는 경우가 많습니다. 이러한 접근 방식을 개발하려면 리더가 데이터를 조직의 자산처럼 취급하고 조직이 더 나은 비즈니스 성과를 위해 데이터를 활용할 수 있도록 권한을 부여해야 합니다. AWS와 Amazon이 어떻게 데이터와 분석을 활용하여 확장 가능한 비즈니스 효율성을 창출하고 고객의 가장 복잡한 문제를 해결하는 메커니즘을 개발했는지 알아보십시오.
[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...Amazon Web Services Korea
데이터는 최종 소비자의 성공에 초점을 맞춘 디지털 혁신에서 중추적인 역할을 하고 있습니다. 모든 기업들은 데이터를 자산으로 사용하여 사례 제공을 추진하고 까다로운 결과를 해결하고 있습니다. AWS 클라우드 기술과 분석 솔루션의 강력한 성능을 통해 고객은 혁신 여정을 가속화할 수 있습니다. 이 세션에서는 기업 고객들이 클라우드에서 데이터의 힘을 활용하여 혁신 목표를 달성하고 필요한 결과를 제공하는 방법에 대해 다룹니다.
LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...Amazon Web Services Korea
LG ThinQ는 LG전자의 가전제품과 서비스를 아우르는 플랫폼 브랜드로서 앱 하나로 간편한 컨트롤, 똑똑한 케어, 스마트한 쇼핑까지 한번에 가능한 플랫폼입니다. ThinQ 플랫폼은 글로벌 서비스로 제공되고 있어, 작업 시간을 최소화하고, 서비스의 영향을 최소화 할 필요가 있었습니다. 따라서 DB 버전 업그레이드 작업 시 애플리케이션 배포가 필요없는 Blue/Green Deployment 방식은 최선의 선택이 되었습니다.
KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...Amazon Web Services Korea
온프레미스 분석 플랫폼에는 자원 증설 비용, 자원 관리 비용, 신규 자원 도입 및 환경 설정의 리드타임 등 다양한 측면에서의 한계가 존재합니다. 이에 KB국민카드에서는 기존 분석 플랫폼의 한계를 극복함과 동시에 시너지를 낼 수 있는 클라우드 기반 분석 플랫폼을 설계 및 도입하였습니다. 본 사례 소개는 KB국민카드의 데이터 혁신 여정과 노하우를 소개합니다.
SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...Amazon Web Services Korea
SK Telecom의 망관리 프로젝트인 TANGO에서는 오라클을 기반으로 시스템을 구축하여 운영해 왔습니다. 하지만 늘어나는 사용자와 데이터로 인해 유연하고 비용 효율적인 인프라가 필요하게 되었고, 이에 클라우드 도입을 검토 및 실행에 옮기게 되었습니다. TANGO 프로젝트의 클라우드 도입을 위한 검토부터 준비, 실행 및 이를 통해 얻게 된 교훈과 향후 계획에 대해 소개합니다.
코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...Amazon Web Services Korea
2022년 코리안리는 핵심업무시스템(기간계/정보계 시스템)을 AWS 클라우드로 전환하는 사업과 AWS 클라우드 기반에서 손익분석을 위한 어플리케이션 구축 사업을 동시에 진행하고 있었습니다. 이에 따라 클라우드 전환 이후 시스템 간 상호운용성과 호환성을갖춘 데이터 분석 플랫폼 또한 필요하게 되었습니다. 코리안리 IT 환경에 적합한 플랫폼 선정을 위하여 AWS Native Analytics Platform, 3rd Party Analytics Platform (클라우데라, 데이터브릭스)과의 PoC를 진행하고, 최종적으로 AWS Native Analytics Platform 으로 확정하였습니다. 코리안리는 메가존클라우드와 함께 2022년 10월부터 4개월(구축 3개월, 안정화 및 교육 1개월) 동안 AWS 기반 데이터 분석 플랫폼을 구축하고 활용 범위를 지속적으로 확대하고 있습니다.
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...Amazon Web Services Korea
LG 이노텍은 세계 시장을 선도하는 글로벌 소재·부품기업으로, Amazon Redshift 을 데이터 분석 플랫폼의 핵심 서비스로 활용하고 있습니다.지속적인 데이터 증가와 업무 확대에 따른 유연한 아키텍처 개선의 필요성에 대처하기 위해, 2022년에 AWS 에서 발표된 Redshift Serverless 를 활용한, 비용 최적화된 아키텍처 개선 과정의 실사례를 엿볼수 있는 기회가 됩니다.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
7. Why Amazon EMR?
Easy to Use
Launch a cluster in minutes
Low Cost
Pay an hourly rate
Elastic
Easily add or remove capacity
Reliable
Spend less time monitoring
Secure
Manage firewalls
Flexible
Control the cluster
8. Easy to deploy
AWS Management Console Command Line
Or use the Amazon EMR API with your favorite SDK.
9. Easy to monitor and debug
Integrated with Amazon CloudWatch
Monitor Cluster, Node, and IO
Monitor Debug
13. Try different configurations to find your optimal architecture.
CPU
c3 family
cc1.4xlarge
cc2.8xlarge
Memory
m2 family
r3 family
Disk/IO
d2 family
i2 family
General
m1 family
m3 family
Choose your instance types
Batch Machine Spark and Large
process learning interactive HDFS
14. Easy to add and remove compute
capacity on your cluster.
Match compute
demands with
cluster sizing.
Resizable clusters
15. Spot Instances
for task nodes
Up to 90%
off Amazon EC2
on-demand
pricing
On-demand for
core nodes
Standard
Amazon EC2
pricing for
on-demand
capacity
Easy to use Spot Instances
Meet SLA at predictable cost Exceed SLA at lower cost
16. Use bootstrap actions to install applications…
https://github.com/awslabs/emr-bootstrap-actions
17. …or to configure Hadoop
--bootstrap-action s3://elasticmapreduce/bootstrap-actions/configure-
hadoop
--keyword-config-file (Merge values in new config to existing)
--keyword-key-value (Override values provided)
Configuration File
Name
Configuration File
Keyword
File Name
Shortcut
Key-Value Pair
Shortcut
core-site.xml core C c
hdfs-site.xml hdfs H h
mapred-site.xml mapred M m
yarn-site.xml yarn Y y
18. Read data directly into Hive,
Apache Pig, and Hadoop
Streaming and Cascading from
Amazon Kinesis streams
No intermediate data
persistence required
Simple way to introduce real-time sources into
batch-oriented systems
Multi-application support and automatic
checkpointing
Amazon EMR Integration with Amazon Kinesis
20. Amazon S3 as your persistent data store
• Amazon S3
– Designed for 99.999999999% durability
– Separate compute and storage
• Resize and shut down Amazon EMR
clusters with no data loss
• Point multiple Amazon EMR clusters at
same data in Amazon S3
21. EMRFS makes it easier to leverage Amazon S3
• Better performance and error handling options
• Transparent to applications – just read/write to “s3://”
• Consistent view
– For consistent list and read-after-write for new puts
• Support for Amazon S3 server-side and client-side encryption
• Faster listing using EMRFS metadata
22. EMRFS support for Amazon S3 client-side encryption
Amazon S3
AmazonS3encryption
clients
EMRFSenabledfor
AmazonS3client-sideencryption
Key vendor (AWS KMS or your custom key vendor)
(client-side encrypted objects)
23. Amazon S3 EMRFS metadata
in Amazon DynamoDB
• List and read-after-write consistency
• Faster list operations
Number of
objects
Without Consistent
Views
With Consistent
Views
1,000,000 147.72 29.70
100,000 12.70 3.69
Fast listing of Amazon S3 objects using
EMRFS metadata
*Tested using a single node cluster with a m3.xlarge instance.
24. Optimize to leverage HDFS
• Iterative workloads
– If you’re processing the same dataset more than once
• Disk I/O intensive workloads
Persist data on Amazon S3 and use S3DistCp to
copy to HDFS for processing.
26. Amazon EMR example #1: Batch processing
GBs of logs pushed
to Amazon S3 hourly
Daily Amazon EMR
cluster using Hive to
process data
Input and output
stored in Amazon S3
250 Amazon EMR jobs per day, processing 30 TB of data
http://aws.amazon.com/solutions/case-studies/yelp/
27. Amazon EMR example #2: Long-running cluster
Data pushed to
Amazon S3
Daily Amazon EMR cluster
Extract, Transform, and Load
(ETL) data into database
24/7 Amazon EMR cluster
running HBase holds last 2
years’ worth of data
Front-end service uses
HBase cluster to power
dashboard with high
concurrency
28. Amazon EMR example #3: Interactive query
TBs of logs sent daily
Logs stored in
Amazon S3
Amazon EMR cluster using Presto for ad hoc
analysis of entire log set
Interactive query using Presto on multipetabyte warehouse
http://techblog.netflix.com/2014/10/using-presto-in-our-big-
data-platform.html
37. Amazon Kinesis – stream and shards
•Stream: A named entity to
capture and store data
•Shards: Unit of capacity
•Put – 1 MB/sec or 1000
TPS
•Get - 2 MB/sec or 5 TPS
•Scale by adding or removing
shards
•Replay in 24-hr. window
38. How to size your Amazon Kinesis stream
Consider 2 producers, each producing 2 KB records at 500 TPS:
Minimum of 2 shards for ingress of 2 MB/s
2 Applications can read with egress of 4MB/s
Shard
Shard
2 KB * 500 TPS = 1000 KB/s
2 KB * 500 TPS = 1000 KB/s
Application
Producers
Application
39. How to size your Amazon Kinesis stream
Consider 3 consuming applications each processing the data
Simple! Add another shard to the stream to spread the load
Shard
Shard
2 KB * 500 TPS = 1000 KB/s
2 KB * 500 TPS = 1000 KB/s
Application
Application
Application
Producers
Shard
40. Amazon Kinesis – distributed streams
• From batch to continuous processing
• Scale UP or DOWN without losing sequencing
• Workers can replay records for up to 24 hours
• Scale up to GB/sec without losing durability
– Records stored across multiple Availability Zones
• Run multiple parallel Amazon Kinesis applications
42. Batch
Micro
batch
Real
time
Pattern for real-time analytics…
Batch
analysis
Data Warehouse
Hadoop
Notifications
& alerts
Dashboards/
visualizations
APIsStreaming
analytics
Data
streams
Deep learning
Dashboards/
visualizations
Spark-Streaming
Apache Storm
Amazon KCL
Data
archive
43. Real-time analytics
• Streaming
– Event-based response within seconds; for example,
detecting whether a transaction is a fraud or not
• Micro-batch
– Operational insights within minutes; for example,
monitor transactions from different regions
Kinesis
Client
Library
45. Amazon KCL design components
• Worker: The processing unit that maps to each application
instance
• Record processor: The processing unit that processes data
from a shard of an Amazon Kinesis stream
• Check-pointer: Keeps track of the records that have already
been processed in a given shard
Amazon KCL restarts the processing of the shard at the last-
known processed record if a worker fails
46. Amazon Kinesis Connector Library
• Amazon S3
– Archival of data
• Amazon Redshift
– Micro-batching loads
• Amazon DynamoDB
– Real-time Counters
• Elasticsearch
– Search and Index
S3 Dynamo DB Amazon
Redshift
Amazon
Kinesis
47. Read data directly into
Hive, Pig, Streaming,
and Cascading from
Amazon Kinesis
Real-time sources into batch-oriented systems
Multi-application support & check-pointing
EMR integration with Amazon
Kinesis
48. DStream
RDD@T1 RDD@T2
Messages
Receiver
Spark streaming – Basic concepts
• Higher-level abstraction called Discretized Streams
(DStreams)
• Represented as sequences of Resilient Distributed
Datasets (RDDs)
http://spark.apache.org/docs/latest/streaming-kinesis-integration.html
49. Apache Storm: Basic concepts
• Streams: Unbounded sequence of tuples
• Spout: Source of stream
• Bolts: Processes that input streams and output new streams
• Topologies: Network of spouts and bolts
https://github.com/awslabs/kinesis-storm-spout
52. Cost-saving tips
• Use Amazon S3 as your persistent data store (only pay for compute
when you need it!).
• Use Amazon EC2 Spot Instances (especially with task nodes) to
save 80 percent or more on the Amazon EC2 cost.
• Use Amazon EC2 Reserved Instances if you have steady
workloads.
• Create CloudWatch alerts to notify you if a cluster is underutilized
so that you can shut it down (e.g. Mappers running == 0 for more
than N hours).
• Contact your sales rep about custom pricing options, if you are
spending more than $10K per month on Amazon EMR.