Amazon Kinesis Data Streams is a scalable, long-lasting, and low-cost streaming data solution from
Amazon. Kinesis Data Streams can gather terabytes of data every second from tens of thousands of
sources, including internet clickstreams, database event streams, financial transactions, social media
feeds, IT logs, and location-tracking events. Real-time dashboards, real-time anomaly detection, and
dynamic pricing are all possible with the collected data, which is available in milliseconds.
Today organizations find themselves in a data rich world with a growing need for increased agility and accessibility of all this data for analysis and deriving keen insights to drive strategic decisions. Creating a data lake helps you to manage all the disparate sources of data you are collecting, in its original format and extract value. In this session learn how to architect and implement an Analytics Data Lake. Hear customer examples of best practices and learn from their architectural blueprints.
With distributed frameworks like Hadoop and Kafka, it is essential to deploy the right environment to successfully support these workloads. Learn about the different block storage options from AWS and walk through with our experts on how to select the best option for your big data analytic workloads. We will demonstrate how to setup, select, and modify volume types to right size your environment needs.
"Conceptually, a data lake is a flat data store to collect data in its original form, without the need to enforce a predefined schema. Instead, new schemas or views are created “on demand”, providing a far more agile and flexible architecture while enabling new types of analytical insights. AWS provides many of the building blocks required to help organizations implement a data lake. In this session, we will introduce key concepts for a data lake and present aspects related to its implementation. We will discuss critical success factors, pitfalls to avoid as well as operational aspects such as security, governance, search, indexing and metadata management. We will also provide insight on how AWS enables a data lake architecture.
A data lake is a flat data store to collect data in its original form, without the need to enforce a predefined schema. Instead, new schemas or views are created ""on demand"", providing a far more agile and flexible architecture while enabling new types of analytical insights. AWS provides many of the building blocks required to help organizations implement a data lake. In this session, we introduce key concepts for a data lake and present aspects related to its implementation. We discuss critical success factors and pitfalls to avoid, as well as operational aspects such as security, governance, search, indexing, and metadata management. We also provide insight on how AWS enables a data lake architecture. Attendees get practical tips and recommendations to get started with their data lake implementations on AWS."
We will introduce key concepts for a data lake and present aspects related to its implementation. Also discussing critical success factors, pitfalls to avoid operational aspects, and insights on how AWS enables a server-less data lake architecture.
Speaker: Sebastien Menant, Solutions Architect, Amazon Web Services
AWS Summit Singapore - Architecting a Serverless Data Lake on AWSAmazon Web Services
Unni Pillai, Specialist Solution Architect, ASEAN, AWS.
Daniel Muller, Head of Cloud Infrastructure, Spuul.
As the volume and types of data continues to grow, customers often have valuable data that is not easily discoverable and available for analytics. A common challenge for data engineering teams is architecting a data lake that can cater to the needs of diverse users - from developers to business analysts to data scientists.
In this session, we will dive deep into building a data lake using Amazon S3, Amazon Kinesis, Amazon Athena and AWS Glue. We will also see how AWS Glue crawlers can automatically discover your data, extracting and cataloguing relevant metadata to reduce operations in preparing your data for downstream consumers.
Furthermore, learn from our customer Spuul, on how they moved from a Data Warehouse based analytics to a serverless data lake. Why and how did Spuul undertake this journey? Hear about the benefits and challenges they encountered.
February 2016 Webinar Series - Architectural Patterns for Big Data on AWSAmazon Web Services
With an ever-increasing set of technologies to process big data, organizations often struggle to understand how to build scalable and cost-effective big data applications.
In this webinar, we will simplify big data processing as a pipeline comprising various stages; and then show you how to choose the right technology for each stage based on criteria such as data structure, design patterns, and best practices.
Learning Objectives:
Understand key AWS Big Data services including S3, Amazon EMR, Kinesis, and Redshift
Learn architectural patterns for Big Data
Hear best practices for building Big Data applications on AWS
Who Should Attend:
Architects, developers and data scientists who are looking to start a Big Data initiative
Today organizations find themselves in a data rich world with a growing need for increased agility and accessibility of all this data for analysis and deriving keen insights to drive strategic decisions. Creating a data lake helps you to manage all the disparate sources of data you are collecting, in its original format and extract value. In this session learn how to architect and implement an Analytics Data Lake. Hear customer examples of best practices and learn from their architectural blueprints.
With distributed frameworks like Hadoop and Kafka, it is essential to deploy the right environment to successfully support these workloads. Learn about the different block storage options from AWS and walk through with our experts on how to select the best option for your big data analytic workloads. We will demonstrate how to setup, select, and modify volume types to right size your environment needs.
"Conceptually, a data lake is a flat data store to collect data in its original form, without the need to enforce a predefined schema. Instead, new schemas or views are created “on demand”, providing a far more agile and flexible architecture while enabling new types of analytical insights. AWS provides many of the building blocks required to help organizations implement a data lake. In this session, we will introduce key concepts for a data lake and present aspects related to its implementation. We will discuss critical success factors, pitfalls to avoid as well as operational aspects such as security, governance, search, indexing and metadata management. We will also provide insight on how AWS enables a data lake architecture.
A data lake is a flat data store to collect data in its original form, without the need to enforce a predefined schema. Instead, new schemas or views are created ""on demand"", providing a far more agile and flexible architecture while enabling new types of analytical insights. AWS provides many of the building blocks required to help organizations implement a data lake. In this session, we introduce key concepts for a data lake and present aspects related to its implementation. We discuss critical success factors and pitfalls to avoid, as well as operational aspects such as security, governance, search, indexing, and metadata management. We also provide insight on how AWS enables a data lake architecture. Attendees get practical tips and recommendations to get started with their data lake implementations on AWS."
We will introduce key concepts for a data lake and present aspects related to its implementation. Also discussing critical success factors, pitfalls to avoid operational aspects, and insights on how AWS enables a server-less data lake architecture.
Speaker: Sebastien Menant, Solutions Architect, Amazon Web Services
AWS Summit Singapore - Architecting a Serverless Data Lake on AWSAmazon Web Services
Unni Pillai, Specialist Solution Architect, ASEAN, AWS.
Daniel Muller, Head of Cloud Infrastructure, Spuul.
As the volume and types of data continues to grow, customers often have valuable data that is not easily discoverable and available for analytics. A common challenge for data engineering teams is architecting a data lake that can cater to the needs of diverse users - from developers to business analysts to data scientists.
In this session, we will dive deep into building a data lake using Amazon S3, Amazon Kinesis, Amazon Athena and AWS Glue. We will also see how AWS Glue crawlers can automatically discover your data, extracting and cataloguing relevant metadata to reduce operations in preparing your data for downstream consumers.
Furthermore, learn from our customer Spuul, on how they moved from a Data Warehouse based analytics to a serverless data lake. Why and how did Spuul undertake this journey? Hear about the benefits and challenges they encountered.
February 2016 Webinar Series - Architectural Patterns for Big Data on AWSAmazon Web Services
With an ever-increasing set of technologies to process big data, organizations often struggle to understand how to build scalable and cost-effective big data applications.
In this webinar, we will simplify big data processing as a pipeline comprising various stages; and then show you how to choose the right technology for each stage based on criteria such as data structure, design patterns, and best practices.
Learning Objectives:
Understand key AWS Big Data services including S3, Amazon EMR, Kinesis, and Redshift
Learn architectural patterns for Big Data
Hear best practices for building Big Data applications on AWS
Who Should Attend:
Architects, developers and data scientists who are looking to start a Big Data initiative
ABD318_Architecting a data lake with Amazon S3, Amazon Kinesis, AWS Glue and ...Amazon Web Services
"Learn how to architect a data lake where different teams within your organization can publish and consume data in a self-service manner. As organizations aim to become more data-driven, data engineering teams have to build architectures that can cater to the needs of diverse users - from developers, to business analysts, to data scientists. Each of these user groups employs different tools, have different data needs and access data in different ways.
In this talk, we will dive deep into assembling a data lake using Amazon S3, Amazon Kinesis, Amazon Athena, Amazon EMR, and AWS Glue. The session will feature Mohit Rao, Architect and Integration lead at Atlassian, the maker of products such as JIRA, Confluence, and Stride. First, we will look at a couple of common architectures for building a data lake. Then we will show how Atlassian built a self-service data lake, where any team within the company can publish a dataset to be consumed by a broad set of users."
Amazon QuickSight is a fast, cloud-powered business intelligence (BI) service that makes it easy to build visualizations, perform ad-hoc analysis, and quickly get business insights from your data. In this session, we demonstrate how you can point Amazon QuickSight to AWS data stores, flat files, or other third-party data sources and begin visualizing your data in minutes. We also introduce you to SPICE - a Super-fast, Parallel, In-memory, Calculation Engine in Amazon QuickSight, which performs advanced calculations and render visualizations rapidly without requiring any additional infrastructure, SQL programming, or dimensional modeling, so you can seamlessly scale to hundreds of thousands of users and petabytes of data. Lastly, you will see how Amazon QuickSight provides you with smart visualizations and graphs that are optimized for your different data types, to ensure the most suitable and appropriate visualization to conduct your analysis, and how to share these visualization stories using the built-in collaboration tools.
ENT313 Deploying a Disaster Recovery Site on AWS: Minimal Cost with Maximum E...Amazon Web Services
In the event of a disaster, you need to be able to recover lost data quickly to ensure business continuity. For critical applications, keeping your time to recover and data loss to a minimum as well as optimizing your overall capital expense can be challenging. This session presents AWS features and services along with Disaster Recovery architectures that you can leverage when building highly available and disaster resilient applications. We will provide recommendations on how to improve your Disaster Recovery plan and discuss example scenarios showing how to recover from a disaster.
Introduction to key architectural concepts to build a data lake using Amazon S3 as the storage layer and making this data available for processing with a broad set of analytic options including Amazon EMR and open source frameworks such as Apache Hadoop, Spark, Presto, and more.
How to build a data lake with aws glue data catalog (ABD213-R) re:Invent 2017Amazon Web Services
As data volumes grow and customers store more data on AWS, they often have valuable data that is not easily discoverable and available for analytics. The AWS Glue Data Catalog provides a central view of your data lake, making data readily available for analytics. We introduce key features of the AWS Glue Data Catalog and its use cases. Learn how crawlers can automatically discover your data, extract relevant metadata, and add it as table definitions to the AWS Glue Data Catalog. We will also explore the integration between AWS Glue Data Catalog and Amazon Athena, Amazon EMR, and Amazon Redshift Spectrum.
(BDT310) Big Data Architectural Patterns and Best Practices on AWS | AWS re:I...Amazon Web Services
The world is producing an ever increasing volume, velocity, and variety of big data. Consumers and businesses are demanding up-to-the-second (or even millisecond) analytics on their fast-moving data, in addition to classic batch processing. AWS delivers many technologies for solving big data problems. But what services should you use, why, when, and how? In this session, we simplify big data processing as a data bus comprising various stages: ingest, store, process, and visualize. Next, we discuss how to choose the right technology in each stage based on criteria such as data structure, query latency, cost, request rate, item size, data volume, durability, and so on. Finally, we provide reference architecture, design patterns, and best practices for assembling these technologies to solve your big data problems at the right cost.
"Increasing demands to collect, store, and analyze massive amounts of data often means that the same tools and approaches that worked in the past, don't work anymore. That's why many organizations are shifting to a data lake architecture. A data lake is an architectural approach that allows you to store massive amounts of data into a central location, so it's readily available to be categorized, processed, analyzed and consumed by diverse groups within an organization. In this tech talk, we introduce key concepts for a data lake and present aspects related to its implementation. We highlight the core components of a data lake, such as storage, compute, analytics, databases, stream processing, data management, and security. We discuss how to choose the right technologies for each component of the data lake, based on criteria such as data structure, query latency, cost, request rate, item size, data volume, durability, and so on. We also provide a reference architecture and recommendations to get started with a data lake implementation on AWS.
Learning Objectives:
Understand key concepts and architectural components of a data lake architecture
Describe how and when to use a broad set of analytic and data management tools in a data lake architecture
Get insights on how to get started with a data lake implementation on AWS"
ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...Amazon Web Services
Amazon RDS allows you to launch an optimally configured, secure and highly available database with just a few clicks. It provides cost-efficient and resizable capacity, automates time-consuming database administration tasks, and provides you with six familiar database engines to choose from: Amazon Aurora, Oracle, Microsoft SQL Server, PostgreSQL, MySQL and MariaDB. In this session, we will take a close look at the capabilities of Amazon RDS and explain how it works. We’ll also discuss the AWS Database Migration Service and AWS Schema Conversion Tool, which help you migrate databases and data warehouses with minimal downtime from on-premises and cloud environments to Amazon RDS and other Amazon services. Gain your freedom from expensive, proprietary databases while providing your applications with the fast performance, scalability, high availability, and compatibility they need.
Tackle Your Dark Data Challenge with AWS Glue - AWS Online Tech TalksAmazon Web Services
Learning Objectives:
- Discover dark data that you are currently not analyzing.
- Analyze dark data without moving it into your data warehouse.
- Visualize the results of your dark data analytics.
Big Data Architectural Patterns and Best Practices on AWSAmazon Web Services
The world is producing an ever increasing volume, velocity, and variety of big data. Consumers and businesses are demanding up-to-the-second (or even millisecond) analytics on their fast-moving data, in addition to classic batch processing. AWS delivers many technologies for solving big data problems. But what services should you use, why, when, and how? In this session, we simplify big data processing as a data bus comprising various stages: ingest, store, process, and visualize. Next, we discuss how to choose the right technology in each stage based on criteria such as data structure, query latency, cost, request rate, item size, data volume, durability, and so on. Finally, we provide reference architecture, design patterns, and best practices for assembling these technologies to solve your big data problems at the right cost.
(BDT308) Using Amazon Elastic MapReduce as Your Scalable Data Warehouse | AWS...Amazon Web Services
In this presentation, we will demonstrate how to use Amazon Elastic MapReduce as your scalable data warehouse. Amazon EMR supports clusters with thousands of nodes and is used to access petabyte scale data warehouses. Amazon EMR is not only fast, but it is also easy to use for rapid development and adhoc analysis. We will show you how access the large scale data warehouses with emerging tools such as Hue, Hive, low latency SQL applications like Presto, and alternative execution engines like Apache Spark. We will also show you how these tools integrate directly with other AWS big data services such as Amazon S3, Amazon DynamoDB, and Amazon Kinesis.
by Pubali Sen, Solutions Architect, AWS
Everything generates logs. Applications, infrastructure, security ... everything. Keeping track of the flood of log data is a big challenge, yet critical to your ability to understand your systems and troubleshoot (or prevent) issues. In this session, we will use both Amazon CloudWatch and application logs to show you how to build an end-to-end log analytics solution. First, we cover how to configure an Amazon Elaticsearch Service domain and ingest data into it using Amazon Kinesis Firehose, demonstrating how easy it is to transform data with Firehose. We look at best practices for choosing instance types, storage options, shard counts, and index rotations based on the throughput of incoming data and configure a secure analytics environment. We demonstrate how to set up a Kibana dashboard and build custom dashboard widgets. Finally, we dive deep into the Elasticsearch query DSL and review approaches for generating custom, ad-hoc reports.
As the volume and types of data continues to grow, customers often have valuable data that is not easily discoverable and available for analytics. A common challenge for data engineering teams is architecting a data lake that can cater to the needs of diverse users - from developers to business analysts to data scientists. In this session, dive deep into building a data lake using Amazon S3, Amazon Kinesis, Amazon Athena and AWS Glue. Learn how AWS Glue crawlers can automatically discover your data, extracting and cataloguing relevant metadata to reduce operations in preparing your data for downstream consumers.
by Avijit Goswami, Sr. Solutions Architect, AWS
A data lake can be used as a source for both structured and unstructured data - but how? We'll look at using open standards including Spark and Presto with Amazon EMR, Amazon Redshift Spectrum and Amazon Athena to process and understand data.
Big Data Architectural Patterns and Best Practices on AWSAmazon Web Services
by Dario Rivera, Solutions Architect, AWS
The world is producing an ever-increasing volume, velocity, and variety of big data. Consumers and businesses are demanding up-to-the-second (or even millisecond) analytics on their fast-moving data, in addition to classic batch processing. AWS delivers many technologies for solving big data problems. But what services should you use, why, when, and how? In this session, we simplify big data processing as a data bus comprising various stages: ingest, store, process, and visualize. Next, we discuss how to choose the right technology in each stage based on criteria such as data structure, query latency, cost, request rate, item size, data volume, durability, and so on. Finally, we provide reference architecture, design patterns, and best practices for assembling these technologies to solve your big data problems at the right cost.
Join us for an in-depth look at the current state of big data at AWS. Learn about the latest big data trends and industry use cases. Hear how other organizations are using the AWS big data platform to innovate and remain competitive. Take a look at some of the most recent AWS big data developments.
Streaming ETL for Data Lakes using Amazon Kinesis Firehose - May 2017 AWS Onl...Amazon Web Services
Learning Objectives:
- Understand key requirements for collecting, preparing, and loading streaming data into data lakes
- Get an overview of transmitting data using Amazon Kinesis Firehose
- Learn how to perform data transformations with Amazon Kinesis Firehose
Data lakes enable your employees across the organization to access and analyze massive amounts of unstructured and structured data from disparate data sources, many of which generate data continuously and rapidly. Making this data available in a timely fashion for analysis requires a streaming solution that can durably and cost-effectively ingest this data into your data lake. Amazon Kinesis Firehose is a fully managed service that makes it easy to prepare and load streaming data into AWS. In this tech talk, we will provide an overview of Amazon Kinesis Firehose and dive deep into how you can use the service to collect, transform, batch, compress, and load real-time streaming data into your Amazon S3 data lakes.
Esta sesión está enfocada en mostrar cómo las empresas pueden optimizar sus recursos a través de las soluciones basadas en la nube, poniendo foco en la diferenciación, la innovación y reducción de riesgos en la infraestructura.
Por Ricardo Rentería de Amazon
Join us for a series of introductory and technical sessions on AWS Big Data solutions. Gain a thorough understanding of what Amazon Web Services offers across the big data lifecycle and learn architectural best practices for applying those solutions to your projects.
We will kick off this technical seminar in the morning with an introduction to the AWS Big Data platform, including a discussion of popular use cases and reference architectures. In the afternoon, we will deep dive into Machine Learning and Streaming Analytics. We will then walk everyone through building your first Big Data application with AWS.
ABD318_Architecting a data lake with Amazon S3, Amazon Kinesis, AWS Glue and ...Amazon Web Services
"Learn how to architect a data lake where different teams within your organization can publish and consume data in a self-service manner. As organizations aim to become more data-driven, data engineering teams have to build architectures that can cater to the needs of diverse users - from developers, to business analysts, to data scientists. Each of these user groups employs different tools, have different data needs and access data in different ways.
In this talk, we will dive deep into assembling a data lake using Amazon S3, Amazon Kinesis, Amazon Athena, Amazon EMR, and AWS Glue. The session will feature Mohit Rao, Architect and Integration lead at Atlassian, the maker of products such as JIRA, Confluence, and Stride. First, we will look at a couple of common architectures for building a data lake. Then we will show how Atlassian built a self-service data lake, where any team within the company can publish a dataset to be consumed by a broad set of users."
Amazon QuickSight is a fast, cloud-powered business intelligence (BI) service that makes it easy to build visualizations, perform ad-hoc analysis, and quickly get business insights from your data. In this session, we demonstrate how you can point Amazon QuickSight to AWS data stores, flat files, or other third-party data sources and begin visualizing your data in minutes. We also introduce you to SPICE - a Super-fast, Parallel, In-memory, Calculation Engine in Amazon QuickSight, which performs advanced calculations and render visualizations rapidly without requiring any additional infrastructure, SQL programming, or dimensional modeling, so you can seamlessly scale to hundreds of thousands of users and petabytes of data. Lastly, you will see how Amazon QuickSight provides you with smart visualizations and graphs that are optimized for your different data types, to ensure the most suitable and appropriate visualization to conduct your analysis, and how to share these visualization stories using the built-in collaboration tools.
ENT313 Deploying a Disaster Recovery Site on AWS: Minimal Cost with Maximum E...Amazon Web Services
In the event of a disaster, you need to be able to recover lost data quickly to ensure business continuity. For critical applications, keeping your time to recover and data loss to a minimum as well as optimizing your overall capital expense can be challenging. This session presents AWS features and services along with Disaster Recovery architectures that you can leverage when building highly available and disaster resilient applications. We will provide recommendations on how to improve your Disaster Recovery plan and discuss example scenarios showing how to recover from a disaster.
Introduction to key architectural concepts to build a data lake using Amazon S3 as the storage layer and making this data available for processing with a broad set of analytic options including Amazon EMR and open source frameworks such as Apache Hadoop, Spark, Presto, and more.
How to build a data lake with aws glue data catalog (ABD213-R) re:Invent 2017Amazon Web Services
As data volumes grow and customers store more data on AWS, they often have valuable data that is not easily discoverable and available for analytics. The AWS Glue Data Catalog provides a central view of your data lake, making data readily available for analytics. We introduce key features of the AWS Glue Data Catalog and its use cases. Learn how crawlers can automatically discover your data, extract relevant metadata, and add it as table definitions to the AWS Glue Data Catalog. We will also explore the integration between AWS Glue Data Catalog and Amazon Athena, Amazon EMR, and Amazon Redshift Spectrum.
(BDT310) Big Data Architectural Patterns and Best Practices on AWS | AWS re:I...Amazon Web Services
The world is producing an ever increasing volume, velocity, and variety of big data. Consumers and businesses are demanding up-to-the-second (or even millisecond) analytics on their fast-moving data, in addition to classic batch processing. AWS delivers many technologies for solving big data problems. But what services should you use, why, when, and how? In this session, we simplify big data processing as a data bus comprising various stages: ingest, store, process, and visualize. Next, we discuss how to choose the right technology in each stage based on criteria such as data structure, query latency, cost, request rate, item size, data volume, durability, and so on. Finally, we provide reference architecture, design patterns, and best practices for assembling these technologies to solve your big data problems at the right cost.
"Increasing demands to collect, store, and analyze massive amounts of data often means that the same tools and approaches that worked in the past, don't work anymore. That's why many organizations are shifting to a data lake architecture. A data lake is an architectural approach that allows you to store massive amounts of data into a central location, so it's readily available to be categorized, processed, analyzed and consumed by diverse groups within an organization. In this tech talk, we introduce key concepts for a data lake and present aspects related to its implementation. We highlight the core components of a data lake, such as storage, compute, analytics, databases, stream processing, data management, and security. We discuss how to choose the right technologies for each component of the data lake, based on criteria such as data structure, query latency, cost, request rate, item size, data volume, durability, and so on. We also provide a reference architecture and recommendations to get started with a data lake implementation on AWS.
Learning Objectives:
Understand key concepts and architectural components of a data lake architecture
Describe how and when to use a broad set of analytic and data management tools in a data lake architecture
Get insights on how to get started with a data lake implementation on AWS"
ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...Amazon Web Services
Amazon RDS allows you to launch an optimally configured, secure and highly available database with just a few clicks. It provides cost-efficient and resizable capacity, automates time-consuming database administration tasks, and provides you with six familiar database engines to choose from: Amazon Aurora, Oracle, Microsoft SQL Server, PostgreSQL, MySQL and MariaDB. In this session, we will take a close look at the capabilities of Amazon RDS and explain how it works. We’ll also discuss the AWS Database Migration Service and AWS Schema Conversion Tool, which help you migrate databases and data warehouses with minimal downtime from on-premises and cloud environments to Amazon RDS and other Amazon services. Gain your freedom from expensive, proprietary databases while providing your applications with the fast performance, scalability, high availability, and compatibility they need.
Tackle Your Dark Data Challenge with AWS Glue - AWS Online Tech TalksAmazon Web Services
Learning Objectives:
- Discover dark data that you are currently not analyzing.
- Analyze dark data without moving it into your data warehouse.
- Visualize the results of your dark data analytics.
Big Data Architectural Patterns and Best Practices on AWSAmazon Web Services
The world is producing an ever increasing volume, velocity, and variety of big data. Consumers and businesses are demanding up-to-the-second (or even millisecond) analytics on their fast-moving data, in addition to classic batch processing. AWS delivers many technologies for solving big data problems. But what services should you use, why, when, and how? In this session, we simplify big data processing as a data bus comprising various stages: ingest, store, process, and visualize. Next, we discuss how to choose the right technology in each stage based on criteria such as data structure, query latency, cost, request rate, item size, data volume, durability, and so on. Finally, we provide reference architecture, design patterns, and best practices for assembling these technologies to solve your big data problems at the right cost.
(BDT308) Using Amazon Elastic MapReduce as Your Scalable Data Warehouse | AWS...Amazon Web Services
In this presentation, we will demonstrate how to use Amazon Elastic MapReduce as your scalable data warehouse. Amazon EMR supports clusters with thousands of nodes and is used to access petabyte scale data warehouses. Amazon EMR is not only fast, but it is also easy to use for rapid development and adhoc analysis. We will show you how access the large scale data warehouses with emerging tools such as Hue, Hive, low latency SQL applications like Presto, and alternative execution engines like Apache Spark. We will also show you how these tools integrate directly with other AWS big data services such as Amazon S3, Amazon DynamoDB, and Amazon Kinesis.
by Pubali Sen, Solutions Architect, AWS
Everything generates logs. Applications, infrastructure, security ... everything. Keeping track of the flood of log data is a big challenge, yet critical to your ability to understand your systems and troubleshoot (or prevent) issues. In this session, we will use both Amazon CloudWatch and application logs to show you how to build an end-to-end log analytics solution. First, we cover how to configure an Amazon Elaticsearch Service domain and ingest data into it using Amazon Kinesis Firehose, demonstrating how easy it is to transform data with Firehose. We look at best practices for choosing instance types, storage options, shard counts, and index rotations based on the throughput of incoming data and configure a secure analytics environment. We demonstrate how to set up a Kibana dashboard and build custom dashboard widgets. Finally, we dive deep into the Elasticsearch query DSL and review approaches for generating custom, ad-hoc reports.
As the volume and types of data continues to grow, customers often have valuable data that is not easily discoverable and available for analytics. A common challenge for data engineering teams is architecting a data lake that can cater to the needs of diverse users - from developers to business analysts to data scientists. In this session, dive deep into building a data lake using Amazon S3, Amazon Kinesis, Amazon Athena and AWS Glue. Learn how AWS Glue crawlers can automatically discover your data, extracting and cataloguing relevant metadata to reduce operations in preparing your data for downstream consumers.
by Avijit Goswami, Sr. Solutions Architect, AWS
A data lake can be used as a source for both structured and unstructured data - but how? We'll look at using open standards including Spark and Presto with Amazon EMR, Amazon Redshift Spectrum and Amazon Athena to process and understand data.
Big Data Architectural Patterns and Best Practices on AWSAmazon Web Services
by Dario Rivera, Solutions Architect, AWS
The world is producing an ever-increasing volume, velocity, and variety of big data. Consumers and businesses are demanding up-to-the-second (or even millisecond) analytics on their fast-moving data, in addition to classic batch processing. AWS delivers many technologies for solving big data problems. But what services should you use, why, when, and how? In this session, we simplify big data processing as a data bus comprising various stages: ingest, store, process, and visualize. Next, we discuss how to choose the right technology in each stage based on criteria such as data structure, query latency, cost, request rate, item size, data volume, durability, and so on. Finally, we provide reference architecture, design patterns, and best practices for assembling these technologies to solve your big data problems at the right cost.
Join us for an in-depth look at the current state of big data at AWS. Learn about the latest big data trends and industry use cases. Hear how other organizations are using the AWS big data platform to innovate and remain competitive. Take a look at some of the most recent AWS big data developments.
Streaming ETL for Data Lakes using Amazon Kinesis Firehose - May 2017 AWS Onl...Amazon Web Services
Learning Objectives:
- Understand key requirements for collecting, preparing, and loading streaming data into data lakes
- Get an overview of transmitting data using Amazon Kinesis Firehose
- Learn how to perform data transformations with Amazon Kinesis Firehose
Data lakes enable your employees across the organization to access and analyze massive amounts of unstructured and structured data from disparate data sources, many of which generate data continuously and rapidly. Making this data available in a timely fashion for analysis requires a streaming solution that can durably and cost-effectively ingest this data into your data lake. Amazon Kinesis Firehose is a fully managed service that makes it easy to prepare and load streaming data into AWS. In this tech talk, we will provide an overview of Amazon Kinesis Firehose and dive deep into how you can use the service to collect, transform, batch, compress, and load real-time streaming data into your Amazon S3 data lakes.
Esta sesión está enfocada en mostrar cómo las empresas pueden optimizar sus recursos a través de las soluciones basadas en la nube, poniendo foco en la diferenciación, la innovación y reducción de riesgos en la infraestructura.
Por Ricardo Rentería de Amazon
Join us for a series of introductory and technical sessions on AWS Big Data solutions. Gain a thorough understanding of what Amazon Web Services offers across the big data lifecycle and learn architectural best practices for applying those solutions to your projects.
We will kick off this technical seminar in the morning with an introduction to the AWS Big Data platform, including a discussion of popular use cases and reference architectures. In the afternoon, we will deep dive into Machine Learning and Streaming Analytics. We will then walk everyone through building your first Big Data application with AWS.
This presentation from the AWS Lab at Cloud Expo Europe 2014 contains details of newly announced services from Amazon Web Services, including Amazon Kinesis, Amazon WorkSpaces, AWS CloudTrail (beta), Amazon AppStream and Amazon RDS for PostgreSQL (beta)
This is the complete deck presented at the Westin Calgary Hotel, on August 16th, 2016.
It covers the current state of the AWS Big Data Solution set. Contains several use cases of Big Data, Machine Learning, and a tutorial on how to implement and use Big Data on the AWS Cloud Platform.
Introducing Amazon Kinesis: Real-time Processing of Streaming Big Data (BDT10...Amazon Web Services
"This presentation will introduce Kinesis, the new AWS service for real-time streaming big data ingestion and processing.
We’ll provide an overview of the key scenarios and business use cases suitable for real-time processing, and discuss how AWS designed Amazon Kinesis to help customers shift from a traditional batch-oriented processing of data to a continual real-time processing model. We’ll provide an overview of the key concepts, attributes, APIs and features of the service, and discuss building a Kinesis-enabled application for real-time processing. We’ll also contrast with other approaches for streaming data ingestion and processing. Finally, we’ll also discuss how Kinesis fits as part of a larger big data infrastructure on AWS, including S3, DynamoDB, EMR, and Redshift."
AWS FSI Symposium 2017 NYC - Moving at the Speed of Serverless ft BroadridgeAmazon Web Services
AWS’ suite of serverless technology has enabled enterprises in Financial Services to move quickly from conception to reality. By leveraging AWS, you can run code without provisioning or managing servers—and you only pay for what you use. In this session, we will walk through how we worked with Broadridge to take their Experience Manager application from design to deployment and provide details around how numerous AWS services were leveraged, including Cognito, Lambda, S3, DynamoDB, and SES. We will also dive into how the use of serverless technology can enable developers to move quickly, while improving security postures, minimizing management, and simplifying operations.
Join us for a series of introductory and technical sessions on AWS Big Data solutions. Gain a thorough understanding of what Amazon Web Services offers across the big data lifecycle and learn architectural best practices for applying those solutions to your projects.
We will kick off this technical seminar in the morning with an introduction to the AWS Big Data platform, including a discussion of popular use cases and reference architectures. In the afternoon, we will deep dive into Machine Learning and Streaming Analytics. We will then walk everyone through building your first Big Data application with AWS.
Amazon RDS allows you to launch an optimally configured, secure and highly available database with just a few clicks. It provides cost-efficient and resizable capacity while managing time-consuming database administration tasks, freeing you to focus on your applications and business. We’ll discuss Amazon RDS fundamentals, learn about the seven available database engines, and examine customer success stories.
Amazon RDS allows you to launch an optimally configured, secure and highly available database with just a few clicks. It provides cost-efficient and resizable capacity while managing time-consuming database administration tasks, freeing you to focus on your applications and business. We’ll discuss Amazon RDS fundamentals, learn about the seven available database engines, and examine customer success stories.
AWS re:Invent 2016: AWS Database State of the Union (DAT320)Amazon Web Services
Raju Gulabani, vice president of AWS Database Services (AWS), discusses the evolution of database services on AWS and the new database services and features we launched this year, and shares our vision for continued innovation in this space. We are witnessing an unprecedented growth in the amount of data collected, in many different shapes and forms. Storage, management, and analysis of this data requires database services that scale and perform in ways not possible before. AWS offers a collection of such database and other data services like Amazon Aurora, Amazon DynamoDB, Amazon RDS, Amazon Redshift, Amazon ElastiCache, Amazon Kinesis, and Amazon EMR to process, store, manage, and analyze data. In this session, we provide an overview of AWS database services and discuss how our customers are using these services today.
by Joyjeet Banerjee, Enterprise Solution Architect, AWS
Amazon RDS allows you to launch an optimally configured, secure and highly available database with just a few clicks. It provides cost-efficient and resizable capacity while managing time-consuming database administration tasks, freeing you to focus on your applications and business. We’ll discuss Amazon RDS fundamentals, learn about the seven available database engines, and examine customer success stories. Level 100
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017Amazon Web Services
Real-Time Streaming Analytics became popular amongst many verticals and use cases. In AdTech, Gaming, Financial Service and IoT, AWS customers are leveraging Amazon Kinesis platform to ingest billions of events every day and process them in real-time. In this session, we will discuss Amazon Kinesis Streams, Amazon Kinesis Firehose and Amazon Kinesis Analytics. We will show best practice and design patterns in integrating Amazon Kinesis platform with other services like Amazon EMR, Redshift, Amazon Elasticsearch and AWS lambda as well as 3rd party connectors like storm, Spark and more.
Building Data Analytics pipelines in the cloud using serverless technologyDomino Data Lab
Big Data analytics is well known to uncover hidden insights that gives an organization an edge over the competition. But data does not need to be big in order to be useful. Smaller companies and startups may lack the volume of data that qualifies as big data, yet the variety of data can still yield a trove of insights that helps in driving the business strategies of a company. Startups may also lack the resources to fund an additional, seemingly expensive development project. The key is in simplicity, start small, simple and architect for scalability and performance. But how do you start? In this presentation, we share our experience in building a cost effective, AWS serverless data analytics platform that became an invaluable tool for sales, marketing and operational efficiencies.Serverless architectures simplify development work where servers and software are managed by a third party cloud provider. Developers can focus on just building the data wrangling and data analysis logic where critical aspects like scalability and high availability are guaranteed by the cloud provider. Besides, serverless services offer the pay as you go model, where you pay only based on the amount of resources you use. This turns out to be another attractive aspect where costs can be managed based on the usage. In this presentation we will focus on techniques and best practices to build a big data analytics platform using AWS serverless services like Lambda, DynamoDB, S3, Kinesis, Athena, QuickSight and Amazon ML. We will highlight the strengths of each of these services and what role each plays in the data analytics pipeline. We compare and contrast these services with some of the other popularly used big data technologies like Hadoop, Spark and Kafka. We also demonstrate the usage of these services to build intelligent components that detect anomalies, yield recommendations, simulate chat bots and generate predictive analytics.
MSC203_How Citrix Uses AWS Marketplace Solutions To Accelerate Analytic Workl...Amazon Web Services
Find out how Citrix built a solution using Matillion ETL for Amazon Redshift from AWS Marketplace to load all data into an Amazon Redshift cluster, allowing them to do their analytics on the entire environment at a single time. We’ll discuss the transition made to consolidate multiple disparate databases in order to run analytic workloads, get a holistic view of all their data sources, and prevent inconsistent data from being captured.
How Citrix Uses AWS Marketplace Solutions to Accelerate Analytic Workloads on...Amazon Web Services
Find out how Citrix built a solution using Matillion ETL for Amazon Redshift from AWS Marketplace to load all data into an Amazon Redshift cluster, allowing them to do their analytics on the entire environment at a single time. We’ll discuss the transition made to consolidate multiple disparate databases in order to run analytic workloads, get a holistic view of all their data sources, and prevent inconsistent data from being captured.
What is AWS Lake Formation?
AWS Lake Formation is a fully managed service that makes creating, maintaining, and managing data
lakes a breeze. AWS Lake Formation automates most of the data lake construction process, reducing
the time it takes from months to weeks. The service acts as a single point of control for identifying,
retrieving, cleaning, and transforming data from thousands of sources, as well as enforcing security
regulations across various services and acquiring and managing fresh data. AWS Lake Formation is
controlled via a single dashboard from which you can configure and alter all data lake lifecycle phases
and operations.
What is AWS Glue?
AWS Glue is an ETL (Extract, Transform, and Load) data integration solution that is fully managed.
The process of preparing and merging data for analytics, machine learning, and application
development is known as data integration. It's made to make it simple and inexpensive to not just
categorize your data, but also to clean, enhance, and transfer it.
What is Amazon Data Pipeline?
Amazon Data Pipeline is a web service that enables you to process and transport data across AWS
computing and storage services, as well as on-premises data sources, in a reliable manner. It
automates the extraction, transformation, combination, validation, and uploading of data for further
analysis and display. It can handle many data streams at the same time and delivers end-to-end
speed by removing mistakes or overcoming delays.
Amazon QuickSight is a cloud-based machine learning-powered business analytics tool provided by
Amazon Web Services. It allows businesses to make more informed, data-driven decisions.
Businesses may use Amazon QuickSight BI to generate and analyze data visualizations and receive
easy-to-understand insights to help them make better business decisions. These dynamic dashboards
may be integrated into a variety of apps, portals, and websites with ease. Because Amazon
QuickSight is scalable, it can handle tens of thousands of users without the need for extra
infrastructure or capacity planning. It is also device agnostic.
What is Amazon Redshift?
Amazon Redshift is a fully managed, petabyte-scale, cloud-based data warehouse product designed for large-scale dataset storage and analysis. It is also used to perform large-scale database migrations. Amazon Redshift helps you predict your costs by providing all of them at a price performance that is up to three times better than other cloud data warehouses from the start.
What is Amazon OpenSearch Service?
OpenSearch is a distributed, open-source search and analytics package that may be used for real-
time application monitoring, log analysis, and internet search, among other things. With OpenSearch
Dashboards, an integrated visualization tool that makes it easy for users to examine their data,
OpenSearch provides a highly scalable solution for quick access and reaction to massive amounts of
data. The Apache Lucene search library, as well as OpenSearch, Elasticsearch, and Apache Solr,
support it. Elasticsearch 7.10.2 and Kibana 7.10.2 were used to create OpenSearch and OpenSearch
Dashboards. The Apache License Version 2.0 applies to all software in the OpenSearch project (ALv2).
Amazon MSK is a fully managed service that makes it easy for you to build and run applications that use Apache Kafka to process streaming data. Apache Kafka is an open-source platform for building real-time streaming data pipelines and applications. With Amazon MSK, you can use native Apache Kafka APIs to populate data lakes, stream changes to and from databases, and power machine learning and analytics applications.
Amazon CloudSearch is a search service that is completely controlled. It's easy to set up, use, and it's a cost-
effective search option. The fundamental text search engine of Amazon CloudSearch is Apache Solr. Full-text search, faceted search, real-time indexing, dynamic clustering, database integration, NoSQL
capabilities, and productive document handling are all supported by it.
Amazon EMR is an Amazon-managed cluster platform that makes it easier to handle and
analyze massive amounts of data using big data frameworks like Apache Hadoop and Apache
Spark on AWS. EMR may be used to process data for analytical and business intelligence
tasks in combination with Apache Hive and Apache Pig. EMR lets you transform and move
large amounts of data across AWS data stores and databases.
Amazon Athena is a new serverless query service that makes it easy to analyze data in Amazon S3, using standard SQL. With Athena, there is no infrastructure to setup or manage, and you can start analyzing your data immediately. You don’t even need to load your data into Athena, it works directly with data stored in S3.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
2. WHAT IS
KINESIS DATA STREAMS?
AMAZON KINESIS DATA STREAMS IS A
SCALABLE, LONG-LASTING, AND LOW-COST
STREAMING DATA SOLUTION FROM
AMAZON.
3. FEATURES OF KINESIS
Serverless
Highly avaliable and durable
Low latency
Dedicated throughput per consumer
Choose between on-demand and provisioned capacity mode
Secure and compliant
Integrated with other AWS services
4. WHAT CAN I DO WITH
KINESIS DATA STREAMS?
Accelerated log and data feed intake and
processing
Real-time metrics and reporting
Real-time data analytics
Complex stream processing
5. WHY USE KINESIS DATA STREAMS?
As a streaming technology, Kinesis provides a
number of specific advantages. It is, in
particular, a managed service, which means
that AWS, rather than developers, is in charge
of most of the system administration. This
allows developers to concentrate more on
their code and less on system management.
6. There is no limit to the amount of streams you may have
in your account while using provisioned mode.
Before base64 encoding, a record's data payload can be
up to 1 MB in size.
Within 24 hours, you can switch between on-demand and
provided capacity modes for any data stream in your AWS
account twice.
QUATOS AND LIMITS