Database Week at the San Francisco Loft
ElastiCache & Redis
Redis is an open source, in-memory data store that delivers sub-millisecond response times enabling millions of requests per second to power real-time applications. It can be used as a fast database, cache, message broker, and queue. Amazon ElastiCache delivers the ease-of-use and power of Redis along with the availability, reliability, scalability, security, and performance suitable for the most demanding applications. We’ll take a close look at Redis and how to use it to power different use cases.
Speakers:
Smitty Weygant - Solutions Architect, AWS
Database Week at the San Francisco Loft: ElastiCache & Redis
Redis is an open source, in-memory data store that delivers sub-millisecond response times enabling millions of requests per second to power real-time applications. It can be used as a fast database, cache, message broker, and queue. Amazon ElastiCache delivers the ease-of-use and power of Redis along with the availability, reliability, scalability, security, and performance suitable for the most demanding applications. We’ll take a close look at Redis and how to use it to power different use cases.
Speaker: Ben Willett - Sr. Solutions Architect, AWS
This document discusses Amazon ElastiCache, a fully managed in-memory cache and database service. It provides Redis and Memcached compatible data stores that can be used for fast databases, caches, and other use cases. The document outlines key features of ElastiCache like security, high availability, scalability, and common usage patterns. It also provides an example of how GE uses ElastiCache Redis to power its Predix platform and make it easy for developers to create Redis clusters.
This document discusses the rise of non-relational databases and their advantages over traditional relational databases for modern cloud applications. It outlines how characteristics like scale, data volume, and developer access have changed. It promotes the idea of using different data store technologies based on data needs, rather than relying on a single database. Examples of Amazon's non-relational database services are provided, including DynamoDB, ElastiCache, and the new Neptune graph database.
by Jeff Duffy, Database Specialist Solutions Architect, AWS
Database Week at the AWS Loft is an opportunity to learn about Amazon’s broad and deep family of managed database services. These services provide easy, scalable, reliable, and cost-effective ways to manage your data in the cloud. We explain the fundamentals and take a technical deep dive into Amazon RDS and Amazon Aurora relational databases, Amazon DynamoDB non-relational databases, Amazon Neptune graph databases, and Amazon ElastiCache managed Redis, along with options for database migration, caching, search and more. You'll will learn how to get started, how to support applications, and how to scale.
Trafodion is a transactional SQL engine that runs on Hadoop and HBase, providing ANSI SQL access via ODBC/JDBC drivers. It maintains compatibility with Hadoop APIs while adding relational schema support, distributed transactions, secondary indexes and automatic parallelism. Trafodion uses HBase for storage but adds features like ACID compliance across rows and tables and more optimized performance for transactional workloads. By running SQL on Hadoop, Trafodion allows users to leverage existing SQL skills while gaining scalability and flexibility of big data platforms.
Trafodion – an enterprise class sql based on hadoopKrishna-Kumar
Trafodion is a joint HP Labs and HP-IT research project to develop an enterprise-class SQL on Hadoop DBMS engine that specifically targets operational workloads as opposed to analytic workloads. Operational SQL describe workloads previous described as OLTP (online transaction processing) workloads and Operational Data Store (ODS) workloads, but expands that definition from the broad range of enterprise-level transactional applications (ERP, CRM, etc.) to include the new transactions generated from social and mobile data interactions and observations and the new mixing of structured and semi-structured data.
Trafodion is an open source project that provides transactional SQL capabilities on HBase. It allows for distributed ACID transactions across SQL statements and tables. Trafodion addresses limitations of existing SQL-on-Hadoop solutions by providing features important for operational workloads like concurrency, interactive write speeds, and transactional data consistency guarantees. It is well suited for applications that involve real-time transaction processing, such as billing systems, reservation systems, and claims processing.
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the CloudDataWorks Summit
This document discusses how organizations can leverage data and analytics to power their business models. It provides examples of Fortune 100 companies that are using Attunity products to build data lakes and ingest data from SAP and other sources into Hadoop, Apache Kafka, and the cloud in order to perform real-time analytics. The document outlines the benefits of Attunity's data replication tools for extracting, transforming, and loading SAP and other enterprise data into data lakes and data warehouses.
Database Week at the San Francisco Loft: ElastiCache & Redis
Redis is an open source, in-memory data store that delivers sub-millisecond response times enabling millions of requests per second to power real-time applications. It can be used as a fast database, cache, message broker, and queue. Amazon ElastiCache delivers the ease-of-use and power of Redis along with the availability, reliability, scalability, security, and performance suitable for the most demanding applications. We’ll take a close look at Redis and how to use it to power different use cases.
Speaker: Ben Willett - Sr. Solutions Architect, AWS
This document discusses Amazon ElastiCache, a fully managed in-memory cache and database service. It provides Redis and Memcached compatible data stores that can be used for fast databases, caches, and other use cases. The document outlines key features of ElastiCache like security, high availability, scalability, and common usage patterns. It also provides an example of how GE uses ElastiCache Redis to power its Predix platform and make it easy for developers to create Redis clusters.
This document discusses the rise of non-relational databases and their advantages over traditional relational databases for modern cloud applications. It outlines how characteristics like scale, data volume, and developer access have changed. It promotes the idea of using different data store technologies based on data needs, rather than relying on a single database. Examples of Amazon's non-relational database services are provided, including DynamoDB, ElastiCache, and the new Neptune graph database.
by Jeff Duffy, Database Specialist Solutions Architect, AWS
Database Week at the AWS Loft is an opportunity to learn about Amazon’s broad and deep family of managed database services. These services provide easy, scalable, reliable, and cost-effective ways to manage your data in the cloud. We explain the fundamentals and take a technical deep dive into Amazon RDS and Amazon Aurora relational databases, Amazon DynamoDB non-relational databases, Amazon Neptune graph databases, and Amazon ElastiCache managed Redis, along with options for database migration, caching, search and more. You'll will learn how to get started, how to support applications, and how to scale.
Trafodion is a transactional SQL engine that runs on Hadoop and HBase, providing ANSI SQL access via ODBC/JDBC drivers. It maintains compatibility with Hadoop APIs while adding relational schema support, distributed transactions, secondary indexes and automatic parallelism. Trafodion uses HBase for storage but adds features like ACID compliance across rows and tables and more optimized performance for transactional workloads. By running SQL on Hadoop, Trafodion allows users to leverage existing SQL skills while gaining scalability and flexibility of big data platforms.
Trafodion – an enterprise class sql based on hadoopKrishna-Kumar
Trafodion is a joint HP Labs and HP-IT research project to develop an enterprise-class SQL on Hadoop DBMS engine that specifically targets operational workloads as opposed to analytic workloads. Operational SQL describe workloads previous described as OLTP (online transaction processing) workloads and Operational Data Store (ODS) workloads, but expands that definition from the broad range of enterprise-level transactional applications (ERP, CRM, etc.) to include the new transactions generated from social and mobile data interactions and observations and the new mixing of structured and semi-structured data.
Trafodion is an open source project that provides transactional SQL capabilities on HBase. It allows for distributed ACID transactions across SQL statements and tables. Trafodion addresses limitations of existing SQL-on-Hadoop solutions by providing features important for operational workloads like concurrency, interactive write speeds, and transactional data consistency guarantees. It is well suited for applications that involve real-time transaction processing, such as billing systems, reservation systems, and claims processing.
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the CloudDataWorks Summit
This document discusses how organizations can leverage data and analytics to power their business models. It provides examples of Fortune 100 companies that are using Attunity products to build data lakes and ingest data from SAP and other sources into Hadoop, Apache Kafka, and the cloud in order to perform real-time analytics. The document outlines the benefits of Attunity's data replication tools for extracting, transforming, and loading SAP and other enterprise data into data lakes and data warehouses.
Next-generation Python Big Data Tools, powered by Apache ArrowWes McKinney
This document discusses Apache Arrow, a new open source project that aims to standardize in-memory columnar data representations. It will enable faster data sharing and analysis across systems by avoiding costly serialization. The document outlines how Arrow focuses on CPU efficiency through cache locality, vectorized operations, and minimal overhead. It provides examples of how Arrow could improve I/O performance for Python tools interacting with big data systems and the Feather file format developed using Arrow. Language bindings for Arrow are under development for Python, R, Java and other languages.
Modern data is massive, quickly evolving, unstructured, and increasingly hard to catalog and understand from multiple consumers and applications. This presentation will guide you though the best practices for designing a robust data architecture, highlightning the benefits and typical challenges of data lakes and data warehouses. We will build a scalable solution based on managed services such as Amazon Athena, AWS Glue, and AWS Lake Formation.
Building Data Lakes and Analytics on AWS. IPExpo Manchester.javier ramirez
Over 90% of today's data was generated in the last 2 years, and the rate of data growth isn't slowing down. In this session, we'll step through the challenges and best practices on how to capture all the data that is being generated, understand what data you have, and start driving insights and even predict the future using purpose built AWS Services.
We'll frame the session around common pitfalls of building Data Lakes and how to successful drive analytics and insights from the data. This session will focus on the architecture patterns bringing together key AWS Services and rather than a deep dive on any single service. We'll show how services such as Amazon S3, AWS Glue, Amazon Redshift, Amazon Athena, Amazon EMR, Amazon Kinesis, and even Amazon Machine Learning services are put together to build a successful data lake for various role including both data scientists and business users.
Building an end to end image recognition service - Tel Aviv Summit 2018Amazon Web Services
In this session, we’ll learn how to build and deploy end to end solutions for ingesting and processing computer vision solutions, using machine learning models connected to live video streams, and getting insights such as face detection and object analysis. At the end of the session developers of all skill levels will be able to build their own deep learning powered, computer-vision applications. Attendees will learn how to experiment with different projects for face detection, object recognition and other video-based AWS Machine Learning services.
Big Data, Ingeniería de datos, y Data Lakes en AWSjavier ramirez
Epic Games uses AWS services extensively to gain insights from player data and ensure Fortnite remains engaging for its over 125 million players. Telemetry data from clients is collected with Kinesis and analyzed in real-time using Spark on EMR. Game designers use these insights to inform decisions. Epic also uses S3 as a data lake, DynamoDB for real-time queries, and EMR for batch processing. This analytics platform on AWS allows constant feedback to optimize the player experience.
Performance Optimizations in Apache ImpalaCloudera, Inc.
Apache Impala is a modern, open-source MPP SQL engine architected from the ground up for the Hadoop data processing environment. Impala provides low latency and high concurrency for BI/analytic read-mostly queries on Hadoop, not delivered by batch frameworks such as Hive or SPARK. Impala is written from the ground up in C++ and Java. It maintains Hadoop’s flexibility by utilizing standard components (HDFS, HBase, Metastore, Sentry) and is able to read the majority of the widely-used file formats (e.g. Parquet, Avro, RCFile).
To reduce latency, such as that incurred from utilizing MapReduce or by reading data remotely, Impala implements a distributed architecture based on daemon processes that are responsible for all aspects of query execution and that run on the same machines as the rest of the Hadoop infrastructure. Impala employs runtime code generation using LLVM in order to improve execution times and uses static and dynamic partition pruning to significantly reduce the amount of data accessed. The result is performance that is on par or exceeds that of commercial MPP analytic DBMSs, depending on the particular workload. Although initially designed for running on-premises against HDFS-stored data, Impala can also run on public clouds and access data stored in various storage engines such as object stores (e.g. AWS S3), Apache Kudu and HBase. In this talk, we present Impala's architecture in detail and discuss the integration with different storage engines and the cloud.
How to Confidently Unleash Data to Meet the Needs of Your Entire Organization...Amazon Web Services
Where are you on the spectrum of IT leaders? Are you confident that you’re providing the technology and solutions that consistently meet or exceed the needs of your internal customers? Do your peers at the executive table see you as an innovative technology leader? Innovative IT leaders understand the value of getting data and analytics directly into the hands of decision makers, and into their own. In this session, Daren Thayne, Domo’s Chief Technology Officer, shares how innovative IT leaders are helping drive a culture change at their organizations. See how transformative it can be to have real-time access to all of the data that' is relevant to YOUR job (including a complete view of your entire AWS environment), as well as understand how it can help you lead the way in applying that same pattern throughout your entire company.
This document discusses strategies for filling a data lake by improving the process of data onboarding. It advocates using a template-based approach to streamline data ingestion from various sources and reduce dependence on hardcoded procedures. The key aspects are managing ELT templates and metadata through automated metadata extraction. This allows generating integration jobs dynamically based on metadata passed at runtime, providing flexibility to handle different source data with one template. It emphasizes reducing the risks associated with large data onboarding projects by maintaining a standardized and organized data lake.
From the Hadoop Summit 2015 Session with Tomer Shiran.
To deliver real-time impact from big data, organizations must evolve beyond traditional analytic approaches to support a new class of agile, distributed applications. Real-time Hadoop overcomes batch programs reliant on data transformations and schema management. This session highlights how leading organizations are leveraging Hadoop and NoSQL to merge analytics and production data to make adjustments while business is happening to optimize revenue, mitigate risk and reduce operational costs. Details include how companies have achieved real-time impact on their business, collapsed data silos, and automated in-line analytics with operational data for immediate impact.
Modern data ecosystems require new paradigms to address diverse data sources and user needs. Traditional assumptions about data originating from internal systems and a single data warehouse no longer apply. A new model called "Data Regions" establishes multiple environments for different data usage scenarios, including source onboarding, exploration, reporting, analytics and more. By supporting varied access, structures, domains and integrity across regions, Data Regions can address today's complex data challenges and modernize companies' data ecosystems.
What is aerospike database and why is it vastly superior to other database an...Aerospike
This document discusses Aerospike's hyperscale data solutions and its advantages over other NoSQL solutions. It highlights Aerospike's superior reliability and persistence, uniquely hyperscale architecture, proven adoption by industry pioneers, and ability to eliminate costs and complexity. It also discusses Aerospike's patented flash-optimized storage layer, multi-threaded massively parallel processing, and self-healing clusters. The document positions Aerospike as simplifying legacy architectures while solving scaling problems for enterprises.
As an official MongoDB-as-a-Service offering from MongoDB Inc., the maker for MongoDB, Atlas is becoming a very popular service offering for those who wish to build their applications in the cloud, regardless on AWS, Azure or GCP. One less known cloud product offered on the Atlas platform is Stitch, A group of services designed to interact with Atlas in every conceivable way, including creating endpoints, triggers, user authentication flows, serverless functions, and a UI to handle all of this. Adding these together, you have a server-less solution running on top of MongoDB cloud.
The document discusses using Aerospike for online machine learning. Some key points:
- Aerospike provides a high-performance database that can store large-scale user data and models to power real-time analytics and online learning.
- Online learning allows models to be created and evolve continuously based on new data, rather than in batches, enabling more accurate predictions.
- Neilsen Marketing Cloud uses Aerospike to store over 150 billion model scores per day across thousands of concurrent models for applications like ad targeting and fraud detection.
- eXelate also leverages Aerospike for online learning, processing billions of events to train models each day across many nodes while monitoring model performance in real-
This document provides an agenda and overview for a hands-on introductory course on Spark and Zeppelin. The agenda includes a quick demo, overview of Spark and Zeppelin, a 1 hour lab, discussion of Spark 2.0 features, and a Q&A session. The overview sections explain key Spark concepts like RDDs, DataFrames, and MLlib as well as how Spark SQL, Streaming, and GraphX work. It also introduces the Apache Zeppelin notebook platform and Hortonworks Data Platform sandbox for experimenting with Spark and Hadoop technologies.
Work with Machine Learning in Amazon SageMaker - BDA203 - Toronto AWS SummitAmazon Web Services
Organizations are using machine learning (ML) to address a host of business challenges, from product recommendations to demand forecasting. Until recently, developing these ML models took considerable time and effort, and it required expertise. In this session, we dive deep into Amazon SageMaker, a fully managed ML service that enables developers and data scientists to develop and deploy deep learning models quickly and easily. We walk through the features and benefits of Amazon SageMaker to get your ML models from concept to production.
What's new with Amazon Redshift - ADB203 - New York AWS SummitAmazon Web Services
Organizations cannot afford to have a data warehouse that scales slowly or requires a trade-off between performance and concurrence. Amazon Redshift scales to provide consistently fast performance with rapidly growing data and high user and query concurrence. In this session, we highlight Amazon Redshift’s current features and those that are coming soon. Next, we discuss how your Amazon Redshift data warehouse and Amazon S3 data lake enable you to scale storage and compute resources automatically and on demand. We also demo Amazon Redshift’s intelligent maintenance and administration operations that ensure your clusters perform at any scale.
Scalable, secure log analytics with Amazon ES - ADB302 - Chicago AWS SummitAmazon Web Services
The document discusses Amazon Elasticsearch Service (Amazon ES) and how it can be used for log analytics. Amazon ES is a fully managed service that makes it easy to deploy, manage, and scale Elasticsearch and Kibana in AWS. It allows users to ingest and analyze log data in real time to gain valuable insights from machine-generated data. The document provides examples of how various organizations use Amazon ES for infrastructure monitoring, application monitoring, container monitoring, and security information and event management. It also covers best practices for scaling Amazon ES as data volume increases.
ElastiCache: Deep Dive Best Practices and Usage Patterns - AWS Online Tech TalksAmazon Web Services
This document provides an overview of Amazon ElastiCache and discusses best practices and usage patterns. It describes how ElastiCache provides fully managed, in-memory caching for internet-scale applications using Redis or Memcached. Examples of common usage patterns are discussed, such as caching, real-time analytics, gaming leaderboards, and geospatial applications. Customer examples from BBC and Expedia are also presented that discuss how they leverage ElastiCache.
Reliable & Scalable Redis in the Cloud with Amazon ElastiCache (DAT202) - AWS...Amazon Web Services
This session covers the features and enhancements in our Redis-compatible service, Amazon ElastiCache for Redis. We cover key features, such as Redis 5, scalability and performance improvements, security and compliance, and much more. We also discuss upcoming features and customer case studies.
Next-generation Python Big Data Tools, powered by Apache ArrowWes McKinney
This document discusses Apache Arrow, a new open source project that aims to standardize in-memory columnar data representations. It will enable faster data sharing and analysis across systems by avoiding costly serialization. The document outlines how Arrow focuses on CPU efficiency through cache locality, vectorized operations, and minimal overhead. It provides examples of how Arrow could improve I/O performance for Python tools interacting with big data systems and the Feather file format developed using Arrow. Language bindings for Arrow are under development for Python, R, Java and other languages.
Modern data is massive, quickly evolving, unstructured, and increasingly hard to catalog and understand from multiple consumers and applications. This presentation will guide you though the best practices for designing a robust data architecture, highlightning the benefits and typical challenges of data lakes and data warehouses. We will build a scalable solution based on managed services such as Amazon Athena, AWS Glue, and AWS Lake Formation.
Building Data Lakes and Analytics on AWS. IPExpo Manchester.javier ramirez
Over 90% of today's data was generated in the last 2 years, and the rate of data growth isn't slowing down. In this session, we'll step through the challenges and best practices on how to capture all the data that is being generated, understand what data you have, and start driving insights and even predict the future using purpose built AWS Services.
We'll frame the session around common pitfalls of building Data Lakes and how to successful drive analytics and insights from the data. This session will focus on the architecture patterns bringing together key AWS Services and rather than a deep dive on any single service. We'll show how services such as Amazon S3, AWS Glue, Amazon Redshift, Amazon Athena, Amazon EMR, Amazon Kinesis, and even Amazon Machine Learning services are put together to build a successful data lake for various role including both data scientists and business users.
Building an end to end image recognition service - Tel Aviv Summit 2018Amazon Web Services
In this session, we’ll learn how to build and deploy end to end solutions for ingesting and processing computer vision solutions, using machine learning models connected to live video streams, and getting insights such as face detection and object analysis. At the end of the session developers of all skill levels will be able to build their own deep learning powered, computer-vision applications. Attendees will learn how to experiment with different projects for face detection, object recognition and other video-based AWS Machine Learning services.
Big Data, Ingeniería de datos, y Data Lakes en AWSjavier ramirez
Epic Games uses AWS services extensively to gain insights from player data and ensure Fortnite remains engaging for its over 125 million players. Telemetry data from clients is collected with Kinesis and analyzed in real-time using Spark on EMR. Game designers use these insights to inform decisions. Epic also uses S3 as a data lake, DynamoDB for real-time queries, and EMR for batch processing. This analytics platform on AWS allows constant feedback to optimize the player experience.
Performance Optimizations in Apache ImpalaCloudera, Inc.
Apache Impala is a modern, open-source MPP SQL engine architected from the ground up for the Hadoop data processing environment. Impala provides low latency and high concurrency for BI/analytic read-mostly queries on Hadoop, not delivered by batch frameworks such as Hive or SPARK. Impala is written from the ground up in C++ and Java. It maintains Hadoop’s flexibility by utilizing standard components (HDFS, HBase, Metastore, Sentry) and is able to read the majority of the widely-used file formats (e.g. Parquet, Avro, RCFile).
To reduce latency, such as that incurred from utilizing MapReduce or by reading data remotely, Impala implements a distributed architecture based on daemon processes that are responsible for all aspects of query execution and that run on the same machines as the rest of the Hadoop infrastructure. Impala employs runtime code generation using LLVM in order to improve execution times and uses static and dynamic partition pruning to significantly reduce the amount of data accessed. The result is performance that is on par or exceeds that of commercial MPP analytic DBMSs, depending on the particular workload. Although initially designed for running on-premises against HDFS-stored data, Impala can also run on public clouds and access data stored in various storage engines such as object stores (e.g. AWS S3), Apache Kudu and HBase. In this talk, we present Impala's architecture in detail and discuss the integration with different storage engines and the cloud.
How to Confidently Unleash Data to Meet the Needs of Your Entire Organization...Amazon Web Services
Where are you on the spectrum of IT leaders? Are you confident that you’re providing the technology and solutions that consistently meet or exceed the needs of your internal customers? Do your peers at the executive table see you as an innovative technology leader? Innovative IT leaders understand the value of getting data and analytics directly into the hands of decision makers, and into their own. In this session, Daren Thayne, Domo’s Chief Technology Officer, shares how innovative IT leaders are helping drive a culture change at their organizations. See how transformative it can be to have real-time access to all of the data that' is relevant to YOUR job (including a complete view of your entire AWS environment), as well as understand how it can help you lead the way in applying that same pattern throughout your entire company.
This document discusses strategies for filling a data lake by improving the process of data onboarding. It advocates using a template-based approach to streamline data ingestion from various sources and reduce dependence on hardcoded procedures. The key aspects are managing ELT templates and metadata through automated metadata extraction. This allows generating integration jobs dynamically based on metadata passed at runtime, providing flexibility to handle different source data with one template. It emphasizes reducing the risks associated with large data onboarding projects by maintaining a standardized and organized data lake.
From the Hadoop Summit 2015 Session with Tomer Shiran.
To deliver real-time impact from big data, organizations must evolve beyond traditional analytic approaches to support a new class of agile, distributed applications. Real-time Hadoop overcomes batch programs reliant on data transformations and schema management. This session highlights how leading organizations are leveraging Hadoop and NoSQL to merge analytics and production data to make adjustments while business is happening to optimize revenue, mitigate risk and reduce operational costs. Details include how companies have achieved real-time impact on their business, collapsed data silos, and automated in-line analytics with operational data for immediate impact.
Modern data ecosystems require new paradigms to address diverse data sources and user needs. Traditional assumptions about data originating from internal systems and a single data warehouse no longer apply. A new model called "Data Regions" establishes multiple environments for different data usage scenarios, including source onboarding, exploration, reporting, analytics and more. By supporting varied access, structures, domains and integrity across regions, Data Regions can address today's complex data challenges and modernize companies' data ecosystems.
What is aerospike database and why is it vastly superior to other database an...Aerospike
This document discusses Aerospike's hyperscale data solutions and its advantages over other NoSQL solutions. It highlights Aerospike's superior reliability and persistence, uniquely hyperscale architecture, proven adoption by industry pioneers, and ability to eliminate costs and complexity. It also discusses Aerospike's patented flash-optimized storage layer, multi-threaded massively parallel processing, and self-healing clusters. The document positions Aerospike as simplifying legacy architectures while solving scaling problems for enterprises.
As an official MongoDB-as-a-Service offering from MongoDB Inc., the maker for MongoDB, Atlas is becoming a very popular service offering for those who wish to build their applications in the cloud, regardless on AWS, Azure or GCP. One less known cloud product offered on the Atlas platform is Stitch, A group of services designed to interact with Atlas in every conceivable way, including creating endpoints, triggers, user authentication flows, serverless functions, and a UI to handle all of this. Adding these together, you have a server-less solution running on top of MongoDB cloud.
The document discusses using Aerospike for online machine learning. Some key points:
- Aerospike provides a high-performance database that can store large-scale user data and models to power real-time analytics and online learning.
- Online learning allows models to be created and evolve continuously based on new data, rather than in batches, enabling more accurate predictions.
- Neilsen Marketing Cloud uses Aerospike to store over 150 billion model scores per day across thousands of concurrent models for applications like ad targeting and fraud detection.
- eXelate also leverages Aerospike for online learning, processing billions of events to train models each day across many nodes while monitoring model performance in real-
This document provides an agenda and overview for a hands-on introductory course on Spark and Zeppelin. The agenda includes a quick demo, overview of Spark and Zeppelin, a 1 hour lab, discussion of Spark 2.0 features, and a Q&A session. The overview sections explain key Spark concepts like RDDs, DataFrames, and MLlib as well as how Spark SQL, Streaming, and GraphX work. It also introduces the Apache Zeppelin notebook platform and Hortonworks Data Platform sandbox for experimenting with Spark and Hadoop technologies.
Work with Machine Learning in Amazon SageMaker - BDA203 - Toronto AWS SummitAmazon Web Services
Organizations are using machine learning (ML) to address a host of business challenges, from product recommendations to demand forecasting. Until recently, developing these ML models took considerable time and effort, and it required expertise. In this session, we dive deep into Amazon SageMaker, a fully managed ML service that enables developers and data scientists to develop and deploy deep learning models quickly and easily. We walk through the features and benefits of Amazon SageMaker to get your ML models from concept to production.
What's new with Amazon Redshift - ADB203 - New York AWS SummitAmazon Web Services
Organizations cannot afford to have a data warehouse that scales slowly or requires a trade-off between performance and concurrence. Amazon Redshift scales to provide consistently fast performance with rapidly growing data and high user and query concurrence. In this session, we highlight Amazon Redshift’s current features and those that are coming soon. Next, we discuss how your Amazon Redshift data warehouse and Amazon S3 data lake enable you to scale storage and compute resources automatically and on demand. We also demo Amazon Redshift’s intelligent maintenance and administration operations that ensure your clusters perform at any scale.
Scalable, secure log analytics with Amazon ES - ADB302 - Chicago AWS SummitAmazon Web Services
The document discusses Amazon Elasticsearch Service (Amazon ES) and how it can be used for log analytics. Amazon ES is a fully managed service that makes it easy to deploy, manage, and scale Elasticsearch and Kibana in AWS. It allows users to ingest and analyze log data in real time to gain valuable insights from machine-generated data. The document provides examples of how various organizations use Amazon ES for infrastructure monitoring, application monitoring, container monitoring, and security information and event management. It also covers best practices for scaling Amazon ES as data volume increases.
ElastiCache: Deep Dive Best Practices and Usage Patterns - AWS Online Tech TalksAmazon Web Services
This document provides an overview of Amazon ElastiCache and discusses best practices and usage patterns. It describes how ElastiCache provides fully managed, in-memory caching for internet-scale applications using Redis or Memcached. Examples of common usage patterns are discussed, such as caching, real-time analytics, gaming leaderboards, and geospatial applications. Customer examples from BBC and Expedia are also presented that discuss how they leverage ElastiCache.
Reliable & Scalable Redis in the Cloud with Amazon ElastiCache (DAT202) - AWS...Amazon Web Services
This session covers the features and enhancements in our Redis-compatible service, Amazon ElastiCache for Redis. We cover key features, such as Redis 5, scalability and performance improvements, security and compliance, and much more. We also discuss upcoming features and customer case studies.
by Andre Hass, Specialist Technical Account Manager, AWS
Organizations use reports, dashboards, and analytics tools to extract insights from their data, monitor performance, and support decision making. To support these tools, data must be collected and prepared for use. We'll look at two approaches: a structured centralized data repository as a Data Warehouse the less-structured repository of a Data Lake. We'll compare these approaches, examine the services that support each, and explore how they work together.
Data Warehouses & Data Lakes: Data Analytics Week at the SF LoftAmazon Web Services
Data Warehouses and Data Lakes: Data Analytics Week at the San Francisco Loft
Organizations use reports, dashboards, and analytics tools to extract insights from their data, monitor performance, and support decision making. To support these tools, data must be collected and prepared for use. We'll look at two approaches: a structured centralized data repository as a Data Warehouse the less-structured repository of a Data Lake. We'll compare these approaches, examine the services that support each, and explore how they work together.
Level: Intermediate
Speakers:
Aser Moustafa - Data Warehouse Specialist Solutions Architect, AWS
Asim Kumar Sasmal - Big Data Consultant, AWS
This document provides an overview of data warehousing and data lake concepts. It discusses key stages in data collection, storage, analysis and consumption. Different data storage options like Amazon S3, DynamoDB and Redshift are presented along with considerations for which tool to use based on data characteristics. The document also covers stream storage options and best practices for building cost-conscious and decoupled data architectures.
by Amy Che, Sr Solutions Delivery Manager AWS and Marie Yap, Technical Account Manager AWS
AWS Data & Analytics Week is an opportunity to learn about Amazon’s family of managed analytics services. These services provide easy, scalable, reliable, and cost-effective ways to manage your data in the cloud. We explain the fundamentals and take a technical deep dive into Amazon Redshift data warehouse; Data Lake services including Amazon EMR, Amazon Athena, & Amazon Redshift Spectrum; Log Analytics with Amazon Elasticsearch Service; and data preparation and placement services with AWS Glue and Amazon Kinesis. You'll will learn how to get started, how to support applications, and how to scale.
by Mikhail Prudnikov, Sr. Solutions Architect, AWS
In-memory data stores, such as ElastiCache for Redis, enable applications where response times are measured in microseconds. We’ll look at how to design and deploy high-performance applications using ElastiCache, Aurora, DynamoDB, DAX, and Lambda, then we’ll do a hands-on lab to do it ourselves. You’ll need a laptop with a Firefox or Chrome browser.
Accelerate Analytics at Scale with Amazon EMR - AWS Summit Sydney 2018Amazon Web Services
Accelerate Data Analytics at Scale with Amazon EMR
In this session you will learn the best practices and various use cases for performing data analytics at scale with Amazon EMR. We will introduce you to Amazon EMR design patterns and share how to use big data analytics to provide business insights.
Jonathan Fritz, Principal Product Manager, Amazon Web Services
by Ben Willett, Solutions Architect, AWS
Organizations use reports, dashboards, and analytics tools to extract insights from their data, monitor performance, and support decision making. To support these tools, data must be collected and prepared for use. We'll look at two approaches: a structured centralized data repository as a Data Warehouse the less-structured repository of a Data Lake. We'll compare these approaches, examine the services that support each, and explore how they work together.
Connecting the dots - How Amazon Neptune and Graph Databases can transform yo...Amazon Web Services
This document discusses Amazon Neptune, a fully managed graph database service. It provides an overview of graph databases and their advantages over traditional databases for modeling connected data. It then describes Amazon Neptune's key features, like automatic scaling, high availability across Availability Zones, integration with open standards like Gremlin and SPARQL, and ease of use on AWS. Examples are given showing how to model and query graph data using Gremlin and SPARQL. Finally, it discusses Amazon Neptune's architecture and roadmap for general availability later in 2018.
Come scalare da zero ai tuoi primi 10 milioni di utenti.pdfAmazon Web Services
AWS Summit Milano 2018
Come scalare da zero ai tuoi primi 10 milioni di utenti
Speaker: Giorgio Bonfiglio, AWS Technical Account Manager - Enterprise Support
NoSQL is a term used to describe high-performance, non-relational databases. NoSQL databases use a variety of data models, including document, graph, and key-value. NoSQL databases are recognized for ease of development, scalable performance, high availability, and resilience. With AWS, you can use fully-managed NoSQL database services such as Amazon DynamoDB for all applications that need consistent, single-digit millisecond latency at any scale; Amazon Neptune, a graph database built for the cloud; and Amazon ElastiCache, an in-memory data store service where you can choose Redis or Memcached to power your real-time applications.
ElastiCache Deep Dive: Design Patterns for In-Memory Data Stores (DAT302-R1) ...Amazon Web Services
In this session, we provide a behind the scenes peek to learn about the design and architecture of Amazon ElastiCache. See common design patterns with our Redis and Memcached offerings and how customers use them for in-memory data processing to reduce latency and improve application throughput. We review ElastiCache best practices, design patterns, and anti-patterns.
A decade ago, relational databases were used for nearly every use case. Today, new technologies are enabling a revolution in databases, creating new options for document, key: value, in-memory, search, and graph capabilities that do not use relational tables. We’ll discuss this revolution in database options and who is using them.
Building low latency apps with a serverless architecture and in-memory data I...AWS Germany
Memory data stores such as ElastiCache for Redis enables applications with response times in microseconds. By using Aurora, DynamoDB, DAX, Lambda, and ElastiCache, we explored how to design and deploy high-perfomance applications. Learn more here: https://aws.amazon.com/products/databases/
The cloud has empowered many companies to scale up and serve their users in an efficient and cost effective way. Sometimes starting from zero can be hard. In this slides deck we will review best practices and AWS services that can help you go from 0 to 10 millions users. And we will go even further: we'll explore together how to break the remaining barriers, by making your infrastructure truly global and bringing it closer to your customers.
What are the different options for a developer to run his DB in the Cloud? This session will look into the different options and how to choose the right DB for your workload.
Lessons Learned from a Large-Scale Legacy Migration with Sysco (STG311) - AWS...Amazon Web Services
Migrating enterprise applications to the cloud requires thorough planning and consideration for a number of variables. Should you move your application to a similar infrastructure in the cloud (in a lift-and-shift scenario)? Or should you refactor your application to take advantage of cloud-native services for object storage, serverless, auto-scaling, and so on? In this session, an AWS expert walks through the ten commandments that enterprises should follow when moving applications to the cloud and refactoring them for optimal performance. Then, a representative of Sysco Corporation, a Fortune 50 company, shares how the company migrated mission-critical legacy business systems and modernized them to take advantage of the AWS Cloud. Learn how the company moved its enterprise purchasing system, which processes millions of dollars in sales daily, to the AWS Cloud while achieving a 60% decrease in run costs. Also discover the lessons learned and highlights of the migration, which resulted in 30% increase in performance, 3x improvement in user accessibility, and a significant decrease in order backlogs and outages.
Nearly everything in IT - servers, applications, websites, connected devices, and other things - generate discrete, time-stamped records of events called logs. Processing and analyzing these logs to gain actionable insights is log analytics. We'll look at how to use centralized log analytics across multiple sources with Amazon Elasticsearch Service.
Level: Intermediate
Speaker: Karan Desai - Solutions Architect, AWS
Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...Amazon Web Services
"Customers are migrating their analytics, data processing (ETL), and data science workloads running on Apache Hadoop/Spark to AWS in order to save costs, increase availability, and improve performance. In this session, AWS customers Airbnb and Guardian Life discuss how they migrated their workload to Amazon EMR. This session focuses on key motivations to move to the cloud. It details key architectural changes and the benefits of migrating Hadoop/Spark workloads to the cloud.
"
Similar to ElastiCache & Redis: Database Week SF (20)
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
Il Forecasting è un processo importante per tantissime aziende e viene utilizzato in vari ambiti per cercare di prevedere in modo accurato la crescita e distribuzione di un prodotto, l’utilizzo delle risorse necessarie nelle linee produttive, presentazioni finanziarie e tanto altro. Amazon utilizza delle tecniche avanzate di forecasting, in parte questi servizi sono stati messi a disposizione di tutti i clienti AWS.
In questa sessione illustreremo come pre-processare i dati che contengono una componente temporale e successivamente utilizzare un algoritmo che a partire dal tipo di dato analizzato produce un forecasting accurato.
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
La varietà e la quantità di dati che si crea ogni giorno accelera sempre più velocemente e rappresenta una opportunità irripetibile per innovare e creare nuove startup.
Tuttavia gestire grandi quantità di dati può apparire complesso: creare cluster Big Data su larga scala sembra essere un investimento accessibile solo ad aziende consolidate. Ma l’elasticità del Cloud e, in particolare, i servizi Serverless ci permettono di rompere questi limiti.
Vediamo quindi come è possibile sviluppare applicazioni Big Data rapidamente, senza preoccuparci dell’infrastruttura, ma dedicando tutte le risorse allo sviluppo delle nostre le nostre idee per creare prodotti innovativi.
Ora puoi utilizzare Amazon Elastic Kubernetes Service (EKS) per eseguire pod Kubernetes su AWS Fargate, il motore di elaborazione serverless creato per container su AWS. Questo rende più semplice che mai costruire ed eseguire le tue applicazioni Kubernetes nel cloud AWS.In questa sessione presenteremo le caratteristiche principali del servizio e come distribuire la tua applicazione in pochi passaggi
Vent'anni fa Amazon ha attraversato una trasformazione radicale con l'obiettivo di aumentare il ritmo dell'innovazione. In questo periodo abbiamo imparato come cambiare il nostro approccio allo sviluppo delle applicazioni ci ha permesso di aumentare notevolmente l'agilità, la velocità di rilascio e, in definitiva, ci ha consentito di creare applicazioni più affidabili e scalabili. In questa sessione illustreremo come definiamo le applicazioni moderne e come la creazione di app moderne influisce non solo sull'architettura dell'applicazione, ma sulla struttura organizzativa, sulle pipeline di rilascio dello sviluppo e persino sul modello operativo. Descriveremo anche approcci comuni alla modernizzazione, compreso l'approccio utilizzato dalla stessa Amazon.com.
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
L’utilizzo dei container è in continua crescita.
Se correttamente disegnate, le applicazioni basate su Container sono molto spesso stateless e flessibili.
I servizi AWS ECS, EKS e Kubernetes su EC2 possono sfruttare le istanze Spot, portando ad un risparmio medio del 70% rispetto alle istanze On Demand. In questa sessione scopriremo insieme quali sono le caratteristiche delle istanze Spot e come possono essere utilizzate facilmente su AWS. Impareremo inoltre come Spreaker sfrutta le istanze spot per eseguire applicazioni di diverso tipo, in produzione, ad una frazione del costo on-demand!
In recent months, many customers have been asking us the question – how to monetise Open APIs, simplify Fintech integrations and accelerate adoption of various Open Banking business models. Therefore, AWS and FinConecta would like to invite you to Open Finance marketplace presentation on October 20th.
Event Agenda :
Open banking so far (short recap)
• PSD2, OB UK, OB Australia, OB LATAM, OB Israel
Intro to Open Finance marketplace
• Scope
• Features
• Tech overview and Demo
The role of the Cloud
The Future of APIs
• Complying with regulation
• Monetizing data / APIs
• Business models
• Time to market
One platform for all: a Strategic approach
Q&A
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
Per creare valore e costruire una propria offerta differenziante e riconoscibile, le startup di successo sanno come combinare tecnologie consolidate con componenti innovativi creati ad hoc.
AWS fornisce servizi pronti all'utilizzo e, allo stesso tempo, permette di personalizzare e creare gli elementi differenzianti della propria offerta.
Concentrandoci sulle tecnologie di Machine Learning, vedremo come selezionare i servizi di intelligenza artificiale offerti da AWS e, anche attraverso una demo, come costruire modelli di Machine Learning personalizzati utilizzando SageMaker Studio.
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
Con l'approccio tradizionale al mondo IT per molti anni è stato difficile implementare tecniche di DevOps, che finora spesso hanno previsto attività manuali portando di tanto in tanto a dei downtime degli applicativi interrompendo l'operatività dell'utente. Con l'avvento del cloud, le tecniche di DevOps sono ormai a portata di tutti a basso costo per qualsiasi genere di workload, garantendo maggiore affidabilità del sistema e risultando in dei significativi miglioramenti della business continuity.
AWS mette a disposizione AWS OpsWork come strumento di Configuration Management che mira ad automatizzare e semplificare la gestione e i deployment delle istanze EC2 per mezzo di workload Chef e Puppet.
Scopri come sfruttare AWS OpsWork a garanzia e affidabilità del tuo applicativo installato su Instanze EC2.
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
Vuoi conoscere le opzioni per eseguire Microsoft Active Directory su AWS? Quando si spostano carichi di lavoro Microsoft in AWS, è importante considerare come distribuire Microsoft Active Directory per supportare la gestione, l'autenticazione e l'autorizzazione dei criteri di gruppo. In questa sessione, discuteremo le opzioni per la distribuzione di Microsoft Active Directory su AWS, incluso AWS Directory Service per Microsoft Active Directory e la distribuzione di Active Directory su Windows su Amazon Elastic Compute Cloud (Amazon EC2). Trattiamo argomenti quali l'integrazione del tuo ambiente Microsoft Active Directory locale nel cloud e l'utilizzo di applicazioni SaaS, come Office 365, con AWS Single Sign-On.
Dal riconoscimento facciale al riconoscimento di frodi o difetti di fabbricazione, l'analisi di immagini e video che sfruttano tecniche di intelligenza artificiale, si stanno evolvendo e raffinando a ritmi elevati. In questo webinar esploreremo le possibilità messe a disposizione dai servizi AWS per applicare lo stato dell'arte delle tecniche di computer vision a scenari reali.
Amazon Web Services e VMware organizzano un evento virtuale gratuito il prossimo mercoledì 14 Ottobre dalle 12:00 alle 13:00 dedicato a VMware Cloud ™ on AWS, il servizio on demand che consente di eseguire applicazioni in ambienti cloud basati su VMware vSphere® e di accedere ad una vasta gamma di servizi AWS, sfruttando a pieno le potenzialità del cloud AWS e tutelando gli investimenti VMware esistenti.
Molte organizzazioni sfruttano i vantaggi del cloud migrando i propri carichi di lavoro Oracle e assicurandosi notevoli vantaggi in termini di agilità ed efficienza dei costi.
La migrazione di questi carichi di lavoro, può creare complessità durante la modernizzazione e il refactoring delle applicazioni e a questo si possono aggiungere rischi di prestazione che possono essere introdotti quando si spostano le applicazioni dai data center locali.
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
Molte aziende oggi, costruiscono applicazioni con funzionalità di tipo ledger ad esempio per verificare lo storico di accrediti o addebiti nelle transazioni bancarie o ancora per tenere traccia del flusso supply chain dei propri prodotti.
Alla base di queste soluzioni ci sono i database ledger che permettono di avere un log delle transazioni trasparente, immutabile e crittograficamente verificabile, ma sono strumenti complessi e onerosi da gestire.
Amazon QLDB elimina la necessità di costruire sistemi personalizzati e complessi fornendo un database ledger serverless completamente gestito.
In questa sessione scopriremo come realizzare un'applicazione serverless completa che utilizzi le funzionalità di QLDB.
Con l’ascesa delle architetture di microservizi e delle ricche applicazioni mobili e Web, le API sono più importanti che mai per offrire agli utenti finali una user experience eccezionale. In questa sessione impareremo come affrontare le moderne sfide di progettazione delle API con GraphQL, un linguaggio di query API open source utilizzato da Facebook, Amazon e altro e come utilizzare AWS AppSync, un servizio GraphQL serverless gestito su AWS. Approfondiremo diversi scenari, comprendendo come AppSync può aiutare a risolvere questi casi d’uso creando API moderne con funzionalità di aggiornamento dati in tempo reale e offline.
Inoltre, impareremo come Sky Italia utilizza AWS AppSync per fornire aggiornamenti sportivi in tempo reale agli utenti del proprio portale web.
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
Molte organizzazioni sfruttano i vantaggi del cloud migrando i propri carichi di lavoro Oracle e assicurandosi notevoli vantaggi in termini di agilità ed efficienza dei costi.
La migrazione di questi carichi di lavoro, può creare complessità durante la modernizzazione e il refactoring delle applicazioni e a questo si possono aggiungere rischi di prestazione che possono essere introdotti quando si spostano le applicazioni dai data center locali.
In queste slide, gli esperti AWS e VMware presentano semplici e pratici accorgimenti per facilitare e semplificare la migrazione dei carichi di lavoro Oracle accelerando la trasformazione verso il cloud, approfondiranno l’architettura e dimostreranno come sfruttare a pieno le potenzialità di VMware Cloud ™ on AWS.
1) The document discusses building a minimum viable product (MVP) using Amazon Web Services (AWS).
2) It provides an example of an MVP for an omni-channel messenger platform that was built from 2017 to connect ecommerce stores to customers via web chat, Facebook Messenger, WhatsApp, and other channels.
3) The founder discusses how they started with an MVP in 2017 with 200 ecommerce stores in Hong Kong and Taiwan, and have since expanded to over 5000 clients across Southeast Asia using AWS for scaling.
This document discusses pitch decks and fundraising materials. It explains that venture capitalists will typically spend only 3 minutes and 44 seconds reviewing a pitch deck. Therefore, the deck needs to tell a compelling story to grab their attention. It also provides tips on tailoring different types of decks for different purposes, such as creating a concise 1-2 page teaser, a presentation deck for pitching in-person, and a more detailed read-only or fundraising deck. The document stresses the importance of including key information like the problem, solution, product, traction, market size, plans, team, and ask.
This document discusses building serverless web applications using AWS services like API Gateway, Lambda, DynamoDB, S3 and Amplify. It provides an overview of each service and how they can work together to create a scalable, secure and cost-effective serverless application stack without having to manage servers or infrastructure. Key services covered include API Gateway for hosting APIs, Lambda for backend logic, DynamoDB for database needs, S3 for static content, and Amplify for frontend hosting and continuous deployment.
This document provides tips for fundraising from startup founders Roland Yau and Sze Lok Chan. It discusses generating competition to create urgency for investors, fundraising in parallel rather than sequentially, having a clear fundraising narrative focused on what you do and why it's compelling, and prioritizing relationships with people over firms. It also notes how the pandemic has changed fundraising, with examples of deals done virtually during this time. The tips emphasize being fully prepared before fundraising and cultivating connections with investors in advance.
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
This document discusses Amazon's machine learning services for building conversational interfaces and extracting insights from unstructured text and audio. It describes Amazon Lex for creating chatbots, Amazon Comprehend for natural language processing tasks like entity extraction and sentiment analysis, and how they can be used together for applications like intelligent call centers and content analysis. Pre-trained APIs simplify adding machine learning to apps without requiring ML expertise.
Amazon Elastic Container Service (Amazon ECS) è un servizio di gestione dei container altamente scalabile, che semplifica la gestione dei contenitori Docker attraverso un layer di orchestrazione per il controllo del deployment e del relativo lifecycle. In questa sessione presenteremo le principali caratteristiche del servizio, le architetture di riferimento per i differenti carichi di lavoro e i semplici passi necessari per poter velocemente migrare uno o più dei tuo container.