This document describes our products. FlyData makes it easy to load data automatically and continuously to Amazon Redshift. You can also refer to our HP ( http://flydata.com/ ) for more information.
Best Practices for Migrating your Data Warehouse to Amazon RedshiftAmazon Web Services
You can gain substantially more business insights and save costs by migrating your existing data warehouse to Amazon Redshift. This session will cover the key benefits of migrating to Amazon Redshift, migration strategies, and tools and resources that can help you in the process.
Powering Interactive Data Analysis at Pinterest by Amazon RedshiftJie Li
In the last six month, we have set up Amazon Redshift to power our interactive data analysis at Pinterest. It has tremendously improved the speed of analyzing our data.
AWS July Webinar Series: Amazon redshift migration and load data 20150722Amazon Web Services
Amazon Redshift is a fast, petabyte-scale data warehouse that makes it easy to analyze your data for a fraction of the cost of traditional data warehouses.
In this webinar, you will learn how to easily migrate your data from other data warehouses into Amazon Redshift, efficiently load your data with Amazon Redshift's massively parallel processing (MPP) capabilities, and automate data loading with AWS Lambda and AWS Data Pipeline. You will also learn about ETL tools from our partners to extract, transform, and prepare data from disparate data sources before loading it into Amazon Redshift.
Learning Objectives:
Understand common patterns for migrating your data to Amazon Redshift
See live examples of the Copy command that fully parallelizes data ingestion
Learn how to automate the load process using AWS Lambda & AWS Data Pipleline
Techniques for real time data loading
Options for ETL tools from our partners
Take an in-depth look at data warehousing with Amazon Redshift and get answers to your technical questions. We will cover performance tuning techniques that take advantage of Amazon Redshift's columnar technology and massively parallel processing architecture. We will also discuss best practices for migrating from existing data warehouses, optimizing your schema, loading data efficiently, and using work load management and interleaved sorting.
Introduction to Amazon Redshift and What's Next (DAT103) | AWS re:Invent 2013Amazon Web Services
Amazon Redshift is a fast, fully-managed, petabyte-scale data warehouse service that costs less than $1,000 per terabyte per year—less than a tenth the price of most traditional data warehousing solutions. In this session, you get an overview of Amazon Redshift, including how Amazon Redshift uses columnar technology, optimized hardware, and massively parallel processing to deliver fast query performance on data sets ranging in size from hundreds of gigabytes to a petabyte or more. Finally, we announce new features that we've been working on over the past few months.
Best Practices for Migrating Your Data Warehouse to Amazon RedshiftAmazon Web Services
by Darin Briskman, Technical Evangelist, AWS
You can gain substantially more business insights and save costs by migrating your existing data warehouse to Amazon Redshift. This session will cover the key benefits of migrating to Amazon Redshift, migration strategies, and tools and resources that can help you in the process. We’ll learn about AWS Database Migration Service and AWS Schema Migration Tool, which were recently enhanced to import data from six common data warehouse platforms. Level: 200
Best Practices for Migrating your Data Warehouse to Amazon RedshiftAmazon Web Services
You can gain substantially more business insights and save costs by migrating your existing data warehouse to Amazon Redshift. This session will cover the key benefits of migrating to Amazon Redshift, migration strategies, and tools and resources that can help you in the process.
Powering Interactive Data Analysis at Pinterest by Amazon RedshiftJie Li
In the last six month, we have set up Amazon Redshift to power our interactive data analysis at Pinterest. It has tremendously improved the speed of analyzing our data.
AWS July Webinar Series: Amazon redshift migration and load data 20150722Amazon Web Services
Amazon Redshift is a fast, petabyte-scale data warehouse that makes it easy to analyze your data for a fraction of the cost of traditional data warehouses.
In this webinar, you will learn how to easily migrate your data from other data warehouses into Amazon Redshift, efficiently load your data with Amazon Redshift's massively parallel processing (MPP) capabilities, and automate data loading with AWS Lambda and AWS Data Pipeline. You will also learn about ETL tools from our partners to extract, transform, and prepare data from disparate data sources before loading it into Amazon Redshift.
Learning Objectives:
Understand common patterns for migrating your data to Amazon Redshift
See live examples of the Copy command that fully parallelizes data ingestion
Learn how to automate the load process using AWS Lambda & AWS Data Pipleline
Techniques for real time data loading
Options for ETL tools from our partners
Take an in-depth look at data warehousing with Amazon Redshift and get answers to your technical questions. We will cover performance tuning techniques that take advantage of Amazon Redshift's columnar technology and massively parallel processing architecture. We will also discuss best practices for migrating from existing data warehouses, optimizing your schema, loading data efficiently, and using work load management and interleaved sorting.
Introduction to Amazon Redshift and What's Next (DAT103) | AWS re:Invent 2013Amazon Web Services
Amazon Redshift is a fast, fully-managed, petabyte-scale data warehouse service that costs less than $1,000 per terabyte per year—less than a tenth the price of most traditional data warehousing solutions. In this session, you get an overview of Amazon Redshift, including how Amazon Redshift uses columnar technology, optimized hardware, and massively parallel processing to deliver fast query performance on data sets ranging in size from hundreds of gigabytes to a petabyte or more. Finally, we announce new features that we've been working on over the past few months.
Best Practices for Migrating Your Data Warehouse to Amazon RedshiftAmazon Web Services
by Darin Briskman, Technical Evangelist, AWS
You can gain substantially more business insights and save costs by migrating your existing data warehouse to Amazon Redshift. This session will cover the key benefits of migrating to Amazon Redshift, migration strategies, and tools and resources that can help you in the process. We’ll learn about AWS Database Migration Service and AWS Schema Migration Tool, which were recently enhanced to import data from six common data warehouse platforms. Level: 200
(ISM303) Migrating Your Enterprise Data Warehouse To Amazon RedshiftAmazon Web Services
Learn how Boingo Wireless and online media provider Edmunds gained substantial business insights and saved money and time by migrating to Amazon Redshift. Get an inside look into how they accomplished their migration from on-premises solutions. Learn how they tuned their schema and queries to take full advantage of the columnar MPP architecture in Amazon Redshift, how they leveraged third party solutions, and how they met their business intelligence needs in record time.
In this session, you get an overview of Amazon Redshift, a fast, fully-managed, petabyte-scale data warehouse service. We'll cover how Amazon Redshift uses columnar technology, optimized hardware, and massively parallel processing to deliver fast query performance on data sets ranging in size from hundreds of gigabytes to a petabyte or more. We'll also discuss new features, architecture best practices, and share how customers are using Amazon Redshift for their Big Data workloads.
Analyzing big data quickly and efficiently requires a data warehouse optimized to handle and scale for large datasets. Amazon Redshift is a fast, petabyte-scale data warehouse that makes it simple and cost-effective to analyze big data for a fraction of the cost of traditional data warehouses. By following a few best practices, you can take advantage of Amazon Redshift’s columnar technology and parallel processing capabilities to minimize I/O and deliver high throughput and query performance. This webinar will cover techniques to load data efficiently, design optimal schemas, and use work load management.
Learning Objectives:
• Get an inside look at Amazon Redshift's columnar technology and parallel processing capabilities
• Learn how to migrate from existing data warehouses, optimize schemas, and load data efficiently
• Learn best practices for managing workload, tuning your queries, and using Amazon Redshift's interleaved sorting features
Who Should Attend:
• Data Warehouse Developers, Big Data Architects, BI Managers, and Data Engineers
Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse service that makes it simple and cost-effective to efficiently analyze all your data using your existing business intelligence tools.
This webinar will provide an overview of Redshift with an emphasis on the many changes we recently introduced. In particular, we will address the newly released DW2 instance types and what you can do with them.
This content is designed for database developers and architects interested in Amazon Redshift.
In this presentation, you will get a look under the covers of Amazon Redshift, a fast, fully-managed, petabyte-scale data warehouse service for less than $1,000 per TB per year. Learn how Amazon Redshift uses columnar technology, optimized hardware, and massively parallel processing to deliver fast query performance on data sets ranging in size from hundreds of gigabytes to a petabyte or more. We'll also walk through techniques for optimizing performance and, you’ll hear from a specific customer and their use case to take advantage of fast performance on enormous datasets leveraging economies of scale on the AWS platform.
Speakers:
Ian Meyers, AWS Solutions Architect
Toby Moore, Chief Technology Officer, Space Ape
AWS June 2016 Webinar Series - Amazon Redshift or Big Data AnalyticsAmazon Web Services
Analyzing big data quickly and efficiently requires a data warehouse optimized to handle and scale for large datasets. Amazon Redshift is a fast, petabyte-scale data warehouse that makes it simple and cost-effective to analyze big data for a fraction of the cost of traditional data warehouses. By following a few best practices, you can take advantage of Amazon Redshift’s columnar technology and parallel processing capabilities to minimize I/O and deliver high throughput and query performance. This webinar will cover techniques to load data efficiently, design optimal schemas, and tune query and database performance.
Learning Objectives:
Get an inside look at Amazon Redshift's columnar technology and parallel processing capabilities
Learn how to migrate from existing data warehouses, optimize schemas, and load data efficiently
Learn best practices for managing workload, tuning your queries, and using Amazon Redshift's interleaved sorting features
In this presentation, you will get a look under the covers of Amazon Redshift, a fast, fully-managed, petabyte-scale data warehouse service for less than $1,000 per TB per year. Learn how Amazon Redshift uses columnar technology, optimized hardware, and massively parallel processing to deliver fast query performance on data sets ranging in size from hundreds of gigabytes to a petabyte or more. We'll also walk through techniques for optimizing performance and, you’ll hear from a specific customer and their use case to take advantage of fast performance on enormous datasets leveraging economies of scale on the AWS platform.
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar SeriesAmazon Web Services
Analyzing big data quickly and efficiently requires a data warehouse optimized to handle and scale for large datasets. Amazon Redshift is a fast, petabyte-scale data warehouse that makes it simple and cost-effective to analyze big data for a fraction of the cost of traditional data warehouses. By following a few best practices, you can take advantage of Amazon Redshift’s columnar technology and parallel processing capabilities to minimize I/O and deliver high throughput and query performance. This webinar will cover techniques to load data efficiently, design optimal schemas, and tune query and database performance.
Learning Objectives:
• Get an inside look at Amazon Redshift's columnar technology and parallel processing capabilities
• Learn how to migrate from existing data warehouses, optimize schemas, and load data efficiently
• Learn best practices for managing workload, tuning your queries, and using Amazon Redshift's interleaved sorting features
SQL Server to Redshift Data Load Using SSISMarc Leinbach
In this article we will try to learn how to load data from SQL Server to Amazon Redshift Data warehouse using SSIS. Techniques outlined in this article can be also applied while extracting data from other Relational Source (e.g. Loading Data from MySQL to Redshift, Oracle to Redshift etc). First we will discuss steps needed to load data into Amazon Redshift Data Warehouse, challenges and then we will simplify whole process using SSIS Task for Amazon Redshift Data Transfer.
(DAT308) Yahoo! Analyzes Billions of Events a Day on Amazon RedshiftAmazon Web Services
Amazon Redshift is a fast, fully managed petabyte-scale data warehouse service that costs less than $1,000 a TB a year, under a tenth the price of most traditional data warehousing solutions. Learn how Yahoo! uses both to build a billion event a day infrastructure that is fast, easy, and cost-effective. Dive into how Yahoo performs advanced user retention and cohort analysis to make near–real time product and marketing decisions.
Amazon Redshift in Action: Enterprise, Big Data, and SaaS Use Cases (DAT205) ...Amazon Web Services
Since Amazon Redshift launched last year, it has been adopted by a wide variety of companies for data warehousing. In this session, learn how customers NASDAQ, HauteLook, and Roundarch Isobar are taking advantage of Amazon Redshift for three unique use cases: enterprise, big data, and SaaS. Learn about their implementations and how they made data analysis faster, cheaper, and easier with Amazon Redshift.
Traditional data warehouses become expensive and slow down as the volume of your data grows. Amazon Redshift is a fast, petabyte-scale data warehouse that makes it easy to analyze all of your data using existing business intelligence tools for 1/10th the traditional cost. This session will provide an introduction to Amazon Redshift and cover the essentials you need to deploy your data warehouse in the cloud so that you can achieve faster analytics and save costs. We’ll also cover the recently announced Redshift Spectrum, which allows you to query unstructured data directly from Amazon S3.
In this performance-oriented session, we will cover tuning techniques that take advantage of Amazon Redshift's columnar technology and massively parallel processing architecture. We will also discuss best practices for migrating from existing data warehouses, optimizing your schema, loading data efficiently, and using work load management and interleaved sorting.
AWS June Webinar Series - Getting Started: Amazon RedshiftAmazon Web Services
Amazon Redshift is a fast, fully-managed petabyte-scale data warehouse service, for less than $1,000 per TB per year. In this presentation, you'll get an overview of Amazon Redshift, including how Amazon Redshift uses columnar technology, optimized hardware, and massively parallel processing to deliver fast query performance on data sets ranging in size from hundreds of gigabytes to a petabyte or more. Learn how, with just a few clicks in the AWS Management Console, you can set up with a fully functional data warehouse, ready to accept data without learning any new languages and easily plugging in with the existing business intelligence tools and applications you use today. This webinar is ideal for anyone looking to gain deeper insight into their data, without the usual challenges of time, cost and effort. In this webinar, you will learn: • Understand what Amazon Redshift is and how it works • Create a data warehouse interactively through the AWS Management Console • Load some data into your new Amazon Redshift data warehouse from S3 Who Should Attend • IT professionals, developers, line-of-business managers
How Amazon.com is Leveraging Amazon Redshift (DAT306) | AWS re:Invent 2013Amazon Web Services
Learn how Amazon’s enterprise data warehouse, one of the world's largest data warehouses managing petabytes of data, is leveraging Amazon Redshift. Learn about Amazon's enterprise data warehouse best practices and solutions, and how they’re using Amazon Redshift technology to handle design and scale challenges.
Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...Amazon Web Services
Amazon DynamoDB is a fully-managed, zero-admin, high-speed NoSQL database service. Amazon DynamoDB was built to support applications at any scale. With the click of a button, you can scale your database capacity from a few hundred I/Os per second to hundreds of thousands of I/Os per second or more. You can dynamically scale your database to keep up with your application's requirements while minimizing costs during low-traffic periods. The service has no limit on storage. You also learn about Amazon DynamoDB's design principles and history.
(BDT314) A Big Data & Analytics App on Amazon EMR & Amazon RedshiftAmazon Web Services
"No matter the industry, leading organizations need to closely integrate, deploy, secure, and scale diverse technologies to support workloads while containing costs. Nasdaq, Inc.—a leading provider of trading, clearing, and exchange technology—is no exception.
After migrating more than 1,100 tables from a legacy data warehouse into Amazon Redshift, Nasdaq, Inc. is now implementing a fully-integrated, big data architecture that also includes Amazon S3, Amazon EMR, and Presto to securely analyze large historical data sets in a highly regulated environment. Drawing from this experience, Nasdaq, Inc. shares lessons learned and best practices for deploying a highly secure, unified, big data architecture on AWS.
Attendees learn:
Architectural recommendations to extend an Amazon Redshift data warehouse with Amazon EMR and Presto.
Tips to migrate historical data from an on-premises solution and Amazon Redshift to Amazon S3, making it consumable.
Best practices for securing critical data and applications leveraging encryption, SELinux, and VPC."
(BDT401) Amazon Redshift Deep Dive: Tuning and Best PracticesAmazon Web Services
Get a look under the covers: Learn tuning best practices for taking advantage of Amazon Redshift's columnar technology and parallel processing capabilities to improve your delivery of queries and improve overall database performance. This session explains how to migrate from existing data warehouses, create an optimized schema, efficiently load data, use work load management, tune your queries, and use Amazon Redshift's interleaved sorting features. Finally, learn how TripAdvisor uses these best practices to give their entire organization access to analytic insights at scale.
Data warehousing in the era of Big Data: Deep Dive into Amazon RedshiftAmazon Web Services
Analyzing big data quickly and efficiently requires a data warehouse optimized to handle and scale for large datasets. Amazon Redshift is a fast, petabyte-scale data warehouse that makes it simple and cost-effective to analyze all of your data for a fraction of the cost of traditional data warehouses. In this session, we take an in-depth look at data warehousing with Amazon Redshift for big data analytics. We cover best practices to take advantage of Amazon Redshift's columnar technology and parallel processing capabilities to deliver high throughput and query performance. We also discuss how to design optimal schemas, load data efficiently, and use work load management.
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...Amazon Web Services
Analyze Big Data for Consumer Applications with Looker BI and Amazon Redshift Customizing the customer experience based on user behavior is a constant challenge for today’s consumer apps. Business intelligence helps analyze and model large amounts of data. Looker offers a modern approach to BI leveraging AWS that’s fast, agile, and easy to manage. Join this webinar to learn how MessageMe, which provides emotionally engaging messaging apps to consumers, leverages Looker business intelligence software and the Amazon Redshift data warehouse service to analyze billions of rows of customer data in seconds.
Webinar topics include:
• How MessageMe turns billions of rows of customer data stored in Amazon Redshift into actionable insights
• How Looker connects directly to Amazon Redshift in just a few clicks, enabling MessageMe to build a modern, big data analytics in the cloud. Who should attend
• Information or Solution Architects, Data Analysts, BI Directors, DBAs, Development Leads, Developers, or Technical IT Leaders.
Presenters:
• Justin Rosenthal, CTO, MessageMe
• Keenan Rice, VP, Marketing & Alliances, Looker
• Tina Adams, Senior Product Manager, AWS
(ISM303) Migrating Your Enterprise Data Warehouse To Amazon RedshiftAmazon Web Services
Learn how Boingo Wireless and online media provider Edmunds gained substantial business insights and saved money and time by migrating to Amazon Redshift. Get an inside look into how they accomplished their migration from on-premises solutions. Learn how they tuned their schema and queries to take full advantage of the columnar MPP architecture in Amazon Redshift, how they leveraged third party solutions, and how they met their business intelligence needs in record time.
In this session, you get an overview of Amazon Redshift, a fast, fully-managed, petabyte-scale data warehouse service. We'll cover how Amazon Redshift uses columnar technology, optimized hardware, and massively parallel processing to deliver fast query performance on data sets ranging in size from hundreds of gigabytes to a petabyte or more. We'll also discuss new features, architecture best practices, and share how customers are using Amazon Redshift for their Big Data workloads.
Analyzing big data quickly and efficiently requires a data warehouse optimized to handle and scale for large datasets. Amazon Redshift is a fast, petabyte-scale data warehouse that makes it simple and cost-effective to analyze big data for a fraction of the cost of traditional data warehouses. By following a few best practices, you can take advantage of Amazon Redshift’s columnar technology and parallel processing capabilities to minimize I/O and deliver high throughput and query performance. This webinar will cover techniques to load data efficiently, design optimal schemas, and use work load management.
Learning Objectives:
• Get an inside look at Amazon Redshift's columnar technology and parallel processing capabilities
• Learn how to migrate from existing data warehouses, optimize schemas, and load data efficiently
• Learn best practices for managing workload, tuning your queries, and using Amazon Redshift's interleaved sorting features
Who Should Attend:
• Data Warehouse Developers, Big Data Architects, BI Managers, and Data Engineers
Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse service that makes it simple and cost-effective to efficiently analyze all your data using your existing business intelligence tools.
This webinar will provide an overview of Redshift with an emphasis on the many changes we recently introduced. In particular, we will address the newly released DW2 instance types and what you can do with them.
This content is designed for database developers and architects interested in Amazon Redshift.
In this presentation, you will get a look under the covers of Amazon Redshift, a fast, fully-managed, petabyte-scale data warehouse service for less than $1,000 per TB per year. Learn how Amazon Redshift uses columnar technology, optimized hardware, and massively parallel processing to deliver fast query performance on data sets ranging in size from hundreds of gigabytes to a petabyte or more. We'll also walk through techniques for optimizing performance and, you’ll hear from a specific customer and their use case to take advantage of fast performance on enormous datasets leveraging economies of scale on the AWS platform.
Speakers:
Ian Meyers, AWS Solutions Architect
Toby Moore, Chief Technology Officer, Space Ape
AWS June 2016 Webinar Series - Amazon Redshift or Big Data AnalyticsAmazon Web Services
Analyzing big data quickly and efficiently requires a data warehouse optimized to handle and scale for large datasets. Amazon Redshift is a fast, petabyte-scale data warehouse that makes it simple and cost-effective to analyze big data for a fraction of the cost of traditional data warehouses. By following a few best practices, you can take advantage of Amazon Redshift’s columnar technology and parallel processing capabilities to minimize I/O and deliver high throughput and query performance. This webinar will cover techniques to load data efficiently, design optimal schemas, and tune query and database performance.
Learning Objectives:
Get an inside look at Amazon Redshift's columnar technology and parallel processing capabilities
Learn how to migrate from existing data warehouses, optimize schemas, and load data efficiently
Learn best practices for managing workload, tuning your queries, and using Amazon Redshift's interleaved sorting features
In this presentation, you will get a look under the covers of Amazon Redshift, a fast, fully-managed, petabyte-scale data warehouse service for less than $1,000 per TB per year. Learn how Amazon Redshift uses columnar technology, optimized hardware, and massively parallel processing to deliver fast query performance on data sets ranging in size from hundreds of gigabytes to a petabyte or more. We'll also walk through techniques for optimizing performance and, you’ll hear from a specific customer and their use case to take advantage of fast performance on enormous datasets leveraging economies of scale on the AWS platform.
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar SeriesAmazon Web Services
Analyzing big data quickly and efficiently requires a data warehouse optimized to handle and scale for large datasets. Amazon Redshift is a fast, petabyte-scale data warehouse that makes it simple and cost-effective to analyze big data for a fraction of the cost of traditional data warehouses. By following a few best practices, you can take advantage of Amazon Redshift’s columnar technology and parallel processing capabilities to minimize I/O and deliver high throughput and query performance. This webinar will cover techniques to load data efficiently, design optimal schemas, and tune query and database performance.
Learning Objectives:
• Get an inside look at Amazon Redshift's columnar technology and parallel processing capabilities
• Learn how to migrate from existing data warehouses, optimize schemas, and load data efficiently
• Learn best practices for managing workload, tuning your queries, and using Amazon Redshift's interleaved sorting features
SQL Server to Redshift Data Load Using SSISMarc Leinbach
In this article we will try to learn how to load data from SQL Server to Amazon Redshift Data warehouse using SSIS. Techniques outlined in this article can be also applied while extracting data from other Relational Source (e.g. Loading Data from MySQL to Redshift, Oracle to Redshift etc). First we will discuss steps needed to load data into Amazon Redshift Data Warehouse, challenges and then we will simplify whole process using SSIS Task for Amazon Redshift Data Transfer.
(DAT308) Yahoo! Analyzes Billions of Events a Day on Amazon RedshiftAmazon Web Services
Amazon Redshift is a fast, fully managed petabyte-scale data warehouse service that costs less than $1,000 a TB a year, under a tenth the price of most traditional data warehousing solutions. Learn how Yahoo! uses both to build a billion event a day infrastructure that is fast, easy, and cost-effective. Dive into how Yahoo performs advanced user retention and cohort analysis to make near–real time product and marketing decisions.
Amazon Redshift in Action: Enterprise, Big Data, and SaaS Use Cases (DAT205) ...Amazon Web Services
Since Amazon Redshift launched last year, it has been adopted by a wide variety of companies for data warehousing. In this session, learn how customers NASDAQ, HauteLook, and Roundarch Isobar are taking advantage of Amazon Redshift for three unique use cases: enterprise, big data, and SaaS. Learn about their implementations and how they made data analysis faster, cheaper, and easier with Amazon Redshift.
Traditional data warehouses become expensive and slow down as the volume of your data grows. Amazon Redshift is a fast, petabyte-scale data warehouse that makes it easy to analyze all of your data using existing business intelligence tools for 1/10th the traditional cost. This session will provide an introduction to Amazon Redshift and cover the essentials you need to deploy your data warehouse in the cloud so that you can achieve faster analytics and save costs. We’ll also cover the recently announced Redshift Spectrum, which allows you to query unstructured data directly from Amazon S3.
In this performance-oriented session, we will cover tuning techniques that take advantage of Amazon Redshift's columnar technology and massively parallel processing architecture. We will also discuss best practices for migrating from existing data warehouses, optimizing your schema, loading data efficiently, and using work load management and interleaved sorting.
AWS June Webinar Series - Getting Started: Amazon RedshiftAmazon Web Services
Amazon Redshift is a fast, fully-managed petabyte-scale data warehouse service, for less than $1,000 per TB per year. In this presentation, you'll get an overview of Amazon Redshift, including how Amazon Redshift uses columnar technology, optimized hardware, and massively parallel processing to deliver fast query performance on data sets ranging in size from hundreds of gigabytes to a petabyte or more. Learn how, with just a few clicks in the AWS Management Console, you can set up with a fully functional data warehouse, ready to accept data without learning any new languages and easily plugging in with the existing business intelligence tools and applications you use today. This webinar is ideal for anyone looking to gain deeper insight into their data, without the usual challenges of time, cost and effort. In this webinar, you will learn: • Understand what Amazon Redshift is and how it works • Create a data warehouse interactively through the AWS Management Console • Load some data into your new Amazon Redshift data warehouse from S3 Who Should Attend • IT professionals, developers, line-of-business managers
How Amazon.com is Leveraging Amazon Redshift (DAT306) | AWS re:Invent 2013Amazon Web Services
Learn how Amazon’s enterprise data warehouse, one of the world's largest data warehouses managing petabytes of data, is leveraging Amazon Redshift. Learn about Amazon's enterprise data warehouse best practices and solutions, and how they’re using Amazon Redshift technology to handle design and scale challenges.
Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...Amazon Web Services
Amazon DynamoDB is a fully-managed, zero-admin, high-speed NoSQL database service. Amazon DynamoDB was built to support applications at any scale. With the click of a button, you can scale your database capacity from a few hundred I/Os per second to hundreds of thousands of I/Os per second or more. You can dynamically scale your database to keep up with your application's requirements while minimizing costs during low-traffic periods. The service has no limit on storage. You also learn about Amazon DynamoDB's design principles and history.
(BDT314) A Big Data & Analytics App on Amazon EMR & Amazon RedshiftAmazon Web Services
"No matter the industry, leading organizations need to closely integrate, deploy, secure, and scale diverse technologies to support workloads while containing costs. Nasdaq, Inc.—a leading provider of trading, clearing, and exchange technology—is no exception.
After migrating more than 1,100 tables from a legacy data warehouse into Amazon Redshift, Nasdaq, Inc. is now implementing a fully-integrated, big data architecture that also includes Amazon S3, Amazon EMR, and Presto to securely analyze large historical data sets in a highly regulated environment. Drawing from this experience, Nasdaq, Inc. shares lessons learned and best practices for deploying a highly secure, unified, big data architecture on AWS.
Attendees learn:
Architectural recommendations to extend an Amazon Redshift data warehouse with Amazon EMR and Presto.
Tips to migrate historical data from an on-premises solution and Amazon Redshift to Amazon S3, making it consumable.
Best practices for securing critical data and applications leveraging encryption, SELinux, and VPC."
(BDT401) Amazon Redshift Deep Dive: Tuning and Best PracticesAmazon Web Services
Get a look under the covers: Learn tuning best practices for taking advantage of Amazon Redshift's columnar technology and parallel processing capabilities to improve your delivery of queries and improve overall database performance. This session explains how to migrate from existing data warehouses, create an optimized schema, efficiently load data, use work load management, tune your queries, and use Amazon Redshift's interleaved sorting features. Finally, learn how TripAdvisor uses these best practices to give their entire organization access to analytic insights at scale.
Data warehousing in the era of Big Data: Deep Dive into Amazon RedshiftAmazon Web Services
Analyzing big data quickly and efficiently requires a data warehouse optimized to handle and scale for large datasets. Amazon Redshift is a fast, petabyte-scale data warehouse that makes it simple and cost-effective to analyze all of your data for a fraction of the cost of traditional data warehouses. In this session, we take an in-depth look at data warehousing with Amazon Redshift for big data analytics. We cover best practices to take advantage of Amazon Redshift's columnar technology and parallel processing capabilities to deliver high throughput and query performance. We also discuss how to design optimal schemas, load data efficiently, and use work load management.
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...Amazon Web Services
Analyze Big Data for Consumer Applications with Looker BI and Amazon Redshift Customizing the customer experience based on user behavior is a constant challenge for today’s consumer apps. Business intelligence helps analyze and model large amounts of data. Looker offers a modern approach to BI leveraging AWS that’s fast, agile, and easy to manage. Join this webinar to learn how MessageMe, which provides emotionally engaging messaging apps to consumers, leverages Looker business intelligence software and the Amazon Redshift data warehouse service to analyze billions of rows of customer data in seconds.
Webinar topics include:
• How MessageMe turns billions of rows of customer data stored in Amazon Redshift into actionable insights
• How Looker connects directly to Amazon Redshift in just a few clicks, enabling MessageMe to build a modern, big data analytics in the cloud. Who should attend
• Information or Solution Architects, Data Analysts, BI Directors, DBAs, Development Leads, Developers, or Technical IT Leaders.
Presenters:
• Justin Rosenthal, CTO, MessageMe
• Keenan Rice, VP, Marketing & Alliances, Looker
• Tina Adams, Senior Product Manager, AWS
Migrating Financial and Accounting Systems from Oracle to Amazon DynamoDB (DA...Amazon Web Services
In this session, we discuss our learnings from migrating the financial ledger and accounting system that Amazon uses from Oracle to AWS. We share the performance and cost benefits to enterprises who migrate critical systems from Oracle to AWS, the decision frameworks used to pick the appropriate AWS service for appropriate application, and best practices in project management.
How city of chicago boosts their sap business objects environment prepares fo...Sebastien Goiffon
SAPinsider webinar ran by City of Chicago. How they streamline Business Objects using GB&SMITH 360Suite products around:
Business Objects backup disaster recovery
Business Objects security
Business Objects regression testing
Business Objects bi 4.2 migration
Business Objects audit
BusinessObjects backup disaster recovery
BusinessObjects security
BusinessObjects regression testing
BusinessObjects bi 4.2 migration
BusinessObjects audit
Your data has value for multiple business functions in your organization. Shorten your time to analytics and take faster, better decisions based on data.
In this session you will learn how you can access your data from a myriad of tools such as multiple EMR clusters, Athena & Redshift.
Next generation analytics isn’t on its way… it’s already arrived. Most businesses are in the process of developing their new data platforms on the cloud, or moving their existing analytics infrastructure to the cloud. Attend this webinar to learn model architectures and best practices for analytics on AWS. You’ll also learn how you can leverage cloud to spread insight throughout your organization.
Join us to learn:
• What cloud data infrastructure should look like
• How to optimize your analytics deployment on the cloud
• Using Tableau to find and share new insights with everyone in your organization
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Amazon Web Services
Originally, Hadoop was used as a batch analytics tool; however, this is rapidly changing, as applications move towards real-time processing and streaming. Amazon Elastic MapReduce has made running Hadoop in the cloud easier and more accessible than ever. Each day, tens of thousands of Hadoop clusters are run on the Amazon Elastic MapReduce infrastructure by users of every size — from university students to Fortune 50 companies. We recently launched Amazon Kinesis – a managed service for real-time processing of high volume, streaming data. Amazon Kinesis enables a new class of big data applications which can continuously analyze data at any volume and throughput, in real-time. Adi will discuss each service, dive into how customers are adopting the services for different use cases, and share emerging best practices. Learn how you can architect Amazon Kinesis and Amazon Elastic MapReduce together to create a highly scalable real-time analytics solution which can ingest and process terabytes of data per hour from hundreds of thousands of different concurrent sources. Forever change how you process web site click-streams, marketing and financial transactions, social media feeds, logs and metering data, and location-tracking events.
Holistics - The Perfect Companion for Amazon Redshift
Amazon Redshift is great for the storing and processing of your data. However, there is still a significant effort involved in managing your data pipeline and analytics infrastructure.
Holistics provide Amazon Redshift customers with an end-to-end user interface across their data imports, data transformation and data presentation to address the complexities and challenges of managing their data analytics infrastructure pipeline.
Code first in the cloud: going serverless with AzureJeremy Likness
The popularity of microservices combined with the emergence of serverless based solutions has transformed how modern developers tackle cloud native apps. Microsoft's Azure cloud provides a feature known as serverless functions (including Azure Functions and Logic Apps) that enable developers to stand up integrated end points leveraging the programming language of their choice without having to worry about the supporting infrastructure. Learn how to develop serverless .NET apps and connect them with queues, web requests, and databases or seamlessly integrate with third-party APIs like Twitter and Slack.
by Manish Mohite, Solutions Architect, AWS
How do you get data from your sources into your Redshift data warehouse? We'll show how to use AWS Glue and Amazon Kinesis Firehose to make it easy to automate the work to get data loaded.
Using AWS to design and build your data architecture has never been easier to gain insights and uncover new opportunities to scale and grow your business. Join this workshop to learn how you can gain insights at scale with the right big data applications.
BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...Amazon Web Services
Amazon QuickSight is a fast, cloud-powered business intelligence (BI) service that makes it easy to build visualizations, perform ad-hoc analysis, and quickly get business insights from your data. In this session, we demonstrate how you can point Amazon QuickSight to AWS data stores, flat files, or other third-party data sources and begin visualizing your data in minutes. We also introduce SPICE - a new Super-fast, Parallel, In-memory, Calculation Engine in Amazon QuickSight, which performs advanced calculations and render visualizations rapidly without requiring any additional infrastructure, SQL programming, or dimensional modeling, so you can seamlessly scale to hundreds of thousands of users and petabytes of data. Lastly, you will see how Amazon QuickSight provides you with smart visualizations and graphs that are optimized for your different data types, to ensure the most suitable and appropriate visualization to conduct your analysis, and how to share these visualization stories using the built-in collaboration tools. NOTE: Make this more themed towards QuickSight as it applies to other AWS Big Data Services - Redshift, Athena, S3, RDS.
AWS re:Invent 2016: Migrating Your Data Warehouse to Amazon Redshift (DAT202)Amazon Web Services
Amazon Redshift is a fast, simple, cost-effective data warehousing solution, and in this session, we look at the tools and techniques you can use to migrate your existing data warehouse to Amazon Redshift. We will then present a case study on Scholastic’s migration to Amazon Redshift. Scholastic, a large 100-year-old publishing company, was running their business with older, on-premise, data warehousing and analytics solutions, which could not keep up with business needs and were expensive. Scholastic also needed to include new capabilities like streaming data and real time analytics. Scholastic migrated to Amazon Redshift, and achieved agility and faster time to insight while dramatically reducing costs. In this session, Scholastic will discuss how they achieved this, including options considered, technical architecture implemented, results, and lessons learned.
Data Analytics Week at the San Francisco Loft
Loading Data Into Redshift
How do you get data from your sources into your Redshift data warehouse? We'll show how to use AWS Glue and Amazon Kinesis Firehose to make it easy to automate the work to get data loaded.
Speakers:
Jay Formosa - Solutions Architect, AWS
Asser Moustafa - Data Warehouse Specialist Solutions Architect, AWS
Finally available on SlideShare, Dan Goodinson's SAPinsider presentation: How to pinpoint and fix sources of performance problems in your SAP BusinessObjects BI reports and dashboards.
In this presentation, you will get a look under the covers of Amazon Redshift, a fast, fully-managed, petabyte-scale data warehouse service for less than $1,000 per TB per year. Learn how Amazon Redshift uses columnar technology, optimized hardware, and massively parallel processing to deliver fast query performance on data sets ranging in size from hundreds of gigabytes to a petabyte or more. We'll also walk through techniques for optimizing performance and, you’ll hear from a specific customer and their use case to take advantage of fast performance on enormous datasets leveraging economies of scale on the AWS platform.
Similar to Near Real-Time Data Analysis With FlyData (20)
What is Change Data Capture (CDC) and Why is it Important?FlyData Inc.
Check out what Change Data Capture (CDC) is and why it is becoming ever more important. Slides also include useful tips on how to design your CDC implementation.
What's So Unique About a Columnar Database?FlyData Inc.
Looking for the right database technology to use? Luckily there are many database technologies to choose from, including relational databases (MySQL, Postgres), NoSQL (MongoDB), columnar databases (Amazon Redshift, BigQuery), and others. Each choice has its own pros and cons, but today let’s walk through how columnar databases are unique, by comparing it against the more traditional row-oriented database (e.g., MySQL).
Three Things to Consider When Making Investments in Your Big Data InfrastructureFlyData Inc.
Is your company data-driven? Thought so. Data and “data-drivenness” has become so integral to companies’ success nowadays, that it would feel weird to hear somebody say “oh, our company doesn’t care about data”.\r\n\r\nIn fact, it sounds almost as if it’s a crime to disrespect data like that. So, for those of you who want to do the morally right thing and get the most out of your data, let’s go over what you need to consider when making the necessary technology investments to support your data infrastructure.
Biases come in many different forms, but in this article, we’ll touch on few of the major ones that are known to have considerable effects on research and science. Here is a brief list of four cognitive biases that may affect you as a researcher or data scientist.
From wearable computing to heart sensors that send frequent updates to other devices, IoT envisions a future of near-infinite connectivity. In a nutshell, the evolution of technology will bring us to a future where all of our devices will not be just connected to the internet, but to each other.
Globus Connect Server Deep Dive - GlobusWorld 2024Globus
We explore the Globus Connect Server (GCS) architecture and experiment with advanced configuration options and use cases. This content is targeted at system administrators who are familiar with GCS and currently operate—or are planning to operate—broader deployments at their institution.
We describe the deployment and use of Globus Compute for remote computation. This content is aimed at researchers who wish to compute on remote resources using a unified programming interface, as well as system administrators who will deploy and operate Globus Compute services on their research computing infrastructure.
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Globus
The Earth System Grid Federation (ESGF) is a global network of data servers that archives and distributes the planet’s largest collection of Earth system model output for thousands of climate and environmental scientists worldwide. Many of these petabyte-scale data archives are located in proximity to large high-performance computing (HPC) or cloud computing resources, but the primary workflow for data users consists of transferring data, and applying computations on a different system. As a part of the ESGF 2.0 US project (funded by the United States Department of Energy Office of Science), we developed pre-defined data workflows, which can be run on-demand, capable of applying many data reduction and data analysis to the large ESGF data archives, transferring only the resultant analysis (ex. visualizations, smaller data files). In this talk, we will showcase a few of these workflows, highlighting how Globus Flows can be used for petabyte-scale climate analysis.
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus
As part of the DOE Integrated Research Infrastructure (IRI) program, NERSC at Lawrence Berkeley National Lab and ALCF at Argonne National Lab are working closely with General Atomics on accelerating the computing requirements of the DIII-D experiment. As part of the work the team is investigating ways to speedup the time to solution for many different parts of the DIII-D workflow including how they run jobs on HPC systems. One of these routes is looking at Globus Compute as a way to replace the current method for managing tasks and we describe a brief proof of concept showing how Globus Compute could help to schedule jobs and be a tool to connect compute at different facilities.
Accelerate Enterprise Software Engineering with PlatformlessWSO2
Key takeaways:
Challenges of building platforms and the benefits of platformless.
Key principles of platformless, including API-first, cloud-native middleware, platform engineering, and developer experience.
How Choreo enables the platformless experience.
How key concepts like application architecture, domain-driven design, zero trust, and cell-based architecture are inherently a part of Choreo.
Demo of an end-to-end app built and deployed on Choreo.
Multiple Your Crypto Portfolio with the Innovative Features of Advanced Crypt...Hivelance Technology
Cryptocurrency trading bots are computer programs designed to automate buying, selling, and managing cryptocurrency transactions. These bots utilize advanced algorithms and machine learning techniques to analyze market data, identify trading opportunities, and execute trades on behalf of their users. By automating the decision-making process, crypto trading bots can react to market changes faster than human traders
Hivelance, a leading provider of cryptocurrency trading bot development services, stands out as the premier choice for crypto traders and developers. Hivelance boasts a team of seasoned cryptocurrency experts and software engineers who deeply understand the crypto market and the latest trends in automated trading, Hivelance leverages the latest technologies and tools in the industry, including advanced AI and machine learning algorithms, to create highly efficient and adaptable crypto trading bots
First Steps with Globus Compute Multi-User EndpointsGlobus
In this presentation we will share our experiences around getting started with the Globus Compute multi-user endpoint. Working with the Pharmacology group at the University of Auckland, we have previously written an application using Globus Compute that can offload computationally expensive steps in the researcher's workflows, which they wish to manage from their familiar Windows environments, onto the NeSI (New Zealand eScience Infrastructure) cluster. Some of the challenges we have encountered were that each researcher had to set up and manage their own single-user globus compute endpoint and that the workloads had varying resource requirements (CPUs, memory and wall time) between different runs. We hope that the multi-user endpoint will help to address these challenges and share an update on our progress here.
Experience our free, in-depth three-part Tendenci Platform Corporate Membership Management workshop series! In Session 1 on May 14th, 2024, we began with an Introduction and Setup, mastering the configuration of your Corporate Membership Module settings to establish membership types, applications, and more. Then, on May 16th, 2024, in Session 2, we focused on binding individual members to a Corporate Membership and Corporate Reps, teaching you how to add individual members and assign Corporate Representatives to manage dues, renewals, and associated members. Finally, on May 28th, 2024, in Session 3, we covered questions and concerns, addressing any queries or issues you may have.
For more Tendenci AMS events, check out www.tendenci.com/events
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Globus
The U.S. Geological Survey (USGS) has made substantial investments in meeting evolving scientific, technical, and policy driven demands on storing, managing, and delivering data. As these demands continue to grow in complexity and scale, the USGS must continue to explore innovative solutions to improve its management, curation, sharing, delivering, and preservation approaches for large-scale research data. Supporting these needs, the USGS has partnered with the University of Chicago-Globus to research and develop advanced repository components and workflows leveraging its current investment in Globus. The primary outcome of this partnership includes the development of a prototype enterprise repository, driven by USGS Data Release requirements, through exploration and implementation of the entire suite of the Globus platform offerings, including Globus Flow, Globus Auth, Globus Transfer, and Globus Search. This presentation will provide insights into this research partnership, introduce the unique requirements and challenges being addressed and provide relevant project progress.
How Recreation Management Software Can Streamline Your Operations.pptxwottaspaceseo
Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.
Cyaniclab : Software Development Agency Portfolio.pdfCyanic lab
CyanicLab, an offshore custom software development company based in Sweden,India, Finland, is your go-to partner for startup development and innovative web design solutions. Our expert team specializes in crafting cutting-edge software tailored to meet the unique needs of startups and established enterprises alike. From conceptualization to execution, we offer comprehensive services including web and mobile app development, UI/UX design, and ongoing software maintenance. Ready to elevate your business? Contact CyanicLab today and let us propel your vision to success with our top-notch IT solutions.
Unleash Unlimited Potential with One-Time Purchase
BoxLang is more than just a language; it's a community. By choosing a Visionary License, you're not just investing in your success, you're actively contributing to the ongoing development and support of BoxLang.
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...informapgpstrackings
Keep tabs on your field staff effortlessly with Informap Technology Centre LLC. Real-time tracking, task assignment, and smart features for efficient management. Request a live demo today!
For more details, visit us : https://informapuae.com/field-staff-tracking/
Quarkus Hidden and Forbidden ExtensionsMax Andersen
Quarkus has a vast extension ecosystem and is known for its subsonic and subatomic feature set. Some of these features are not as well known, and some extensions are less talked about, but that does not make them less interesting - quite the opposite.
Come join this talk to see some tips and tricks for using Quarkus and some of the lesser known features, extensions and development techniques.
Understanding Globus Data Transfers with NetSageGlobus
NetSage is an open privacy-aware network measurement, analysis, and visualization service designed to help end-users visualize and reason about large data transfers. NetSage traditionally has used a combination of passive measurements, including SNMP and flow data, as well as active measurements, mainly perfSONAR, to provide longitudinal network performance data visualization. It has been deployed by dozens of networks world wide, and is supported domestically by the Engagement and Performance Operations Center (EPOC), NSF #2328479. We have recently expanded the NetSage data sources to include logs for Globus data transfers, following the same privacy-preserving approach as for Flow data. Using the logs for the Texas Advanced Computing Center (TACC) as an example, this talk will walk through several different example use cases that NetSage can answer, including: Who is using Globus to share data with my institution, and what kind of performance are they able to achieve? How many transfers has Globus supported for us? Which sites are we sharing the most data with, and how is that changing over time? How is my site using Globus to move data internally, and what kind of performance do we see for those transfers? What percentage of data transfers at my institution used Globus, and how did the overall data transfer performance compare to the Globus users?
3. Difficulty of loading data to Redshift
~Difference between traditional DBs and Redshift~
MySQL, PostgreSQL, Oracle, etc.
Amazon Redshift
Transactional
RDB
Data
warehouse
SQL
INSERT
Bulk upload, but
how?
synchronous
asynchronous
www.flydata.com
4. MySQL, PostgreSQL, Oracle, etc.
Amazon Redshift
SQL
INSERT
FlyData
Transactional
RDB
Data
warehouse
synchronous
asynchronous
Difficulty of loading data to Redshift
~Difference between traditional DBs and Redshift~
www.flydata.com
5. Process of upload to Redshift
1. Data extraction (E)
2. Transform data (T)
3. Upload TSV file to S3
4. Run COPY command to
load data from S3 to
Redshift (L)
5. Error Handling
Amazon
Redshift
S3TSV
Data Extraction
(E) and
Transform (T)
Client
Server
Log Files
Load to
DB(L)
Error
Handling
www.flydata.com
6. FlyData: Near-Real Time Upload To
Redshift
Manage all with
FlyData
Amazon
RedshiftClient
Server
www.flydata.com
9. FlyData – A service for Amazon Redshift
1. Continuous Loading
2. Flexible JSON format Support
3. Query Scheduling and Management
4. All-in-One package for Amazon Redshift
www.flydata.com
10. Continuous Loading
• Near Real-time Data: Send data to Redshift
periodically, every 5 minutes
• Scaling. FlyData can handle large amounts of data
(100GB+ per day) for many tables, while optimizing
appropriately with scheduled COPY commands
• Error handling.
– Retry and notifications.
– Even when Redshift is in its
maintenance window
www.flydata.com
11. Nested JSON and Apache Log
Formats
• Support for Nested JSON logs and Apache log
formats, not yet offered by AWS
• Dynamic Column Creation
– Brings flexibility to tables
– Less need to predefine table schema
• Smooth handling of nested data
– Auto-creation of parent-child table relationships
www.flydata.com
12. Example of auto-creating tables from
JSON Logs
Your JSON logs:
Get stored in RS as:
www.flydata.com
13. Flexible JSON format Support
• Your JSON log can be loaded into Redshift
directly!
• Automatic creation of tables and columns for
Redshift from your JSON log
• Nested JSON support
– Handles structure by creating
parent-child table relations
with foreign keys
www.flydata.com
14. Query Scheduling and Management
• Stored SQL management on web console
• Mail notifications and downloads for queries
that take a long time to run
• Periodical query scheduling
(under development)
– Time scheduled query processing
– Running maintenance tasks
www.flydata.com
15. All in One package for Amazon
Redshift
• We are an Amazon Redshift partner
– Officially listed on
https://aws.amazon.com/redshift/partners/
• Complete technical support for FlyData & Redshift
• As a Reseller Partner,
we can provide Amazon
Redshift under a flexible
pricing schedule
www.flydata.com
17. FlyData Sync
• Released in January 2014
• Enables Synchronization between RDBMS to
Redshift. (Currently supporting MySQL)
• Just another feature of FlyData for Redshift
– Easy setup through web/command-line interface
– One-line install command
• Supporting Insert / Delete / Update statements
www.flydata.com
18. 18
Amazon Redshift
Customer Data Center or Cloud
FlyData Client
Replication
binlog access
binlog access
Read Replica
is Optional
scalable
data servers
Amazon S3
Load Controller
Load
Optimization
for Redshift
FlyData Sync for MySQL
www.flydata.com
19. FlyData Sync Requirements
• Support currently limited to MySQL
• FlyData module must be installed on a data server with
access to MySQL transaction logs
• Supported MySQL DB Engines: InnoDB and MyISAM
• Transaction log format: ROW
– --binlog-format=ROW
• Synced table must have Primary Key set
• For data types not supported on Redshift:
– MySQL’s "binary”,"varbinary” switched to “VARCHAR”, etc.
www.flydata.com
20. Use Case: Game Analytics
• Multi-platform game titles
FlyData client module makes it easy to manage
• Basic Log Format: JSON
Makes analytics flexible and reduces data
• Large amounts of data in popular titles (200GB / day)
– Large amounts of data are concentrated in a specific table
– Hard to load in real-time (due to Redshift restrictions)
FlyData can handle it!
www.flydata.com
21. Contact Information
• sales@flydata.com
• Toll Free: 1-855-427-9787
• http://flydata.com
We are an official data
integration partner of
Amazon Redshift
www.flydata.com
24. Real-time analytics for gaming client
• Case
– Client is a leading mobile gaming company in Japan with multiple released
game titles
– Previously large amount of data was stored MySQL cluster
– MySQL often went down because of the large amount of data. Repair took
weeks of man-hours every time this happened.
– Historical analysis over multiple years was simply impossible, given the data
size.
• Solution
– Implemented FlyData Enterprise with JSON logs across multiple titles
– Outputs user activity by application into JSON log files
– Data is automatically fed to Amazon Redshift
• Result
– Engineering time is saved and real-time BI insights can be fed back to
application development cycle
– Client saves 2 weeks of man-hours every month, with added insight into user
behavior. As a result, the client continues to steadily grow its user base and its
bottom line.
www.flydata.com
26. Data analytics on Online Ad
Effectiveness
• Case
– Client is a online advertisement startup in the US with Display Ads shown across multiple websites
– User activity from the duration of engagement to the position of the cursor is all logged to measure
viewer engagement
– Client needs to save large amounts of data, and be able to query that data real-time. This data will then
be used to generate Ad Performance Reports.
– Their initial option Hadoop turned out to be too costly in terms of Engineering time. The learning curve
for the team was steep, for both query generation and maintenance of their Hadoop clusters
• Solution
– Implemented FlyData Enterprise using “Extended” Apache logs
– Outputs all user activity in Apache logs with additional information appended, such as key-value pair
information for URL parameters and custom variables
– Data is automatically fed to Amazon Redshift in the appropriate columns. When appropriate columns
do not exist, the columns are added on the fly. This allows for added flexibility in table schema design
– Customer can now know the real-time effectiveness of their online advertisements through Ad
Performance Reports
– The client’s internal BI team can quickly analyze which ads are working and which are not,
in real-time and can gain insight or optimize for the best performing ads
• Result
– With a more cost-effective solution than Hadoop, client was able to increase revenue by steadily
increasing the quality of ads based on data gathered by FlyData and analyzed in Amazon Redshift.
– Client has an implemented scalable backend reporting system that can handle multi-TB sized ad
campaigns.
www.flydata.com
28. Faster Feedback, Faster Development
Cycles
• Case
– Client is a digital media startup in the US that has a website with rapid growth in user
access, becoming one of the most “Like”d pages on Facebook 1000万を超える
– User activity logs are carefully analyzed and assessed both for the website content and
for the user experience
– Used log data to perform funnel analysis on customer conversion rates
– Client received user activity from its site as JSON objects, before storing it in MongoDB
– Given the nature of the queries they wanted to run, MongoDB became very slow as their
user base grew
• Solution
– Implemented FlyData Enterprise using nested JSON logs
– Outputs all user activity as a JSON log file
– FlyData automatically uploads the data into Redshift, so BI team (= App Development
team) can simply query their user activity logs
– Client now can quickly perform funnel analysis on customer data
• Result
– Query speed dramatically improved. Queries that took 20 minutes before, now take less
than a minute, while still being able to have the flexibility of JSON.
– Faster development cycles (Build-Measure-Learn cycles) were achieved.
www.flydata.com
29. Contact Information
• sales@flydata.com
• Toll Free: 1-855-427-9787
• http://flydata.com
We are an official data
integration partner of
Amazon Redshift
www.flydata.com
30. www.flydata.com www.flydata.com
Check us out!
-> http://flydata.com
sales@flydata.com
Toll Free: 1-855-427-9787
http://flydata.com
We are an official data integration
partner of Amazon Redshift