Increasing demands to collect,store, and analyze massive amounts of data often mean that the same tools and approaches that worked in the past don’t work anymore. That’s why many organizations are shifting to a data lake, which is an architectural approach that allows you to store massive amounts of data into a central location, so it’s readily available to be categorized, processed, analyzed and consumed by diverse groups within an organization. In this tech talk, we introduce key concepts for a data lake and present aspects related to its implementation. We highlight the core components of a data lake, such as storage, compute, analytics, databases, stream processing, data management, and security. We discuss how to choose the right technologies for each component of the data lake, based on criteria such as data structure, query latency, cost, request rate, item size, data volume, durability, and so on. We also provide a reference architecture and recommendations to get started with a data lake implementation on AWS.
· Understand key concepts and architectural components of a data lake architecture
· Describe how and when to use a broad set of analytic and data management tools in a data lake architecture
· Get insights on how to get started with a data lake implementation on AWS