This document discusses building a data lake on AWS. It describes using Amazon S3 for storage of structured, semi-structured, and unstructured data at scale. Amazon Kinesis is used for streaming ingest of data. A metadata catalogue using Amazon DynamoDB and AWS Lambda allows for data discovery and governance. IAM policies control access and encryption using AWS KMS provides security. APIs built using Amazon API Gateway provide programmatic access to the data lake resources.