Be the first to like this
Since Doug Cutting invented Hadoop and Amazon Web Services released S3 ten years ago, we've seen quite a bit of innovation in large-scale data storage and processing. These innovations have enabled engineers to build data infrastructure at scale, many of them fail to fill their scalable systems with useful data, struggling to unify data silos or failing to collect logs from thousands of servers and millions of containers. Fluentd and Embulk are two projects that I've been involved to solve the unsexy yet critical problem of data collection and transport. In this talk, I will give an overview of Fluentd and Embulk and give a survey of how they are used at companies like Microsoft and Atlassian or in projects like Docker and Kubernetes.