This document contains stories of things that went wrong with production Kafka clusters in an effort to provide lessons learned. Some examples include losing all data by deleting an important topic, running Kafka with an outdated version, improperly configuring replication factors, and running Kafka logs in a temporary directory which resulted in data loss. The goal is to share these stories so others can learn from mistakes and better configure their Kafka clusters for reliability.