This document provides an overview of entropy and conditional entropy in information theory. It begins with examples of encoding variables with different probabilities to minimize the number of bits needed. It then defines entropy as the average number of bits needed to encode events from a probability distribution. Several example distributions are provided, along with their entropies. Finally, it defines conditional entropy as the expected entropy of a variable given knowledge of another variable.