This document provides an overview of bio big data and related technologies. It discusses what big data is and why bio big data is necessary given the large size of genomic data sets. It then outlines and describes Hadoop, Spark, machine learning, and streaming in the context of bio big data. For Hadoop, it explains HDFS, MapReduce, and the Hadoop ecosystem. For Spark, it covers RDDs, Spark SQL, MLlib, and Spark Streaming. The document is intended as an introduction to key concepts and tools for working with large biological data sets.