This document provides an introduction to Apache Hive, including what Hive is, its key features and architecture. Hive is a data warehouse infrastructure tool used to process structured data in Hadoop. It allows users to query and analyze large datasets using SQL-like queries. Hive resides on top of Hadoop and uses MapReduce to process queries internally. It includes a metastore to store metadata, query compiler and execution engine to process queries, and can operate on data stored in HDFS or HBase.