HBase is a distributed, column-oriented database that runs on top of HDFS. It provides random real-time read/write access to big data. HBase tables are sharded into regions distributed across region servers. The master server manages the cluster and schema changes. Clients communicate with the master and region servers to read and write data. Keys are sorted and data is stored in columns within column families.