Sort order across all keys.Columns can differ across rows.Value can have many different types.Entry can convey information with an empty value.Each key has a visibility – a given read will see a subset of the data.Visibility is part of key uniqueness – PSYCH_JD is withholding information from JD.
Tablet Servers have 4 primary functions:Hosting RPCs (read, write, etc.)Managing resources (RAM, CPU, File I/O, etc.)Scheduling background tasks (compactions, caching, etc.)Handling key/value pairs (via Iterators)BecauseAccumulodoesn’t use hashing to assign key-value pairs to servers, we need:We need to store the mapping of TabletServer-to-Tablet . This mapping is stored in another Tablet in Accumulo called the Metadata Table. A client need only scan a portion of the Metdata table to find which TabletServers have the Tablets it wants. (binary search through the metadata hierarchy (NEED input, is this correct) The Metadata table’sTabletServer-to-Tablet assignments must also be stored somewhere. These are written to the first Tablet of the Metadata table, called the Root Tablet. However, you may notice that the Root Tablet itself is stored in Accumulo! Somewhat of a circular dependency. That’s what we use Zookeeper for: The location of the Root Tablet is always known to ZooKeeper.
ApachAccumulo runs on top of Hadoop and ZooKeeperIt relies on HDFS for storage of data and Zookeeper for:- storage of config data (location of metadata Root Tablet) - locking of tabletsZookeeper uses Quorum consistency algorithms for High Availability to prevent Single Point of FailureZookeeper itself is actually a K/V datastore, but holds very little dataAlthough we’ll be running ZK on the same machines as Accumulo and Hadoop, it’s recommended that the Quorum of servers be separate.
1. Securely explore your data
Adam Fuchs and John Vines
January 9, 2013