Successfully reported this slideshow.
You’ve unlocked unlimited downloads on SlideShare!
Getting started (OSX)
brew install rabbitmq maven coreutils wget
# Check this works without a passphrase
# Check that the GNU coreutils cmds
# (grm, gcp, gln, gmv) are on your PATH
# Clone & build
git clone https://github.com/addthis/hydra.git
Getting started (2)
# Start local stack
# yes, twice!
# UI should now be running
# Sample job definition file available at
# Click ‘Create’, copy-paste the job config,
# save the job and click ‘Kick’ to run it.
# Click the ‘Q’ button to open the query UI
# and see the resulting data.
Analysing text files
## “files” source is broken. Use “mesh2”.
## Docs are out of date. Read the source
# Mesh filesystem root is here:
# Here’s an example job config I used to
parse some TSV-formatted Apache logs
● If you have Small Data,
use grep, awk, sort, uniq
● If you have Big Data,
● If you really like trees,
use Hydra ;)