Hadoop - Lessons Learned

@tcurdt
github.com/tcurdt
yourdailygeekery.com

Agenda

· hadoop? really? cloud?
· integration
· mapreduce
· operations
· community and outlook

“It is a new and improved
version of enterprise tape
drive”

20 machines
Map Reduce
20 files, 1.5 GB each

hadoop job grep.jar

grep “needle” file

ir
0 17.5 35.0 52.5 70.0

f a
u n

Run your own?

http://bit.ly/elastic-mr-pig

Engineers

· hadoop-cat

· hadoop-grep

· hadoop-range
--prefix /logs
--from 2012-05-15 --until 2012-05-22
--postfix /*play*.seq | xargs hadoop jar

· streaming jobs

Non-Engineering Folks

· mount hdfs

· pig / hive

· data dumps

Map Reduce
HDFS ﬁles

InputFormat

Split Split Split Split

Map Map Map Map

Combiner Combiner Combiner Combiner

Sort Sort Sort Sort

Partitioner

Copy and Merge

Combiner Combiner

Reducer Reducer

OutputFormat

Job Counters

12/05/25 01:27:38 INFO mapred.JobClient: Reduce input records=106
..
12/05/25 01:27:38 INFO mapred.JobClient: Combine output records=409
12/05/25 01:27:38 INFO mapred.JobClient: Map input records=112705844
12/05/25 01:27:38 INFO mapred.JobClient: Reduce output records=4
12/05/25 01:27:38 INFO mapred.JobClient: Combine input records=64842079
..
12/05/25 01:27:38 INFO mapred.JobClient: Map output records=64841776

map in : 112705844 *********************************
map out : 64841776 *****************
combine in : 64842079 *****************
combine out : 409 |
reduce in : 106 |
reduce out : 4 |

MAPREDUCE-346 (since 2009)

Job Counters

map in : 20000 **************
map out : 40000 ******************************
combine in : 40000 ******************************
combine out : 10001 ********
reduce in : 10001 ********
reduce out : 10001 ********

Map-only

mapred.reduce.tasks = 0

EOF on append
public class EofSafeSequenceFileInputFormat<K,V>
extends SequenceFileInputFormat<K,V> {
...
}

public class EofSafeRecordReader<K,V>
extends RecordReader<K,V> {
...
public boolean nextKeyValue()
throws IOException, InterruptedException {
try {
return this.delegate.nextKeyValue();
} catch(EOFException e) {
return false;
}
}
...
}

Serialization

before

ASN1, custom java serialization, Thrift

now

protobuf

Custom Writables
public static class Play extends CustomWritable {

public final LongWritable time
= new LongWritable();

public final LongWritable owner_id

public final LongWritable track_id

public Play() {
fields = new WritableComparable[] {
owner_id, track_id, time };
}
}

Fear the State

BytesWritable bytes = new BytesWritable();
...
byte[] buffer = bytes.getBytes();

Re-Iterate

public void reduce(
LongTriple key,
Iterable<LongWritable> values,
Context ctx) {

for(LongWritable v : values) { }
for(LongWritable v : values) { }
}

public void reduce(
LongTriple key,
Iterable<LongWritable> values,
Context ctx) {
buffer.clear();
for(LongWritable v : values) { buffer.add(v); }
for(LongWritable v : buffer.values()) { }
}
HADOOP-5266 (applied to 0.21.0)

BitSets

long min = 1;
long max = 10000000;

FastBitSet set = new FastBitSet(min, max);

for(long i = min; i<max; i++) {
set.set(i);
}

org.apache.lucene.util.*BitSet

Data Structures

http://bit.ly/data-structures
http://bit.ly/bloom-filters
http://bit.ly/stream-lib

General Tips

· test on small datasets, test on your machine

· many reducers

· always consider a combiner and partitioner

· pig / streaming for one-time jobs,
java/scala for recurring

http://bit.ly/map-reduce-book

Operations

use chef / puppet

runit / init.d

pdsh / dsh

pdsh -w "hdd[001-019]"
"sudo sv restart /etc/sv/hadoop-tasktracker"

Hardware

· 2x name nodes raid 1

· 12 cores, 48GB RAM, xfs, 2x1TB

· n x data nodes no raid

· 12 cores, 16GB RAM, xfs, 4x2TB

Monitoring

dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
dfs.period=10
dfs.servers=...

mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
mapred.period=10
mapred.servers=...

jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
jvm.period=10
jvm.servers=...

rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
rpc.period=10
rpc.servers=...

# ignore
ugi.class=org.apache.hadoop.metrics.spi.NullContext

Monitoring

total capacity capacity used

Compression

# of 64MB blocks
# of bytes needed
# of bytes used
# bytes reclaimed

bzip2 / gzip / lzo / snappy

io.seqfile.compression.type = BLOCK
io.seqfile.compression.blocksize = 512000

Janitor

hadoop-expire
-url namenode.here
-path /tmp
-mtime 7d
-delete

The last block of an HDFS block only
occupies the required space. So a 4k
file only consumes 4k on disk.
-- Owen

E D
S T
B U

Logfiles

find
-wholename "/var/log/hadoop/hadoop-*"
-wholename "/var/log/hadoop/job_*.xml"
-wholename "/var/log/hadoop/history/*"
-wholename "/var/log/hadoop/history/.*.crc"
-wholename "/var/log/hadoop/history/done/*"
-wholename "/var/log/hadoop/history/done/.*.crc"
-wholename "/var/log/hadoop/userlogs/attempt_*"
-mtime +7
-daystart
-delete

Limits

sysctl.conf

fs.file-max = 128000

limits.conf

hdfs hard nofile 128000
hdfs soft nofile 64000
mapred hard nofile 128000
mapred soft nofile 64000

Localhost

before

127.0.0.1 localhost localhost.localdomain
127.0.1.1 hdd01.some.net hdd01

hadoop

127.0.0.1 localhost localhost.localdomain
127.0.1.1 hdd01

Rackaware
site config

<property>
<name>topology.script.file.name</name>
<value>/path/to/script/location-from-ip</value>
<final>true</final>
</property>

topology script

#!/usr/bin/ruby
location = {
'hdd001.some.net' => '/ams/1',
'10.20.2.1' => '/ams/1',
'hdd002.some.net' => '/ams/2',
'10.20.2.2' => '/ams/2',
}

puts ARGV.map { |ip| location[ARGV.first] || '/default-rack' }.join(' ')

Fix the Policy

for f in `hdfs hadoop fsck / | grep "Replica
placement policy is violated" | awk -F: '{print $1}'
| sort | uniq | head -n1000`; do
hadoop fs -setrep -w 4 $f
hadoop fs -setrep 3 $f
done

Fsck

hadoop fsck / -openforwrite -files | grep -i
"OPENFORWRITE: MISSING 1 blocks of total size" | awk
'{print $1}' | xargs -L 1 -i hadoop dfs -mv {} /lost
+notfound

Community

hadoop

* from markmail.org

Community

The Enterprise Effect

“The Community Effect” (in 2011)

Community

core

mapreduce

* from markmail.org

The Future

incremental
real time

refined API
flexible pipelines

refined implementation

Real Time Datamining and Aggregation at Scale (Ted Dunning)

Eventually Consistent Data Structures (Sean Cribbs)

Real-time Analytics with HBase (Alex Baranau)

Profiling and performance-tuning your Hadoop pipelines (Aaron Beppu)

From Batch to Realtime with Hadoop (Lars George)

Event-Stream Processing with Kafka (Tim Lossen)

Real-/Neartime analysis with Hadoop & VoltDB (Ralf Neeb)

Take Aways

· use hadoop only if you must
· really understand the pipeline
· unbox the black box

That’s it
folks!
@tcurdt
github.com/tcurdt
yourdailygeekery.com

Hadoop - Lessons Learned

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (11)

Similar to Hadoop - Lessons Learned

Similar to Hadoop - Lessons Learned (20)

Recently uploaded

Recently uploaded (20)

Hadoop - Lessons Learned