It’s been ten years since scientists sequenced the human genome. But what do all these letters?
Researchers could identify in its 3 billion letters many of the regions that code for proteins, but those make
up little more than 1% of the genome, contained in around 21,000 genes a few familiar objects in an otherwise stark and unrecognizable landscape. Many biologists suspected that the information responsible
for the wondrous complexity of humans lay somewhere in the ‘deserts’ between the genes (The ENCODE Project Consortium, 2012).
Interpreting the human genome sequence is one of the leading challenges of 21st century biology
(Collins et al., 2003). In 2003, the National Human Genome Research Institute (NHGRI) embarked on an
ambitious project the Encyclopedia of DNA Elements (ENCODE), aiming to delineate all of the functional elements encoded in the human genome sequence (The ENCODE Project Consortium 2004). To further
this goal, NHGRI organized the ENCODE Consortium, an international group of investigators with diverse
backgrounds and expertise in production and analysis of high-throughput functional genomic data. In a pilot project phase spanning 2003–2007, the Consortium applied and compared a variety of experimental and computational methods to annotate functional elements in a defined 1% of the human genome (The ENCODE Project Consortium, 2007)