Biotea is a semantic dataset that RDFizes (converts to RDF) the open-access subset of PubMed Central. It makes scholarly documents and their metadata interconnected by extensively using existing ontologies and semantic enrichment services. This allows the generation of machine-readable scholarly documents that are self-describing. The Biotea dataset and tools provide a flexible and adaptable way to semantically enrich and process biomedical documents into a highly interconnected and semantically rich dataset.