This document discusses using big data and semantics for data integration. It describes loading multiple data sources into Hadoop and mapping the data into a common domain vocabulary that can then be queried using SPARQL. Adding a new data source involves mapping it to the existing domain vocabulary rather than changing queries. Key technologies mentioned include RDF for the data model, RDFS for schemas, SPARQL for querying, and R2RML for mapping relational data to RDF triples.