This talk was given at the IIPC General Assembly in Paris in May 2014. It introduces the distributed, parallel extraction framework provided by the Web Data Commons project. The framework is public accessible and tailored for the Amazon Web Service Stack. Besides the presentation includes an excerpt of datasets which were extracted from over 100 TB of crawling data and are as well available at http://webdatacommons.org.