The document discusses large scale digitization initiatives undertaken by the British Library. It provides details on projects to digitize over 4 million pages of historic newspapers from 1620-1900 and collaborations with other organizations to digitize over 100,000 19th century books. The challenges of quality assurance, data storage, and text extraction from digitized materials are examined. Automated processes and outsourcing were used to help quality check large volumes of digitized pages. Data storage requirements for different file formats are listed, with JPEG 2000 noted as offering good compression ratios for digitized text.