The report discusses the challenges and methods for extracting statistical information from 130 years of digitized Medical Officer of Health reports, which contains over 70,000 documents. Current practices reveal that standard OCR is inadequate for accurate data extraction, necessitating the development of automated solutions combined with advanced recognition techniques. Future work aims to enhance data accessibility and quality through improved table recognition algorithms and integrated data resources.