This document discusses the challenges and results of extracting articles from a large digitized Finnish historical newspaper collection (1771-1929) using the Pivaj software. The analysis revealed that while current algorithms achieve around 80-85% accuracy, the Pivaj system, particularly its offline application, showed promising results in learning to segment and label articles correctly. Despite some limitations, such as the handling of advertisement-heavy pages, Pivaj outperformed other systems, including Docworks, in identifying articles within the newspaper pages.