The document summarizes a research project involving several Finnish universities to analyze publishing trends and the development of public discourse in Finland between 1771-1910 using text mining techniques on digitized historical newspapers and journals. The project includes analyzing metadata about books and publications, identifying text reuse through machine learning methods, and publishing results in academic conferences and journals.
The Library as a Digital Research infrastructure: Digital Initiatives and Dig...lorna_hughes
Memory institutions have built up expertise and taken the lead in all aspects of digital humanities, especially the development and implementation of digital methods for the capture, analysis and dissemination of archives and special collections, including manuscripts. In recent years, these initiatives have become embedded into Digital Humanities Initiatives, Centres and Programmes within research libraries, adding value to the existing relationships between libraries and scholarly iniatiatives. These activities have fostered the development of new projects that bring into collaboration the skills and expertise of academics, librarians, and digital humanists, making the Library increasingly a “digital research infrastructure”. This presentation will discuss these developments based on the experience of the Research Programme in Digital Collections at the National Library of Wales, specifically discussing some recent experimentation with new methods for manuscript digitization and dissemination, including hyperspectral digitization of the Library’s Chaucer manuscripts. The presentation will also discuss the wider embedding of this work within the European Digital Humanities Context, through collaborations with the ESF Research Network Programe NeDiMAH (Network for Digital Methods in the Arts and Humanities).
Historical newspapers in the context of Digital Library of SloveniaEuropeana Newspapers
The Europeana Newspapers Project held a workshop in Amsterdam in September 2013. This presentation from Zoran Krstulović shows the work of the National Library of Slovenia.
Mate Toth: Digitisation and creative re-use of cultural content #blokexpertuKISK FF MU
Slides for the lecture given at Department of Library and Information Studies. // Slajdy k přednášce pro předmět Blok expertů na KISKu (kisk.cz/blok-expertu).
Making cultural content available for everyone via mass digitisation is still a challenge for the European ALM (Archives, libraries and museums) sector. Most European memory institutions intend to digitise their whole collection and develop projects for the attractive presentation of their online available electronic content.
The creative industry expects content that is ready for remix and reuse even for business purposes. Based on the experiences of the meetings of Member States Expert Group on Digitisation and Digital Preservation the lecture will summarize the main factors that challenge the realization of this aim and outline possible solutions.
I will present the business needs (what creative reuse means), the legal barriers (how existing copyright rules make creative reuse difficult), the memory institutions’ perspective and some landmark projects from all over Europe that makes it clear that there is a light at the end of the tunnel!
Climbing the Tower of Babel: Challenges and Opportunities in Multilingual Dat...cneudecker
LIDER Workshop: Datos Enlazados y Multilingüismo para la Lingüística y las Humanidades Digitales, 20 October 2015, Madrid, Spain, http://lider-project.eu/workshopMadrid/
The Library as a Digital Research infrastructure: Digital Initiatives and Dig...lorna_hughes
Memory institutions have built up expertise and taken the lead in all aspects of digital humanities, especially the development and implementation of digital methods for the capture, analysis and dissemination of archives and special collections, including manuscripts. In recent years, these initiatives have become embedded into Digital Humanities Initiatives, Centres and Programmes within research libraries, adding value to the existing relationships between libraries and scholarly iniatiatives. These activities have fostered the development of new projects that bring into collaboration the skills and expertise of academics, librarians, and digital humanists, making the Library increasingly a “digital research infrastructure”. This presentation will discuss these developments based on the experience of the Research Programme in Digital Collections at the National Library of Wales, specifically discussing some recent experimentation with new methods for manuscript digitization and dissemination, including hyperspectral digitization of the Library’s Chaucer manuscripts. The presentation will also discuss the wider embedding of this work within the European Digital Humanities Context, through collaborations with the ESF Research Network Programe NeDiMAH (Network for Digital Methods in the Arts and Humanities).
Historical newspapers in the context of Digital Library of SloveniaEuropeana Newspapers
The Europeana Newspapers Project held a workshop in Amsterdam in September 2013. This presentation from Zoran Krstulović shows the work of the National Library of Slovenia.
Mate Toth: Digitisation and creative re-use of cultural content #blokexpertuKISK FF MU
Slides for the lecture given at Department of Library and Information Studies. // Slajdy k přednášce pro předmět Blok expertů na KISKu (kisk.cz/blok-expertu).
Making cultural content available for everyone via mass digitisation is still a challenge for the European ALM (Archives, libraries and museums) sector. Most European memory institutions intend to digitise their whole collection and develop projects for the attractive presentation of their online available electronic content.
The creative industry expects content that is ready for remix and reuse even for business purposes. Based on the experiences of the meetings of Member States Expert Group on Digitisation and Digital Preservation the lecture will summarize the main factors that challenge the realization of this aim and outline possible solutions.
I will present the business needs (what creative reuse means), the legal barriers (how existing copyright rules make creative reuse difficult), the memory institutions’ perspective and some landmark projects from all over Europe that makes it clear that there is a light at the end of the tunnel!
Climbing the Tower of Babel: Challenges and Opportunities in Multilingual Dat...cneudecker
LIDER Workshop: Datos Enlazados y Multilingüismo para la Lingüística y las Humanidades Digitales, 20 October 2015, Madrid, Spain, http://lider-project.eu/workshopMadrid/
This is a presentation from The Library, Book, Information and the Internet Conference held in Lublin, Poland on 9th October 2012. It introduces the EOD project and service in the National Technical Library Prague.
New media and other media in Communicating ArchaeologyJohn Tierney
PPT presentation from the Communicating Archaeology Seminar held 16 June 2011 http://eachtra.ie/index.php/journal/communicating-archaeology-a-forum-for-professionals-and-the-public/
How university libraries of the future need to make global content accessible locally, and local content accessible globally. Given at Slovakian Digital Library conference, October 2012
Rethink research, illuminate history with the British LibraryMia
Join Dr Mia Ridge, Digital Curator for Western Heritage Collections at the British Library, to discover how research and technology can create a richer picture of our past. Living with Machines is a collaborative project between the Alan Turing Institute, universities and the British Library – home to the world’s most comprehensive research collection. Together, they are using data science and digital history methods to analyse millions of historical documents and understand the impact of mechanisation in the 19th century. Their initial approach has focused on specific regions like Yorkshire that will help tell us the story of industrialisation in Britain.
Presentation held by Jussi Nuorteva (Finnish National Archives) at "Freedom for Information - the Power of Open Data in the Cultural Field" on 02 May 2016 at the Upper Austrian State Archives (AT).
This is a presentation from The Library, Book, Information and the Internet Conference held in Lublin, Poland on 9th October 2012. It introduces the EOD project and service in the National Technical Library Prague.
New media and other media in Communicating ArchaeologyJohn Tierney
PPT presentation from the Communicating Archaeology Seminar held 16 June 2011 http://eachtra.ie/index.php/journal/communicating-archaeology-a-forum-for-professionals-and-the-public/
How university libraries of the future need to make global content accessible locally, and local content accessible globally. Given at Slovakian Digital Library conference, October 2012
Rethink research, illuminate history with the British LibraryMia
Join Dr Mia Ridge, Digital Curator for Western Heritage Collections at the British Library, to discover how research and technology can create a richer picture of our past. Living with Machines is a collaborative project between the Alan Turing Institute, universities and the British Library – home to the world’s most comprehensive research collection. Together, they are using data science and digital history methods to analyse millions of historical documents and understand the impact of mechanisation in the 19th century. Their initial approach has focused on specific regions like Yorkshire that will help tell us the story of industrialisation in Britain.
Presentation held by Jussi Nuorteva (Finnish National Archives) at "Freedom for Information - the Power of Open Data in the Cultural Field" on 02 May 2016 at the Upper Austrian State Archives (AT).
Bente Jensen
Archives’ Outreach in the Nordic Countries – a Question About Relevance, Participation and Dialogue
ICARUS-Meeting #17 | Transparency - Accessibility – Dialogue. How a creative archival landscape can effect society
23–25 May 2016, Krukmakarens hus (The Potter´s house), Mellangatan 21, 621 56 Visby / The Regional State Archives in Visby, Broväg 27, 621 41 Visby, Sweden
Understandings of the role of a public library in EstoniaMai Poldaas
Methodology and some results of the study about understandings of the role of a public library in Estonia are introduced.
Keywords: public library’s role in the society; public library’s policy; challenges before libraries today
Through a new Audiovisual Think Tank, visionary experts in the AV cultural heritage sector are working together to map out our shared strategic priorities and put into place a research and action agenda to shape the coming decade. The AV Think Tank looks to represent major AV archives and digital cultural heritage professionals from across the globe and closely connects these key players to work collectively at the forefront of the sector in consultation with the wider community. Initiated and actively supported by Sound and Vision, the AV Think Tank aims to lay the groundwork for an AV archiving sector that enables more long-term use of, learning with, and education through AV materials.
A presentation about the JISC Mass Digitization project "Rhyfel Byd 1914-1918 a’r profiad Cymreig / Welsh experience of World War One 1914-1918". Talk at the Strategic Content Alliance World War One roundtable meeting, 27th March 2012.
Europe’s Common Cultural Heritage – Unity in Diversity: Digital Technologies ...Aneta Kozuchowska
Bellevue Programme 2011 - EU Seminar: Bruxelles, 2 March 2011. Presentation by Giuliana De Francesco (Ministry for Cultural Heritage, Italy, Stiftung Preussischer Kulturbesitz, Germany)
Building The European Digital Library - An Insider’s Point of View Olaf Janssen
In December 2004 Google announced its plans to digitise and publish millions of books from 5 prestigious Anglo-American academic libraries by the year 2015. Initiated by French fears that Google's initiative could create a bias towards Anglo-American language and culture, Europe quickly united to mobilise funds for the digitisation, preservation and accessibility of European cultural heritage and the creation of a European Digital Library, including 6M digital works from libraries, muse-ums and archives by 2010.
Today The European Library (TEL) is a multilingual portal offering integrated access to the tens of millions of resources of Europe's national libraries. It offers free federated searching and delivers digital objects - some free, some priced.
The EU stressed that the European Digital Library should not be constructed from scratch, but build on existing initia-tives such as TEL, because TEL has a long history of successfully implementing and using some of the vital ingredients for the European Digital Library. These include A) internal & external collaboration and cooperative organisational networks, B) a technological platform based on creating, maintaining and conforming to common standards in i) data harvesting and ac-cess protocols, ii) metadata and iii) collection descriptions and C) multilingual access.
The reader will 1) learn what it takes to build a pan-European Digital Library, 2) find out about the history and future of this project and C) discover that this a win-win-win project: for its users, for its builders, and for world knowledge.
Janssen, O.D. (2007), “Building the European Digital Library - an insider's point of view”, in: ACRL 13th National Conference Proceedings, Hugh A. Thompson (Ed.), 29th March-1st April 2007, Baltimore, Maryland, USA p.46-55
Digitization of Tamil Soviet Publications and Little Magazines.pdfShrinivasan T
The Tamil Soviet Collection
The Tamil Soviet Collection consists of works published in the Tamil language by the Soviet publishers such as Raduga Publishers and Progress Publishers on various subjects. These publications often pioneered many genres in Tamil, including illustrated children stories, scientific writing and Russian-Tamil translations. Once widely available and read, now they are difficult to find and access. The Tamil Soviet Collection aims to digitally preserve and make these works openly accessible. This collection will be of interest to the public as well as to scholars/educators interested in Scientific Tamil and the Soviet/Russian Tamil Studies.
Similar to Computational History and the Transformation of Public Discourse in Finland, 1640 - 1910 (COMHIS) (20)
Neil Tarrant Defining Nature’s Limits 9 March 2022.pptxUCLDH
Neil Tarrant (Research Associate CREMS, University of York) discusses his monograph, Defining Nature’s Limits: The Roman Inquisition and the Boundaries of Science (The University of Chicago Press, August 2022).
Archiving the Medici: History and Future (1370s-2020s)UCLDH
Alessio Assonitis (Medici Archive Project, Florence), Archiving the Medici: History and Future (1370s-2020s). Archiving the Academies of Early Modern Italy: Critical methodologies & digital tools, 28 June 2018
The Pleasures and Sorrows of digitising primary source collections: The Case ...UCLDH
In this presentation, Seth Cayley, VP of Gale Primary Sources, will describe the forgotten history of the Atlantic Editions of the Daily Mail. These extremely rare newspapers, printed at sea, were a separate enterprise to the more familiar London edition, and provide a fascinating insight into upper-class social history of the 1920s and 1930s.
The opportunity of accessibility: increasing impact and improving the user ex...UCLDH
The opportunity of accessibility: increasing impact and improving the user experience, Ben Watson, Accessible Information Project Adviser, Kent University
Where does the born- and reborn-digital material take the Digital Humanities?UCLDH
Prof Niels Brügger discusses digitised, born-digital, and reborn-digital material, and tries to understand how each of these types of digital material affects their possible scholarly use.
Humanities Crowdsourcing on the Zooniverse PlatformUCLDH
Zooniverse (https://www.zooniverse.org/) is a world-leading academic crowdsourcing organization based at the University of Oxford, the Adler Planetarium and the University of Minnesota. This talk will provide an overview of the types of metadata extraction and full text transcription projects and tools that are currently available on the platform. It will give an overview of the design and lessons learned from projects such as Operation War Diary, Science Gossip, Shakespeare’s World and Measuring the ANZACs, and suggest ways in which crowdsourced data can be used in the humanities. The talk will also provide an overview of the free Project Builder (https://www.zooniverse.org/lab), where anyone with an internet connection can create their own project and obtain their own data.
Greta and Emily Franzini (UCLDH and Göttingen), 'Brothers Grimm, Jane Austen ...UCLDH
Greta and Emily Franzini (UCLDH and Göttingen), 'Brothers Grimm, Jane Austen and Paulus Orosius have one thing in common: the eTRAP research team and its DH projects'
The Art Pastor's Guide to Sabbath | Steve ThomasonSteve Thomason
What is the purpose of the Sabbath Law in the Torah. It is interesting to compare how the context of the law shifts from Exodus to Deuteronomy. Who gets to rest, and why?
How to Split Bills in the Odoo 17 POS ModuleCeline George
Bills have a main role in point of sale procedure. It will help to track sales, handling payments and giving receipts to customers. Bill splitting also has an important role in POS. For example, If some friends come together for dinner and if they want to divide the bill then it is possible by POS bill splitting. This slide will show how to split bills in odoo 17 POS.
We all have good and bad thoughts from time to time and situation to situation. We are bombarded daily with spiraling thoughts(both negative and positive) creating all-consuming feel , making us difficult to manage with associated suffering. Good thoughts are like our Mob Signal (Positive thought) amidst noise(negative thought) in the atmosphere. Negative thoughts like noise outweigh positive thoughts. These thoughts often create unwanted confusion, trouble, stress and frustration in our mind as well as chaos in our physical world. Negative thoughts are also known as “distorted thinking”.
This is a presentation by Dada Robert in a Your Skill Boost masterclass organised by the Excellence Foundation for South Sudan (EFSS) on Saturday, the 25th and Sunday, the 26th of May 2024.
He discussed the concept of quality improvement, emphasizing its applicability to various aspects of life, including personal, project, and program improvements. He defined quality as doing the right thing at the right time in the right way to achieve the best possible results and discussed the concept of the "gap" between what we know and what we do, and how this gap represents the areas we need to improve. He explained the scientific approach to quality improvement, which involves systematic performance analysis, testing and learning, and implementing change ideas. He also highlighted the importance of client focus and a team approach to quality improvement.
Operation “Blue Star” is the only event in the history of Independent India where the state went into war with its own people. Even after about 40 years it is not clear if it was culmination of states anger over people of the region, a political game of power or start of dictatorial chapter in the democratic setup.
The people of Punjab felt alienated from main stream due to denial of their just demands during a long democratic struggle since independence. As it happen all over the word, it led to militant struggle with great loss of lives of military, police and civilian personnel. Killing of Indira Gandhi and massacre of innocent Sikhs in Delhi and other India cities was also associated with this movement.
Read| The latest issue of The Challenger is here! We are thrilled to announce that our school paper has qualified for the NATIONAL SCHOOLS PRESS CONFERENCE (NSPC) 2024. Thank you for your unwavering support and trust. Dive into the stories that made us stand out!
Computational History and the Transformation of Public Discourse in Finland, 1640 - 1910 (COMHIS)
1. Consortium partners:
• National Library of Finland, Centre for Preservation and Digitisation
• University of Helsinki, Faculty of Humanities
• University of Turku, Dept of Information Technology
• University of Turku, Dept of Cultural History
More info at http:// goo.gl/tMH4RE
2. Researchers:
• National Library of Finland, Centre for Preservation and Digitisation
Kimmo Kettunen (PI), Mika Koistinen, Teemu Ruokolainen
• University of Helsinki, Faculty of Humanities
Mikko Tolonen (PI), Leo Lahti, Jani Marjanen, Hege Roivainen,
Ville Vaara
• University of Turku, Dept of Information Technology
Tapio Salakoski (PI), Filip Ginter, Aleksi Vesanto
• University of Turku, Dept of Cultural History
Hannu Salmi (Consortium PI), Asko Nivala, Heli Rantala, Reetta
Sippola
4. COMHIS Overview
Work package 1: Publishing Trends and the Development of Public Discourse
WP 1.1 Large-scale Analysis of Library Catalogue Metadata Collections
WP 1.2. Intellectual Geography and Transcending of National Borders
Work Package 2: WP2 Viral Texts and Social Networks of Finnish Public Discourse in
Newspapers and Journals 1771–1910
WP 2.1: Improving the Quality of Newspaper Digital Archives
WP 2.2: Virality of Newspaper and Journal Discourse in Nineteenth-Century Finland:
Cultural Rhizomes and Social Networks
Work package 3: Data Analytical Ecosystem for Newspapers and Historical
Document Collections
WP 3.1 Quantitative Tools for Bibliographic Library Catalogue Metadata Collections
and Finnish Book Production (1488–1910)
WP 3.2 Machine learning methods for text mining
WP 3.3 Text Reuse and Paraphrasing in Finnish Newspapers and Journals, 1771–1910
WP 3.4 Open Source Statistical Workflows
5. National Library of Finland
NLF has a large digitized newspaper and journal
collection 1771-1920 (and newer)
• http://digi.kansalliskirjasto.fi
Newspapers
Digitized 4,501,147 pages.
Free use 2,954,424 pages (65%) (-1920).
Copyright based material 1,546,723 pages (35%) (1921-)
Journals
Digitized 6,378,717 pages.
Free use 2,161,748 pages (33%) ( -1920).
Copyright based material 4,216,969 pages (67%) (1921-).
12. • How much newspapers and
journals shared each others’
content?
• We have found 8 million clusters of
repeated texts in the corpus of
Finnish newspapers and journals
1771–1910, this includes a total of
49 million occurences (hits)
• Different forms of text reuse:
advertisement, notices, news,
anecdotes, poems, etc.
• Long-term reuse
• Viral chains, explosive replication
Text reuse
14. Finding text reuse
• Programme called NCBI BLAST
• Used to compare and align biological sequences, like protein sequences
• Finds all similar sub-sequence pairs
• Our data is just text, not protein sequences
• We had to encode our data into protein sequences
• 23 amino acids available
• We formed a mapping from the 23 most common letters to the available amino acids
• We encoded the data using this mapping and discarded characters that didn’t have an
equivalent
• "This is an example sentence” à “DSCHCHBEGBNQFGHGEDGEG”
• BLAST outputs all similar sub-sequences from our data
• We formed clusters by assigning all sub-sequences that overlap enough to be a cluster
15.
16. Publications
• Kimmo Kettunen, Tuula Pääkkönen: “Measuring Lexical Quality of a Historical Finnish Newspaper
Collection? Analysis of Garbled OCR Data with Basic Language Technology Tools and Means”, LREC
2016.
• Kimmo Kettunen, Eetu Mäkelä, Juha Kuokkala, Teemu Ruokolainen, Jyrki Niemi: “Modern Tools
for Old Content - in Search of Named Entities in a Finnish OCRed Historical Newspaper Collection
1771-1910”, LWDA 2016: 124-135.
• Tuula Pääkkönen, Jukka Kervinen, Kimmo Kettunen, Asko Nivala, Eetu Mäkelä: “Exporting Finnish
Digitized Historical Newspaper Contents for Offline Use”, D-Lib Magazine 22(7/8) (2016).
• Mikko Tolonen, Jani Marjanen, Niko Ilomäki, Hege Roivainen and Leo Lahti, “Printing in a
Periphery: a Quantitative Study of Finnish Knowledge Production, 1640-1828”, Proceedings of
Digital Humanities 2016, long papers, Kraków, Poland, July, 2016
• Mikko Tolonen, Leo Lahti and Niko Ilomäki, “A Quantitative Analysis of History in the ESTC
catalogue”, Liber Quarterly, 25(2), pp. 87–116, 2016. DOI: http://doi.org/10.18352/lq.10112
• Aleksi Vesanto, Asko Nivala, Tapio Salakoski, Hannu Salmi and Filip Ginter: “A System for
Identifying and Exploring Text Repetition in Large Historical Document Corpora”, In Proceedings
of the 21st Nordic Conference of Computational Linguistics. Gothenburg, Sweden, 23–24 May
2017 (Linköping 2017), 330–333, http://www.ep.liu.se/ecp/131/049/ecp17131049.pdf
• Aleksi Vesanto, Asko Nivala, Heli Rantala, Tapio Salakoski, Hannu Salmi and Filip Ginter: “Applying
BLAST to Text Reuse Detection in Finnish Newspapers and Journals, 1771-1910”, Proceedings of
the 21st Nordic Conference of Computational Linguistics. Gothenburg, Sweden, 23–24 May 2017
(Linköping 2017), 54–58, http://www.ep.liu.se/ecp/133/010/ecp17133010.pdf