Short presentation on text and data mining from a digital heritage and library perspective, given at the FutureTDM Knowledge Café in Helsinki during the LIBER 2016 conference.
10 8 2007 Digital Classicist Work in Progress seminarStuart Dunn
Digital Classicist Work in Progress seminar - broadly, but not totally, some reflections on the geospatial computing workshop in Edinburgh, July 23rd and 24th 2007
(BIG) DATA SCIENCE AND HISTORICAL ARCHAEOLOGICAL STUDIES: A METHODOLOGICAL, ...4Science
This document discusses the opportunities and challenges of applying big data science and digital humanities approaches to historical archaeological studies. It argues that while archaeological and historical data is growing in volume, it remains fragmented, contextual, and produced by humans rather than instruments. Therefore, historians and archaeologists need multidisciplinary data management and analysis skills to integrate diverse data sources while maintaining contextual understanding. A digital humanities framework can help analyze these relationships and contextual associations. However, approaches also require strong domain knowledge to avoid decontextualizing data. Virtual research environments may help manage the data lifecycle if they integrate into daily workflows and provide collaborative analysis and modeling tools.
Towards a Graph of Ancient World Data & an Ecosystem of Gazetteersaboutgeo
Pelagios 3 is a 2-year project funded by the Andrew W. Mellon Foundation to annotate geographic documents predating 1492 by associating places mentioned in them with entries from various gazetteers. It aims to grow a graph of linked ancient world data by annotating over 39 partners from 6 countries have contributed around 830,000 annotations so far. The project develops tools to annotate documents and link gazetteer entries, and defines profiles for publishing gazetteer links and metadata online to allow cross-searching.
A presentation to attendees of our Arabic Scientific Manuscripts ground truth for OCR transcription workshop.
For more details see: https://www.eventbrite.co.uk/e/arabic-scientific-manuscripts-transcription-workshop-tickets-43303096728
About the project: http://blogs.bl.uk/digital-scholarship/2018/03/arabic-handwrittten-ocr.html
HUMlab: Virtual Worlds Learning and ResearchJames Barrett
The document discusses the concept of archives and their evolution from Archive 1.0 to Archive 3.0. Archive 1.0 refers to early state archives stored as inscriptions, while Archive 2.0 denotes digitized archives with efficient search and retrieval. Archive 3.0 involves new architectures for producing and sharing archival resources in animated and interactive ways through remixing, engagement and regeneration. The document also lists various learning and research projects conducted in virtual worlds at HUMlab, including machinima filmmaking, language learning, and pharmacy simulations.
10 8 2007 Digital Classicist Work in Progress seminarStuart Dunn
Digital Classicist Work in Progress seminar - broadly, but not totally, some reflections on the geospatial computing workshop in Edinburgh, July 23rd and 24th 2007
(BIG) DATA SCIENCE AND HISTORICAL ARCHAEOLOGICAL STUDIES: A METHODOLOGICAL, ...4Science
This document discusses the opportunities and challenges of applying big data science and digital humanities approaches to historical archaeological studies. It argues that while archaeological and historical data is growing in volume, it remains fragmented, contextual, and produced by humans rather than instruments. Therefore, historians and archaeologists need multidisciplinary data management and analysis skills to integrate diverse data sources while maintaining contextual understanding. A digital humanities framework can help analyze these relationships and contextual associations. However, approaches also require strong domain knowledge to avoid decontextualizing data. Virtual research environments may help manage the data lifecycle if they integrate into daily workflows and provide collaborative analysis and modeling tools.
Towards a Graph of Ancient World Data & an Ecosystem of Gazetteersaboutgeo
Pelagios 3 is a 2-year project funded by the Andrew W. Mellon Foundation to annotate geographic documents predating 1492 by associating places mentioned in them with entries from various gazetteers. It aims to grow a graph of linked ancient world data by annotating over 39 partners from 6 countries have contributed around 830,000 annotations so far. The project develops tools to annotate documents and link gazetteer entries, and defines profiles for publishing gazetteer links and metadata online to allow cross-searching.
A presentation to attendees of our Arabic Scientific Manuscripts ground truth for OCR transcription workshop.
For more details see: https://www.eventbrite.co.uk/e/arabic-scientific-manuscripts-transcription-workshop-tickets-43303096728
About the project: http://blogs.bl.uk/digital-scholarship/2018/03/arabic-handwrittten-ocr.html
HUMlab: Virtual Worlds Learning and ResearchJames Barrett
The document discusses the concept of archives and their evolution from Archive 1.0 to Archive 3.0. Archive 1.0 refers to early state archives stored as inscriptions, while Archive 2.0 denotes digitized archives with efficient search and retrieval. Archive 3.0 involves new architectures for producing and sharing archival resources in animated and interactive ways through remixing, engagement and regeneration. The document also lists various learning and research projects conducted in virtual worlds at HUMlab, including machinima filmmaking, language learning, and pharmacy simulations.
This document discusses a project funded by DEFRA to create a taxonomy backbone - a graph of taxonomy - for names and natural history collections. The project has three phases focusing on names and taxonomy, collections, and taxon-based information. The goal is to create, curate, and cite semantically meaningful objects or "things" rather than just strings. A three-layered model is proposed with strings at the bottom layer, things in the middle layer representing names in a nomenclatural sense, and a graph of things at the top layer representing taxonomic concepts and the relationships between them. This changes how data is created, curated, and cited, and allows for integration of different classification data and analysis of differences of opinions in
Learn essential legal literacies (copyright, contracts, privacy, ethics & policy, and special use cases/statutes), for supporting text data mining (TDM) research.
Professionalizing via Digital Humanities - Roopika RisamRoopsi Risam
Slides for a talk at the New England American Studies Association Spring Colloquium - "Professional Realities Inside and Outside the Academy" (May 3, 2014)
WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...Micah Altman
The WorldMap platform http://worldmap.harvard.edu is the largest open source collaborative mapping system in the world, with over 13,000 map layers contributed by thousands of users from Harvard and around the world. Researchers may upload large spatial datasets to the system, create data-driven visualizations, edit data, and control access. Users may keep their data private, share it in groups, or publish to the world.
The user base is interdisciplinary, including scholars from the humanities, social sciences, sciences, public health, design, planning, etc. All are able to access, view, and use one another’s data, either online, via map services, or by downloading.
Current work is underway to create and maintain a global registry of map services and take us a step closer to one-stop-access for public geospatial data. Another project is working on tools to support the visualization of spatial datasets with over a billion features. Current collaborations are underway with groups inside Harvard, such as Dataverse, HarvardX, and various departments, and with groups outside Harvard, such as Cornell University and the University of Pennsylvania. Major additional contributors to the underlying source code include the WorldBank, the U.S. State Department, and the United Nations.
The source code for the WorldMap platform is available on GitHub https://github.com/cga-harvard/cga-worldmap.
Location: E25-202
Discussant: Ben Lewis is system architect and project manager for WorldMap, an open source infrastructure that supports collaborative research centered on geospatial information. Before joining Harvard, Ben was a project manager with Advanced Technology Solutions of Pennsylvania, where he led the company in adopting platform independent approaches to GIS system development. Ben studied Chinese at the University of Wisconsin and has a Masters in Planning from the University of Pennsylvania. After Penn, Ben helped start the GIS Lab at U.C. Berkeley, founded the GIS group for transportation engineering firm McCormick Taylor, and coordinated the Land Acquisition Mapping System for South Florida Water Management District. Ben is especially interested in technologies that lower the barrier to spatial technology access.
Information Science Brown Bag talks, hosted by the Program on Information Science, consists of regular discussions and brainstorming sessions on all aspects of information science and uses of information science and technology to assess and solve institutional, social and research problems. These are informal talks. Discussions are often inspired by real-world problems being faced by the lead discussant.
Context-free data analysis with Transcendental Information Cascades.Markus Luczak-Rösch
In order to discover hidden relationships and patterns in data streams from multiple heterogenous sources, we work on a method for exploratory data analysis. We disregard any system-specific context to generate generic networks of information co-occurrence. These networks allow for more informed sampling and filtering. Case specific context can be added once these networks have been created to support accurate decision making.
Accounting for Productivity and Spillover Effects in Emerging Energy Technolo...Richard Bowers
This document summarizes a proposal for a capstone project examining cost reductions in the wind power industry in California from 1985-1995. The proposal identifies several reasons why costs may decline for emerging industries over time, including: 1) reductions within firms from R&D, learning by doing, and economies of scale; 2) reductions from spillovers between firms in an industry from learning and R&D; and 3) demand spillovers from increased usage of a technology. The proposal will examine evidence of "learning by doing" and potential for subsidies in the wind power industry in California, focusing on changes in quarterly electricity output at individual plants to determine if operational learning occurred.
Expo procesos basicos en la gestion de rrhh (ultimo)abelgarcia52
Este documento discute dos procesos clave en la gestión de recursos humanos: el análisis de puestos de trabajo y la planificación de recursos humanos. El análisis de puestos de trabajo implica describir las tareas, responsabilidades, habilidades y competencias requeridas para cada puesto. La planificación de recursos humanos busca anticipar las necesidades futuras de personal de una organización para garantizar el reclutamiento, desarrollo y retención adecuados. Juntos, estos procesos ayudan a las organizaciones a administr
Este documento describe varios tipos populares de deportes extremos, incluyendo bungee jumping, paracaidismo, alpinismo y parapente. Explica que los deportes extremos implican cierto riesgo y dificultad física y mental. Luego proporciona más detalles sobre cada deporte, describiendo las actividades involucradas en bungee jumping, paracaidismo, alpinismo y el equipo y técnicas utilizadas en parapente.
El documento habla sobre varias afecciones oculares como el síndrome de ojo rojo, blefaritis, orzuelo, chalazión, conjuntivitis, úlcera corneal, glaucoma agudo, uveítis anterior aguda y las manifestaciones oculares del SIDA. Proporciona detalles sobre el diagnóstico diferencial de estas condiciones y sus síntomas clínicos.
Este documento describe varios tipos populares de deportes extremos, incluyendo bungee jumping, paracaidismo, alpinismo y parapente. Explica que los deportes extremos implican cierto riesgo y dificultad física y mental. Luego proporciona más detalles sobre cada deporte, describiendo las actividades involucradas en bungee jumping, paracaidismo, alpinismo y el equipo y técnicas utilizadas en parapente.
El documento presenta el currículum vitae de Alejandro Cejudo Dueñas, arquitecto con número telefónico 55 3366 1537 y correo electrónico alejandro.cejudo@gmail.com. Incluye una lista de 10 proyectos realizados, que van desde programas de Rhinoceros y Stands Juzzfit hasta proyectos de una escuela sustentable, un museo y edificios de departamentos y comerciales.
Elizabeth Brickland has over 20 years of experience in administration, secretarial work, sales, and information technology. Her career began in administrative roles and she was quickly promoted to administrative team leader. She developed strong organizational, communication, and management skills. Later she worked in IT for over 15 years, holding roles from desktop support technician to senior service desk analyst. She is skilled in areas such as customer service, problem solving, and technical support.
Mid-Atlantic Wind - Overing the ChallengesRichard Bowers
This document provides a summary of a study on the challenges and opportunities for developing offshore wind energy projects in the mid-Atlantic region of the United States. It finds that key barriers include regulatory hurdles at the state level relating to renewable portfolio goals and zoning. Technically, developing offshore wind farms faces challenges relating to construction costs, grid integration, and environmental impacts. However, the region has strong wind resources that could help states meet renewable goals in a cost-effective manner if these barriers are successfully addressed. The study evaluates various offshore wind project scenarios and finds some may be economically viable if costs continue to decline and states implement policies supporting offshore wind.
This document provides a final technical report for the Mid-Atlantic Offshore Wind Interconnection and Transmission project funded by the DOE. The report summarizes the project's key tasks and findings. It analyzed five scenarios of offshore wind buildout off the coast of PJM up to 69.7 GW of installed capacity. It found that an offshore HVDC transmission system like the Atlantic Wind Connection would result in lower transmission losses than piecemeal radial connections. High-resolution wind resource and forecasting modeling was performed. Analysis showed forecasts had satisfactory performance for capturing wind variability and power output at offshore nodes. The project provided insights into how different levels of offshore wind could affect transmission needs, congestion, and demand for grid services within PJM
El documento describe los elementos que componen un acto administrativo, incluyendo la voluntad, el objeto, los motivos y la forma. También clasifica los actos administrativos según su naturaleza, finalidad, forma de manifestación de la voluntad, efectos y contenido. Finalmente, explica las formas en que un acto administrativo puede extinguirse, como por su propio cumplimiento, nulidad, revocación o inexistencia.
Este documento presenta una guía sobre la profesión de consultoría de empresas. Explica la naturaleza y objetivos de la consultoría, los diferentes tipos de servicios que ofrece, y el proceso de consultoría. También cubre temas como la relación consultor-cliente, el cambio organizacional, la cultura, la ética profesional y la gestión de una firma de consultoría.
This document discusses a project funded by DEFRA to create a taxonomy backbone - a graph of taxonomy - for names and natural history collections. The project has three phases focusing on names and taxonomy, collections, and taxon-based information. The goal is to create, curate, and cite semantically meaningful objects or "things" rather than just strings. A three-layered model is proposed with strings at the bottom layer, things in the middle layer representing names in a nomenclatural sense, and a graph of things at the top layer representing taxonomic concepts and the relationships between them. This changes how data is created, curated, and cited, and allows for integration of different classification data and analysis of differences of opinions in
Learn essential legal literacies (copyright, contracts, privacy, ethics & policy, and special use cases/statutes), for supporting text data mining (TDM) research.
Professionalizing via Digital Humanities - Roopika RisamRoopsi Risam
Slides for a talk at the New England American Studies Association Spring Colloquium - "Professional Realities Inside and Outside the Academy" (May 3, 2014)
WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...Micah Altman
The WorldMap platform http://worldmap.harvard.edu is the largest open source collaborative mapping system in the world, with over 13,000 map layers contributed by thousands of users from Harvard and around the world. Researchers may upload large spatial datasets to the system, create data-driven visualizations, edit data, and control access. Users may keep their data private, share it in groups, or publish to the world.
The user base is interdisciplinary, including scholars from the humanities, social sciences, sciences, public health, design, planning, etc. All are able to access, view, and use one another’s data, either online, via map services, or by downloading.
Current work is underway to create and maintain a global registry of map services and take us a step closer to one-stop-access for public geospatial data. Another project is working on tools to support the visualization of spatial datasets with over a billion features. Current collaborations are underway with groups inside Harvard, such as Dataverse, HarvardX, and various departments, and with groups outside Harvard, such as Cornell University and the University of Pennsylvania. Major additional contributors to the underlying source code include the WorldBank, the U.S. State Department, and the United Nations.
The source code for the WorldMap platform is available on GitHub https://github.com/cga-harvard/cga-worldmap.
Location: E25-202
Discussant: Ben Lewis is system architect and project manager for WorldMap, an open source infrastructure that supports collaborative research centered on geospatial information. Before joining Harvard, Ben was a project manager with Advanced Technology Solutions of Pennsylvania, where he led the company in adopting platform independent approaches to GIS system development. Ben studied Chinese at the University of Wisconsin and has a Masters in Planning from the University of Pennsylvania. After Penn, Ben helped start the GIS Lab at U.C. Berkeley, founded the GIS group for transportation engineering firm McCormick Taylor, and coordinated the Land Acquisition Mapping System for South Florida Water Management District. Ben is especially interested in technologies that lower the barrier to spatial technology access.
Information Science Brown Bag talks, hosted by the Program on Information Science, consists of regular discussions and brainstorming sessions on all aspects of information science and uses of information science and technology to assess and solve institutional, social and research problems. These are informal talks. Discussions are often inspired by real-world problems being faced by the lead discussant.
Context-free data analysis with Transcendental Information Cascades.Markus Luczak-Rösch
In order to discover hidden relationships and patterns in data streams from multiple heterogenous sources, we work on a method for exploratory data analysis. We disregard any system-specific context to generate generic networks of information co-occurrence. These networks allow for more informed sampling and filtering. Case specific context can be added once these networks have been created to support accurate decision making.
Accounting for Productivity and Spillover Effects in Emerging Energy Technolo...Richard Bowers
This document summarizes a proposal for a capstone project examining cost reductions in the wind power industry in California from 1985-1995. The proposal identifies several reasons why costs may decline for emerging industries over time, including: 1) reductions within firms from R&D, learning by doing, and economies of scale; 2) reductions from spillovers between firms in an industry from learning and R&D; and 3) demand spillovers from increased usage of a technology. The proposal will examine evidence of "learning by doing" and potential for subsidies in the wind power industry in California, focusing on changes in quarterly electricity output at individual plants to determine if operational learning occurred.
Expo procesos basicos en la gestion de rrhh (ultimo)abelgarcia52
Este documento discute dos procesos clave en la gestión de recursos humanos: el análisis de puestos de trabajo y la planificación de recursos humanos. El análisis de puestos de trabajo implica describir las tareas, responsabilidades, habilidades y competencias requeridas para cada puesto. La planificación de recursos humanos busca anticipar las necesidades futuras de personal de una organización para garantizar el reclutamiento, desarrollo y retención adecuados. Juntos, estos procesos ayudan a las organizaciones a administr
Este documento describe varios tipos populares de deportes extremos, incluyendo bungee jumping, paracaidismo, alpinismo y parapente. Explica que los deportes extremos implican cierto riesgo y dificultad física y mental. Luego proporciona más detalles sobre cada deporte, describiendo las actividades involucradas en bungee jumping, paracaidismo, alpinismo y el equipo y técnicas utilizadas en parapente.
El documento habla sobre varias afecciones oculares como el síndrome de ojo rojo, blefaritis, orzuelo, chalazión, conjuntivitis, úlcera corneal, glaucoma agudo, uveítis anterior aguda y las manifestaciones oculares del SIDA. Proporciona detalles sobre el diagnóstico diferencial de estas condiciones y sus síntomas clínicos.
Este documento describe varios tipos populares de deportes extremos, incluyendo bungee jumping, paracaidismo, alpinismo y parapente. Explica que los deportes extremos implican cierto riesgo y dificultad física y mental. Luego proporciona más detalles sobre cada deporte, describiendo las actividades involucradas en bungee jumping, paracaidismo, alpinismo y el equipo y técnicas utilizadas en parapente.
El documento presenta el currículum vitae de Alejandro Cejudo Dueñas, arquitecto con número telefónico 55 3366 1537 y correo electrónico alejandro.cejudo@gmail.com. Incluye una lista de 10 proyectos realizados, que van desde programas de Rhinoceros y Stands Juzzfit hasta proyectos de una escuela sustentable, un museo y edificios de departamentos y comerciales.
Elizabeth Brickland has over 20 years of experience in administration, secretarial work, sales, and information technology. Her career began in administrative roles and she was quickly promoted to administrative team leader. She developed strong organizational, communication, and management skills. Later she worked in IT for over 15 years, holding roles from desktop support technician to senior service desk analyst. She is skilled in areas such as customer service, problem solving, and technical support.
Mid-Atlantic Wind - Overing the ChallengesRichard Bowers
This document provides a summary of a study on the challenges and opportunities for developing offshore wind energy projects in the mid-Atlantic region of the United States. It finds that key barriers include regulatory hurdles at the state level relating to renewable portfolio goals and zoning. Technically, developing offshore wind farms faces challenges relating to construction costs, grid integration, and environmental impacts. However, the region has strong wind resources that could help states meet renewable goals in a cost-effective manner if these barriers are successfully addressed. The study evaluates various offshore wind project scenarios and finds some may be economically viable if costs continue to decline and states implement policies supporting offshore wind.
This document provides a final technical report for the Mid-Atlantic Offshore Wind Interconnection and Transmission project funded by the DOE. The report summarizes the project's key tasks and findings. It analyzed five scenarios of offshore wind buildout off the coast of PJM up to 69.7 GW of installed capacity. It found that an offshore HVDC transmission system like the Atlantic Wind Connection would result in lower transmission losses than piecemeal radial connections. High-resolution wind resource and forecasting modeling was performed. Analysis showed forecasts had satisfactory performance for capturing wind variability and power output at offshore nodes. The project provided insights into how different levels of offshore wind could affect transmission needs, congestion, and demand for grid services within PJM
El documento describe los elementos que componen un acto administrativo, incluyendo la voluntad, el objeto, los motivos y la forma. También clasifica los actos administrativos según su naturaleza, finalidad, forma de manifestación de la voluntad, efectos y contenido. Finalmente, explica las formas en que un acto administrativo puede extinguirse, como por su propio cumplimiento, nulidad, revocación o inexistencia.
Este documento presenta una guía sobre la profesión de consultoría de empresas. Explica la naturaleza y objetivos de la consultoría, los diferentes tipos de servicios que ofrece, y el proceso de consultoría. También cubre temas como la relación consultor-cliente, el cambio organizacional, la cultura, la ética profesional y la gestión de una firma de consultoría.
This document provides an overview of digital humanities (DH), including definitions, history, tools and projects. It discusses DH as using technology to enhance humanities research and communication. Definitions presented emphasize DH as an umbrella term for diverse activities involving technology and humanities scholarship. The history outlines early use of computers in humanities and development of standards like TEI. Tools discussed include network analysis, data visualization, text analysis, and GIS. Examples provided are DH projects mapping relationships and visualizing data. The role of libraries in supporting DH through collections, expertise, partnerships and experimentation is also covered.
This document provides an overview of digital humanities (DH), including definitions, a brief history, tools used in DH, and examples of DH projects and centers. DH is defined as using computational tools and methods to expand humanities research and communication. It has evolved from humanities computing beginning in the 1960s. Libraries play a key role in DH through activities like digitization, curation, and providing tools and space for DH work. The document discusses several DH tools and projects in South Africa and worldwide as illustrations.
A whirlwind introduction to digital humanities for CDP Digital Humanities: Collections & Heritage - current challenges and futures workshop. February 22, 2018 Imperial War Museum
Brown Bag: New Models of Scholarly Communication for Digital Scholarship, by ...Micah Altman
In his talk for the MIT Libraries Program on Information Science, Steve Griffin discusses how how research libraries can play a key and expanded role in enabling digital scholarship and creating the supporting activities that sustain it.
This document provides an overview of digital humanities (DH), including brief definitions, a history of DH, examples of DH tools and projects, and recommendations for further reading. It describes DH as using digital technologies to enhance research in the humanities and explores new methods of scholarly communication. The history discusses early examples from the 1940s onwards and the rise of digital libraries and DH centers from the 1990s on. Tools highlighted include visualization, text analysis, GIS, and digital exhibits. Recommended resources give context to the role of libraries and provide examples of digital projects and tools.
Andrea Scharnhorst (2016) Humanities and ICT. Introduction at the Workshop National Infrastructure, Social Science and Humanities, January 20, 2015, ePlan workshop at NLeSC, Amsterdam.
This document discusses new directions for e-science in the arts and humanities. Specifically, it discusses using networks to connect resources like virtual libraries and museums. It also addresses challenges like dealing with large datasets from simulations and linking heterogeneous resources. Finally, it provides examples of past e-science projects in areas like dance documentation, image analysis, and musicology that have helped map e-science approaches to digital humanities research.
Oulu-e-Science Methods in Arts and HumanitiesStuart Dunn
The document discusses the development of e-Science methods in the arts and humanities. It describes several projects that apply e-Science approaches, including using virtual research communities, geospatial computing, and ontologies. These projects involve digital resources in areas like dance, history, archaeology and music. The document advocates further developing e-Science methods to enable new forms of collaboration, access to cultural artifacts, and ways of analysis across disciplines.
This document discusses the use of e-science, or collaborative science using advanced computing and networking infrastructure, in archaeology. It describes how e-science allows for global collaboration, sharing of resources securely over networks, and new forms of collaboration. Examples provided include projects linking digital archives and publications, using geospatial modeling to simulate ancient battles, and constructing geodatabases of archaeological evidence like tephra deposits. E-science provides opportunities to better analyze and understand large, heterogeneous archaeological data sources.
Are New Digital Literacies Skills Neededrscd2018SusanMRob
Remarrying research and collection services around access to corpora and text mining, are new technical literacy skills needed? Was presented by Ingrid Mason (Deployment Strategist, AARNet) at the Research Support Community Day 2018
Faculty center dh talk 2 s2016 pedagogical provocationsJennifer Dellner
This document discusses digital humanities (DH) pedagogy and contrasts it with traditional "ed tech" approaches. It argues that DH is local and contextual, involving specific configurations of tools, faculty, and students based on an institution's strengths and mission. DH emphasizes hands-on learning through making and production, using tools like programming, audio/video creation, and mapping in project-based ways. Examples provided include open-access textbook projects, rewriting Wikipedia, and digital mapping and narrative projects. The document advocates for DH approaches that encourage exploration, distraction, and making over purely delivering content.
Presentation given at the CBS (Central Bureau of Statistics) by CEDAR members on 06-11-2014 for the Studiemiddag "Digitalisering historische CBS-collectie" (digitisation of the CBS historical collection). All things on converting Excel spreadsheets to RDF Data Cube, harmonisation, and using Linked Data for standardizing statistical data on the Web.
This document provides an overview of digital humanities (DH), including brief definitions and history, examples of DH projects and tools, and the role of libraries in supporting DH. Some key points include:
- DH uses computational methods to study the humanities and involves activities like digitization of collections, text analysis, and data visualization.
- It has roots in earlier humanities computing projects from the 1940s-1970s and grew with text encoding standards, digital libraries and DH centers in the 1990s-2000s.
- Example projects include Mapping the Republic of Letters, digital archives of WWI poetry, and datasets on the transatlantic slave trade.
- Libraries support DH through digitization, technical skills, project
This presentation was provided by Twyla Gibson and Ann Campion Riley, both of the University of Missouri, during the NISO Virtual Conference, The Computer Campus: Integrating Information Systems and Services, held on August 15, 2018.
Chaos&Order: Using visualization as a means to explore large heritage collec...TimelessFuture
*note: download original powerpoint to view animations*. Presentation at 4th Int. Alexandria Workshop (19./20. October 2017) - Foundations for Temporal Retrieval, Exploration and Analytics in Web Archives.
A Case Study Protocol For Meta-Research Into Digital Practices In The HumanitiesJeff Brooks
This document presents a case study protocol for conducting meta-research on digital practices in the humanities. The protocol was developed by the Digital Methods and Practices Observatory working group to help researchers adopt this methodology across disciplines and approaches. The document discusses three pilot meta-research studies on digital practices that informed the protocol's development. It also provides several examples of how digital tools are being integrated into various stages of humanities research in uneven ways and highlights how research practices are unpredictable and assembled in response to specific project needs.
Digital Scholarship Intersection Scale Social MachinesDavid De Roure
This document discusses digital scholarship and social machines. It begins with an overview of digital humanities and social machines. It then provides examples of digital scholarship projects that utilize large datasets, citizen science, and social annotation. These examples demonstrate how digital methods can facilitate collaboration at scale. The document argues that a digital strategy is needed to guide investment and support for research using digital infrastructure and methods at universities.
Developing tools in humanities computing Dave Marcial
This document discusses several projects related to digital humanities and computing in the Philippines. It describes projects involving indigenous knowledge preservation in history, linguistics, literature, art, archaeology and music. Other projects include a role-playing game about Negros Oriental history, a simulation game about a university's enrollment process and history, and a system for cataloging specimens in a natural history museum. The document emphasizes that humanities computing is interdisciplinary in nature and involves collaboration between humanities experts and computing specialists.
Similar to Text and Data Mining - FutureTDM Knowledge Café (20)
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Aggregage
This webinar will explore cutting-edge, less familiar but powerful experimentation methodologies which address well-known limitations of standard A/B Testing. Designed for data and product leaders, this session aims to inspire the embrace of innovative approaches and provide insights into the frontiers of experimentation!
End-to-end pipeline agility - Berlin Buzzwords 2024Lars Albertsson
We describe how we achieve high change agility in data engineering by eliminating the fear of breaking downstream data pipelines through end-to-end pipeline testing, and by using schema metaprogramming to safely eliminate boilerplate involved in changes that affect whole pipelines.
A quick poll on agility in changing pipelines from end to end indicated a huge span in capabilities. For the question "How long time does it take for all downstream pipelines to be adapted to an upstream change," the median response was 6 months, but some respondents could do it in less than a day. When quantitative data engineering differences between the best and worst are measured, the span is often 100x-1000x, sometimes even more.
A long time ago, we suffered at Spotify from fear of changing pipelines due to not knowing what the impact might be downstream. We made plans for a technical solution to test pipelines end-to-end to mitigate that fear, but the effort failed for cultural reasons. We eventually solved this challenge, but in a different context. In this presentation we will describe how we test full pipelines effectively by manipulating workflow orchestration, which enables us to make changes in pipelines without fear of breaking downstream.
Making schema changes that affect many jobs also involves a lot of toil and boilerplate. Using schema-on-read mitigates some of it, but has drawbacks since it makes it more difficult to detect errors early. We will describe how we have rejected this tradeoff by applying schema metaprogramming, eliminating boilerplate but keeping the protection of static typing, thereby further improving agility to quickly modify data pipelines without fear.
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...Social Samosa
The Modern Marketing Reckoner (MMR) is a comprehensive resource packed with POVs from 60+ industry leaders on how AI is transforming the 4 key pillars of marketing – product, place, price and promotions.
The Ipsos - AI - Monitor 2024 Report.pdfSocial Samosa
According to Ipsos AI Monitor's 2024 report, 65% Indians said that products and services using AI have profoundly changed their daily life in the past 3-5 years.
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Kaxil Naik
Navigating today's data landscape isn't just about managing workflows; it's about strategically propelling your business forward. Apache Airflow has stood out as the benchmark in this arena, driving data orchestration forward since its early days. As we dive into the complexities of our current data-rich environment, where the sheer volume of information and its timely, accurate processing are crucial for AI and ML applications, the role of Airflow has never been more critical.
In my journey as the Senior Engineering Director and a pivotal member of Apache Airflow's Project Management Committee (PMC), I've witnessed Airflow transform data handling, making agility and insight the norm in an ever-evolving digital space. At Astronomer, our collaboration with leading AI & ML teams worldwide has not only tested but also proven Airflow's mettle in delivering data reliably and efficiently—data that now powers not just insights but core business functions.
This session is a deep dive into the essence of Airflow's success. We'll trace its evolution from a budding project to the backbone of data orchestration it is today, constantly adapting to meet the next wave of data challenges, including those brought on by Generative AI. It's this forward-thinking adaptability that keeps Airflow at the forefront of innovation, ready for whatever comes next.
The ever-growing demands of AI and ML applications have ushered in an era where sophisticated data management isn't a luxury—it's a necessity. Airflow's innate flexibility and scalability are what makes it indispensable in managing the intricate workflows of today, especially those involving Large Language Models (LLMs).
This talk isn't just a rundown of Airflow's features; it's about harnessing these capabilities to turn your data workflows into a strategic asset. Together, we'll explore how Airflow remains at the cutting edge of data orchestration, ensuring your organization is not just keeping pace but setting the pace in a data-driven future.
Session in https://budapestdata.hu/2024/04/kaxil-naik-astronomer-io/ | https://dataml24.sessionize.com/session/667627
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...sameer shah
"Join us for STATATHON, a dynamic 2-day event dedicated to exploring statistical knowledge and its real-world applications. From theory to practice, participants engage in intensive learning sessions, workshops, and challenges, fostering a deeper understanding of statistical methodologies and their significance in various fields."
1. Text and Data Mining: Explaining the Relevance
dr. Steven Claeyssens | @sclaeyssens
2.
3. Text and data
= result of
more than 200 years of collecting
over 30 years of digitisation
almost 10 years of collecting born-digital
= machine readable, mostly textual
= structured or semi-structured
= legally as open as possible
8. Text and Data Mining
= using text corpora in bulk as complex (‘biggish’, structured and semi-structured) data
= using computational techniques (IR, NLP, ML, NER, vector space models, …) to derive information
by computer scientists and (digital) humanities scholars
e.g. historians: track actors (networks), concepts (semantic fields) and ideas over space and time
=> identifying patterns and needles (longue durée and microhistory)
= new ways to help us understand culture, society, humanity