“Data? I don’t have data” is a common refrain for researchers working in the arts and humanities. Yet whether or not you consider yourself a “digital humanist,” the reality is that most of us are working digitally now, and there are different techniques for managing digital research assets than physical ones. This workshop explores how scholars of all stripes can add value to their research by making the products of their work more organized, transparent, usable, and ethical. In addition to instruction in best practices for managing research assets, participants of this workshop will create a short “data management plan,” excellent practice for fulfilling the NEA, NEH, and IMLS data management plan grant requirement!
"CONTEMPORARY PHILOSOPHIES AND CURRICULUM DEVELOPMENT” in Philosophy of Educa...R.A Duhdra
After studying this chapter, the pupil Teachers can be able to
Explain the role of Contemporary Philosophies in education
Identify the suitable Philosophy for a Particular curriculum
Compare different philosophies for curriculum development
Develop their own philosophy for particular discipline
This is a companion Powerpoint to Ethics & Psychology Podcast on ethical decision-making.
The importance of this podcast and Episode 5 is to set up vignette analysis in future podcasts. Everyone needs to be on the same page in order to apply ethical decision-making in instructional or real life situations.
"CONTEMPORARY PHILOSOPHIES AND CURRICULUM DEVELOPMENT” in Philosophy of Educa...R.A Duhdra
After studying this chapter, the pupil Teachers can be able to
Explain the role of Contemporary Philosophies in education
Identify the suitable Philosophy for a Particular curriculum
Compare different philosophies for curriculum development
Develop their own philosophy for particular discipline
This is a companion Powerpoint to Ethics & Psychology Podcast on ethical decision-making.
The importance of this podcast and Episode 5 is to set up vignette analysis in future podcasts. Everyone needs to be on the same page in order to apply ethical decision-making in instructional or real life situations.
Sirris innovate2011 - Smart Products with smart data - introduction, Dr. Elen...Sirris
This lecture highlights current trends, challenges and opportunities related to the emergence of large amounts of data. It also presents Sirris’s recent research activities in this domain.
Meeting Federal Research Requirements for Data Management Plans, Public Acces...ICPSR
These slides cover evolving federal research requirements for sharing scientific data. Provided are updates on federal agency responses to the 2013 OSTP memo, guidance on data management plans, resources for data management and curation training for staff/researchers, and tips for evaluating public data-sharing services. ICPSR's public data-sharing service, openICPSR, is also presented. Recording of this presentation is here: https://www.youtube.com/watch?v=2_erMkASSv4&feature=youtu.be
Introduction to Big Data
Big Data is a massive collection of data that is growing exponentially over time.
It is a data set that is so large and complex that traditional data management tools cannot store or process it efficiently.
Big data is a type of data that is extremely large in size.
Big data is a broad term for data sets so large or complex that traditional data processing applications are inadequate. Challenges include analysis, capture, data curation, search, sharing, storage, transfer, visualization, and information privacy.
Big Data, NoSQL, NewSQL & The Future of Data ManagementTony Bain
It is an exciting and interesting time to be involved in data. More change of influence has occurred in the database management in the last 18 months than has occurred in the last 18 years. New technologies such as NoSQL & Hadoop and radical redesigns of existing technologies, like NewSQL , will change dramatically how we manage data moving forward.
These technologies bring with them possibilities both in terms of the scale of data retained but also in how this data can be utilized as an information asset. The ability to leverage Big Data to drive deep insights will become a key competitive advantage for many organisations in the future.
Join Tony Bain as he takes us through both the high level drivers for the changes in technology, how these are relevant to the enterprise and an overview of the possibilities a Big Data strategy can start to unlock.
Webinar for the Mountain West Digital Library on how to turn your digital collections into datasets for digital humanities research. Includes a case study of the University of Utah Marriott Library and four digital collections we made available as datasets.
More Related Content
Similar to Data Management for the Arts and Humanities
Sirris innovate2011 - Smart Products with smart data - introduction, Dr. Elen...Sirris
This lecture highlights current trends, challenges and opportunities related to the emergence of large amounts of data. It also presents Sirris’s recent research activities in this domain.
Meeting Federal Research Requirements for Data Management Plans, Public Acces...ICPSR
These slides cover evolving federal research requirements for sharing scientific data. Provided are updates on federal agency responses to the 2013 OSTP memo, guidance on data management plans, resources for data management and curation training for staff/researchers, and tips for evaluating public data-sharing services. ICPSR's public data-sharing service, openICPSR, is also presented. Recording of this presentation is here: https://www.youtube.com/watch?v=2_erMkASSv4&feature=youtu.be
Introduction to Big Data
Big Data is a massive collection of data that is growing exponentially over time.
It is a data set that is so large and complex that traditional data management tools cannot store or process it efficiently.
Big data is a type of data that is extremely large in size.
Big data is a broad term for data sets so large or complex that traditional data processing applications are inadequate. Challenges include analysis, capture, data curation, search, sharing, storage, transfer, visualization, and information privacy.
Big Data, NoSQL, NewSQL & The Future of Data ManagementTony Bain
It is an exciting and interesting time to be involved in data. More change of influence has occurred in the database management in the last 18 months than has occurred in the last 18 years. New technologies such as NoSQL & Hadoop and radical redesigns of existing technologies, like NewSQL , will change dramatically how we manage data moving forward.
These technologies bring with them possibilities both in terms of the scale of data retained but also in how this data can be utilized as an information asset. The ability to leverage Big Data to drive deep insights will become a key competitive advantage for many organisations in the future.
Join Tony Bain as he takes us through both the high level drivers for the changes in technology, how these are relevant to the enterprise and an overview of the possibilities a Big Data strategy can start to unlock.
Webinar for the Mountain West Digital Library on how to turn your digital collections into datasets for digital humanities research. Includes a case study of the University of Utah Marriott Library and four digital collections we made available as datasets.
Finding, Evaluating, and Using Quality Information Rebekah Cummings
How to find, evaluate, and capture quality information. Lecture and workshop for undergraduate students. Cover fake news, media bias, strategies for evaluating websites, use of library resources, and capturing resources in Zotero.
Worth a Thousand Words: Finding, Evaluating, and Using Historical ImagesRebekah Cummings
45 minute lecture and interactive discussion on finding, evaluating, using, and citing images for historical research. Includes short discussions on copyright, fair use, Creative Commons licenses, and attribution. Presentation created for a first year information literacy college class.
45 minute lecture and interactive discussion about the purpose of newspapers, journalism ethics, fake news, bias, and the role of a reader in parsing real news from fake news. Created for a first year college information literacy class.
Level Up! Building data services at the Marriott LibraryRebekah Cummings
Research data services have become a common fixture in academic libraries, yet many libraries still struggle to develop an appropriate and in-demand mix of services to support their research community. While an elite few offer seemingly endless curatorial assistance, the majority of libraries are building basic to mid-level services such as DMP support, workshops, and consultations. This case study provides a detailed look at the University of Utah Marriott Library’s data services, the rationale behind our current service model, the results of our campus data needs assessment, and how we plan to grow our technical infrastructure into the future. In addition to an overview of our data service mix, we will look closely at one current initiative, the Entertainment, Arts, and Engineering (EAE) Thesis Preservation Project, which highlights curation challenges such as irregular and proprietary file formats, copyright restrictions, long-term preservation, and a lack of appropriate metadata standards. This presentation will highlight the Marriott Library’s data curation accomplishments to date alongside an honest assessment of ongoing challenges.
Your digital humanities are in my library! No, your library is in my digital ...Rebekah Cummings
A presentation on the intersection of libraries and digital humanities presented at the Utah Digital Humanities Symposium at Utah Valley University on February 26, 2016.
A 40 minute presentation and demo on how to use bibliographic management systems. This presentation also included extensive demonstrations in Zotero and EndNote.
This is the PowerPoint for my "Data Management for Undergraduate Researchers" workshop for the Office of Undergraduate Research Seminar and Workshop Series. Major topics include motivations behind good data management, file naming, version control, metadata, storage, and archiving.
Since Wikipedia launched in 2001, librarians have maintained a cautious and, at times, hostile relationship with the online, crowd-sourced encyclopedia. Librarians have largely ignored Wikipedia, citing it as an unreliable and non-authoritative resource, and steering information seekers toward traditional reference materials. While librarians waged this quiet war, Wikipedia has gained increasing dominance as an information resource, and is now the indisputable starting point for most quick research. In this presentation, attendees will learn how to wield the power of Wikipedia in their libraries and embrace Wikipedia as an information resource. Presenters will discuss how to use Wikipedia for reference and instruction, linking online resources, increasing search engine optimization, and creating linked data for the semantic web. Presenters will also discuss the great need for librarians to delve into the world of Wikipedia as researchers and contributors; including the ethics of contributing to Wikipedia. Presenters: Dustin Fife, Rebekah Cummings, Jessica Breiman
The Roman Empire A Historical Colossus.pdfkaushalkr1407
The Roman Empire, a vast and enduring power, stands as one of history's most remarkable civilizations, leaving an indelible imprint on the world. It emerged from the Roman Republic, transitioning into an imperial powerhouse under the leadership of Augustus Caesar in 27 BCE. This transformation marked the beginning of an era defined by unprecedented territorial expansion, architectural marvels, and profound cultural influence.
The empire's roots lie in the city of Rome, founded, according to legend, by Romulus in 753 BCE. Over centuries, Rome evolved from a small settlement to a formidable republic, characterized by a complex political system with elected officials and checks on power. However, internal strife, class conflicts, and military ambitions paved the way for the end of the Republic. Julius Caesar’s dictatorship and subsequent assassination in 44 BCE created a power vacuum, leading to a civil war. Octavian, later Augustus, emerged victorious, heralding the Roman Empire’s birth.
Under Augustus, the empire experienced the Pax Romana, a 200-year period of relative peace and stability. Augustus reformed the military, established efficient administrative systems, and initiated grand construction projects. The empire's borders expanded, encompassing territories from Britain to Egypt and from Spain to the Euphrates. Roman legions, renowned for their discipline and engineering prowess, secured and maintained these vast territories, building roads, fortifications, and cities that facilitated control and integration.
The Roman Empire’s society was hierarchical, with a rigid class system. At the top were the patricians, wealthy elites who held significant political power. Below them were the plebeians, free citizens with limited political influence, and the vast numbers of slaves who formed the backbone of the economy. The family unit was central, governed by the paterfamilias, the male head who held absolute authority.
Culturally, the Romans were eclectic, absorbing and adapting elements from the civilizations they encountered, particularly the Greeks. Roman art, literature, and philosophy reflected this synthesis, creating a rich cultural tapestry. Latin, the Roman language, became the lingua franca of the Western world, influencing numerous modern languages.
Roman architecture and engineering achievements were monumental. They perfected the arch, vault, and dome, constructing enduring structures like the Colosseum, Pantheon, and aqueducts. These engineering marvels not only showcased Roman ingenuity but also served practical purposes, from public entertainment to water supply.
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdfTechSoup
In this webinar you will learn how your organization can access TechSoup's wide variety of product discount and donation programs. From hardware to software, we'll give you a tour of the tools available to help your nonprofit with productivity, collaboration, financial management, donor tracking, security, and more.
Synthetic Fiber Construction in lab .pptxPavel ( NSTU)
Synthetic fiber production is a fascinating and complex field that blends chemistry, engineering, and environmental science. By understanding these aspects, students can gain a comprehensive view of synthetic fiber production, its impact on society and the environment, and the potential for future innovations. Synthetic fibers play a crucial role in modern society, impacting various aspects of daily life, industry, and the environment. ynthetic fibers are integral to modern life, offering a range of benefits from cost-effectiveness and versatility to innovative applications and performance characteristics. While they pose environmental challenges, ongoing research and development aim to create more sustainable and eco-friendly alternatives. Understanding the importance of synthetic fibers helps in appreciating their role in the economy, industry, and daily life, while also emphasizing the need for sustainable practices and innovation.
How to Split Bills in the Odoo 17 POS ModuleCeline George
Bills have a main role in point of sale procedure. It will help to track sales, handling payments and giving receipts to customers. Bill splitting also has an important role in POS. For example, If some friends come together for dinner and if they want to divide the bill then it is possible by POS bill splitting. This slide will show how to split bills in odoo 17 POS.
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxEduSkills OECD
Andreas Schleicher presents at the OECD webinar ‘Digital devices in schools: detrimental distraction or secret to success?’ on 27 May 2024. The presentation was based on findings from PISA 2022 results and the webinar helped launch the PISA in Focus ‘Managing screen time: How to protect and equip students against distraction’ https://www.oecd-ilibrary.org/education/managing-screen-time_7c225af4-en and the OECD Education Policy Perspective ‘Students, digital devices and success’ can be found here - https://oe.cd/il/5yV
2024.06.01 Introducing a competency framework for languag learning materials ...Sandy Millin
http://sandymillin.wordpress.com/iateflwebinar2024
Published classroom materials form the basis of syllabuses, drive teacher professional development, and have a potentially huge influence on learners, teachers and education systems. All teachers also create their own materials, whether a few sentences on a blackboard, a highly-structured fully-realised online course, or anything in between. Despite this, the knowledge and skills needed to create effective language learning materials are rarely part of teacher training, and are mostly learnt by trial and error.
Knowledge and skills frameworks, generally called competency frameworks, for ELT teachers, trainers and managers have existed for a few years now. However, until I created one for my MA dissertation, there wasn’t one drawing together what we need to know and do to be able to effectively produce language learning materials.
This webinar will introduce you to my framework, highlighting the key competencies I identified from my research. It will also show how anybody involved in language teaching (any language, not just English!), teacher training, managing schools or developing language learning materials can benefit from using the framework.
Unit 8 - Information and Communication Technology (Paper I).pdfThiyagu K
This slides describes the basic concepts of ICT, basics of Email, Emerging Technology and Digital Initiatives in Education. This presentations aligns with the UGC Paper I syllabus.
5. • MEET GRANT AND JOURNAL REQUIREMENTS
• PROMOTE TRANSPARENCY
• ENABLE NEW DISCOVERIES FROM YOUR DATA
• MAKE THE RESULTS OF PUBLICLY FUNDED RESEARCH
PUBLICLY AVAILABLE
8. “WHEN YOU CALL SOMETHING DATA, YOU IMPLY THAT IT EXISTS IN
DISCRETE, FUNGIBLE UNITS; THAT IT IS COMPUTATIONALLY
TRACTABLE; THAT ITS MEANINGFUL QUALITIES CAN BE ENUMERATED
IN A FINITE LIST; THAT SOMEONE ELSE PERFORMING THE SAME
OPERATIONS ON THE SAME DATA WILL COME UP WITH THE SAME
RESULTS. THIS IS NOT HOW HUMANISTS THINK OF THE MATERIAL
THEY WORK WITH.”
- - MIRIAM POSNER “HUMANITIES DATA: A NECESSARY CONTRADICTION”
9.
10.
11. • 2 PAGE LIMIT
• MUST ADDRESS TWO MAIN TOPICS:
• WHAT DATA WILL YOUR RESEARCH GENERATE?
• WHAT IS YOUR PLAN FOR MANAGING THE DATA?
• MUST REFLECT BEST PRACTICES IN THE APPLICANTS AREA OF RESEARCH AND SHOULD
BE APPROPRIATE TO THE DATA THE PROJECT WILL GENERATE
• DMP COMPLIANCE WILL BE EVALUATED IN POST-AWARD MONITORING/REPORTS.
• RECOMMENDATION TO LOOK AT ICPSR, SUCCESSFUL ODH APPLICATIONS, AND
DIGITAL HUMANITIES CURATION GUIDE
12. • 5,000 CHARACTER LIMIT (ABOUT 2 PAGES)
• DOCUMENT HOW “ANY RAW DATA AND METADATA RESULTING FROM THE
PROPOSED PROJECT WILL BE MAINTAINED DURING AND BEYOND THE LIFE OF THE
GRANT.”
• DISCUSS CONFIDENTIALITY AS APPROPRIATE
• COSTS OF STORING AND SHARING ARE ALLOWABLE DURING THE GRANT PERIOD
• NO DETAILED PLAN IS NEEDED, AS LONG AS THE STATEMENT IS ACCOMPANIED BY A
CLEAR JUSTIFICATION.
13. IMLS– NEW! (2017)
• “DIGITAL PRODUCT FORM”
• COMMITTED TO EXPANDING PUBLIC ACCESS TO FEDERALLY FUNDED DIGITAL PRODUCTS
(E.G., DIGITAL CONTENT, RESOURCES, ASSETS, SOFTWARE, AND DATASETS).
• MUCH MORE STRUCTURED THAN NORMAL DMP REQUIREMENTS – 9 PAGE FORM
• SPECIAL SECTIONS FOR DATASETS, SOFTWARE, AND INTELLECTUAL PROPERTY
• SPECIFICALLY TELLS YOU TO CHARGE THE AWARD FOR PUBLICATION AND SHARING OF
DIGITAL PRODUCTS -- EVEN COSTS INCURRED AFTER PROJECT CLOSEOUT.
40. • CONSIDER ALL THE TYPES OF FILES YOU WILL HANDLE DURING THE COURSE OF
YOUR PROJECT.
• DEVELOP A NESTED FOLDER STRUCTURE THAT MAKES SENSE FOR YOUR PROJECT
AND YOUR TEAM’S RETRIEVAL NEEDS.
• NAME FOLDERS CLEARLY, WITHOUT SPECIAL CHARACTERS (AVOID REDUNDANCY)
• USE A STANDARD FOLDER STRUCTURE FOR EACH PROJECT OR SUBPROJECT
(INCLUDING MAKING FOLDERS FOR FILES NOT YET CREATED)
• CREATE A REFERENCE DOCUMENT (README FILE) THAT NOTES THE PURPOSE OF
DIFFERENT FOLDER.
University of Massachusetts Medical School Library http://libraryguides.umassmed.edu/file_management
41.
42. 1. Is there a better way to organize these files?
2. Can you spot any problems with the way these files are names?
48. “ALL DATA FILES WILL BE STORED ON THE UNIVERSITY SERVER THAT IS BACKED UP
NIGHTLY. THE UNIVERSITY'S COMPUTING NETWORK IS PROTECTED FROM VIRUSES BY A
FIREWALL AND ANTI-VIRUS SOFTWARE. DIGITAL RECORDINGS WILL BE COPIED TO THE
SERVER EACH DAY AFTER INTERVIEWS.
SIGNED CONSENT FORMS WILL BE STORED IN A LOCKED CABINET IN THE OFFICE.
INTERVIEW RECORDINGS AND TRANSCRIPTS, WHICH MAY CONTAIN PERSONAL
INFORMATION, WILL BE PASSWORD PROTECTED AT FILE-LEVEL AND STORED ON THE
SERVER.
ORIGINAL VERSIONS OF THE FILES WILL ALWAYS BE KEPT ON THE SERVER. IF COPIES OF
FILES ARE HELD ON A LAPTOP AND EDITS MADE, THEIR FILE NAMES WILL BE CHANGED.”
57. Unstructured
Data
Structured Data
Title Growth of rodent
kidney cells in serum
media and the effect of
viral transformations on
growth.
Author Gary Bradshaw
Date 1982
Publisher University of Nebraska
Medical Center
Subject Kidney -- Cytology
70. • PUBLISH YOUR DATA ONLINE WITH A PERSISTENT
IDENTIFIER (DOI OR ARK)
• PUBLISH YOUR DATA IN A REPUTABLE, PUBLIC DATA
REPOSITORY
• CONVERT YOUR DATA TO STABLE, NON-PROPRIETARY
FORMATS FOR LONG-TERM ACCESS
• PUBLISH ENOUGH CONTEXT TO MAKE YOUR DATA
UNDERSTANDABLE (METADATA, CODE, WORKFLOWS)
• LINK YOUR DATA TO YOUR PUBLICATIONS AS OFTEN AS
POSSIBLE
71. • STATE HOW YOU WANT TO GET CREDIT FOR YOUR
DATA
• ALWAYS CITE THE SOURCES OF DATA THAT YOU USE
AND INCLUDE DATA CITATIONS WITH YOUR
DATASETS
• INCLUDE PUBLIC RESEARCH ASSETS IN YOUR
FACULTY PROFILE
Content from “Ten Simple Rules for the Care and
Feeding of Scientific Data”
http://journals.plos.org/ploscompbiol/article?id=10.1
371/journal.pcbi.1003542
74. • STORAGE REDUNDANCY
• SECURITY/ CONFIDENTIALITY
• LONG TERM PRESERVATION (FIXITY
CHECKS, FORWARD MIGRATION)
• PERSISTENT IDENTIFIERS
Metadata
Preparation
Wider visibility of
research and access
to data
Secondary analysis
tools
Data archives services may include:
Introduction
I have wanted to do this version of a talk for the last few years...
Data is a term we aren’t all that comfortable with in the arts and humanities. I’ve worked with a few humanists on data management plans, and the first thing they often say is “I don’t have data.” Even for digital humanists who might feel more comfortable talking about data, few of them have been steeped in best practices for data management.
My hope for today is that can talk a little bit about gaining some best practices for managing whatever digital objects you are working with so you can make informed choices when it comes to collecting, organizing, naming, storing, and sharing the products of your research.
“The Sculptor” https://collections.lib.utah.edu/ark:/87278/s628079t
Let's start by giving a definition of data management. Some people conflate it with an element of data management, like data storage, but doesn’t really encompass all the elements of data management. (READ DEFINITION)
Data management is the process of intervening in the research process to migrate data into new formats, to enhance it through additional layers of context, markup, or metadata, and to otherwise ensure that data is maintained in as highly-functional a form as possible. (https://guide.dhcuration.org/contents/intro/)
Another term often used is data curation, which of course has its roots in the museum and gallery world where curation means to add value to something, which is exactly what we do when we manage our data well.
An important note here is that DM is not just something we do at any one point in the project. It doesn’t happen at the end when we package everything up and share it on GitHub or in a repository. It happens throughout the entire lifecycle of your data from the planning stages of your project, collection, analyzing your data, publication, and long-term archiving.
_________________
Discovery and Planning – collecting new data, combining datasets, using secondary data? Need to consider these things before the project begins
Type and format of data
Consider privacy, confidentiality, ethical issues, consider documentation.
Identify potential users of data; will it be useful for secondary analysis
Identify data repository
Consider data management costs and budget
Data collection
File organization – naming conventions; versioning policies
Backup and storage policies
Quality Assurance Protocols – implement protocols to check on the data
Consider access control and data security
Preparation and Data Analysis Phase
Clean, manipulate, or process the raw data
Document any changes to the data
Create a “master” version to be analyzed and eventually archived (MAKE the final version of the data read only)
Document analysis procedures
Publication and Sharing
Prepare data files and other research materials for future reuse;
Thinking about data management earlier is better. Considering what choices you can make to make your date more understandable and more open and ready to share. In a format that is open or supported by an appropriate repository.
Let’s talk a little bit about what we mean we say data. Data is incredibly diverse, and has a tendency to look different in different fields.
Scientific data – observations about the natural world, computational models, lab notebooks, streaming data coming in from sensors, and hand-collected data in the field.
Social sciences –surveys/public opinion, interviews, video recordings, field notes
Humanities – we don’t talk about data as much but data might be a corpus of text, both big and small, records of human history like newspapers or yearbooks, images, letter, birth, death, or marriage records. It might be quantitative or qualitative depending on your research.
Arts - sketchbooks, log books, sets of images, video recordings, trials, prototypes, ceramic glaze recipes, found objects, and correspondence.”
Raw Materials of your research
Whatever research you are conducting, whatever your findings are, data is the alleged evidence you are using to support those findings.”
Most of our research has an endpoint. A research paper, a finished art piece in an exhibit, a book. Data is the stuff you used and collected along the way that led to or validated the finished product. Traditionally in research, we shared the finished product. But what we are finding is that the underlying research materials might be just as important. And in a digital world it’s possible to make a curated set of the underlying research assets available and link them to the finished product to add context, transparency, and evidence to your work.
Why manage data?
The main reason you should manage your data/research assets is for yourself and for your own research team.
Prevent data loss or data errors
Data management is one of those essential skills you need to get just like learning how manage citations or understand research methods.
But it can feel a bit boring like filing. But five years down the road when you want to locate a file, or even understand your file, your future self will thank you.
But there are other reasons data management seems to be more of a thing besides just being a best practice. The reason so many people are talking about data management is that an increasing number of funders and journals are requiring that researchers share their data.
2003 – NIH; 2011 – NSF; 2013 – OSTP Memo; Federal agencies with over $100 million/year in R&D must develop a plan to support public access to research. Now there are divisions of NEH and NEA that require it as well as private funders like the Bill and Melinda Gates foundation.
So why else would you want to even think about recording your creative process and all the stuff that comes with it? This gets be thinking about the famous and probably only American abstract painter at that time Jackson Pollock. A lot of the photographs of Jackson Pollock and his drip painting process became as renowned as the paintings themselves.
Ceramic glaze recipe
Oil spot and hare’s fur glazes are beautiful and fascinating. In a nutshell, they are high-iron glazes that are applied in thick layers, which bubble up through one another and generate patterns ranging from metallic crystals to running streaks. These effects resemble, you guessed it, oil spots or the striated patterns in the fur of a rabbit. Of course, the explanation for how and why this happens is far more complex than that, so it’s a good thing John Britt did his homework and explains it so well in this post! –Jennifer Poellot Harnetty, editor.
Reading the 2,000 word post on how this look was achieved gave me a new appreciation that artist methods are not that dissimilar from scientific methods.
Not so much so your work can be reproduced but understood.
The bottom line is that we are all working digitally now, and there are different techniques for managing digital assets than physical ones.
Version control – what gets kept? Some datasets never stop amassing data. (Twitter archive, Hubble telescope)
Ethical considerations - Lomax archive; Human subjects – many consent forms don’t include how the researcher plans on sharing the data.
Listening to the Lomax Archive: The Sonic Rhetorics of American Folksong in the 1930s. https://humanities.utah.edu/awards/jonathanstoneneh.php
Additional Challenge in the Arts and Humanities:
“it’s that humanists have a very different way of engaging with evidence than most scientists or even social scientists. And we have different ways of knowing things than people in other fields. We can know something to be true without being able to point to a dataset, as it’s traditionally understood.
And I would argue that the notion of reproducible research in the humanities just doesn’t have much currency, the way it does in the sciences, because humanists tend to believe that the scholar’s own subject position is inextricably linked to the scholarship she produces.”
- Miriam Posner
Nevertheless! Despite these challenges and reservations and the ways we might squabble over terminology, we have some real data management challenges in the arts and humanities and room for improvement when it comes to managing digital research assets.
So humanists and artists — even those who aren’t digital humanists — desperately need some help managing their stuff. – Miriam Posner
I’ve structured this workshop around common elements required in DMPs.
Because it’s the reason many of us are thinking about data management and I like active learning, I’ve framed this workshop today around the concept of Data Management Plans because I think they are useful tools to learn about the key concepts of data management. Many of us care about being competitive with grant funding…
Reviewing grants for NSF in 2016
Funding agencies don’t have identical reqirements when it comes to DMPs. I’m going to highlight the requirements from 3 different funding agencies where I thought people attending this might look for funding opportuinties.
https://www.neh.gov/sites/default/files/2018-06/data_management_plans_2018.pdf
Distribute DMPs
Think a bit about what materials you produce as part of your research process that might add context or clarity or just be of interest to others.
I thought it might be useful to show examples of what data might be in the arts and humanities, and then we're going to take 5 minutes to brainstorm about what your data is.
https://exhibits.lib.utah.edu/s/century-of-black-mormons/page/flake-green#?#documents&xywh=-1140%2C-6%2C3302%2C1246
Sometimes you don’t have to actually keep all the data you use, especially if you’re using secondary data that you can point to, like census records. But you need to have enough context where someone else can find your data sources.
text files extracted from a corpus of texts by Optical Character Recognition software
Notice I don’t list things like word clouds here.
Getty Research Institute – “Streets of Los Angeles”
https://www.getty.edu/research/scholars/digital_art_history/pdfs/gri_ruscha_proposals.pdf
https://www.getty.edu/research/special_collections/notable/ruscha.html
The archive comprises over half a million images to date—including negatives, digital files, hundreds of contact sheets and the complete production archive Ruscha’s seminal artist book, Every Building on the Sunset Strip (1966)—
https://artivity.io/
https://blog.spoongraphics.co.uk/tutorials/how-to-create-a-double-exposure-effect-in-photoshop
https://researchdata.jiscinvolve.org/wp/2016/11/22/research-data-creative-performing-arts/
Artivity
Funded by JISC
Open source
Created by the University of Arts, London
Understanding the techniques of artists is an essential part of studying art and art history. The process of creating an artwork is often more valueable than the artwork itself.
In traditional art historical discourses, art forms such as painting, sculpture and printmaking, can be studied by technically examining the artwork for evidence of technique. In digital art, this evidence are often lost as soon as the editing session on a piece of software ends.
Artivity can document the creation process of your digital artwork. This is critical for attributing art which is increasingly shared online, but also for interpreting individual artworks and their context within a given social and technical environment. It is a long term self archiving tool which does not intefere with your practice.
Artivity is a project which aims to produce a toolkit for capturing contextual data produced during the creative process of artists and designers while working on a computer. The Artivity open source software is developed by Semiodesk GmbH in partnership with the University of the Arts, London . The project was initiated by Dr. Athanasios Velios at the Ligatus Research Centre. It is funded by JISC since March 2015.
https://www.semiodesk.com/2015/11/23/artivity-available-windows-mac/
KML is a file format used to display geographic data in an Earth browser such as Google Earth. KML uses a tag-based structure with nested elements and attributes and is based on the XML standard. All tags are case-sensitive and must appear exactly as they are listed in the KML Reference. The Reference indicates which tags are optional. Within a given element, tags must appear in the order shown in the Reference.
#stopkavanaugh vs. #confirmkavanaugh
Maybe you don’t have the raw data; say how you accessed the data.
“A book about a sonic archive should not be silent!” Stone said. “As a digital monograph under contract with The University of Michigan Press, I hope Listening to the Lomax Archive breaks new scholarly ground as it brings the sounds and other material artifacts of the Library of Congress’s American Folklife Center to the reader in ways not possible with conventional publishing,” he explained. “My plan is to take careful advantage of the format, demonstrating how digital scholarship can meet traditional scholarly expectations for monographs, while also changing the shape and sound of scholarly argumentation.”
Data – song files
Prisoners Breaking up Rocks at a Prison Camp or Road Construction Site. , None. [Between 1934 and 1950] [Photograph] Retrieved from the Library of Congress, https://www.loc.gov/item/2007660387/.
Take less than five minutes, turn to the person next to you and brainstorm about what your data might be. Think about one project or several projects. Talk about your data. Jot down notes on your handout.
We’ve talked in broad strokes about data management but now we are going to focus in one some of the more specific aspects of managing data well.
One of the simplest things that you can do is to be more consistent with file naming, version control, and folder structures.
This section has a lot to do with organizing and naming your research materials so that you can find them later and so they will open in any environment.
These are the four main things you need to do when managing your research assets or data. Starting at the beginning of your project.
This is important whether you are doing traditional humanist work, digital humanities, or social science research. Too many of us continue to accrue digital content without ever thinking about how we plan on managing them.
We’ve talked about data management at kind of a high level. What is data? Why should you manage it well?
Now we are going to talk about some of the nuts and bolts of data management. Starting with file naming. How do you currently name files? Do you have a system?
To some extent we are all guilty of bad file naming but when it comes to your research it is important to create a system that makes sense not just to you, but other people as well.
are all guilty of bad file naming but when it comes to your research it is important to create a system that makes sense not just to you, but other people as well.
Here are four best practices for naming your digital files.
File names should reflect the contents of a file and enough information to uniquely identify the data file without getting way too long.
Don’t be generic in your file names MyData; avoid generic file names that may conflict when moved from one location to another.
Appropriate length – should be long enough to be descriptive but not so long that it becomes absurd.
Be consistent!!!! – whatever you do, do it consistently. If you are working as a group, document your file naming practices ahead of time in a shared document. Ensure the rules are followed systematically. Document your system, don’t rely on file names as your sole source of documentation.
Think critically about what can be added and what can be omitted in your file names. If you are the only person on a project, you probably don’t need your name. However, if you are submitting a paper for a class, the first thing should be your name, not the assignment. “Assignment #1 and the date”. What differentiates your file from everyone else’s is you, not the date or the Assignment number.
Here are some file naming best practices that will make sure your file will open in any environment, in any browser and with any operating system from as many eras as possible.
Special characters can have special meaning in certain programming languages and operating systems and can be misinterpreted in file names.
Ex: $ = beginning of a variable names in php. A backslash designates file path locations in the Windows operating system.
Spaces make things easier for humans to read but some browsers and software don’t know how to interpret spaces. Sometimes it only reads a file up to the space, which can cause problems.
There are also best practices around version control and numbering.
Version control is often achieved by using dates or a standard numbering system
Recommend a couple tools for organizing your sources.
Tropy is free, open-source desktop software that allows you to organize and describe photographs of research material. Once you have imported your photos into Tropy, you can combine photos into items (e.g., photos of the three pages of a letter into a single item), and group photos into lists. You can also describe the content of a photograph. Tropy uses customizable metadata templates with multiple fields for different properties of the content of your photo, for example, title, date, author, box, folder, collection, archive. You can enter information in the template for an individual photo or select multiple photos and add or edit information to them in bulk. Tropy also lets you tag photos. You can also add one or more notes to a photo; a note could be a transcription of a document. A search function lets you find material in your photos, using metadata, tags, and notes.
JPG/JPEG
PNG
SVG
https://vimeo.com/239557418
Quick check for understanding
#1 is the best one.
Descriptive
Not too long, not too short
#2 is the best choice here.
First example here has spaces, irregular dates that won’t line up in order, special characters
Third example may not be descriptive enough for for a secondary user. Also, beware of the “FINAL” as opposed to using a standardized numbering system.
That is how to name an individual file. What about your whole file structure?
All your research materials need to be in one folder. The top level folder should include the project title and year. If it is multiple year, include the first and last year in the title.
The substructures should have a clear and consistent naming convention that is documented in a README file.
Exercise!!
You are a historian and you have conducted several oral histories with Utah politicians. As of now, you’ve been dumping everything into one folder. Can you think of a better way to organize these files?
Possible solutions:
Organize by type of file (all transcripts in one folder all audio recordings in another)
Organize by person (Have a Cliff Barrett folder and a Robert Bennett folder)
Problems with file names:
Dates are not standardized
Special characters/spaces
File type in the file name which is unnecessary
Unnecessary information in file name – “found on Internet, think okay, better than mine” picture
NO consistency to file naming
Before we move to the next section of your data management plan, we’re going to talk about storage because usually these things go together.
Through the course of your research your data needs to be stored securely, backed up, and maintained regularly. Once again this sounds like common sense, but you will be happy when you pay some attention to it. (e.g. when your laptop crashes or is stolen.).
General Washington’s Treasure Chests - https://calisphere.org/item/ark:/13030/kt287013jk/
#1 rule of data storage – never just keep your data on one device. You are one dropped computer, one spilled glass of water, one unscrupulous thief away from losing all of your data. Every single day I go to Mom’s Café and see people leave their computers at their table while they go to the bathroom or grab a cup of coffee.
LOCKSS - There should never just be one copy of your data. Do you backup your data? Most important data management task. NO less than two, preferably three copies of research data.
How well are you covered against unexpected loss? Make sure that when disaster strikes, it isn’t a disaster
I know this is harder when you have huge data.
There are three options for data storage
Personal computers and laptops – Convenient for storing your data while in use. Should not be used for storing master copies of your data.
Networked drives – Highly recommended. You can share data. Your data is stored in a single place and backed up regularly. Available to you from any place at any time. If using a department drive or Box stored securing thereby minimizing the risk of loss, theft, or authorized access. BEST!!!
External storage devices – thumb drives, flash drives, external hard drive. Cheap, easy to store and pass around. Feel better knowing it’s in your hands where you can see it. Not recommended for the long-term storage of your data.
1 TB of free storage and an additional 50 GB if you are on a sponsored project.
Free!
Secure!
You can share with individuals outside of the institution!
When you leave you can take a copy with you or create a new account
3,2,1 – 3 copies in 2 physical locations, or more than one media.
This is the gold standard
If your research involves human subjects you need to consider legal and ethical obligations in managing and sharing your data. You need a clear view of how you will protect your research subjects.
The success of social science research relies on the willingness of research participants to take part in our research. It is critically important to protect the identities of our research subjects.
As datasets have become available online and it has become easier to link data to publicly available external databases, disclosure risk or reidentification of research subjects has grown.
We need to do our part to minimize disclosure risks and keep sensitive data confidential and secure.
Human subject data
Environmental data
Potentially patentable data
Throughout the course of your research, many of you may collect data that is referred to as human subject data. If you do this, you will need to work with the IRB office on campus to figure out how to protect the privacy of your research subjects. Ultimately, the IRB has the final say, but here are some tips for keeping your confidential data, confidential.
Direct vs. Indirect identifiers
All consent forms will need to be reviewed and approved by IRB. Include a “Provision for Data Sharing”
Data cannot be shared without the consent of the research subject
Tools for qualitative data masking – QualAnon, SSN
Next we are going to talk about data description.
While digital data are machine readable, understanding their mean is a job for humans. The importance of documenting your data throughout your research project cannot be overestimated.
Document your data with a certain level of reuse in mind. Replication? Verification? inspection?
First and foremost, metadata includes any surrounding documentation you may need to make sense of your data. An excel spreadsheet of survey responses is fairly useless if you haven’t kept the survey that generated those responses.
A common definition of metadata is “data about data” Circular and not terribly useful.
For our purposes, that “something else” is our primary research data.
Who created the data, when the data were created or published, title or descriptive name used for the data.
Documentation is meant to be read by humans, metadata is meant to be computer readable. Allows for people to search for your data by title, author, year, variable name, etc.
“Actionable information”
Metadata is very important for people to be able to find your data and to be able to search by fields like title, author, year, and subject. You may need to seek help from librarians or data repositories to collect metadata but it should be a goal.
From Helen Tibbo’s Coursera Class – Types of data and metadata
Slide about Dublin Core and DDI
Additonal metadta elements
Data collection processes
Variable descriptions
Methodologies
Simple standard, low barrier to entry
https://www.getty.edu/research/publications/electronic_publications/intro_controlled_vocab/what.pdf
Getty – Art and Architecture Thesaurus – shows hierarchy
If you are considering building a database of digital objects, talk to a metadata librarian like Anna or Jeremy about using a controlled vocabulary. Allows it to mingle with other collections.
When you start to think about sharing your data, you have to think about the intellectual property aspects of it. Do you own your data? Does someone else? Can you license your data? Under what conditions?
Might be more complicated than you imagine.
With all of these stakeholders it can be difficult to know who is the actual owner of the data.
This is a study that was done at UNC in 2012 that asked researchers on campus, who owns the data? (Read question)
46% of the researchers thought that it was the researcher who owns the data. 15% thought the university owned the data…
8% - funding agency, 9% the public
As we saw earlier with the DaMaRo study, this issue is far from understood in the research communities.
There are a lot of potential stakeholders
In the news…
Paul Aisen an Alzheimer’s researcher at UCSD moved his research to USC.
A California judge issued an injunction to restore control of a massive database to UCSD after the researcher tried to take the database to USC.
UCSD filed the lawsuit to get the database back when the researcher moved it to USC. Claimed they were the rightful owners of the dataset.
As faculty members, there are a lot of policies around data ownership and stewardship.
Try to summarize:
Something most of you know: You own the copyright over your traditional scholarly products – books, journals, creative works. You don’t own the copyright over some of the other things that you produce including your data.
2.) Commercializing your data. Check with TVC.
We didn’t collect the data; may not have any rights over it.
Metadata is usually copyright free
Can’t share full-text of Cormac McCarthy’s novels
Future Death - can’t share our corpus either.
Avail yourself of creative commons licenses
Another area of data management that you will have to consider is data archiving.
Archiving adds additional value to your data.
Long-term preservation
Metadata
Sharable, usually through a persistent identifier
Makes data citable
Your project will end one day. You publish. Where do your research assets go?
OAIS Compliant – Open Archival Information Systems
Three copies – 1 in a different geographic location
In 53 years ICPSR has never had a security breech; consulting
Long-term preservation = Fixity means that a digital file has not been changed between two points of time.
If you remember one thing…
I’m going to ask you to remember four things