This document summarizes an information quality assessment of the file system at the Department of Arkansas Heritage (DAH). It conducted a stakeholder survey, file system scans, and evaluation. The assessment found wasted storage space, duplicate files, difficulty finding files, and lack of naming conventions and governance. It recommended establishing agency-level groups to define naming standards and manage files, providing training, and regularly scanning to measure improvements under a centralized governance structure. The goal is to clean the file system in a non-invasive way through cooperation across DAH agencies.
Empower to the People - Non-invasive Governance: Meeting People Where They AreShelley Keith, MSIQ
"Governance" sounds like work. Like bureaucracy. Like "no." It doesn't have to be this way.
Governance is most effective when it looks like help and sounds like "yes." Governance should be empowering, enlightening, supportive of goals, and flexible enough to allow innovation. It should be non-invasive.
We know that most people want to succeed at whatever they're attempting to do, so let's talk about how to help all the other-duties-as-assigned folks do good work on your site even though it's probably not their primary function.
In this session, you'll learn:
*How to establish a web governance structure that meets your users where they are
*How to focus on your users' own goals to sell them on the importance of content strategy and governance
*Tactics and strategies for turning other-duties-as-assigned content contributors into quality stewards (even if they don't know that's what they are)
Spring 2014 Data Management Lab: Session 1 Slides (more details at http://ulib.iupui.edu/digitalscholarship/dataservices/datamgmtlab)
What you will learn:
1. Build awareness of research data management issues associated with digital data.
2. Introduce methods to address common data management issues and facilitate data integrity.
3. Introduce institutional resources supporting effective data management methods.
4. Build proficiency in applying these methods.
5. Build strategic skills that enable attendees to solve new data management problems.
Presentation on electronic records management and archival issues. Originally presented at the Fall 2008 meeting of the Southeastern Wisconsin Archivists Group
This document summarizes a webinar about managing and preserving scientific data sets. It discusses the definition of science data according to the federal government, why science data is different than other data, current trends and challenges in digital preservation for science. It outlines several levels of digital preservation and provides examples of data being preserved. The webinar discusses the benefits of data management, such as supporting open access and future funding. It also describes existing problems around data management including lack of standards, resources and staffing. Potential solutions discussed include implementing research data management plans and using existing and upcoming tools to help with various stages of the research lifecycle from data creation to long-term preservation and access.
This document presents a draft maturity matrix for long-term scientific data stewardship. The matrix defines 5 levels of maturity for 10 key components of data stewardship, including preservation, accessibility, usability, production sustainability, and data quality. Each increasing level represents more advanced and formalized approaches to managing the data according to established standards and community best practices. The authors thank various subject matter experts who helped define the maturity levels based on their expertise in areas such as data archiving, access, and product development.
The document summarizes the results of Raytheon's efforts to improve their information management and search capabilities. It found that most information was unstructured and not tagged, leading to duplication and difficulty finding information. User surveys identified needs like filtering searches by attributes. Raytheon implemented taxonomies in key areas and saw improvements like increased search and category usage after launching an updated search tool.
Research Data Management Fundamentals for MSU Engineering StudentsAaron Collie
This document discusses the importance of research data management and outlines best practices. It notes that data is expensive to produce but is the primary output of research. Funding agencies now require data management plans to facilitate data sharing and reuse. The document recommends storing data on multiple types of storage, avoiding single points of failure, creating backup strategies, documenting projects and data, and selecting open file formats. Overall, it emphasizes that data management is an important skill for researchers.
Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...Anastasija Nikiforova
This presentation was delivered as part of the Data Science Seminar titled “When, Why and How? The Importance of Business Intelligence“ organized by the Institute of Computer Science (University of Tartu) in cooperation with Swedbank.
In this presentation I talked about:
*“Data warehouse vs. data lake – what are they and what is the difference between them?” (structured vs unstructured, static vs dynamic (real-time data), schema-on-write vs schema on-read, ETL vs ELT) with further elaboration on What are their goals and purposes? What is their target audience? What are their pros and cons?
*“Is the Data warehouse the only data repository suitable for BI?” – no, (today) data lakes can also be suitable. And even more, both are considered the key to “a single version of the truth”. Although, if descriptive BI is the only purpose, it might still be better to stay within data warehouse. But, if you want to either have predictive BI or use your data for ML (or do not have a specific idea on how you want to use the data, but want to be able to explore your data effectively and efficiently), you know that a data warehouse might not be the best option.
*“So, the data lake will save my resources a lot, because I do not have to worry about how to store /allocate the data – just put it in one storage and voila?!” – no, in this case your data lake will turn into a data swamp! And you are forgetting about the data quality you should (must!) be thinking of!
*“But how do you prevent the data lake from becoming a data swamp?” – in short and simple terms – proper data governance & metadata management is the answer (but not as easy as it sounds – do not forget about your data engineer and be friendly with him [always… literally always :D) and also think about the culture in your organization.
*“So, the use of a data warehouse is the key to high quality data?” – no, it is not! Having ETL do not guarantee the quality of your data (transform&load is not data quality management). Think about data quality regardless of the repository!
*“Are data warehouses and data lakes the only options to consider or are we missing something?“– true! Data lakehouse!
*“If a data lakehouse is a combination of benefits of a data warehouse and data lake, is it a silver bullet?“– no, it is not! This is another option (relatively immature) to consider that may be the best bit for you, but not a panacea. Dealing with data is not easy (still)…
In addition, in this talk I also briefly introduced the ongoing research into the integration of the data lake as a data repository and data wrangling seeking for an increased data quality in IS. In short, this is somewhat like an improved data lakehouse, where we emphasize the need of data governance and data wrangling to be integrated to really get the benefits that the data lakehouses promise (although we still call it a data lake, since a data lakehouse is nut sufficiently mature concept with different definitions of it).
Empower to the People - Non-invasive Governance: Meeting People Where They AreShelley Keith, MSIQ
"Governance" sounds like work. Like bureaucracy. Like "no." It doesn't have to be this way.
Governance is most effective when it looks like help and sounds like "yes." Governance should be empowering, enlightening, supportive of goals, and flexible enough to allow innovation. It should be non-invasive.
We know that most people want to succeed at whatever they're attempting to do, so let's talk about how to help all the other-duties-as-assigned folks do good work on your site even though it's probably not their primary function.
In this session, you'll learn:
*How to establish a web governance structure that meets your users where they are
*How to focus on your users' own goals to sell them on the importance of content strategy and governance
*Tactics and strategies for turning other-duties-as-assigned content contributors into quality stewards (even if they don't know that's what they are)
Spring 2014 Data Management Lab: Session 1 Slides (more details at http://ulib.iupui.edu/digitalscholarship/dataservices/datamgmtlab)
What you will learn:
1. Build awareness of research data management issues associated with digital data.
2. Introduce methods to address common data management issues and facilitate data integrity.
3. Introduce institutional resources supporting effective data management methods.
4. Build proficiency in applying these methods.
5. Build strategic skills that enable attendees to solve new data management problems.
Presentation on electronic records management and archival issues. Originally presented at the Fall 2008 meeting of the Southeastern Wisconsin Archivists Group
This document summarizes a webinar about managing and preserving scientific data sets. It discusses the definition of science data according to the federal government, why science data is different than other data, current trends and challenges in digital preservation for science. It outlines several levels of digital preservation and provides examples of data being preserved. The webinar discusses the benefits of data management, such as supporting open access and future funding. It also describes existing problems around data management including lack of standards, resources and staffing. Potential solutions discussed include implementing research data management plans and using existing and upcoming tools to help with various stages of the research lifecycle from data creation to long-term preservation and access.
This document presents a draft maturity matrix for long-term scientific data stewardship. The matrix defines 5 levels of maturity for 10 key components of data stewardship, including preservation, accessibility, usability, production sustainability, and data quality. Each increasing level represents more advanced and formalized approaches to managing the data according to established standards and community best practices. The authors thank various subject matter experts who helped define the maturity levels based on their expertise in areas such as data archiving, access, and product development.
The document summarizes the results of Raytheon's efforts to improve their information management and search capabilities. It found that most information was unstructured and not tagged, leading to duplication and difficulty finding information. User surveys identified needs like filtering searches by attributes. Raytheon implemented taxonomies in key areas and saw improvements like increased search and category usage after launching an updated search tool.
Research Data Management Fundamentals for MSU Engineering StudentsAaron Collie
This document discusses the importance of research data management and outlines best practices. It notes that data is expensive to produce but is the primary output of research. Funding agencies now require data management plans to facilitate data sharing and reuse. The document recommends storing data on multiple types of storage, avoiding single points of failure, creating backup strategies, documenting projects and data, and selecting open file formats. Overall, it emphasizes that data management is an important skill for researchers.
Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...Anastasija Nikiforova
This presentation was delivered as part of the Data Science Seminar titled “When, Why and How? The Importance of Business Intelligence“ organized by the Institute of Computer Science (University of Tartu) in cooperation with Swedbank.
In this presentation I talked about:
*“Data warehouse vs. data lake – what are they and what is the difference between them?” (structured vs unstructured, static vs dynamic (real-time data), schema-on-write vs schema on-read, ETL vs ELT) with further elaboration on What are their goals and purposes? What is their target audience? What are their pros and cons?
*“Is the Data warehouse the only data repository suitable for BI?” – no, (today) data lakes can also be suitable. And even more, both are considered the key to “a single version of the truth”. Although, if descriptive BI is the only purpose, it might still be better to stay within data warehouse. But, if you want to either have predictive BI or use your data for ML (or do not have a specific idea on how you want to use the data, but want to be able to explore your data effectively and efficiently), you know that a data warehouse might not be the best option.
*“So, the data lake will save my resources a lot, because I do not have to worry about how to store /allocate the data – just put it in one storage and voila?!” – no, in this case your data lake will turn into a data swamp! And you are forgetting about the data quality you should (must!) be thinking of!
*“But how do you prevent the data lake from becoming a data swamp?” – in short and simple terms – proper data governance & metadata management is the answer (but not as easy as it sounds – do not forget about your data engineer and be friendly with him [always… literally always :D) and also think about the culture in your organization.
*“So, the use of a data warehouse is the key to high quality data?” – no, it is not! Having ETL do not guarantee the quality of your data (transform&load is not data quality management). Think about data quality regardless of the repository!
*“Are data warehouses and data lakes the only options to consider or are we missing something?“– true! Data lakehouse!
*“If a data lakehouse is a combination of benefits of a data warehouse and data lake, is it a silver bullet?“– no, it is not! This is another option (relatively immature) to consider that may be the best bit for you, but not a panacea. Dealing with data is not easy (still)…
In addition, in this talk I also briefly introduced the ongoing research into the integration of the data lake as a data repository and data wrangling seeking for an increased data quality in IS. In short, this is somewhat like an improved data lakehouse, where we emphasize the need of data governance and data wrangling to be integrated to really get the benefits that the data lakehouses promise (although we still call it a data lake, since a data lakehouse is nut sufficiently mature concept with different definitions of it).
Curation-Friendly Tools for the Scientific Researcherbwestra
Presentation for Online Northwest Conference, in Corvallis Oregon, February 10, 2012.
Highlights electronic lab notebooks (ELN) and OMERO (Open Microscopy Environment) as two tools that enable researchers to better manage their research data.
The document discusses advances in data management practices and technologies for ecosystem science. It describes the role of a data manager in facilitating data management, from collecting raw data to organizing it in standard formats and metadata according to community practices. Well-managed data is stored and shared through repositories to enable discovery, access, interoperability and future reuse. Resources and experts are available to help researchers improve their data management.
Slides from Thursday 2nd August 2018 - Data in the Scholarly Communications Life Cycle Course which is part of the FORCE11 Scholarly Communications Institute.
Presenter - Natasha Simons
Opening/Framing Comments: John Behrens, Vice President, Center for Digital Data, Analytics, & Adaptive Learning Pearson
Discussion of how the field of educational measurement is changing; how long held assumptions may no longer be taken for granted and that new terminology and language are coming into the.
Panel 1: Beyond the Construct: New Forms of Measurement
This panel presents new views of what assessment can be and new species of big data that push our understanding for what can be used in evidentiary arguments.
Marcia Linn, Lydia Liu from UC Berkeley and ETS discuss continuous assessment of science and new kinds of constructs that relate to collaboration and student reasoning.
John Byrnes from SRI International discusses text and other semi-structured data sources and different methods of analysis.
Kristin Dicerbo from Pearson discusses hidden assessments and the different student interactions and events that can be used in inferential processes.
Panel 2: The Test is Just the Beginning: Assessments Meet Systems Context
This panel looks at how assessments are not the end game, but often the first step in larger big-data practices at districts/state/national levels.
Gerald Tindal from the University of Oregon discusses State data systems and special education, including curriculum-based measurement across geographic settings.
Jack Buckley Commissioner of the National Center for Educational Statistics discussing national datasets where tests and other data connect.
Lindsay Page, Will Marinell from the Strategic Data Project at Harvard discussing state and district datasets used for evaluating teachers, colleges of education, and student progress.
Panel 3: Connecting the Dots: Research Agendas to Integrate Different Worlds
This panel will look at how research organizations are viewing the connections between the perspectives presented in Panels 1 and 2; what is known, what is still yet to be discovered in order to achieve the promised of big connected data in education.
Andrea Conklin Bueschel Program Director at the Spencer Foundation
Ed Dieterle Senior Program Officer at the Bill and Melinda Gates Foundation
Edith Gummer Program Manager at National Science Foundation
This document summarizes a seminar on data management for undergraduate researchers. It discusses what data is, why it needs to be managed, and key aspects of the data management process such as data organization, metadata, storage, and archiving. Topics covered include file naming best practices, version control, documentation, metadata standards, storage options, and long-term archiving. The goal is to help researchers organize and document their data so it can be understood, preserved, and reused.
Research data management & planning: an introductionMaggie Neilson
This document provides an introduction to research data management (RDM). It defines RDM as the organization and stewardship of research data throughout a research project and beyond. Key components of RDM include data management plans, metadata, sharing and preservation, and ethical and legal obligations. The document discusses why RDM is important, outlines the goals of the Tri-Agency Statement on digital data management, and provides resources for writing data management plans, creating metadata, sharing data, and addressing privacy and ethics.
Research Data Management and Sharing for the Social Sciences and HumanitiesRebekah Cummings
This document summarizes a presentation on research data management for social and behavioral sciences and humanities. The presentation covered topics such as what data management is, why it is important to manage and share data, how to create data management plans, organize data files through naming conventions and folder structures, describe data through metadata and codebooks, issues around data ownership, and data storage, archiving and sharing options. The presentation was aimed at providing guidance to researchers at the University of Utah on best practices for managing and sharing their research data.
This document discusses research lifecycles and data management. It begins by outlining typical stages in a research lifecycle from planning to publication. It then discusses how data is created and managed at various stages, and raises questions researchers should consider around formatting, documenting, storing, sharing and preserving data. The document provides examples of research lifecycle models and gives advice on best practices for managing data at each stage of the research process to support reuse and ensure data is well documented and preserved.
2-6-14 ESI Supplemental Webinar: The Data Information Literacy ProjectDuraSpace
The document summarizes a webinar about the past, present, and future of the Data Information Literacy Project. The project aims to identify data literacy skills for different disciplines, build infrastructure for teaching those skills, and develop a toolkit for librarians. Case studies were conducted at 5 universities to determine data needs of students and faculty. Educational programs were developed and a symposium and toolkit are planned next. The project identifies 12 core data literacy competencies and aims to develop standards in this area.
Data Management Lab: Session 3 slides (more details at http://ulib.iupui.edu/digitalscholarship/dataservices/datamgmtlab)
What you will learn:
1. Build awareness of research data management issues associated with digital data.
2. Introduce methods to address common data management issues and facilitate data integrity.
3. Introduce institutional resources supporting effective data management methods.
4. Build proficiency in applying these methods.
5. Build strategic skills that enable attendees to solve new data management problems.
This document discusses best practices for data management for research. It covers topics such as file organization, documentation, storage, sharing and publishing data, and archiving. Good practices include using file naming conventions and open formats, documenting projects, processes, and data, making backups in multiple locations, and publishing and archiving data in repositories to enable access and preservation. Data management is important for research reproducibility, sharing, and complying with funder requirements.
It's 2015. Do You Know Where Your Data Are?Patricia Hswe
This document summarizes a presentation on research data management. It discusses definitions of research data and why data should be shared. It provides tips for best practices in file naming, description standards, formats and storage. Tools, resources and services for research data management from Penn State and beyond are presented, including ScholarSphere and DMPTool. The importance of having an online presence and sharing research is discussed.
Meeting Federal Research Requirements for Data Management Plans, Public Acces...ICPSR
These slides cover evolving federal research requirements for sharing scientific data. Provided are updates on federal agency responses to the 2013 OSTP memo, guidance on data management plans, resources for data management and curation training for staff/researchers, and tips for evaluating public data-sharing services. ICPSR's public data-sharing service, openICPSR, is also presented. Recording of this presentation is here: https://www.youtube.com/watch?v=2_erMkASSv4&feature=youtu.be
DRC is a privately held company founded in 1978 that provides survey research, education, document, and data collection services. It has 550 regular employees and 4,000 seasonal/temporary employees across multiple US locations. DRC's services include survey design and administration, test development, printing, data processing, and psychometric analysis to serve educational, government, and commercial clients. It prides itself on quality, security, and long-term client relationships.
Research Data Curation _ Grad Humanities ClassAaron Collie
This document discusses best practices for research data curation and management. It covers topics such as data storage, file organization, documentation, sharing, and archiving. Effective data management practices include making backups in multiple locations, using logical file naming conventions and organization schemes, documenting projects, processes, and data, publishing and sharing data when appropriate, and archiving data for long-term preservation and access. Proper data management ensures that valuable research data is organized, preserved, and accessible to enable future research and verification of results.
The document summarizes the Research Data Alliance (RDA), including its vision of open data sharing across disciplines to address societal challenges, groups working on specific issues, and governance structure involving a council, secretariat, and technical advisory board to guide its work. RDA has over 4,000 members from over 100 countries working in various interest and working groups to develop standards and recommendations to make data sharing and use more effective.
The document provides logistics for a webinar on data curation profiles and the DMPTool. It includes instructions for calling into the audio, asking questions in the chat, and finding recordings and slides. The webinar will discuss the history of data curation profiles, comparing them to data management plans, and a case study of using data curation profiles. Data curation profiles involve interviewing researchers about their data practices and needs in order to understand how to support them, while data management plans focus on requirements for funding. Both tools can help librarians engage with researchers, though data curation profiles provide a more in-depth understanding of researchers' full data lifecycles.
This presentation introduced participants to the DC 101 course and was given at the Digital Curation and Preservation Outreach and Capacity Building Workshop in Belfast on September 14-15 2009.
http://www.dcc.ac.uk/events/workshops/digital-curation-and-preservation-outreach-and-capacity-building-workshop
This document is a proposal for an enrollment system project for Campus Recreation at Auraria (CRA). It outlines the current problems with CRA's manual enrollment process and proposes building a web-based enrollment system. The proposal describes the technical approach, which includes gathering requirements, designing the system architecture and database, and implementing a prototype. It provides details on the system requirements, design diagrams, and implementation plan. It also includes a quality assurance plan and outlines the project schedule, budget, and expected results. The goal of the new system is to automate CRA's enrollment process and provide a better experience for members.
This document summarizes a session from the Force 11 Scholarly Communications Institute Summer School on data discovery. The session covered metadata, including what it is, types of metadata, and standards. It discussed how people search for and find data through various sources. The session also explored the FAIR data principles of findable, accessible, interoperable and reusable data and had breakout groups discuss applying these principles in practice.
The document discusses non-invasive governance strategies for digital assets like websites. It advocates for a collaborative, transparent approach that focuses on continuous improvement through small steps. The key aspects are maximizing existing resources, documenting goals and results, building buy-in across stakeholders, and training people on best practices. The goal is to evolve processes over time in a way that works for the organization's culture and resources.
The web is this magical place where everyone has opinions on what isn't working and why. The call for website (and, let's face it, primarily homepage) redesigns comes from all sides at the most inopportune times. Instead of embarking on a massive project where scope creep and HIPPO's will eat your soul, start making small, goal-oriented, data-driven changes to content and IA. The results may surprise even the most vocal of HIPPO's.
More Related Content
Similar to Thesis Defense - MSIQ Program - December 2014
Curation-Friendly Tools for the Scientific Researcherbwestra
Presentation for Online Northwest Conference, in Corvallis Oregon, February 10, 2012.
Highlights electronic lab notebooks (ELN) and OMERO (Open Microscopy Environment) as two tools that enable researchers to better manage their research data.
The document discusses advances in data management practices and technologies for ecosystem science. It describes the role of a data manager in facilitating data management, from collecting raw data to organizing it in standard formats and metadata according to community practices. Well-managed data is stored and shared through repositories to enable discovery, access, interoperability and future reuse. Resources and experts are available to help researchers improve their data management.
Slides from Thursday 2nd August 2018 - Data in the Scholarly Communications Life Cycle Course which is part of the FORCE11 Scholarly Communications Institute.
Presenter - Natasha Simons
Opening/Framing Comments: John Behrens, Vice President, Center for Digital Data, Analytics, & Adaptive Learning Pearson
Discussion of how the field of educational measurement is changing; how long held assumptions may no longer be taken for granted and that new terminology and language are coming into the.
Panel 1: Beyond the Construct: New Forms of Measurement
This panel presents new views of what assessment can be and new species of big data that push our understanding for what can be used in evidentiary arguments.
Marcia Linn, Lydia Liu from UC Berkeley and ETS discuss continuous assessment of science and new kinds of constructs that relate to collaboration and student reasoning.
John Byrnes from SRI International discusses text and other semi-structured data sources and different methods of analysis.
Kristin Dicerbo from Pearson discusses hidden assessments and the different student interactions and events that can be used in inferential processes.
Panel 2: The Test is Just the Beginning: Assessments Meet Systems Context
This panel looks at how assessments are not the end game, but often the first step in larger big-data practices at districts/state/national levels.
Gerald Tindal from the University of Oregon discusses State data systems and special education, including curriculum-based measurement across geographic settings.
Jack Buckley Commissioner of the National Center for Educational Statistics discussing national datasets where tests and other data connect.
Lindsay Page, Will Marinell from the Strategic Data Project at Harvard discussing state and district datasets used for evaluating teachers, colleges of education, and student progress.
Panel 3: Connecting the Dots: Research Agendas to Integrate Different Worlds
This panel will look at how research organizations are viewing the connections between the perspectives presented in Panels 1 and 2; what is known, what is still yet to be discovered in order to achieve the promised of big connected data in education.
Andrea Conklin Bueschel Program Director at the Spencer Foundation
Ed Dieterle Senior Program Officer at the Bill and Melinda Gates Foundation
Edith Gummer Program Manager at National Science Foundation
This document summarizes a seminar on data management for undergraduate researchers. It discusses what data is, why it needs to be managed, and key aspects of the data management process such as data organization, metadata, storage, and archiving. Topics covered include file naming best practices, version control, documentation, metadata standards, storage options, and long-term archiving. The goal is to help researchers organize and document their data so it can be understood, preserved, and reused.
Research data management & planning: an introductionMaggie Neilson
This document provides an introduction to research data management (RDM). It defines RDM as the organization and stewardship of research data throughout a research project and beyond. Key components of RDM include data management plans, metadata, sharing and preservation, and ethical and legal obligations. The document discusses why RDM is important, outlines the goals of the Tri-Agency Statement on digital data management, and provides resources for writing data management plans, creating metadata, sharing data, and addressing privacy and ethics.
Research Data Management and Sharing for the Social Sciences and HumanitiesRebekah Cummings
This document summarizes a presentation on research data management for social and behavioral sciences and humanities. The presentation covered topics such as what data management is, why it is important to manage and share data, how to create data management plans, organize data files through naming conventions and folder structures, describe data through metadata and codebooks, issues around data ownership, and data storage, archiving and sharing options. The presentation was aimed at providing guidance to researchers at the University of Utah on best practices for managing and sharing their research data.
This document discusses research lifecycles and data management. It begins by outlining typical stages in a research lifecycle from planning to publication. It then discusses how data is created and managed at various stages, and raises questions researchers should consider around formatting, documenting, storing, sharing and preserving data. The document provides examples of research lifecycle models and gives advice on best practices for managing data at each stage of the research process to support reuse and ensure data is well documented and preserved.
2-6-14 ESI Supplemental Webinar: The Data Information Literacy ProjectDuraSpace
The document summarizes a webinar about the past, present, and future of the Data Information Literacy Project. The project aims to identify data literacy skills for different disciplines, build infrastructure for teaching those skills, and develop a toolkit for librarians. Case studies were conducted at 5 universities to determine data needs of students and faculty. Educational programs were developed and a symposium and toolkit are planned next. The project identifies 12 core data literacy competencies and aims to develop standards in this area.
Data Management Lab: Session 3 slides (more details at http://ulib.iupui.edu/digitalscholarship/dataservices/datamgmtlab)
What you will learn:
1. Build awareness of research data management issues associated with digital data.
2. Introduce methods to address common data management issues and facilitate data integrity.
3. Introduce institutional resources supporting effective data management methods.
4. Build proficiency in applying these methods.
5. Build strategic skills that enable attendees to solve new data management problems.
This document discusses best practices for data management for research. It covers topics such as file organization, documentation, storage, sharing and publishing data, and archiving. Good practices include using file naming conventions and open formats, documenting projects, processes, and data, making backups in multiple locations, and publishing and archiving data in repositories to enable access and preservation. Data management is important for research reproducibility, sharing, and complying with funder requirements.
It's 2015. Do You Know Where Your Data Are?Patricia Hswe
This document summarizes a presentation on research data management. It discusses definitions of research data and why data should be shared. It provides tips for best practices in file naming, description standards, formats and storage. Tools, resources and services for research data management from Penn State and beyond are presented, including ScholarSphere and DMPTool. The importance of having an online presence and sharing research is discussed.
Meeting Federal Research Requirements for Data Management Plans, Public Acces...ICPSR
These slides cover evolving federal research requirements for sharing scientific data. Provided are updates on federal agency responses to the 2013 OSTP memo, guidance on data management plans, resources for data management and curation training for staff/researchers, and tips for evaluating public data-sharing services. ICPSR's public data-sharing service, openICPSR, is also presented. Recording of this presentation is here: https://www.youtube.com/watch?v=2_erMkASSv4&feature=youtu.be
DRC is a privately held company founded in 1978 that provides survey research, education, document, and data collection services. It has 550 regular employees and 4,000 seasonal/temporary employees across multiple US locations. DRC's services include survey design and administration, test development, printing, data processing, and psychometric analysis to serve educational, government, and commercial clients. It prides itself on quality, security, and long-term client relationships.
Research Data Curation _ Grad Humanities ClassAaron Collie
This document discusses best practices for research data curation and management. It covers topics such as data storage, file organization, documentation, sharing, and archiving. Effective data management practices include making backups in multiple locations, using logical file naming conventions and organization schemes, documenting projects, processes, and data, publishing and sharing data when appropriate, and archiving data for long-term preservation and access. Proper data management ensures that valuable research data is organized, preserved, and accessible to enable future research and verification of results.
The document summarizes the Research Data Alliance (RDA), including its vision of open data sharing across disciplines to address societal challenges, groups working on specific issues, and governance structure involving a council, secretariat, and technical advisory board to guide its work. RDA has over 4,000 members from over 100 countries working in various interest and working groups to develop standards and recommendations to make data sharing and use more effective.
The document provides logistics for a webinar on data curation profiles and the DMPTool. It includes instructions for calling into the audio, asking questions in the chat, and finding recordings and slides. The webinar will discuss the history of data curation profiles, comparing them to data management plans, and a case study of using data curation profiles. Data curation profiles involve interviewing researchers about their data practices and needs in order to understand how to support them, while data management plans focus on requirements for funding. Both tools can help librarians engage with researchers, though data curation profiles provide a more in-depth understanding of researchers' full data lifecycles.
This presentation introduced participants to the DC 101 course and was given at the Digital Curation and Preservation Outreach and Capacity Building Workshop in Belfast on September 14-15 2009.
http://www.dcc.ac.uk/events/workshops/digital-curation-and-preservation-outreach-and-capacity-building-workshop
This document is a proposal for an enrollment system project for Campus Recreation at Auraria (CRA). It outlines the current problems with CRA's manual enrollment process and proposes building a web-based enrollment system. The proposal describes the technical approach, which includes gathering requirements, designing the system architecture and database, and implementing a prototype. It provides details on the system requirements, design diagrams, and implementation plan. It also includes a quality assurance plan and outlines the project schedule, budget, and expected results. The goal of the new system is to automate CRA's enrollment process and provide a better experience for members.
This document summarizes a session from the Force 11 Scholarly Communications Institute Summer School on data discovery. The session covered metadata, including what it is, types of metadata, and standards. It discussed how people search for and find data through various sources. The session also explored the FAIR data principles of findable, accessible, interoperable and reusable data and had breakout groups discuss applying these principles in practice.
Similar to Thesis Defense - MSIQ Program - December 2014 (20)
The document discusses non-invasive governance strategies for digital assets like websites. It advocates for a collaborative, transparent approach that focuses on continuous improvement through small steps. The key aspects are maximizing existing resources, documenting goals and results, building buy-in across stakeholders, and training people on best practices. The goal is to evolve processes over time in a way that works for the organization's culture and resources.
The web is this magical place where everyone has opinions on what isn't working and why. The call for website (and, let's face it, primarily homepage) redesigns comes from all sides at the most inopportune times. Instead of embarking on a massive project where scope creep and HIPPO's will eat your soul, start making small, goal-oriented, data-driven changes to content and IA. The results may surprise even the most vocal of HIPPO's.
The document discusses critical success factors in software projects, noting that 26% of projects fail and 46% experience cost or schedule overruns or loss of functionality. It identifies common reasons for failures as including unclear user needs, scope, change management, technology changes, business needs changes, unrealistic deadlines, resistant users, and lost sponsorship. The document recommends managing expectations, skills, quality, and progress, making sustainable choices, and performing a post-mortem review after every project to identify lessons learned and improve future projects.
This document provides an overview of Google Analytics and how to set it up to track key metrics for a website. It discusses installing the Google Analytics tracking code on website pages, setting up goals and segments to measure desired business outcomes. Examples of common metrics that can be tracked include traffic sources, geographic locations, conversion rates, sales by device. Custom dashboards can be created to compare specific metrics over different date ranges and for different user segments. A/B testing is also mentioned as a way to gather objective data rather than relying on anecdotes. Contact information is provided for questions.
This document discusses iterative design and provides tips for implementing an iterative design process. It defines some key terms like HiPPO (Highest Paid Person's Opinion) and discusses how to test websites using techniques like the F-pattern test and 10-second test. The document also provides examples of iterating the design of an undergraduate admissions landing page over three versions, and emphasizes that designers should always be iterating and improving their work.
The Role of E-Service Quality and Information Quality In Creating Perceived V...Shelley Keith, MSIQ
This study surveyed 409 undergraduate students to test how e-service quality and information quality contribute to perceived value and website loyalty. The researchers developed 16 hypotheses and found that both e-service quality and information quality must be addressed to best promote loyalty. The literature review covered expectancy-value theory, e-service quality, information quality, perceived value, and loyalty intentions. The study validated that no single factor ensures loyalty and that a holistic approach to service, information, and value is needed.
This document summarizes Shelley Keith's efforts to make a WordPress site mobile-friendly with limited resources. Some key points:
1) Mobile traffic to the site had increased significantly from 2010 to 2011, demonstrating a need for a mobile solution.
2) The project criteria included creating a mobile version with minimal coding, budget, and need for ongoing maintenance.
3) A number of WordPress plugins were tested but most did not meet the criteria well. Responsive design was also considered but faced challenges.
4) In the end, no single solution met all the needs, so a combination of approaches was used including plugins, themes, and responsive design principles.
Presentation delivered to faculty and staff outlining the state of social media for the university and guidelines for improving their social media efforts.
BuddyPress is a social network plugin for WordPress that allows users to create profiles, share updates, join groups, and participate in forums. It can be used to create a campus-wide or company-wide social network, or niche social networks around specific topics. Benefits include owning the content, maintaining long-term relationships, and having control over the style. BuddyPress is installed and activated like any other WordPress plugin, then components like profiles, activity streams, and groups need to be configured.
This document summarizes a workshop about using WordPress and BuddyPress for a university website project. It discusses installing and setting up WordPress and a WordPress Multisite network, important plugins, custom post types and taxonomies, costs, speed and security best practices. It also covers installing and configuring BuddyPress to add a social network layer. Key strengths are noted as the easy setup, flexibility, large user community and ability to customize sites independently while deploying rapidly on a limited budget.
This presentation includes basic of PCOS their pathology and treatment and also Ayurveda correlation of PCOS and Ayurvedic line of treatment mentioned in classics.
A workshop hosted by the South African Journal of Science aimed at postgraduate students and early career researchers with little or no experience in writing and publishing journal articles.
How to Manage Your Lost Opportunities in Odoo 17 CRMCeline George
Odoo 17 CRM allows us to track why we lose sales opportunities with "Lost Reasons." This helps analyze our sales process and identify areas for improvement. Here's how to configure lost reasons in Odoo 17 CRM
How to Fix the Import Error in the Odoo 17Celine George
An import error occurs when a program fails to import a module or library, disrupting its execution. In languages like Python, this issue arises when the specified module cannot be found or accessed, hindering the program's functionality. Resolving import errors is crucial for maintaining smooth software operation and uninterrupted development processes.
The simplified electron and muon model, Oscillating Spacetime: The Foundation...RitikBhardwaj56
Discover the Simplified Electron and Muon Model: A New Wave-Based Approach to Understanding Particles delves into a groundbreaking theory that presents electrons and muons as rotating soliton waves within oscillating spacetime. Geared towards students, researchers, and science buffs, this book breaks down complex ideas into simple explanations. It covers topics such as electron waves, temporal dynamics, and the implications of this model on particle physics. With clear illustrations and easy-to-follow explanations, readers will gain a new outlook on the universe's fundamental nature.
Introduction to AI for Nonprofits with Tapp NetworkTechSoup
Dive into the world of AI! Experts Jon Hill and Tareq Monaur will guide you through AI's role in enhancing nonprofit websites and basic marketing strategies, making it easy to understand and apply.
How to Build a Module in Odoo 17 Using the Scaffold MethodCeline George
Odoo provides an option for creating a module by using a single line command. By using this command the user can make a whole structure of a module. It is very easy for a beginner to make a module. There is no need to make each file manually. This slide will show how to create a module using the scaffold method.
Assessment and Planning in Educational technology.pptxKavitha Krishnan
In an education system, it is understood that assessment is only for the students, but on the other hand, the Assessment of teachers is also an important aspect of the education system that ensures teachers are providing high-quality instruction to students. The assessment process can be used to provide feedback and support for professional development, to inform decisions about teacher retention or promotion, or to evaluate teacher effectiveness for accountability purposes.
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...PECB
Denis is a dynamic and results-driven Chief Information Officer (CIO) with a distinguished career spanning information systems analysis and technical project management. With a proven track record of spearheading the design and delivery of cutting-edge Information Management solutions, he has consistently elevated business operations, streamlined reporting functions, and maximized process efficiency.
Certified as an ISO/IEC 27001: Information Security Management Systems (ISMS) Lead Implementer, Data Protection Officer, and Cyber Risks Analyst, Denis brings a heightened focus on data security, privacy, and cyber resilience to every endeavor.
His expertise extends across a diverse spectrum of reporting, database, and web development applications, underpinned by an exceptional grasp of data storage and virtualization technologies. His proficiency in application testing, database administration, and data cleansing ensures seamless execution of complex projects.
What sets Denis apart is his comprehensive understanding of Business and Systems Analysis technologies, honed through involvement in all phases of the Software Development Lifecycle (SDLC). From meticulous requirements gathering to precise analysis, innovative design, rigorous development, thorough testing, and successful implementation, he has consistently delivered exceptional results.
Throughout his career, he has taken on multifaceted roles, from leading technical project management teams to owning solutions that drive operational excellence. His conscientious and proactive approach is unwavering, whether he is working independently or collaboratively within a team. His ability to connect with colleagues on a personal level underscores his commitment to fostering a harmonious and productive workplace environment.
Date: May 29, 2024
Tags: Information Security, ISO/IEC 27001, ISO/IEC 42001, Artificial Intelligence, GDPR
-------------------------------------------------------------------------------
Find out more about ISO training and certification services
Training: ISO/IEC 27001 Information Security Management System - EN | PECB
ISO/IEC 42001 Artificial Intelligence Management System - EN | PECB
General Data Protection Regulation (GDPR) - Training Courses - EN | PECB
Webinars: https://pecb.com/webinars
Article: https://pecb.com/article
-------------------------------------------------------------------------------
For more information about PECB:
Website: https://pecb.com/
LinkedIn: https://www.linkedin.com/company/pecb/
Facebook: https://www.facebook.com/PECBInternational/
Slideshare: http://www.slideshare.net/PECBCERTIFICATION
हिंदी वर्णमाला पीपीटी, hindi alphabet PPT presentation, hindi varnamala PPT, Hindi Varnamala pdf, हिंदी स्वर, हिंदी व्यंजन, sikhiye hindi varnmala, dr. mulla adam ali, hindi language and literature, hindi alphabet with drawing, hindi alphabet pdf, hindi varnamala for childrens, hindi language, hindi varnamala practice for kids, https://www.drmullaadamali.com
1. UNIVERSITY OF ARKANSAS AT LITTLE ROCK
Information Quality Program
Information Quality and File System
Management at the Department of
Arkansas Heritage
BY T.M. “SHELLEY” KEITH
2. UNIVERSITY OF ARKANSAS AT LITTLE ROCK
Information Quality Program
Department of Arkansas Heritage
7 “arm” state organization, plus central director’s office
◦ Each arm with its own mission, staff.
◦ Some have their own regulatory requirements.
Identified issues with file system, email
◦ Lack of naming conventions
◦ Operational inefficiencies
◦ Concerns about waste, archives, backups, resources
Digital photo storage
◦ space, conventions, backups
Training
IQ AND FILE SYSTEM MANAGEMENT AT DAH 2
Step 1: Define Business Need and Approach
3. UNIVERSITY OF ARKANSAS AT LITTLE ROCK
Information Quality Program
Approach Rationale
Quantify issues
◦ Verify problems identified by leadership
◦ What other problems exist that might be contributing to or more critical than what’s been
reported?
Prioritize
◦ Triage identified issues and begin understanding the source
Define improvement
◦ What is “better” for this organization?
Plan
◦ What will it take to start making progress toward “better?”
IQ AND FILE SYSTEM MANAGEMENT AT DAH 3
4. UNIVERSITY OF ARKANSAS AT LITTLE ROCK
Information Quality Program
Project Approach
Establish A Data Quality Baseline
◦ Step 1: Define Business Need and Approach
◦ Step 2: Analyze Information Environment
◦ Step 3: Assess Data Quality
◦ Step 4: Assess Business Impact
◦ Step 5: Identify Root Causes
◦ Step 6: Develop Improvement Plans
◦ Step 10: Communicate Actions and Results
Goal
◦ Uncover problems
◦ Determine which ones are worth
addressing
◦ Identify root causes for high priority issues
◦ Develop realistic action plans
IQ AND FILE SYSTEM MANAGEMENT AT DAH 4
McGilvray pp 242-243
5. UNIVERSITY OF ARKANSAS AT LITTLE ROCK
Information Quality Program
Project Goals
IQ AND FILE SYSTEM MANAGEMENT AT DAH 5
1. Assess the current ecosystem from an Information Quality perspective.
I. Primary Dimensions
I. Duplication
II. Ease of Use & Maintainability
III. Data Specifications
2. Provide a set of formal recommendations for naming conventions.
I. Folder names and file system organization
II. Metadata
III. File names
3. Provide a path to and structure for unified, consistent, file system
governance.
Step 1: Define Business Need and Approach
6. UNIVERSITY OF ARKANSAS AT LITTLE ROCK
Information Quality Program
Department
of Arkansas
Heritage
Director’s Office
Museums
Historic Arkansas Museum (HAM)
Delta Cultural Center (DCC)
Mosaic Templars Cultural Center (MTCC)
Old State House Museum (OSH)
Heritage Resource
Agencies
Arkansas Arts Council (AAC)
Arkansas Natural Heritage Commission (ANHC)
Arkansas Historic Preservation Program (AHPP)
The Organization
IQ AND FILE SYSTEM MANAGEMENT AT DAH 6
Step 1: Define Business Need and Approach
7. UNIVERSITY OF ARKANSAS AT LITTLE ROCK
Information Quality Program
DAH Network Access
Each agency has a dedicated network drive (T)
Each agency has access to a central shared drive (S)
Each user has their own personal network drive (U)
IQ AND FILE SYSTEM MANAGEMENT AT DAH 7
S:
AAC
ANHC
Central
MTCC
AHPP
DCC
HAM
OSH
Step 2: Analyze the Information Environment
8. UNIVERSITY OF ARKANSAS AT LITTLE ROCK
Information Quality Program
Project Plan & Tools
Plan
◦ File System Review
◦ Manual evaluation of the file names and folder structures across
the network.
◦ Stakeholder Survey
◦ Understand perceptions across agencies and user types
◦ Administrative, Professional, Leadership
◦ Identify issues throughout the organization
◦ Uncover root causes
◦ File System Scan
◦ Quantitative measurements for the health of the file system
Tools
◦ MailChimp
◦ Google Form
◦ Microsoft Excel
◦ DiskBoss Pro
IQ AND FILE SYSTEM MANAGEMENT AT DAH 8
9. UNIVERSITY OF ARKANSAS AT LITTLE ROCK
Information Quality Program
Stakeholder Survey
37 question
◦ 46 input opportunities once broken down into survey tool
◦ 5 required
112 responses of 213 employees emailed (53%)
Questions specific to Leadership & IT staff
Applicable to:
◦ Dimensions of Data Quality
◦ Business Impact Techniques
◦ Information Life Cycle
◦ 10-Step Process
Organized by:
◦ Theme
◦ Employee type
◦ Agency
IQ AND FILE SYSTEM MANAGEMENT AT DAH 9
10. UNIVERSITY OF ARKANSAS AT LITTLE ROCK
Information Quality Program
Stakeholder Survey – IQ Map
Information Life
Cycle
Business Impact
Technique
Dimension(s) of
Data Quality
10-Step Process Theme
Plan Usage Ease of Use Define Business Need
& Approach
General information
Obtain Anecdotes Duplication Analyze Information
Environment
Time spent
on/frequency of
encounters
Store & Share Cost of Low-Quality
Data
Timeliness &
Availability
Assess Data Quality Preferences
Maintain Process Impact Perception,
Relevance, & Trust
Assess Business
Impact
File storage behaviors
Apply Ranking & Prioritization Data Specifications Identify Root Causes Regulatory awareness
Dispose Develop Improvement
Plans
IQ AND FILE SYSTEM MANAGEMENT AT DAH 10
11. UNIVERSITY OF ARKANSAS AT LITTLE ROCK
Information Quality Program
Survey Responses – Agency Information
Arkansas Arts Council, 11, 10%
Arkansas Historic Preservation
Program, 21, 19%
Arkansas Natural Heritage
Commission, 18, 16%
Delta Cultural Center, 3, 3%Director's Office, 19, 17%
Historic Arkansas Museum, 16,
14%
Mosaic Templars Cultural
Center, 8, 7%
Old State House Museum, 16,
14%
RESPONSES BY AGENCY
IQ AND FILE SYSTEM MANAGEMENT AT DAH 11
12. UNIVERSITY OF ARKANSAS AT LITTLE ROCK
Information Quality Program
Survey Responses – Category
Leadership,
13, 12%
Administrative,
29, 26%
Professional, 70,
62%
EMPLOYEE CATEGORY
IQ AND FILE SYSTEM MANAGEMENT AT DAH 12
Director's
Office,
19, 17%
Museums,
43, 38%
Heritage
Resource
Agencies,
50, 45%
RESPONSE BY AGENCY TYPE
13. UNIVERSITY OF ARKANSAS AT LITTLE ROCK
Information Quality Program
Survey Response – File Types
IQ AND FILE SYSTEM MANAGEMENT AT DAH 13
102
75
71
26
19
14
14
7
7
6
4
4
2
2
2
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
FILE TYPES
14. UNIVERSITY OF ARKANSAS AT LITTLE ROCK
Information Quality Program
Survey Responses – File Findability
10
7 8
2
1
2
6
4
14
20
20
12
2
1 2 3 4 5
ORDERED EASY (1) TO HARD (5)
EASE - BY CATEGORY
Administrative Leadership Professional
IQ AND FILE SYSTEM MANAGEMENT AT DAH 14
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
50%
1 2 3 4 5
BY CATEGORY PERCENTAGE
Professional Leadership Administrative Average
15. UNIVERSITY OF ARKANSAS AT LITTLE ROCK
Information Quality Program
Survey Responses – Time & Frequency
26%
reported recreating existing
files because they couldn’t
find the file they needed…
IQ AND FILE SYSTEM MANAGEMENT AT DAH 15
25%
reported being unable to
find the source file for an
archive document type like
PDF…
26%
reported having to ask
someone to email a file
because they can’t find it or
it’s stored where they don’t
have access…
…at least once a month.
16. UNIVERSITY OF ARKANSAS AT LITTLE ROCK
Information Quality Program
32%
reported encountering files that were
supposed to be current, but actually
contained outdated or incorrect
information…
23%
reported discovering conflicting copies
of the same file…
IQ AND FILE SYSTEM MANAGEMENT AT DAH 16
…at least once a year.
Survey Responses – Time & Frequency
17. UNIVERSITY OF ARKANSAS AT LITTLE ROCK
Information Quality Program
Survey Responses – Time & Frequency
20 hours or more,
1, 1%
Less than 10 hours,
7, 6%
Less than 20 hours,
3, 3%
Less than 5 hours,
98, 90%
TIME PER WEEK
IQ AND FILE SYSTEM MANAGEMENT AT DAH 17
18. UNIVERSITY OF ARKANSAS AT LITTLE ROCK
Information Quality Program
Survey Responses – File Storage Behaviors
Yes, 86%
Yes, 53%
No, 14%
No, 47%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Local External
STORING FILES ON NON-NETWORK DRIVES
IQ AND FILE SYSTEM MANAGEMENT AT DAH 18
19. UNIVERSITY OF ARKANSAS AT LITTLE ROCK
Information Quality Program
Survey Responses – Regulatory Awareness
IQ AND FILE SYSTEM MANAGEMENT AT DAH 19
No, 33, 30%
Yes, 76, 70%
ORGANIZATION WIDE
No Yes
9
2
22
18
11
47
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Administrative Leadership Professional
BY CATEGORY
No Yes
20. UNIVERSITY OF ARKANSAS AT LITTLE ROCK
Information Quality Program
Survey Responses - Preferences
No
39%
Yes
61%
PRESENCE OF FILE NAMING PREFERENCES EXAMPLES
[project number].[artifact_id]
[location]_[year]_[description]
[historic resource number]-[historic name]-
[description]
IQ AND FILE SYSTEM MANAGEMENT AT DAH 20
21. UNIVERSITY OF ARKANSAS AT LITTLE ROCK
Information Quality Program
File System Evaluation – Drive Scans
IQ AND FILE SYSTEM MANAGEMENT AT DAH 21
Measures Drives Totals
Agency T Drives
S Central AAC ANHC AHPP OSH MTCC HAM DCC
Wasted Space
(GB)
9.2 0.33163 28.05 153.31 184.3 28.55 244.8 80.25 9.26 728.79
by last accessed 1-2 years 1-3 months 1-2 years 3-5 years 3-6 months 1-2 years 1-3 months 6-12
months
6-12
months
by user name Administrators Jessica.Cren
shaw
Administrators Administrators Shelle Administrators bryan.mcdade Patricia
by file type JPG JPG JPG JPG JPG TIF TIF JPG TIF
Disk Space (GB) 305.54 38.04 147.37 1380 1690 388.02 754.64 418.68 55.03 5122.2
by last accessed 1-2 years 1-2 years 2-3 years 3-5 years 1-2 years 3-5 years 6-12
months
6-12
months
by modified 5+ years 2-3 years 2-3 years 5+ years 5+ years 5+ years 5+ years 5+ years 5+ years
by user name Administrators Administrators Scotty Administrators Administrators Administrators jaime
by file type TIF VHD JPG JPG JPG TIF MTS JPG TIF
% wasted 3% 1% 19% 11% 11% 7% 32% 19% 17% 14%
Number of Files 68739 16805 85890 387661 409190 60059 114067 140869 27018 1283280
by last accessed 1-2 years 1-2 years 3-5 years 3-5 years 1-2 years 3-5 years 6-12
months
by modified 3-5 years 5+ years 5+ years 5+ years 5+ years 5+ years 5+ years
duplicate files 9101 1046 11699 88089 56439 5080 8613 33555 3086 213622
% duplicate 13% 6% 14% 23% 14% 8% 8% 24% 11% 17%
22. UNIVERSITY OF ARKANSAS AT LITTLE ROCK
Information Quality Program
Wasted Space
IQ AND FILE SYSTEM MANAGEMENT AT DAH 22
3%
1%
19%
11% 11%
7%
32%
19%
17%
0%
5%
10%
15%
20%
25%
30%
35%
% wasted
WASTED SPACE PER DRIVE
S Central Arts ANHC AHPP OSH MTCC HAM DCC
23. UNIVERSITY OF ARKANSAS AT LITTLE ROCK
Information Quality Program
Duplicate Files
13%
6%
14%
23%
14%
8% 8%
24%
11%
0%
5%
10%
15%
20%
25%
30%
% duplicate
PERCENTAGE OF DUPLICATE FILES ON EACH DRIVE
S Central Arts ANHC AHPP OSH MTCC HAM DCC
IQ AND FILE SYSTEM MANAGEMENT AT DAH 23
24. UNIVERSITY OF ARKANSAS AT LITTLE ROCK
Information Quality Program
Network Waste
14%
Wasted Disk Space
IQ AND FILE SYSTEM MANAGEMENT AT DAH 24
17%
Duplicate Files
25. UNIVERSITY OF ARKANSAS AT LITTLE ROCK
Information Quality Program
File System Age
1 year
38%
5 Years
34%
10 Years
17%
Older
11%
REPORTED AGE OF FILES
0
1
2
3
4
5
6
Wasted Space Disk Space Files
LAST ACCESSED
< 1 year
1-2 years
2-3 years
3-5 years
5+ years
IQ AND FILE SYSTEM MANAGEMENT AT DAH 25
26. UNIVERSITY OF ARKANSAS AT LITTLE ROCK
Information Quality Program
Stakeholder Support
1
4% 2
4%
3
13%
4
29%
5
50%
VALUE PERCEPTION - ORGANIZATION
IQ AND FILE SYSTEM MANAGEMENT AT DAH 26
27. UNIVERSITY OF ARKANSAS AT LITTLE ROCK
Information Quality Program
Recommendations
Create agency-level working groups to steward the resource. Include IT.
a. Naming conventions
b. Folder hierarchies
c. Metadata
d. Deletion/archiving plans
Create a central working group made up of agency stewards and IT.
a. Formalize and support the work being done at the agency level.
b. Establish “S” drive requirements for appropriate use, naming, and archiving.
Provide regular training on conventions, metadata, and the use of existing tools.
Continually scan network drives to identify areas of focus for working groups. Define and
measure improvement.
IQ AND FILE SYSTEM MANAGEMENT AT DAH 27
28. UNIVERSITY OF ARKANSAS AT LITTLE ROCK
Information Quality Program
Conclusion
Interview
Survey
Scans
Refine
Iterate
IQ AND FILE SYSTEM MANAGEMENT AT DAH 28
Sweeping change is not likely to render
desired results. A non-invasive
approach will allow agencies to
establish conventions and protocols
that work for their requirements while
achieving the desired result of a
cleaner, more efficient, more
sustainable file system.
29. UNIVERSITY OF ARKANSAS AT LITTLE ROCK
Information Quality Program
Future Considerations
Digital Asset Management
Geodatabase
Sharepoint or other “intranet” type file versioning tool
IQ AND FILE SYSTEM MANAGEMENT AT DAH 29
Editor's Notes
Regulatory requirements: archival needs, file content/format, naming conventions governed by other agencies they work closely with, etc.
Reality vs. perception. Leadership vs. the rest of the organization vs. file system scans.
Leadership had anecdotes and the desire for consistency, but no hard data to understand the actual state of their file system. They also didn’t have a clear understanding of how issues were impacting the whole organization.
Establish a data quality baseline
10-Step Process
Business Impact Techniques
Information Life Cycle (POSMAD)
Data Quality Dimensions
This project focused on the S and T drives.
The start of building comparative data for measuring progress over time.
Attempt to ensure that any corrective measures don’t miss the mark.
In many cases, the multiple choice questions existed just to get the user in the right mindset to respond to the long-answer portion. So much value comes from letting people tell their stories.
Use cases, anecdotes
Frustrations with systems and services
The survey responses helped inform the scan priorities. The scans gave context to survey results.
Use steps 1-6 of the 10-Step Process, as prescribed by the project approach. Note that Step 10 isn’t reflected in the survey.
The largest group was the Historic Preservation program with 21 respondents, followed by the Director’s office, the Natural Heritage Commission, and then by a tie between the Historic Arkansas Museum and the Old State House Museum. Delta Cultural Center contributed 3 responses to the survey.
Respondents were asked “How easy is it to locate existing files?” The overall responses clearly skewed “easy” in raw numbers, but when it was broken down into percentages for each employee category, we see that, on average, a larger percentage of leadership and professional respondents skewed toward “hard.” These opposing trends may indicate that users have adapted to the system, or that users don’t perceive reported issues as factors that increase the level of difficulty of file findability. It may also indicate that leadership and professional users rely on administrative staff for some of these functions.
A quarter of the responses indicated consistent problems finding files.
Many of the comments provided examples of files renamed or deleted by coworkers, mislabeled or misfiled files, and one mentioned a file that had been password protected by a former employee. The most common issues associated with difficult to find files were those involving images and GIS data. The lack of consistent metadata was repeatedly cited as a contributing factor.
Email files: This is of concern because it results in duplicate files, and/or a number of versions of files, across the network.
Seventeen percent of respondents indicate they regularly create files only to discover a similar file already existed. Note that these figures only reflect self-reported instances of these scenarios. In the case of discovering files already exist, the probability is high that existing files go undiscovered as well.
Question 10: How often have you encountered files that were supposed to be current, but actually contained outdated or incorrect information? This issue can arise because files haven’t been updated to match new information, but are the most current version; because updated versions are being stored elsewhere, or because an old version is the most readily available to the respondent. Thirty-two percent responded that they encounter this issue more than once per year.
Question 11: How often have you encountered conflicting copies of the same file or form? Question 11 measures the frequency of discovering conflicting files, rather than simply outdated ones. Twenty-three percent of respondents encounter this issue more than once per year.
Respondents were asked to self-report the impact of information quality issues in the DAH file system in terms of hours per week. Ninety percent indicated less than 5 hours per week spent on these scenarios. Overall, while frequency may be an issue for some situations, the actual time spent working through these problems is perceived to be minimal.
Keeping copies of files is a common, and sometimes necessary, behavior in networked environments. It can also be indicative of and a contributor to versioning and duplication issues. Eighty-six percent of respondents reported keeping copies of files on their computer.
Like storing files on a local computer rather than on the network drives, storing files or copies in the cloud or on other external devices can be a best practice for archiving purposes, but can also lead to versioning and duplication issues. Fifty-three percent of respondents reported storing files in this way. Reasons cited include easier sharing of files, fear of loss due to drive failure, and the ability to access files from outside the office.
Discussions with leadership indicated the possibility that employees within the agencies might not be fully aware of internal and external regulatory requirements governing the storage and deletion of files. Seventy percent of survey respondents indicated an awareness that there was a policy or policies. However, a request to describe the requirement resulted in responses ranging from “I have no idea” to “3 years” to “until legislators say it’s ok to delete them” and even “someone else keeps up with that.” Couple this lack of awareness with repeated reports of “other people” deleting needed files from network drives and we start to see some of the root cause of issues overall.
Sixty-one percent of respondents indicated they already have a method in how they name files, and many provided examples in the comments. While some respondents had very general guidelines, such as date and location for photos relevant to a geographic area, some were very specific in their methods. Examples included above.
Many examples of hierarchal folder structures were provided to sort files in ways that were appropriate to the agency. One respondent indicated use National Park Service naming conventions, and another suggested adherence to an ISO standard. The diversity of use cases throughout DAH will factor heavily into any efforts for file name and folder structure consistency.
The leadership interview, and employee survey informed the list of needs for the file system evaluation. Not only did we need to understand the state of the file system, we needed to understand how the perceptions in the survey were reflected by the state of the file system.
Thorough scans were taken of each shared drive, and a set of key indicators; age of the file system, file types, and users of interest; shown in the table below, were selected to profile the overall health of the file system. File system age is demonstrated by (1) how long ago the largest portion of the drive was last accessed and last modified, and (2) by how long ago the largest portion of the duplicate space was last accessed.
In addition to the evaluation criteria, we’re also able to see the usernames associated with wasted space, who uses the most disk space, who has the most duplicate files, which file types are the most common, and the most commonly duplicated.
14% of the scanned file system, or (729 gb of space) qualifies as ‘wasted space’, meaning it’s occupied by duplicated files.
17% of the files on the system (213,622) are duplicates.
One of the key indicators of the age of the file system is when files were last accessed. This includes just opening them. The chart here is the last accessed times for the largest portion of the disk space or files of each type. To clarify, the wasted space indicator is the last opened times for the largest chunk of wasted files on each drive. Disk space is calculated the same way. Files are by actual numbers of files.
What we see here is a comparison of the perception of the age of files users interact with, and an actual reading of the how recently different types of files are accessed.
Respondents were asked to rate on a scale of 1 (Not at all valuable) to 5 (Very valuable) how valuable they thought a consistent naming convention would be in the context of finding files and information. Only eight percent of respondents felt a level of consistency had little to no value. Overwhelmingly, respondents felt consistency would be valuable, with half of respondents felt there was significant value in the effort. Many did voice concerns about the level of effort such an initiative would require. Those with much evolved naming and storage methods, especially those in use agency-wide, were opposed to imposed standards that would require significant time and manpower to adopt.
Realistic action plan(s) – Accountability, ownership, relevance.
Create agency-specific naming conventions. Consider legacy files.
Create agency-specific file cleanup plans to remove drafts and unnecessary versions.
Establish agency-specific metadata conventions. Consider metadata editors.
Provide training on existing tools such as the Microsoft Suite, Adobe, etc.
Folders should be relevant to the organization, not individuals. (Personal files should be stored on non-work spaces)
Formalize IT processes and communication surrounding backups, archiving, and loss recovery. It would be beneficial for IT to clearly define and make available archiving and backup protocols, including when backups are set to run. Make it known when issues occur, such as backup failures, and communicate updates and changes early and often.
Note the cyclical nature. Use the baselines to measure improvement. Keep testing.
Interview/survey – measure perceptions, identify things to test
Scan – tests of previous and new metrics. How you identify actual progress or lack thereof.
Refine, iterate – identify changes to be made for the next cycle (almost agile, selecting the focus of the next sprint), update survey instrument, select new scans to be run/old to be dropped
The overwhelming majority of stakeholders, including leadership and IT, support a more sustainable, strategic file system management process. The least optimal outcome of this initiative, per leadership and IT survey responses, is to do nothing. It should not go unnoticed that the status quo is designated ‘least optimal’ in these responses. The optimal outcome is agency-specific plans designed to support the needs and requirements of each branch of DAH, alleviate some of the pressure on the physical servers, and make transparent the naming, storing, and archiving of files throughout the organization.
As this is the preliminary study in a long-term rehabilitation effort, it will be necessary to re-survey stakeholders regularly to determine needed course adjustments and identify new issues.
Based on stakeholder feedback, there is significant interest and perceived need for both a Digital Asset Management System (DAM) and a geodatabase. Enterprise cloud storage options may alleviate some of the network-based issues associated with archiving and availability. Additionally, tools that include change management and version control options, such as Microsoft Sharepoint, could serve to correct user behaviors regarding archiving and versioning. These considerations should not be taken as recommendations, but rather starting points for further evaluation and review.
It should be noted that the 1st steps recommended here are necessary for successful implementation of enterprise systems.