This document discusses Stanford's efforts to process and manage born-digital materials from several collections received in the late 1990s and 2000s. It outlines challenges around reading legacy media formats, describing technical metadata, and providing long-term access. The document also describes Stanford's collaboration with other institutions on the AIMS project and their use of FTK forensic software to extract metadata and organize large email collections.
AIMS Workshop Case Study 3: Arrangement and Description Case Study - Stephen ...AIMS_Archives
This document summarizes the arrangement and description of the Stephen Jay Gould papers, which include both analog and born-digital materials. The analog materials consist of 550 linear feet of papers in various containers, while the digital materials include over 2,500 files in formats such as WordPerfect, Microsoft Word, Excel, and others stored on floppy disks, computer tapes, and punch cards. Tools like AccessData FTK are being used to process the digital materials, including rendering files, identifying duplicates, full-text searching, and flagging restricted files. Labels are being applied to files to indicate access restrictions, file types, and subjects to carry that metadata when files are exported to the access repository. Finding aids like EAD are
This document provides guidance on using library databases such as ProQuest and JSTOR to research topics and find scholarly journal articles and other sources. It outlines how to develop keywords from a topic, search databases effectively using Boolean logic and filters, collect and save search results, and get help from a librarian if needed. Databases contain peer-reviewed sources not available elsewhere and can save time compared to general web searches. Tips are provided on refining searches, choosing file formats, and excluding book reviews from JSTOR results.
Born digital archives refer to personal and corporate archives that are created and stored in digital formats, rather than physical formats. They typically include draft works, diaries, correspondence, photographs, and other digital files and objects. These archives pose challenges for preservation due to the variety of file formats, operating systems, and storage media used over time as technologies become obsolete. Institutions must address issues related to representing relationships within archives, scaling workflows, data protection, and educating users on access to these archives.
Presentation on the use of the Eureka Research Workbench to store data and scientific workflow information. Presented online as part of the Dial-a-molecule 'Liberating Laboratory Data' event (http://www.dial-a-molecule.org/wp/events-listing/liberating-laboratory-data/)
DOIs and Other Persistent identifiers in Research Data (Eugene Barsky)ORCID, Inc
- Persistent identifiers like DOIs, Handles, ARKs and PURLs provide long-lasting references to digital resources. They ensure the provenance and persistence of cited resources over time.
- DOIs have additional benefits like discoverability, making resources findable across scholarly databases. UBC Library has developed a GUI to mint DOIs as a service for researchers to identify their work.
- UBC signs an agreement with DataCite Canada to issue DOIs, which can be done for individual resources, CSV files, or programmatically. The library is open to collaborating on issuing DOIs both within UBC and beyond.
The document discusses digital preservation and file format selection for long-term preservation of digital assets. It notes that file formats can become obsolete over time and presents five criteria for selecting preservation-suitable formats: 1) widespread adoption, 2) lack of technological dependencies, 3) disclosure of specifications, 4) transparency/identifiability, and 5) ability to embed metadata. It also discusses using a "performance model" where the significant properties and essence of a digital object are maintained regardless of file format changes over time. The key recommendation is to select file formats that align with a preservation strategy articulating the repository's purpose and community needs.
Donat Agosti & Norman F. Johnson - Copyright: the new taxonomic impedimentICZN
1. Taxonomic publications are considered "legal documents" as they establish nomenclatural decisions under taxonomic codes. As such, everyone should have access to these legally binding documents.
2. Taxonomic descriptions are factual knowledge based on direct observations, so the descriptive parts of publications cannot be copyrighted and should be open access.
3. Publications can be broken down into the basic data elements of individual taxon descriptions, which contain details like descriptions, specimens examined, and characters. These descriptions are the building blocks of taxonomic knowledge.
AIMS Workshop Case Study 3: Arrangement and Description Case Study - Stephen ...AIMS_Archives
This document summarizes the arrangement and description of the Stephen Jay Gould papers, which include both analog and born-digital materials. The analog materials consist of 550 linear feet of papers in various containers, while the digital materials include over 2,500 files in formats such as WordPerfect, Microsoft Word, Excel, and others stored on floppy disks, computer tapes, and punch cards. Tools like AccessData FTK are being used to process the digital materials, including rendering files, identifying duplicates, full-text searching, and flagging restricted files. Labels are being applied to files to indicate access restrictions, file types, and subjects to carry that metadata when files are exported to the access repository. Finding aids like EAD are
This document provides guidance on using library databases such as ProQuest and JSTOR to research topics and find scholarly journal articles and other sources. It outlines how to develop keywords from a topic, search databases effectively using Boolean logic and filters, collect and save search results, and get help from a librarian if needed. Databases contain peer-reviewed sources not available elsewhere and can save time compared to general web searches. Tips are provided on refining searches, choosing file formats, and excluding book reviews from JSTOR results.
Born digital archives refer to personal and corporate archives that are created and stored in digital formats, rather than physical formats. They typically include draft works, diaries, correspondence, photographs, and other digital files and objects. These archives pose challenges for preservation due to the variety of file formats, operating systems, and storage media used over time as technologies become obsolete. Institutions must address issues related to representing relationships within archives, scaling workflows, data protection, and educating users on access to these archives.
Presentation on the use of the Eureka Research Workbench to store data and scientific workflow information. Presented online as part of the Dial-a-molecule 'Liberating Laboratory Data' event (http://www.dial-a-molecule.org/wp/events-listing/liberating-laboratory-data/)
DOIs and Other Persistent identifiers in Research Data (Eugene Barsky)ORCID, Inc
- Persistent identifiers like DOIs, Handles, ARKs and PURLs provide long-lasting references to digital resources. They ensure the provenance and persistence of cited resources over time.
- DOIs have additional benefits like discoverability, making resources findable across scholarly databases. UBC Library has developed a GUI to mint DOIs as a service for researchers to identify their work.
- UBC signs an agreement with DataCite Canada to issue DOIs, which can be done for individual resources, CSV files, or programmatically. The library is open to collaborating on issuing DOIs both within UBC and beyond.
The document discusses digital preservation and file format selection for long-term preservation of digital assets. It notes that file formats can become obsolete over time and presents five criteria for selecting preservation-suitable formats: 1) widespread adoption, 2) lack of technological dependencies, 3) disclosure of specifications, 4) transparency/identifiability, and 5) ability to embed metadata. It also discusses using a "performance model" where the significant properties and essence of a digital object are maintained regardless of file format changes over time. The key recommendation is to select file formats that align with a preservation strategy articulating the repository's purpose and community needs.
Donat Agosti & Norman F. Johnson - Copyright: the new taxonomic impedimentICZN
1. Taxonomic publications are considered "legal documents" as they establish nomenclatural decisions under taxonomic codes. As such, everyone should have access to these legally binding documents.
2. Taxonomic descriptions are factual knowledge based on direct observations, so the descriptive parts of publications cannot be copyrighted and should be open access.
3. Publications can be broken down into the basic data elements of individual taxon descriptions, which contain details like descriptions, specimens examined, and characters. These descriptions are the building blocks of taxonomic knowledge.
The Names Project presentation discusses using Names to disambiguate researcher identities and integrate researcher data across different sources. Names extracts data from repositories like EPrints and Zetoc and makes it available through APIs and other standardized formats. Over 30 million researcher records have been made permanent in Names so far. Future work includes processing more data sources, adding more identifiers like ISNI and ORCID, and developing plugins to help repositories integrate with Names.
The document proposes a data model for an MP3/CD music collection database. It lists attributes like file name, genre, artist, album, size and time for music files. An entity-relationship model is designed with a MUSIC FILE entity connected to attributes through a Track_ID primary key. The primary key will uniquely identify each record and allow relationships between database entities, and the database can be expanded through new music files and references to other databases.
How to do things with metadata: From rights statements to speech acts.Richard Urban
This document discusses metadata rights statements from the perspective of speech act theory. It analyzes a sample of 488 unique rights statements from the Digital Public Library of America and codes them according to Searle's taxonomy of speech acts. The majority were coded as assertives (199) or directives (272) regarding copyright and usage permissions. Other speech acts identified include one commissive and one expressive statement. Non-speech acts (130) were also present. The analysis suggests rights statements communicate different types of speech acts and exploring how to automatically classify them could help improve metadata quality.
Online Library and Information Systems: the DLSU ExperienceFe Angela Verzosa
presented at a seminar sponsored by the UP Library Science Alumni Association, held at the UP College of Engineering Theater, Diliman, Quezon City, Philippines on 1997 Oct. 23
1. The document provides an overview of teaching systems and fundamentals of technology to elementary and middle school students.
2. It outlines topics such as basic computer parts, file types, organizing and storing files, troubleshooting problems, and engagement strategies for teaching these concepts.
3. Key areas covered include basic computer hardware, software, file extensions, setting up file organization systems, defining problems, and common troubleshooting solutions.
The document discusses fundamental file processing operations including opening, closing, reading, writing and seeking files. It defines physical files as those that exist on storage while logical files are how programs view files. Opening a file can create a new file or open an existing one. Closing a file frees it up to be used by another program. Reading and writing are essential I/O operations for file processing. Seeking allows moving to a specific position in a file defined by an offset from the start. Special characters can cause issues when creating file structures.
This document discusses key aspects of building databases to catalog global biodiversity in the 2000s, including standards, technology, data sharing challenges, and classification methods. It covers how database infrastructure requires stable standards and technology to ensure data accessibility over time. Issues around data ownership, privacy, and ensuring data can be shared and reused across disciplines are also addressed. Classification systems are evolving from paper-based to digital formats using tools like cladistics and computer programs to help organize the vast amounts of data being collected through worldwide biodiversity projects.
S alvarado revision wk 7 copyright crash coursesalvara85
This document discusses copyright and fair use guidelines for using copyrighted materials. It outlines the differences between implied licenses and express licenses, notes that orphan works lack ownership information, and addresses penalties for copyright infringement. The document also describes the four fair use factors to determine if permission is needed and provides resources for obtaining permission or determining fair use.
This document provides an introduction to files and file systems. It defines what files are, including that they are containers for storing information and come in different types like text, data, binary and graphic files. It outlines key file attributes like name, size, permissions. It also describes different file access methods like sequential, direct/random, and indexed sequential access. File operations like create, write, read, delete and truncate are also covered. The document concludes with definitions of flat file databases and their advantages and disadvantages compared to relational databases.
1) Physical files exist on storage while logical files are how programs view files without knowing the actual physical file.
2) Opening files creates a new file or accesses an existing one, while closing files frees up the file descriptor for another file and ensures all output is written.
3) Core file processing operations include reading, writing, and seeking within a file.
This document summarizes a presentation on the Hypatia platform, which was developed to help archivists manage, preserve, and provide access to digital archival materials. Key points include:
- Hypatia is an open source software based on Hydra and Fedora that aims to be a repository solution for digital archives.
- It grew out of the Archives Information Management System (AIMS) project and leverages the Hydra framework.
- The presentation covered Hypatia's functional requirements gathering, data models, demonstration of capabilities, and plans for future development and community involvement.
247th ACS Meeting: The Eureka Research WorkbenchStuart Chalk
Academic scientists need a tool to capture the science they do so that it can be shared in open science, integrated with linked data, and shared/searched. Eureka is an evolving platform to do this.
The document discusses challenges and strategies for digital preservation. It outlines a life cycle approach to digital archiving including metadata, storage, access, and preservation strategies like migration. Examples of digital preservation projects at Rutgers University are provided, such as databases of historical information and digital collections. Ensuring long-term access to digital content requires standards, documentation, addressing technology obsolescence, and establishing trusted digital repositories.
eScience: A Transformed Scientific MethodDuncan Hull
The document discusses the concept of eScience, which involves synthesizing information technology and science. It explains how science is becoming more data-driven and computational, requiring new tools to manage large amounts of data. It recommends that organizations foster the development of tools to help with data capture, analysis, publication, and access across various scientific disciplines.
This poster presents guidelines for researchers to improve reproducibility in scientific research by better documenting the key entities of research: data, software, workflow, and research output. It recommends documenting data sources and processing steps, writing descriptive code with examples, and using tools like Docker, Jupyter notebooks, LaTeX, and data repositories to capture the experimental environment and research process. Following these guidelines helps researchers communicate and verify their work, allowing others to build on their research findings.
This document summarizes Peter Chan's presentation on accessioning born-digital materials. The presentation covered literature reviews on best practices, putting accessioning in context within Stanford's workflow, and a demonstration of their forensic workflow. The workflow involves surveying collections, creating accession records, photographing media, virus checking, creating disk images, generating summaries, and transferring data to secure storage. Questions from attendees were also taken and a tour of the forensic lab was included.
This document summarizes a presentation about EZID, a service that provides persistent identifiers and supports data citation. It introduces DataCite as an international consortium that develops specifications for data citation. The presentation outlines new features being developed for EZID, including service replicas, URN support, suffix pass-through identifiers, and identifier status indicators. It also discusses the ARK community and governance.
The document discusses the evolution of data storage and retrieval from oral traditions to modern databases integrated with the World Wide Web. It describes how early databases used file-based systems that had limitations in efficiency and usability. The development of relational databases and the ability to dynamically query databases from web servers enabled more powerful data-driven websites and applications. The integration of databases and client-side technologies like Flash further enhanced the interactivity and capabilities of websites and web applications.
Saa Session 502 Born Digital Archives in Collecting RepositoriesAIMS_Archives
Digital archivists from the Universities of Hull (UK), Stanford, and Yale currently are collaborating on an Andrew W. Mellon Foundation-funded project. Born-Digital Collections: An Inter-Institutional Model for Stewardship (AIMS) will produce a common framework for managing born digital archives. Each digital archivist presents a short case study to cover areas of workflow for electronic records: collection development, accessioning, arrangement and description, and discovery and access.
BP301: Q: What’s Your Second Most Valuable Asset and Nearly Doubles Every Year? panagenda
A: Data! But do you know where this data is duplicated, by whom and exactly how it’s scattered across laptops, desktops, file servers and IBM Domino databases?
Let us show you how to analyze local drives, network drives and server based apps to get a grasp of what data is out there and what it means to your business. Learn how to collect, aggregate and analyze file sizes and types, as well as identify knowledge sharing patterns. This session will empower you to work towards reducing your data storage costs and increasing collaboration efficiency!
The Names Project presentation discusses using Names to disambiguate researcher identities and integrate researcher data across different sources. Names extracts data from repositories like EPrints and Zetoc and makes it available through APIs and other standardized formats. Over 30 million researcher records have been made permanent in Names so far. Future work includes processing more data sources, adding more identifiers like ISNI and ORCID, and developing plugins to help repositories integrate with Names.
The document proposes a data model for an MP3/CD music collection database. It lists attributes like file name, genre, artist, album, size and time for music files. An entity-relationship model is designed with a MUSIC FILE entity connected to attributes through a Track_ID primary key. The primary key will uniquely identify each record and allow relationships between database entities, and the database can be expanded through new music files and references to other databases.
How to do things with metadata: From rights statements to speech acts.Richard Urban
This document discusses metadata rights statements from the perspective of speech act theory. It analyzes a sample of 488 unique rights statements from the Digital Public Library of America and codes them according to Searle's taxonomy of speech acts. The majority were coded as assertives (199) or directives (272) regarding copyright and usage permissions. Other speech acts identified include one commissive and one expressive statement. Non-speech acts (130) were also present. The analysis suggests rights statements communicate different types of speech acts and exploring how to automatically classify them could help improve metadata quality.
Online Library and Information Systems: the DLSU ExperienceFe Angela Verzosa
presented at a seminar sponsored by the UP Library Science Alumni Association, held at the UP College of Engineering Theater, Diliman, Quezon City, Philippines on 1997 Oct. 23
1. The document provides an overview of teaching systems and fundamentals of technology to elementary and middle school students.
2. It outlines topics such as basic computer parts, file types, organizing and storing files, troubleshooting problems, and engagement strategies for teaching these concepts.
3. Key areas covered include basic computer hardware, software, file extensions, setting up file organization systems, defining problems, and common troubleshooting solutions.
The document discusses fundamental file processing operations including opening, closing, reading, writing and seeking files. It defines physical files as those that exist on storage while logical files are how programs view files. Opening a file can create a new file or open an existing one. Closing a file frees it up to be used by another program. Reading and writing are essential I/O operations for file processing. Seeking allows moving to a specific position in a file defined by an offset from the start. Special characters can cause issues when creating file structures.
This document discusses key aspects of building databases to catalog global biodiversity in the 2000s, including standards, technology, data sharing challenges, and classification methods. It covers how database infrastructure requires stable standards and technology to ensure data accessibility over time. Issues around data ownership, privacy, and ensuring data can be shared and reused across disciplines are also addressed. Classification systems are evolving from paper-based to digital formats using tools like cladistics and computer programs to help organize the vast amounts of data being collected through worldwide biodiversity projects.
S alvarado revision wk 7 copyright crash coursesalvara85
This document discusses copyright and fair use guidelines for using copyrighted materials. It outlines the differences between implied licenses and express licenses, notes that orphan works lack ownership information, and addresses penalties for copyright infringement. The document also describes the four fair use factors to determine if permission is needed and provides resources for obtaining permission or determining fair use.
This document provides an introduction to files and file systems. It defines what files are, including that they are containers for storing information and come in different types like text, data, binary and graphic files. It outlines key file attributes like name, size, permissions. It also describes different file access methods like sequential, direct/random, and indexed sequential access. File operations like create, write, read, delete and truncate are also covered. The document concludes with definitions of flat file databases and their advantages and disadvantages compared to relational databases.
1) Physical files exist on storage while logical files are how programs view files without knowing the actual physical file.
2) Opening files creates a new file or accesses an existing one, while closing files frees up the file descriptor for another file and ensures all output is written.
3) Core file processing operations include reading, writing, and seeking within a file.
This document summarizes a presentation on the Hypatia platform, which was developed to help archivists manage, preserve, and provide access to digital archival materials. Key points include:
- Hypatia is an open source software based on Hydra and Fedora that aims to be a repository solution for digital archives.
- It grew out of the Archives Information Management System (AIMS) project and leverages the Hydra framework.
- The presentation covered Hypatia's functional requirements gathering, data models, demonstration of capabilities, and plans for future development and community involvement.
247th ACS Meeting: The Eureka Research WorkbenchStuart Chalk
Academic scientists need a tool to capture the science they do so that it can be shared in open science, integrated with linked data, and shared/searched. Eureka is an evolving platform to do this.
The document discusses challenges and strategies for digital preservation. It outlines a life cycle approach to digital archiving including metadata, storage, access, and preservation strategies like migration. Examples of digital preservation projects at Rutgers University are provided, such as databases of historical information and digital collections. Ensuring long-term access to digital content requires standards, documentation, addressing technology obsolescence, and establishing trusted digital repositories.
eScience: A Transformed Scientific MethodDuncan Hull
The document discusses the concept of eScience, which involves synthesizing information technology and science. It explains how science is becoming more data-driven and computational, requiring new tools to manage large amounts of data. It recommends that organizations foster the development of tools to help with data capture, analysis, publication, and access across various scientific disciplines.
This poster presents guidelines for researchers to improve reproducibility in scientific research by better documenting the key entities of research: data, software, workflow, and research output. It recommends documenting data sources and processing steps, writing descriptive code with examples, and using tools like Docker, Jupyter notebooks, LaTeX, and data repositories to capture the experimental environment and research process. Following these guidelines helps researchers communicate and verify their work, allowing others to build on their research findings.
This document summarizes Peter Chan's presentation on accessioning born-digital materials. The presentation covered literature reviews on best practices, putting accessioning in context within Stanford's workflow, and a demonstration of their forensic workflow. The workflow involves surveying collections, creating accession records, photographing media, virus checking, creating disk images, generating summaries, and transferring data to secure storage. Questions from attendees were also taken and a tour of the forensic lab was included.
This document summarizes a presentation about EZID, a service that provides persistent identifiers and supports data citation. It introduces DataCite as an international consortium that develops specifications for data citation. The presentation outlines new features being developed for EZID, including service replicas, URN support, suffix pass-through identifiers, and identifier status indicators. It also discusses the ARK community and governance.
The document discusses the evolution of data storage and retrieval from oral traditions to modern databases integrated with the World Wide Web. It describes how early databases used file-based systems that had limitations in efficiency and usability. The development of relational databases and the ability to dynamically query databases from web servers enabled more powerful data-driven websites and applications. The integration of databases and client-side technologies like Flash further enhanced the interactivity and capabilities of websites and web applications.
Saa Session 502 Born Digital Archives in Collecting RepositoriesAIMS_Archives
Digital archivists from the Universities of Hull (UK), Stanford, and Yale currently are collaborating on an Andrew W. Mellon Foundation-funded project. Born-Digital Collections: An Inter-Institutional Model for Stewardship (AIMS) will produce a common framework for managing born digital archives. Each digital archivist presents a short case study to cover areas of workflow for electronic records: collection development, accessioning, arrangement and description, and discovery and access.
BP301: Q: What’s Your Second Most Valuable Asset and Nearly Doubles Every Year? panagenda
A: Data! But do you know where this data is duplicated, by whom and exactly how it’s scattered across laptops, desktops, file servers and IBM Domino databases?
Let us show you how to analyze local drives, network drives and server based apps to get a grasp of what data is out there and what it means to your business. Learn how to collect, aggregate and analyze file sizes and types, as well as identify knowledge sharing patterns. This session will empower you to work towards reducing your data storage costs and increasing collaboration efficiency!
Linux is a freely distributed open source operating system similar to Unix. It was developed by Linus Torvalds and has become widely used by companies, academics, and individuals due to its free source code and ability to scale across systems. Helix is a Linux distribution tailored for computer forensics that contains tools like Adepto for acquiring forensic images and Autopsy for analyzing the images to extract evidence from investigations.
BioHDF is a project to develop open binary file formats and software tools for managing large-scale genomic data from next-generation DNA sequencing. The project aims to address challenges related to the proliferation of file formats, redundancy of data, and computational overhead by building on the HDF5 data model and libraries. BioHDF will develop models and applications to support primary and secondary data analysis from sequencing, with collaborations planned with software developers and research groups.
This document provides guidance on managing research data. It discusses planning ahead to consider data needs, formats, and volume. It emphasizes organizing data through file naming, metadata, references, email, and remote access. It stresses preserving data by determining what to keep/delete, using long-term storage such as repositories or archives. Finally, it examines reasons to share data such as scientific integrity, funding mandates, and increasing impact and collaboration.
The document provides guidance on early planning for data management, including becoming familiar with funder requirements, planning for the types and formats of data that will be created, designing a system for taking notes, organizing files through consistent naming schemes and use of folders, adding metadata to files to aid in documentation and discovery, and using RSS feeds to organize web-based information. It also touches on issues like plagiarism, data protection, intellectual property rights, and remote access to and backup of data.
Watching the Detectives: Using digital forensics techniques to investigate th...GarethKnight
This document discusses digital forensics techniques used by law enforcement and researchers. It describes how digital forensics emerged in response to criminal use of electronic devices and emphasizes scientifically valid methods. Key techniques discussed include imaging media to obtain evidence, using hashing to filter known files, and data carving to recover deleted information. Challenges include analyzing increasing digital data and addressing ethical issues when recovering deleted files.
AntiForensics - Leveraging OS and File System Artifacts.pdfekobelasting
The document discusses anti-forensics techniques that can be used to hide evidence on a hard drive and frustrate forensic investigations. It covers how tools like file wipers, log injectors, and timestamp manipulators can destroy artifacts and obscure timelines. It also details the operating system and file system artifacts that examiners can analyze, such as Prefetch files, Jump Lists, Volume Shadow Copies, and the MFT, to potentially detect the use of anti-forensics and recover deleted files and events. The document aims to help examiners understand criminal perspectives and common artifacts in order to catch anti-forensics activities.
The document discusses Apache Tika, an open source content analysis and detection toolkit. It provides an overview of Tika's history and capabilities, including MIME type detection, language identification, and metadata extraction. It also describes how NASA uses Tika within its Earth science data systems to process large volumes of scientific data files in formats like HDF and netCDF.
The document discusses the impact of Covid-19 on learning and education, including long-term effects on academic setups due to lack of physical access and digital divides. It also discusses the need for and benefits of institutional repositories to manage and provide access to scholarly works. Key benefits include increased visibility, centralized storage, and supporting learning and teaching. Challenges include difficulties generating content and issues around policies, incentives, and costs. The document then focuses on the open-source DSpace software as a tool for creating institutional repositories, covering its features, requirements, structures, workflows, and examples of existing DSpace-based repositories.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-und-domino-lizenzkostenreduzierung-in-der-welt-von-dlau/
DLAU und die Lizenzen nach dem CCB- und CCX-Modell sind für viele in der HCL-Community seit letztem Jahr ein heißes Thema. Als Notes- oder Domino-Kunde haben Sie vielleicht mit unerwartet hohen Benutzerzahlen und Lizenzgebühren zu kämpfen. Sie fragen sich vielleicht, wie diese neue Art der Lizenzierung funktioniert und welchen Nutzen sie Ihnen bringt. Vor allem wollen Sie sicherlich Ihr Budget einhalten und Kosten sparen, wo immer möglich. Das verstehen wir und wir möchten Ihnen dabei helfen!
Wir erklären Ihnen, wie Sie häufige Konfigurationsprobleme lösen können, die dazu führen können, dass mehr Benutzer gezählt werden als nötig, und wie Sie überflüssige oder ungenutzte Konten identifizieren und entfernen können, um Geld zu sparen. Es gibt auch einige Ansätze, die zu unnötigen Ausgaben führen können, z. B. wenn ein Personendokument anstelle eines Mail-Ins für geteilte Mailboxen verwendet wird. Wir zeigen Ihnen solche Fälle und deren Lösungen. Und natürlich erklären wir Ihnen das neue Lizenzmodell.
Nehmen Sie an diesem Webinar teil, bei dem HCL-Ambassador Marc Thomas und Gastredner Franz Walder Ihnen diese neue Welt näherbringen. Es vermittelt Ihnen die Tools und das Know-how, um den Überblick zu bewahren. Sie werden in der Lage sein, Ihre Kosten durch eine optimierte Domino-Konfiguration zu reduzieren und auch in Zukunft gering zu halten.
Diese Themen werden behandelt
- Reduzierung der Lizenzkosten durch Auffinden und Beheben von Fehlkonfigurationen und überflüssigen Konten
- Wie funktionieren CCB- und CCX-Lizenzen wirklich?
- Verstehen des DLAU-Tools und wie man es am besten nutzt
- Tipps für häufige Problembereiche, wie z. B. Team-Postfächer, Funktions-/Testbenutzer usw.
- Praxisbeispiele und Best Practices zum sofortigen Umsetzen
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Infrastructure Challenges in Scaling RAG with Custom AI modelsZilliz
Building Retrieval-Augmented Generation (RAG) systems with open-source and custom AI models is a complex task. This talk explores the challenges in productionizing RAG systems, including retrieval performance, response synthesis, and evaluation. We’ll discuss how to leverage open-source models like text embeddings, language models, and custom fine-tuned models to enhance RAG performance. Additionally, we’ll cover how BentoML can help orchestrate and scale these AI components efficiently, ensuring seamless deployment and management of RAG systems in the cloud.
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
GraphRAG for Life Science to increase LLM accuracyTomaz Bratanic
GraphRAG for life science domain, where you retriever information from biomedical knowledge graphs using LLMs to increase the accuracy and performance of generated answers
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
2. Collections in the late 1990s Apple Computer Inc. records Douglas Engelbart papers Stephen Cabrinety collection By 2000, over 7,000 items of legacy computer media received as part of hybrid collections Now over 26,000 items recorded during accessioning process
4. First Digital Lives Research Conference: Personal Digital Archives for the 21st Century
5. FRED (Forensic Recovery Evidence Device: Digital Intelligence) Software: FTK suite (AccessData) - EnCase
6. AIMS Born-Digital Collections: An Inter-Institutional Model for Stewardship University of Virginia Yale University Hull University Stanford University Funded by the Andrew W. Mellon Foundation
7. Robert Creeley papers Stephen Jay Gould papers Keith Henson papers re: to Project Xanadu Peter Rutledge Koch papers
8.
9. Stephen Jay Gould Influential American paleontologist, evolutionary biologist and historian of science, Gould began his career at Harvard University in 1967 and worked until his death in 2002. 98 3 ½” floppy diskettes 61 5 ½” floppy diskettes 4 sets of punch cards 3 computer tapes
10. Dear Peter,Unfortunately we do not manufacture any motherboards now a days which can support the 5.25 floppy. The interface are different than 3.5 and they are becoming obsolete and are no longer available on the newer motherboards.
38. Email Mining on Peter Koch’s Emails http://suif.stanford.edu/~hangal/muse/
39. What are our Roles? Donors & users expect us to acquire, organize, preserve and provide access to b-d collections Special Collections staff capture, appraise, arrange and describe b-d materials AND contribute to requirements for both access and delivery as well as arrangement and description tools Our digital group will preserve in our preservation repository (SDR) and provide public access and invite participation – Hypatia (under development)
40. Challenges Read contents from storage media (punch cards, tapes, 8/5.25/3.5 inch. floppy diskettes, Zip disks, etc.) View contents with different formats (WordPerfect, Lotus 1-2-3, Quark files, etc.) Organize “large” collections (420,000 files or multiple computers in 1 collection) Long term preservation (hardware failure, obsolete file formats, unknown future, etc.) Wide scope of knowledge needed (computer hardware, operating systems, application, repository and virtualization software, archival processing, security (authentication , encryption), Web 3.0, digital preservation, natural language processing, etc.) Descriptive standards are in flux Accessioning procedures under development Delivery options for different formats
[INTRO]A little over two years ago, a few elements converged and a core group of us at Stanford began to get more serious about developing a viable method for processing the born-digital “papers” in our collectionsMost of my talk is centered around our first trialsbut I’d first like to describe the context and pressures at SUL that put us on this path …
The major pressure was the growing quantity of legacy media in our “backlog” …With Stanford situated in Silicon Valley, it’s no big surprise that we have a lot of computer collections that contain old legacy media. Hence our acquisition in the late 1990s of the records of Apple Computer Inc., the papers of Douglas Engelbart and a really large collection of computer games and software. [images: mouse/engelbart, box of Atari games/Cabrinety]
Because of those very acquisitions, in 1997, the Manuscripts Division began tracking the incoming quantities – just an overall count – of legacy computer media contained in new accessions. By the end of the decade we had recorded over 7,000 “items”Increasingly our b-d material comes from faculty, artists, writers, organizations Today, we have over 26,000 items of legacy media recorded in our backlog. [Univ. Archives has ~700 listed]
The other element was an event in February 2009. A staff member on our digital team (Michael Olson), who had previously worked in Manuscripts, attended the Digital Lives Project’s first conference at the British Library. Two things occurred : He heard about a study* done at the B.L. on data loss in legacy computer media (3% per year) and … He saw that the B.L. was exploring the use of forensic tools for capturing data from media. Based on this and coupled with the weight of our growing backlog of media – we decided on two courses of action: *McLeod, Rory paper “Risk Assessment; using a risk based approach to prioritise handheld digital information” 2008
First we purchased forensic hardware and software to enable us to capture and view legacy media and files. Hardware from Digital Intelligence (FRED) Software – we purchased and tested both FTK and En-Case forensic software. This framed the nucleus of our digital lab … And yet, most forensic equipment is geared toward current/modern media. So, we searched Ebay for old floppy disk drives to use with FRED
Next, we partnered with 3 other institutions (U. Va., Yale and Hull) - as part of the AIMS Project - funded by the Andrew W. Mellon Foundation. (AIMS Born-Digital Collections: An Inter-Institutional Model for Stewardship) The goals of the project were to:to process b-d material from 13 (mostly legacy) collections and to deliver the b-d material in some fashion by the end of the grant
Each repository hired a Digital Archivist. Peter Chan was hired at SUL in January 2010 and began actual work on imaging the disks for our 4 collections and trying out various methods for “processing” the data. We choose collections that contained different types of media and content. They are: Robert Creeley papers (poet – mostly email, some writing)Stephen Jay Gould papers (paleontologist, author – writing and some data sets)Peter Routledge Koch papers (fine press printer – a mix of files: email, text, image, and design files like Adobe’s InDesign. This was the only collection with files transferred directly from a donor’s current computer)Xanadu Project records (early hypertext project – software program on 6 hard drives)
It is now 1.5 years later and we have created a viable (although under constant development) workflow for accessioning and processing b-d materials using forensic tools. This is the more detailed workflow for collections that would be “fully” processed… We are also working on a minimal processing workflow and one that would fully “accession” the data – i.e. remove from physical storage media - and store for later processing.
One of the collections that has informed our development of b-d practice is that of Stephen Jay Gould, which contains both paper (analog) – over 500 linear feet – and 3 cartons of digital material. 98 3.5-inch floppy disks61 5.25-inch floppy disks4 sets of punch cards3 computer tapes In total, over 550 linear feet have been rec’d in 8 accessions.His papers and the audio/video are being processed concurrently – by archivist, Jenny Johnson - and will be done this August This month, the processing team discovered another 5 cartons of punch cards in the 2008 accn (21 sets). [This recent find won’t be resolved by end of grant]
Using the two different capture stations – FRED & the floppy/zip station – we created disk images of all the disks : 8 sets of punch cards were successfully read by our neighbors at the Computer History Museum. 1 set was unreadable – as it had no sorting key.We also began tracking loss our own loss statistics - “success or failure” of captures - in a spreadsheet; which we link to our accession records in Archivists’ Toolkit.Loss rate for floppies in Gould are 5% - loss in other collections was higher.Creeley: 6% loss: 1 out of 12 CDs unreadable; 3 out of 53 floppies unreadable. [1987-2004?]Xanadu: 4 of 6 hard drives inoperable – or 67% damaged: [PC’s report: There were mechanical or electrical problems with other drives (one didn't spin after it was powered up and one gave a "dong" sound after it was powered up). We are not sure what the problems with the remaining two drives are – they do spin after power up but we cannot access the data.] Cost to recover ~ $10,000 (2.5K / drive)
To process the materials during our initial trial, we used Windows Explorer.Folders were created that mirrored “series” and “titles” in EAD and files were moved from original media folder into appropriate place. This however changed data associated with the files – such as original path, etc.At this point, Peter Chan attended a week long session on the use of forensic software at Digital Intelligence – focusing on FTKWhile much more robust than we needed for archival work, he decided that many of the tools in FTK could easily be adapted for archival processing. We discovered that this practice mirrored work beginning at both BL and Oxford.
Technical metadata for the disk images are displayed here. The are arranged by floppy disk and display file format (where identifiable), file size, checksum, creation dates, etc. One can change the view to add additional columns, such as duplicate or primary file, etc.
The embedded viewer in FTK – from the same company that does Quick View Plus – allows you to quickly see the contents of many of the files
Here are two quick screen shots showing archival HIERARCHY using FTK’s “bookmark” feature.Series or Subseries can be added as metadata to individual or groups of files by highlighting or checking the boxes of the files in the lower panel.
Description for the three different formats in Gould will be merged at the end of the summer or early fall – paper, audio/video and born-digital files – but the level of description will be different.Gould’s papers are processed to the folder level for most of the collectionThe audio and video are listed at the item level to facilitate any future digitizationThe born-digital material will have Series level description with notes about original mediacapture and processing methods loss/damaged media and delivery methods
Here is a partial view of our working draft for processing notes for Gould b-d “series”
We encountered different issues in our other “AIMS” collections - the main one I will mention is the Robert Creeley collection… His papers originally contained : 53 floppies, 5 zip disks, and 3 CDsInitially the computer media was segregated into a separate collection – but will need to be merged into the main collection record and finding aid in the fall.After processing with FTK, the disk images garnered: Identified 50K emails Identified 8 files related to health records Identified 69 files with SS#A recent addenda complicates the processing of Creeley’s born-digital material : rec’d in May 2011 containing b-d media will need to be processed – and may allow us to have more complete set of emails, drafts, etc.7 computers3 zip drives121 optical discs422 3.5-inch floppy diskettes1 Zip 250 USB Drive1 Olympus C-4000 Camedia Digital Camera & flash cards1 20-gigabyte iPodWe have yet to analyze the data in the new accession and compare to original data but two issues cropped upHow to process and deliver multiple computers over creators life cycleData was captured from various CDs and computers to create an overview of the b-d material before transfer to SUL – what got changed in the process?![image from wikipedia taken by Elsa Dorman]
In processing initial computer media, PC used folder titles on the disks as keywords for files
Using Creeley’s initial text data, we have worked with two individuals – one working in the Digital Humanities – who took the header info from the 50K emails and created a network graph (Elijah Meeks) : Header information from Robert Creeley’s 50,000+ emails emphasizing the connection between the poet and Gerard Malanga.
To wrap up:Donors & users expect us to acquire, organize, preserve and provide access to b-d collectionsSpecial Collections staff capture, appraise, arrange and describe b-d materials AND contribute to requirements for both access and delivery as well as arrangement and description toolsOur digital group will preserve in our preservation repository (SDR) and provide public access and invite participation – Hypatia (under development)