The document discusses best practices for large-scale digitization of archival materials. It addresses topics like scan quality, resolution, file formats, and compression methods. The goal is to digitize collections at a large scale while keeping production and storage costs low, without compromising readability for research purposes.
This is a copy of the presentation given by Ellen Fleurbaay and Marc Holtman of the Amsterdam City Archives at the the MARAC Plenary Session in Jersey City on Friday October 30, 2009.
Presentatie op mini symposium Koning of Bedelaar over impact van digitalisering op de informatiehuishouding van archiefdiensten, Amsterdam, 6 feb 2012. Marc Holtman
This is a copy of the presentation given by Ellen Fleurbaay and Marc Holtman of the Amsterdam City Archives at the the MARAC Plenary Session in Jersey City on Friday October 30, 2009.
Presentatie op mini symposium Koning of Bedelaar over impact van digitalisering op de informatiehuishouding van archiefdiensten, Amsterdam, 6 feb 2012. Marc Holtman
Learn about the basic decisions required for business document scanning. Indexing, file formats, document resolution, color space, and more. Learn about estimating volumes and automated capture technology such as barcode recogonition, OCR, batch document processing and more.
DocuFile enables automatic archiving of files to reduce storage costs. It migrates rarely used files to low-cost storage while indexing them and allowing management of retention policies. This relieves file servers and cuts 20% of stored data that is rarely used, lowering IT budgets burdened by this seldom-accessed information. As files are kept indefinitely instead of deleted, constant storage expansion occurs, raising procurement, operation, power, and cooling costs. DocuFile addresses this by archiving files based on rules of age, size, type or attributes; exchanging originals with reference files; and restoring archived files on request.
Digital library management and archiving software by PressMart unlocks the historic archives of newspapers, libraries, magazines, journals and catalogs into Digital Format. PressMart Magazine and Newspaper Publishing software is the industry most advanced online publishing software. Its emagazine and epapers are widely used in 21 countries including US, UK, INDIA, Spain, Germany. For more info pls visit http://www.pressmart.net/
Beginning an Imaging Program: Achieving Success and Avoiding the Pitfalls – A...Raymond Cunningham
This document summarizes key points about beginning an imaging program, including defining the audience and purpose, choosing a small pilot project, gaining support, and addressing technical considerations like scanning resolution, storage, indexing, and distribution methods. It emphasizes starting small, focusing on usability, and avoiding over-engineering or indexing more than necessary.
Document Technologies Inc. is the largest independent provider of litigation support services, assisting clients throughout the legal process. They have over 1,500 employees across 20 offices and offer end-to-end solutions including data collection, processing, review and production. Their Washington D.C. office provides local document review facilities, scanning, and computer forensics services.
Document Technologies Inc. is the largest independent provider of litigation support services, assisting clients throughout the legal process. They have a national presence with over 1,500 employees across 20 markets. As an end-to-end provider, they offer services including data preservation, review, processing and production. They utilize best-of-breed technologies and have facilities in DC and Atlanta capable of large-scale electronic discovery.
Research Data (and Software) Management at Imperial: (Everything you need to ...Sarah Anna Stewart
A presentation on research data management tools, workflows and best practices at Imperial College London with a focus on software management. Presented at the 2017 session of the HPC Summer School (Dept. of Computing).
Public cloud storage might look cost-effective at first glance, but AWS, Azure, and Google Cloud will saddle you with egress charges for every file you pull out of the cloud - and these add up quick. So how can you predict your real cloud storage TCO?
Cloud content migration strategies frequently overlook file access performance and storage costs. In this session, we will explore how to:
• Identify hidden dangers in cloud content storage that are quietly taxing IT budgets
• Build specific strategies to help you better forecast your cloud storage investment
• Detect cloud cost drivers in your own systems
• Protect your organization from runaway cloud costs – Before it’s too late!
If you are responsible for cost containment, records/document/archive/content management, or even developing your own in-house applications that require document capture, optimized compression, archiving of documents, this session is for you.
Digitization is revolutionizing library management by increasing access and preserving fragile materials. The document discusses best practices for digitization including choosing materials, file formats, metadata standards, copyright issues, outsourcing options, and long-term digital preservation. It also provides examples of the Memorial's digitization of WW1 records and considerations for developing an enterprise content management system.
The document discusses a new digital forensic data capture device called the Forensic Dossier launched by Logicube. The Dossier allows investigators to capture data from suspect drives at speeds of up to 6GB per minute. It supports capturing from RAID drives and various flash media. The Dossier features built-in support for many drive types and connections. It includes advanced authentication and other forensic features. The Dossier will be showcased at the 2009 International CES conference in Las Vegas.
Cost, Risk, Loss and other fun things PrestoCentre
Presentation given by Matthew Addis (ITInnovation Centre) of the PrestoPRIME project at Screening the Future conference, March 14-15 at the Netherlands Institute for Sound and Vision in Hilversum
IIA Conference 2017 - Edmonton, AB - Paperless GovernementBruce Covington
If you missed our presentation at IIA on September 26, we got you covered.
Presented by:
Hassan Qureshi, Partner, MNP LLP
Stephanie Armstrong, Senior Consultant, MNP LLP
Bruce Covington, Director, Document Imaging Services, PSPC
The document discusses digitizing legacy materials for online courses. It may include older formats like video tapes, floppies, and microfiche. The best approaches are used by museums, libraries, and universities that have dealt with file size constraints and conversion costs. Factors to consider for digitization include costs, intended use of materials, condition of originals, and future migration needs. Different materials like paper, audio, and video have different complexity levels. Formats are chosen based on intended preservation, use, access or commerce objectives.
This presentation looks at what organizations on the path to paperlite need to do in the planning stage to ensure they reap the rewards of document imaging.
Learn about batch document processing and the technologies used such as barcode recognition, content mining, OCR and more for unattended, automated processing. See how index data can be captured, files can be split, named, routed, cleaned, converted and more with little to no user action to save you money and time.
Slides from a half day workshop that I gave a couple of times in 2009. Better late than never I suppose. You need to read my blog post here: http://frommelbin.blogspot.com/2010/09/some-old-news-about-digitisation.html for an explanation about some slides and for references.
This document summarizes the pros, cons, and ethical considerations of insourcing versus outsourcing e-discovery services. It discusses the risks of insourcing such as bandwidth limitations and lack of technical expertise. It also reviews factors to consider like case size and ability to evaluate changing technology. The document provides examples of the costs of document review using different platforms and methods. It highlights challenges like pricing models and keeping up with changing technology. Best practices for supervising outsourced work are also outlined.
This document provides best practices for digitizing collections. It discusses key questions to consider for a digitization project, the pros and cons of in-house vs outsourced digitization, documentation standards, staffing needs, costs, scanner types, file formats, naming conventions, and storage recommendations. The overall guidelines are to digitize at high resolution from original sources, create master files and derivatives for access, use open standards, and fully document the project for long-term preservation and usability of the digital files.
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfChart Kalyan
A Mix Chart displays historical data of numbers in a graphical or tabular form. The Kalyan Rajdhani Mix Chart specifically shows the results of a sequence of numbers over different periods.
Driving Business Innovation: Latest Generative AI Advancements & Success StorySafe Software
Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency.
During the hour, we’ll take you through:
Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board.
Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes.
Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI.
We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI.
This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!
More Related Content
Similar to Grootschalige digitalisering van archivalia
Learn about the basic decisions required for business document scanning. Indexing, file formats, document resolution, color space, and more. Learn about estimating volumes and automated capture technology such as barcode recogonition, OCR, batch document processing and more.
DocuFile enables automatic archiving of files to reduce storage costs. It migrates rarely used files to low-cost storage while indexing them and allowing management of retention policies. This relieves file servers and cuts 20% of stored data that is rarely used, lowering IT budgets burdened by this seldom-accessed information. As files are kept indefinitely instead of deleted, constant storage expansion occurs, raising procurement, operation, power, and cooling costs. DocuFile addresses this by archiving files based on rules of age, size, type or attributes; exchanging originals with reference files; and restoring archived files on request.
Digital library management and archiving software by PressMart unlocks the historic archives of newspapers, libraries, magazines, journals and catalogs into Digital Format. PressMart Magazine and Newspaper Publishing software is the industry most advanced online publishing software. Its emagazine and epapers are widely used in 21 countries including US, UK, INDIA, Spain, Germany. For more info pls visit http://www.pressmart.net/
Beginning an Imaging Program: Achieving Success and Avoiding the Pitfalls – A...Raymond Cunningham
This document summarizes key points about beginning an imaging program, including defining the audience and purpose, choosing a small pilot project, gaining support, and addressing technical considerations like scanning resolution, storage, indexing, and distribution methods. It emphasizes starting small, focusing on usability, and avoiding over-engineering or indexing more than necessary.
Document Technologies Inc. is the largest independent provider of litigation support services, assisting clients throughout the legal process. They have over 1,500 employees across 20 offices and offer end-to-end solutions including data collection, processing, review and production. Their Washington D.C. office provides local document review facilities, scanning, and computer forensics services.
Document Technologies Inc. is the largest independent provider of litigation support services, assisting clients throughout the legal process. They have a national presence with over 1,500 employees across 20 markets. As an end-to-end provider, they offer services including data preservation, review, processing and production. They utilize best-of-breed technologies and have facilities in DC and Atlanta capable of large-scale electronic discovery.
Research Data (and Software) Management at Imperial: (Everything you need to ...Sarah Anna Stewart
A presentation on research data management tools, workflows and best practices at Imperial College London with a focus on software management. Presented at the 2017 session of the HPC Summer School (Dept. of Computing).
Public cloud storage might look cost-effective at first glance, but AWS, Azure, and Google Cloud will saddle you with egress charges for every file you pull out of the cloud - and these add up quick. So how can you predict your real cloud storage TCO?
Cloud content migration strategies frequently overlook file access performance and storage costs. In this session, we will explore how to:
• Identify hidden dangers in cloud content storage that are quietly taxing IT budgets
• Build specific strategies to help you better forecast your cloud storage investment
• Detect cloud cost drivers in your own systems
• Protect your organization from runaway cloud costs – Before it’s too late!
If you are responsible for cost containment, records/document/archive/content management, or even developing your own in-house applications that require document capture, optimized compression, archiving of documents, this session is for you.
Digitization is revolutionizing library management by increasing access and preserving fragile materials. The document discusses best practices for digitization including choosing materials, file formats, metadata standards, copyright issues, outsourcing options, and long-term digital preservation. It also provides examples of the Memorial's digitization of WW1 records and considerations for developing an enterprise content management system.
The document discusses a new digital forensic data capture device called the Forensic Dossier launched by Logicube. The Dossier allows investigators to capture data from suspect drives at speeds of up to 6GB per minute. It supports capturing from RAID drives and various flash media. The Dossier features built-in support for many drive types and connections. It includes advanced authentication and other forensic features. The Dossier will be showcased at the 2009 International CES conference in Las Vegas.
Cost, Risk, Loss and other fun things PrestoCentre
Presentation given by Matthew Addis (ITInnovation Centre) of the PrestoPRIME project at Screening the Future conference, March 14-15 at the Netherlands Institute for Sound and Vision in Hilversum
IIA Conference 2017 - Edmonton, AB - Paperless GovernementBruce Covington
If you missed our presentation at IIA on September 26, we got you covered.
Presented by:
Hassan Qureshi, Partner, MNP LLP
Stephanie Armstrong, Senior Consultant, MNP LLP
Bruce Covington, Director, Document Imaging Services, PSPC
The document discusses digitizing legacy materials for online courses. It may include older formats like video tapes, floppies, and microfiche. The best approaches are used by museums, libraries, and universities that have dealt with file size constraints and conversion costs. Factors to consider for digitization include costs, intended use of materials, condition of originals, and future migration needs. Different materials like paper, audio, and video have different complexity levels. Formats are chosen based on intended preservation, use, access or commerce objectives.
This presentation looks at what organizations on the path to paperlite need to do in the planning stage to ensure they reap the rewards of document imaging.
Learn about batch document processing and the technologies used such as barcode recognition, content mining, OCR and more for unattended, automated processing. See how index data can be captured, files can be split, named, routed, cleaned, converted and more with little to no user action to save you money and time.
Slides from a half day workshop that I gave a couple of times in 2009. Better late than never I suppose. You need to read my blog post here: http://frommelbin.blogspot.com/2010/09/some-old-news-about-digitisation.html for an explanation about some slides and for references.
This document summarizes the pros, cons, and ethical considerations of insourcing versus outsourcing e-discovery services. It discusses the risks of insourcing such as bandwidth limitations and lack of technical expertise. It also reviews factors to consider like case size and ability to evaluate changing technology. The document provides examples of the costs of document review using different platforms and methods. It highlights challenges like pricing models and keeping up with changing technology. Best practices for supervising outsourced work are also outlined.
This document provides best practices for digitizing collections. It discusses key questions to consider for a digitization project, the pros and cons of in-house vs outsourced digitization, documentation standards, staffing needs, costs, scanner types, file formats, naming conventions, and storage recommendations. The overall guidelines are to digitize at high resolution from original sources, create master files and derivatives for access, use open standards, and fully document the project for long-term preservation and usability of the digital files.
Similar to Grootschalige digitalisering van archivalia (20)
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfChart Kalyan
A Mix Chart displays historical data of numbers in a graphical or tabular form. The Kalyan Rajdhani Mix Chart specifically shows the results of a sequence of numbers over different periods.
Driving Business Innovation: Latest Generative AI Advancements & Success StorySafe Software
Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency.
During the hour, we’ll take you through:
Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board.
Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes.
Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI.
We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI.
This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!
Ivanti’s Patch Tuesday breakdown goes beyond patching your applications and brings you the intelligence and guidance needed to prioritize where to focus your attention first. Catch early analysis on our Ivanti blog, then join industry expert Chris Goettl for the Patch Tuesday Webinar Event. There we’ll do a deep dive into each of the bulletins and give guidance on the risks associated with the newly-identified vulnerabilities.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
5th LF Energy Power Grid Model Meet-up SlidesDanBrown980551
5th Power Grid Model Meet-up
It is with great pleasure that we extend to you an invitation to the 5th Power Grid Model Meet-up, scheduled for 6th June 2024. This event will adopt a hybrid format, allowing participants to join us either through an online Mircosoft Teams session or in person at TU/e located at Den Dolech 2, Eindhoven, Netherlands. The meet-up will be hosted by Eindhoven University of Technology (TU/e), a research university specializing in engineering science & technology.
Power Grid Model
The global energy transition is placing new and unprecedented demands on Distribution System Operators (DSOs). Alongside upgrades to grid capacity, processes such as digitization, capacity optimization, and congestion management are becoming vital for delivering reliable services.
Power Grid Model is an open source project from Linux Foundation Energy and provides a calculation engine that is increasingly essential for DSOs. It offers a standards-based foundation enabling real-time power systems analysis, simulations of electrical power grids, and sophisticated what-if analysis. In addition, it enables in-depth studies and analysis of the electrical power grid’s behavior and performance. This comprehensive model incorporates essential factors such as power generation capacity, electrical losses, voltage levels, power flows, and system stability.
Power Grid Model is currently being applied in a wide variety of use cases, including grid planning, expansion, reliability, and congestion studies. It can also help in analyzing the impact of renewable energy integration, assessing the effects of disturbances or faults, and developing strategies for grid control and optimization.
What to expect
For the upcoming meetup we are organizing, we have an exciting lineup of activities planned:
-Insightful presentations covering two practical applications of the Power Grid Model.
-An update on the latest advancements in Power Grid -Model technology during the first and second quarters of 2024.
-An interactive brainstorming session to discuss and propose new feature requests.
-An opportunity to connect with fellow Power Grid Model enthusiasts and users.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Speck&Tech
ABSTRACT: A prima vista, un mattoncino Lego e la backdoor XZ potrebbero avere in comune il fatto di essere entrambi blocchi di costruzione, o dipendenze di progetti creativi e software. La realtà è che un mattoncino Lego e il caso della backdoor XZ hanno molto di più di tutto ciò in comune.
Partecipate alla presentazione per immergervi in una storia di interoperabilità, standard e formati aperti, per poi discutere del ruolo importante che i contributori hanno in una comunità open source sostenibile.
BIO: Sostenitrice del software libero e dei formati standard e aperti. È stata un membro attivo dei progetti Fedora e openSUSE e ha co-fondato l'Associazione LibreItalia dove è stata coinvolta in diversi eventi, migrazioni e formazione relativi a LibreOffice. In precedenza ha lavorato a migrazioni e corsi di formazione su LibreOffice per diverse amministrazioni pubbliche e privati. Da gennaio 2020 lavora in SUSE come Software Release Engineer per Uyuni e SUSE Manager e quando non segue la sua passione per i computer e per Geeko coltiva la sua curiosità per l'astronomia (da cui deriva il suo nickname deneb_alpha).
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
Taking AI to the Next Level in Manufacturing.pdfssuserfac0301
Read Taking AI to the Next Level in Manufacturing to gain insights on AI adoption in the manufacturing industry, such as:
1. How quickly AI is being implemented in manufacturing.
2. Which barriers stand in the way of AI adoption.
3. How data quality and governance form the backbone of AI.
4. Organizational processes and structures that may inhibit effective AI adoption.
6. Ideas and approaches to help build your organization's AI strategy.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
Skybuffer SAM4U tool for SAP license adoptionTatiana Kojar
Manage and optimize your license adoption and consumption with SAM4U, an SAP free customer software asset management tool.
SAM4U, an SAP complimentary software asset management tool for customers, delivers a detailed and well-structured overview of license inventory and usage with a user-friendly interface. We offer a hosted, cost-effective, and performance-optimized SAM4U setup in the Skybuffer Cloud environment. You retain ownership of the system and data, while we manage the ABAP 7.58 infrastructure, ensuring fixed Total Cost of Ownership (TCO) and exceptional services through the SAP Fiori interface.
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-und-domino-lizenzkostenreduzierung-in-der-welt-von-dlau/
DLAU und die Lizenzen nach dem CCB- und CCX-Modell sind für viele in der HCL-Community seit letztem Jahr ein heißes Thema. Als Notes- oder Domino-Kunde haben Sie vielleicht mit unerwartet hohen Benutzerzahlen und Lizenzgebühren zu kämpfen. Sie fragen sich vielleicht, wie diese neue Art der Lizenzierung funktioniert und welchen Nutzen sie Ihnen bringt. Vor allem wollen Sie sicherlich Ihr Budget einhalten und Kosten sparen, wo immer möglich. Das verstehen wir und wir möchten Ihnen dabei helfen!
Wir erklären Ihnen, wie Sie häufige Konfigurationsprobleme lösen können, die dazu führen können, dass mehr Benutzer gezählt werden als nötig, und wie Sie überflüssige oder ungenutzte Konten identifizieren und entfernen können, um Geld zu sparen. Es gibt auch einige Ansätze, die zu unnötigen Ausgaben führen können, z. B. wenn ein Personendokument anstelle eines Mail-Ins für geteilte Mailboxen verwendet wird. Wir zeigen Ihnen solche Fälle und deren Lösungen. Und natürlich erklären wir Ihnen das neue Lizenzmodell.
Nehmen Sie an diesem Webinar teil, bei dem HCL-Ambassador Marc Thomas und Gastredner Franz Walder Ihnen diese neue Welt näherbringen. Es vermittelt Ihnen die Tools und das Know-how, um den Überblick zu bewahren. Sie werden in der Lage sein, Ihre Kosten durch eine optimierte Domino-Konfiguration zu reduzieren und auch in Zukunft gering zu halten.
Diese Themen werden behandelt
- Reduzierung der Lizenzkosten durch Auffinden und Beheben von Fehlkonfigurationen und überflüssigen Konten
- Wie funktionieren CCB- und CCX-Lizenzen wirklich?
- Verstehen des DLAU-Tools und wie man es am besten nutzt
- Tipps für häufige Problembereiche, wie z. B. Team-Postfächer, Funktions-/Testbenutzer usw.
- Praxisbeispiele und Best Practices zum sofortigen Umsetzen
12. V. Hoe lang duurt het om alles te digitaliseren 1 meter = 7.000 scans Productie = 10.000 scans per week A. 431 jaar V. Hoeveel scans levert digitalisering van 32 kilometer archief A. 224.000.000 scans
13. Aantal te digitaliseren documenten in een archief loopt in een project al snel in de honderdduizenden tot miljoenen Incidentele en structurele kosten moeten ook bij deze enorme aantallen beheersbaar blijven Incidentele en structurele kosten afhankelijk van: B. Werkprocessen : organisatie reproductieproces A. Technische aspecten : kwaliteitsnorm scans / bestandsgrootte
16. We Scan We Store We Do Principles, image quality and workflow principles Compression and filesize Workflow, tools and practical issues
17. Goals of digitization projects vary from access to substitution of the originals In every project quality standard and method are set, depending on purpose and type of material For all projects we have one workflow We always work on project basis We scan Digitization at the Amsterdam City Archives in general
18. We scan 1. At large scale the more scans being made, the lower the price per scan Large scale production is a prerequisite in order to keep production costs as low as possible
19. Documents that are being digitized in this reproduction process can have the following forms We scan Small and large size Bound and loose-leafed entities Card indexes Old and modern material Low and high contrast documents Text alone, text and image together Hybrid forms 3. A broad spectrum of document types
20. Costs for producing and storing scans are determined to a high extent by the quality standard set for the scans Purpose of the scans: archival research using the web, straight from screen or print We scan 4. For archival research from screen or print The higher the standard of quality, the higher the costs will be In order to keep costs low it is prudent to allow the standard of quality follow from the requirement the end user places on the scan Textual information legible in de originals must be legible in the scans
21. But has no added value for the customer at all A quality higher than that inevitably will push up both incidental and structural costs We scan 4. For archival research from screen or print Specified (basic) quality standard: Reproduction of all significant information Reproduction of details which are not part of the textual information is not required
22. We scan Scan quality and legibility High quality scan Modified scan (contrast) Optimal tonal range Example: very “light” original Excellent flexibility Poor tonal range Little flexibility Experience in practice learns that what is experienced as being “good legibility” is very personal. We decided to solve this problem with a smart filter in the document viewer. Poor legibility Excellent legibility Which one would you buy?
23. Skimming on the quality of scans (it can be better) is purely an economic decision, not one taken on principle We scan 4. For archival research from screen or print It does make sense to let the standard of quality follow from the purpose the end-uses places on of the scans Price rates scanning, external partner 0,05 $ Legibility, auto-feed 0,30 – 0,75 $ Legibility 3 – 10 $ High-end Price comparison scanning costs
24. This way damage or loss of the originals is ruled out After digitization the originals can not be requested in the reading room anymore We scan 5. For conservation and security The scans in the scanning on request service are made for the purpose of access / archival research Not as a substitute for the originals Nevertheless, digitization does have a real conservation function Conservation of the originals remains the major concern
25. A file can contain one – hundreds of documents We scan By definition the entire file is scanned Never just a selection of pages There are a few reasons for this: 6. Always complete files The costs for scanning are not so much a factor of quantity, but rather of the manual processing involving in it In the originals or the metadata it has to be indicated which documents are being digitized When shown in the Archiefbank, the user expects completeness When non-scanned pages have to be digitized later, the entire preparation process has to be gone through once again
26. Contracting out of scanning was a logical choice We scan The in-house scan facilities are not designed for large-scale digitizing The complexity of the workflow and material to be scanned calls for Investing only makes sense by very high production, organized on a large scale 7. Contracting out the scanning to external partners Specialized hard- and software Specialized set-ups Knowledge Very complex technical infrastructure
27. This calls for intensive collaboration Also, the workflows of archive and digitizer have to dovetail We scan There are many scanning companies Most do have experience in bulk processing But not in this degree of complexity and diversity 7. Contracting out scanning is more than awarding a contract to a supplier Contracting out the scanning to external partners
28. Customers think a low price is important This means that costs for producing and storing scans have to be as low as possible Archival research easily runs into the use of dozens to hundreds of documents We scan The price of an ordinary copy in our reading room should be the benchmark Low costs 100 scans should not cost $ 100 The costs when purchasing scans online should be competitive with travel costs when visiting our reading room
29. We use a combination of 1 and 3 We store Storage costs still are considerably high when producing large quantities of scans In order to bring structural costs down file size of the scans has to be as low as possible This can be achieved in three ways Scans with a file size as small as possible 1. Skimming on resolution 3. Using (lossless or lossy) compression on the files 2. Skimming on bit depth / amount of colors (only possible in formats like TIFF and PNG)
30. Hoe fijner het gebruikte raster bij scanning, hoe meer informatie, hoe hoger de detaillering Maar, hoe dan ook sterke vereenvoudiging van de werkelijkheid Resolutie Op een bepaald detailniveau zullen altijd de afzonderlijke “rastercellen” zien Voor tekstdocumenten moet het raster een fijnmazigheid hebben die overeenkomt met details uit de tekstuele informatie. Een punt op een i moet als zodanig nog te onderscheiden zijn Maar bijvoorbeeld details in de structuur van het papier hoeven in de scan niet zichtbaar te zijn
31. Resolutie wordt meestal uitgedrukt in DPI (Dots Per Inch) Of – eigenlijk beter – PPI (Pixels Per Inch) DPI zegt dus iets over de informatiedichtheid per lengtemaat Resolutie En daarmee iets over de theoretisch haalbare kwaliteit Maar verder helemaal niets over de objectieve kwaliteit van een scan Zowel een scanner van € 50,- van de Aldi, als een high-end scanner van € 50.000 kunnen op 300 dpi scannen Maar de kwaliteit van de geproduceerde scan zal duidelijk verschillen Meten van het detailoplossend vermogen van een scanner kan met behulp van controlekaartjes waarmee zogenaamde lijnenparen worden gemeten
32. Benchmark resolutie is meestal 300 dpi Dit is gebaseerd op de kleinste letter e ( 1 mm) in drukwerk Niet alle documenten bevatten details die zo klein zijn Resolutie benodigde resolutie kan o.a. worden berekend met de zogenaamde Quality Index: http:// www.library.cornell.edu/preservation/tutorial/conversion/conversion-04.html
33. Resolutie is in sterke mate bepalend voor de bestandsgrootte: Resolutie Resolutie (A4) Bestandsgrootte 300 dpi 24 Mb 400 dpi 44 Mb 800 dpi 177 Mb 1600 dpi 708 Mb 3200 dpi 2,8 Gb
35. Resolutie Conclusie : bij 150 dpi: kleine bestanden en meeste tekst nog prima leesbaar Maar, is het verstandig om hier bij digitaliseren van uit te gaan? Bij lage resolutie ook lagere structurele beheerkosten. Over enkele jaren wellicht met betere technologie opnieuw scannen. Maar niet voldoende wanneer we in de toekomst op basis van deze images in een hogere kwaliteit willen leveren, OCR toe willen passen en/of willen converteren naar betere compressie- en bestandsformaten. Keuze afhankelijk van doelstellingen, middelen, aantallen
36. Kleur Een pixel is een vakje met een enkele kleur De kleinste eenheid van een digitaal bestand is een bit : deze heeft de waarde 0 of 1 Wanneer een pixel uit 1 bit bestaat kan deze pixel de waarde zwart (0) of wit (1) hebben Willen we meer kleuren kunnen definiëren bij een pixel dan zullen we het aantal bits per pixel uit moeten breiden Met 8 bits (die elk de waarde 1 of 0) aan kunnen nemen zijn 256 combinaties, en dus kleuren mogelijk (bijvoorbeeld 0 0 0 1 0 0 1 1 ) Kleurdiepte: bits en bytes De meeste camera’s gebruiken 8 bits per kleurkanaal (in totaal dus 24 bits) Hiermee zijn 16,7 miljoen kleuren mogelijk
38. Compressie Methode waarmee de informatie efficiënter beschreven kan worden Opslaan: 48 letters Woorden coderen Compressie Bestandsgrootte neemt af Peer Spel Spel Spel Spel Peer Peer Spel Spel Spel Peer Peer P = Peer S = Spel
39. Compressie Opslaan: 12 letters (plus coderingstabel Resultaat P S S S S P P S S S P P P = Peer S = Spel
40. Compressie Twee soorten compressie: A. Lossless (exact omkeerbaar) Er gaat geen informatie verloren Vergelijk het met een kussen waar je alle lucht uitdrukt voor je deze verpakt. Haal je het kussen uit de verpakking dan wordt het weer exact het kussen zoals het was voor verpakking. B. Lossy (niet exact omkeerbaar) Bepaalde informatie wordt weggegooid Weer drukken we lucht uit het kussen, maar omdat we een nog kleinere verpakking willen halen we ook een paar veertjes weg. Dit hoeft niet erg te zijn, want wellicht geeft het gemis van een paar veertjes in het gebruik geen oncomfortabeler kussen. Alleen, weggegooide veertjes zullen ook bij het opnieuw uit de verpakking halen niet meer worden toegevoegd.
41. Compressie en informatieverlies Een veelgehoorde stelling: Lossy compressie niet gebruiken bij opslag van images, want bij lossy compressie treedt informatieverlies op Bij lossy compressie treedt inderdaad informatieverlies op, maar dat hoeft niet per definitie verlies van betekenisvolle informatie te betekenen Sowieso is beter is om te zeggen: verlies van informatie ten opzichte van het ongecomprimeerde bestand. Scanning is namelijk - ten opzichte van het origineel - onlosmakelijk verbonden met verlies van informatie, ook bij toepassing van lossless compressie.
43. Compressie en duurzaamheid Veelgehoorde stelling: Gecomprimeerde bestanden hebben een grotere kans om corrupt te raken dan niet gecomprimeerde bestanden. Daarom mag er geen datacompressie worden toegepast. Uit onderzoek is gebleken dat deze stelling niet juist is. Andere oplossingsrichting voor preservering: redundantie in opslag Juist gecomprimeerde bestanden lenen zich hier goed voor
44. We store Resolution, compression and legibility: an example 300 dpi, high quility JPEG 200 dpi, low quility JPEG Scans with a file size as small as possible
45. Comparison between file format, compression, resolution and file size Scans with a file size as small as possible We store 55% 6 Tb 12 MB 24 bits 300 dpi Lossless Part 1 JPEG2000 0,5% 59 Gb 120 Kb 24 bits 300 dpi Lossy Part 6 34% 3,7 Tb 7,5 Mb 24 bits 300 dpi Lossy Qua (ps) 12 JPEG 10% 1,1 Tb 2,1 Mb 24 bits 300 dpi Lossy Qua (ps) 10 1,1% 124 Gb 255 Kb 24 bits 200 dpi Lossy Qua (ps) 4 Filesize 3,3 Mb 22,1 Mb Avg Lossy --- Type 15% 100% % 1,6 Tb 11 Tb 500.000 400 dpi 300 dpi Resolution Qua (ps) 10 No Compression 24 bits 24 bits TIFF Color Format
46. TIFF uncompressed Comparison between file format, compression, resolution and file size Scans with a file size as small as possible We store 55% 6 Tb 12 MB 24 bits 300 dpi Lossless Part 1 JPEG2000 0,5% 59 Gb 120 Kb 24 bits 300 dpi Lossy Part 6 34% 3,7 Tb 7,5 Mb 24 bits 300 dpi Lossy Qua (ps) 12 JPEG 10% 1,1 Tb 2,1 Mb 24 bits 300 dpi Lossy Qua (ps) 10 1,1% 124 Gb 255 Kb 24 bits 200 dpi Lossy Qua (ps) 4 Filesize 3,3 Mb 22,1 Mb Avg Lossy --- Type 15% 100% % 1,6 Tb 11 Tb 500.000 400 dpi 300 dpi Resolution Qua (ps) 10 No Compression 24 bits 24 bits TIFF Color Format
47. JPEG (psd) 10 Comparison between file format, compression, resolution and file size Scans with a file size as small as possible We store 55% 6 Tb 12 MB 24 bits 300 dpi Lossless Part 1 JPEG2000 0,5% 59 Gb 120 Kb 24 bits 300 dpi Lossy Part 6 34% 3,7 Tb 7,5 Mb 24 bits 300 dpi Lossy Qua (ps) 12 JPEG 10% 1,1 Tb 2,1 Mb 24 bits 300 dpi Lossy Qua (ps) 10 1,1% 124 Gb 255 Kb 24 bits 200 dpi Lossy Qua (ps) 4 Filesize 3,3 Mb 22,1 Mb Avg Lossy --- Type 15% 100% % 1,6 Tb 11 Tb 500.000 400 dpi 300 dpi Resolution Qua (ps) 10 No Compression 24 bits 24 bits TIFF Color Format
48. JPEG (psd) 4 Comparison between file format, compression, resolution and file size Scans with a file size as small as possible We store 55% 6 Tb 12 MB 24 bits 300 dpi Lossless Part 1 JPEG2000 0,5% 59 Gb 120 Kb 24 bits 300 dpi Lossy Part 6 34% 3,7 Tb 7,5 Mb 24 bits 300 dpi Lossy Qua (ps) 12 JPEG 10% 1,1 Tb 2,1 Mb 24 bits 300 dpi Lossy Qua (ps) 10 1,1% 124 Gb 255 Kb 24 bits 200 dpi Lossy Qua (ps) 4 Filesize 3,3 Mb 22,1 Mb Avg Lossy --- Type 15% 100% % 1,6 Tb 11 Tb 500.000 400 dpi 300 dpi Resolution Qua (ps) 10 No Compression 24 bits 24 bits TIFF Color Format
49. JPEG2000 lossless Comparison between file format, compression, resolution and file size Scans with a file size as small as possible We store 55% 6 Tb 12 MB 24 bits 300 dpi Lossless Part 1 JPEG2000 0,5% 59 Gb 120 Kb 24 bits 300 dpi Lossy Part 6 34% 3,7 Tb 7,5 Mb 24 bits 300 dpi Lossy Qua (ps) 12 JPEG 10% 1,1 Tb 2,1 Mb 24 bits 300 dpi Lossy Qua (ps) 10 1,1% 124 Gb 255 Kb 24 bits 200 dpi Lossy Qua (ps) 4 Filesize 3,3 Mb 22,1 Mb Avg Lossy --- Type 15% 100% % 1,6 Tb 11 Tb 500.000 400 dpi 300 dpi Resolution Qua (ps) 10 No Compression 24 bits 24 bits TIFF Color Format
50. We store Comparison storage costs Storage of 500.000 images Avg size per scan uncompressed = 22,1 MB Price rate : 1 TB, storage in a controlled e-repository environment on two separate locations, including IT costs $ 7.000 (NLD, nov 2009) Scans with a file size as small as possible (File)size still does matter! $ 420.000 $ 8.680 $ 77.000 $ 770.000 Costs 10 years $ 42.000 $ 868 $ 7.700 $ 77.000 Costs 1 year 6 TB 124 GB 1,1 TB 11 TB Storage JPEG 2000 (part 1, ll) JPEG 4 (200 dpi) JPEG 10 Tiff uncompressed Fileformat
51. Projects with different goals, document types and partners take place at the same time A streamlined, standardized process is indispensable when digitizing on a large scale Guidelines and best practices often take no account of these complex factors and the amount of scans to be produced We developed a process in which large scale and flexibility are starting points All digitization projects follow this process Developing the reproduction process We Do
52. We developed a simple, but effective workflow application in-house This asks for workflow management with a user-friendly application For all projects, at any moment, it has to be clear: We Do What the current status is of each to digitize unit Where each unit can be located What current and succeeding tasks are to be performed on each unit Developing the reproduction process
53. In the following slides we focus on the weekly production of 10.000 scans in the digitizing on request service We developed a simple, but effective workflow application in-house This asks for workflow management with a user-friendly application For all projects, at any moment, it has to be clear: We Do What the current status is of each to be digitized unit Where each unit can be located What current and succeeding tasks are to be performed on each unit Developing the reproduction process
54. All public files can be requested for digitization via the findings aids in the Archiefbank Just by clicking on the “digitize” button Production of 10.000 scans on weekly basis 1. Requesting for digitization We Do
55. A unit to be digitized must be able to be identified at each step of the handling process The units therefore get a unique meaningless order number An order number is provided by the metadata management system and is the basis for In practice: all units to be digitized get an order ticket 2. Providing ordernumbers Communication with the digitizer Scanning Assigning filenames Registration of filenames Billing by digitizer We Do
56. A unit to be digitized must be able to be identified at each step of the handling process The units therefore get a unique meaningless order number An order number is provided by the metadata management system and is the basis for In practice: all units to be digitized get an order ticket 2. Providing ordernumbers Communication with the digitizer Scanning Assigning filenames Registration of filenames Billing by digitizer We Do
57. The workflow system generates a list of all originals to asses from the repositories The list is sorted on repository / shelf to make retrieval efficient We Do 3. Assessing the originals
58. All assessed originals are stored in a special room In this room all checks are executed We Do 4. Checking the originals
59. Information about the originals in our management systems is not always complete If an item falls into one of these categories the request is rejected B. Condition of the material A rough check of the originals takes place A. Content We Do 4. Checking the originals Copyrights Publicity Privacy Items that are in such a condition that digitizing or transport could cause damage, or are packaged in a way that scanning in conventional set-ups is not possible do not qualify for standard way of digitization
60. Information about the originals in our management systems is not always complete If an item falls into one of these categories the request is rejected B. Condition of the material A rough check of the originals takes place A. Content We Do 4. Checking the originals Copyrights Publicity Privacy Items that are in such a condition that digitizing or transport could cause damage, or are packaged in a way that scanning in conventional set-ups is not possible do not qualify for standard way of digitization
61. Material preparation is limited to the most minimal We Do 4. Checking the originals Staples are being removed as a rule Small reparations are executed by our restoration employees The sequence of the originals as found in the repository is not checked or altered We Do We don’t The originals are not numbered
62. But this is only true when the numbering tallies exact, because: Numbering the originals has one advantage: We Do Not number the originals The completeness of the scans (compared to the originals) can be guaranteed Numbers that are assigned double lead to illogical end numbers (100 scans: scan 100 has been numbered as 99) Experiments with numbering in practice learned that faultless numbering can not be realized A missing number in a sequence of scans leads to the conclusion that there is one original that has not been scanned
63. Securing completeness can be realized by other means: We Do Comparing scans to originals 1:1 after digitization Scanning the originals twice # scans = 365 # scans = 365 Low quality High quality master files Not number the originals
65. It has to be perfectly clear which filenames this should be After scanning the scan operator or data manager has to assign filenames to the scans Because, when the meaning changes, filenames should change too As a rule filenames contain no meaningful information We Do 6. / 7. Scanning and assigning filenames Filenames are the key between scans metadata
66. Assigning filenames at City Archives Amsterdam Customer request Management systems First 6#: ordernr Last 6#: serial nr Order ticket Filename Scanning the order A20758000001 A20758000002 A20758000003 Range A20758000001 – A20758999999 Archive 195 File 836 Order: A20758 A20758000004 A20758000005 Scan report A20758000001 A20758000002 A20758000003 A20758000004 A20758000005 12 digits Registration filenames Import
67. An application from which all checks can be executed is in development Scans and metadata are checked efficiently Where possible checks are automated 10. 11. Checking scans and metadata Basic checks We Do Depends on project Completeness Script Filenames Visual check production scans Visual check reference scans Quality scans Jhove File format validity MD-5 checksum comparison Data integrity Virus checker Viruses Method Check
68. After import the “order for digitization” of each unit is completed After approving of all checks, scans and metadata are imported into the management systems The imports are executed automatically, on basis of scripts and standard protocols for file transfer 13. 14. Import metadata and scans into management systems We Do
69. After import the metadata are optimized for the search system For exchange of finding aids we use EAD From any workstation at the archive, directly via the CMS of the website The website is hosted from an external location Metadata are uploaded to the webserver by simple HTTP transfer 18. Import metadata into the website We Do
70. Until then scans are transported by use of portable USB harddisks Bandwith of the internet connections at the archive is still too small for direct sFTP (or suchlike) upload of large quantities of scans to the webserver It seems likely that in the near future this will change 17. Import scans into the website Transport medium We Do
71. Derivates for use of thumbnails and zoom / contrast functionality are made After connecting the harddisk to the server the import process starts Some basic checks are executed on the scans Import 17. Import scans into the website We Do
72. The requester can decide whether to buy scans or not When both scans and metadata have been imported, automatically an email is send to the requester for digitization This email contains a link to the finding aid and thumbnails on the website Request completed We Do The happy customer:
73. MARAC Conference October 30 2009 The requester can decide whether to buy scans or not When both scans and metadata have been imported, automatically an e-mail is sent to the requester for digitization This email contains a link to the finding aid and thumbnails on the website Request complete! The happy customer: We Do
74. Costs and income € 200,000 Digitization projects € 52,000 Webservices € 140,000 Digitsation on request Costs Archiefbank (2008) € 40,000 Government € 330,350 Project funding € 100,000 Digitsation on request Income Archiefbank (2008)
Editor's Notes
I will take you a step deeper into the workprocess of creating large amounts of scans. I’ll tell you about starting points and choises we have made and I’ll show you the result of some research we have done, particularyu towards image quality and filesize. Also, I’ll sohw you some back- and frontoffice tools from our webstie.
I will take you a step deeper into the workprocess of creating large amounts of scans. I’ll tell you about starting points and choises we have made and I’ll show you the result of some research we have done, particularyu towards image quality and filesize. Also, I’ll sohw you some back- and frontoffice tools from our webstie.