For Essay #3, please write about from either
Chronicle of a Death Foretold
Topic: Discuss animal imagery in Chronicle of a
Death Foretold. Concentrate on Santiago in
particular. Note his dream of birds, instances in
which Santiago is compared to animals, the butterfly
analogy in the discussion of Angela's accusation.
What part do Santiago Nasar's dogs play in the
novel?
1. Word Count 750
2. Double Spaced, Times New Roman 12pt.
3. At least 6 quotes from the book
4. Include Works Cited page
5. Turnitin is enabled
Remember, use a strong thesis statement toward
the end of your first paragraph and supportive
arguments to prove that statement throughout your
paper. Topic sentences in each paragraph should
be indicative of what you will discuss in that
paragraph. Make your conclusion soundly wrap up
your arguments about your thesis statement.
IT 380
Electronic Document and
Record Management
Systems
Unit 7 Metadata, Classification, and
Converting Manual Records
Instructor: Dr. Michelle Liu
METADATA
2
Topics
▪ Metadata concepts and standards
▪ Sources of metadata
▪ Applying metadata to records
▪ Automated metadata collection
3
Definition of Metadata
4
“Data describing context, content and structure of
records and their management through time.”
-Source: ISO15489
“Initially, metadata defines the record at its point of
capture, fixing the record into its business context and
establishing management control over it.”
-Source: ISO 23081
“Structured information that describes, explains,
locates, or otherwise makes it easier to retrieve, use,
or manage an information resource. Metadata is often
called data about data or information about
information.”
-Source: NISO (National Information Standards Organization)
Metadata
5
Metadata will have “properties”
These
describes the
characteristics
and rules
Where do you encounter metadata
every day?
6
Metadata
7
Where else do you use metadata
every day?
8
What is Metadata?
▪ Provides information about a document or record
▪ Machine understandable
▪ Has well defined semantics and structure
▪ May include descriptive information about the context,
quality, condition, or characteristics of the data
▪ Metadata provides context for data and is used to
facilitate the understanding, characteristics, and
usage of data
▪ All the objects managed by an ERM system have
metadata, including volumes, files, and classes.
▪ Also referred to as Tags or Attributes
▪ Adding consistent tags to an unstructured piece of
data gives it context
9
What is Metadata?, cont’d
▪ For a document or record metadata could be
its author, its title, the issue date, and other
information which can usefully be associated
with it
▪ Defined in terms of units called elements,
fields, index fields, or profile fields.
▪ Some fields also support sub-elements or
attributes to differentiate different types of
similar fields.
▪ E.g., Date_ ...
For Essay #3, please write about from eitherChronicle of a D
1. For Essay #3, please write about from either
Chronicle of a Death Foretold
Topic: Discuss animal imagery in Chronicle of a
Death Foretold. Concentrate on Santiago in
particular. Note his dream of birds, instances in
which Santiago is compared to animals, the butterfly
analogy in the discussion of Angela's accusation.
What part do Santiago Nasar's dogs play in the
novel?
1. Word Count 750
2. Double Spaced, Times New Roman 12pt.
3. At least 6 quotes from the book
4. Include Works Cited page
5. Turnitin is enabled
Remember, use a strong thesis statement toward
the end of your first paragraph and supportive
arguments to prove that statement throughout your
paper. Topic sentences in each paragraph should
be indicative of what you will discuss in that
paragraph. Make your conclusion soundly wrap up
your arguments about your thesis statement.
IT 380
Electronic Document and
2. Record Management
Systems
Unit 7 Metadata, Classification, and
Converting Manual Records
Instructor: Dr. Michelle Liu
METADATA
2
Topics
▪ Metadata concepts and standards
▪ Sources of metadata
▪ Applying metadata to records
▪ Automated metadata collection
3
Definition of Metadata
4
3. “Data describing context, content and structure of
records and their management through time.”
-Source: ISO15489
“Initially, metadata defines the record at its point of
capture, fixing the record into its business context and
establishing management control over it.”
-Source: ISO 23081
“Structured information that describes, explains,
locates, or otherwise makes it easier to retrieve, use,
or manage an information resource. Metadata is often
called data about data or information about
information.”
-Source: NISO (National Information Standards Organization)
Metadata
5
Metadata will have “properties”
These
4. describes the
characteristics
and rules
Where do you encounter metadata
every day?
6
Metadata
7
Where else do you use metadata
every day?
8
What is Metadata?
▪ Provides information about a document or record
▪ Machine understandable
▪ Has well defined semantics and structure
5. ▪ May include descriptive information about the context,
quality, condition, or characteristics of the data
▪ Metadata provides context for data and is used to
facilitate the understanding, characteristics, and
usage of data
▪ All the objects managed by an ERM system have
metadata, including volumes, files, and classes.
▪ Also referred to as Tags or Attributes
▪ Adding consistent tags to an unstructured piece of
data gives it context
9
What is Metadata?, cont’d
▪ For a document or record metadata could be
its author, its title, the issue date, and other
information which can usefully be associated
with it
▪ Defined in terms of units called elements,
fields, index fields, or profile fields.
6. ▪ Some fields also support sub-elements or
attributes to differentiate different types of
similar fields.
▪ E.g., Date_paid, Date_enrolled,
Date_graduated
10
Why Metadata?
11
▪ Supporting efficient retrieval
▪ Providing logical links between records and the
context of their creation, and maintaining them
in a structured, reliable and meaningful way
▪ Supporting the identification of the technological
environment used to create the record and
needed to access it
▪ Supporting efficient and successful migration of
records from one environment or computer
7. platform to another or any other preservation
strategy.
Why Metadata?
12
▪ Protecting records as evidence and ensuring
their accessibility and usability through time
▪ Facilitating the ability to understand records
▪ Supporting and ensuring the evidential value
of records
▪ Helping to ensure the authenticity, reliability,
and integrity of records
▪ Supporting and managing access, privacy,
and rights
Purpose of Metadata
▪ Identification / Distinction
[Title, Date, Publisher, Version, Type, etc.]
▪ Search / Retrieve / Browse
8. [Country, Region, Subject, Sector, Theme, Topic, etc.]
▪ Use Management
[Authorized By, Rights Management, Access Rights,
Location, Disclosure Status, etc.]
▪ Compliant Document Management
[Record Identifier, Retention Schedule, Relation, Disposal
Status, etc.]
13
14
Types of Metadata
Source: ISO 23081-1
Image source: http://archives.govt.nz/advice/continuum-
resource-kit/continuum-publications-html/g14-technical-guide-
implementing-recordkee
Metadata about the Record
▪ Provides minimal, but essential information
about an item or record
▪ May come directly from the file itself
▪ Examples: Author, Contributor, Creator,
9. Date, Identifier, Status, Title, or Document
Type.
15
Metadata about business rules,
policies, and mandates
▪ For “discovery” and context
▪ Making future search and retrieval
operations more effective
▪ More precise and targeted searches against
a range of criteria
▪ Facilitates collection of items into “virtual
files” that satisfy certain criteria
▪ Examples: Audience, Accessibility,
Addressee, Coverage, Description,
Language, Location, Publisher, Relation, or
Source
16
10. Metadata about Agents or Users
▪ Who created the record
▪ Who indexed it
▪ Who has accessed to perform record-related
functions
▪ Who has extracted it
▪ Examples: Indexer, Scanner, User, Manager,
Owner, or Reviewer
17
Administrative Metadata
▪ About business activities or processes
▪ Provide general information on how to
manage a resource
▪ Often includes technical metadata intrinsic to
the file or its location
▪ Examples: File Type, File Date,
Compression Type, or Access Level
11. 18
Metadata about Records
Management Activities
▪ Subset of administrative metadata that is
more specifically designed to support
records management activities
▪ Usually of most relevance to an
organization’s record managers.
▪ Actions such as changing metadata entries
for security, disposal and preservation are
the domain of records administration
▪ Examples: Aggregation, Digital signature,
Disposal, Mandate, Preservation, Rights
19
Business-specific Metadata
▪ It may be necessary to add additional
12. elements or refinements to meet particular
business needs.
▪ Any new elements need to be carefully
defined and specified so that the user
community understands their purpose
▪ Constrain the choice of values to a ‘pick-list’
or pre-determined encoding scheme for the
information to be held, to ensure consistency
of use.
20
Examples of Business-Specific
Metadata
▪ Case 1: A business organization determines
that existing date-related metadata is
confusing because all fields use the same
date. It decides to create and use refined
elements including “date.invoicepaid”,
“date.billed”, and “date.projcomplete”.
13. ▪ Case 2: A university decides to introduce a
Student element to differentiate records of
staff from student records.
21
Mandatory vs. Optional Metadata
▪ Trade-off between too many mandatory
elements and too few
▪ Why?
▪ Records Management staff often want as
much mandatory metadata as possible to
achieve a metadata-rich repository.
▪ At odds with the user community, who are often
not prepared to spend the time and effort
entering it!
▪ A sensible compromise is necessary, as well
as the use of automation techniques 22
Metadata for Physical Objects
14. ▪ Metadata entries in an ERM system can be
made not only for electronic records, but also
for physical objects
▪ A library of audio tapes, CD-ROMs containing
experimental data, or other collections of
physical objects
▪ Physical objects need a Location metadata
element if the objects can be moved about.
▪ Record-keeping information and actions can
apply to physical objects just as with
electronic items
▪ E.g., Retention periods and disposal
instructions
23
Sources of Metadata
▪ The record itself can provide much
metadata
▪ When the record is first created, it will have
its own intrinsic metadata relating to the
15. digital file itself: file date, file size, and so
forth.
▪ The record could also provide metadata
itself
▪ A Word document will often have a title, author,
etc.
24
Sources of Metadata
▪ Document metadata/properties:
▪ Most Office Automation (OA)
software packages can associate
metadata with documents as they
are created or received.
▪ Example: MS Office permits the
entry of document “Properties”
25
Sources of Metadata
16. ▪ Manual data entry
▪ Users are expected or required to enter
information into profile fields
▪ Profile fields could be in the document
properties themselves
▪ They could be in an ERMS, ECMS, or
imaging application.
▪ Data entry staff vs. all users
26
How metadata works for an
Individual Record
▪ Suppose you work as an intern in one local
IT company. You need to create a word file
to document details of one contract. When
you first created the contract document, it
has some intrinsic metadata:
▪ Title
▪ File format
17. ▪ File size
▪ Date of creation
27
Declare a Record
▪ The document is declared as a record later
▪ As part of the process, additional metadata is
added to the record depending on what type
of record it is, whether it is a vital record, its
location, etc.
28
Perspectives on Metadata
▪ The act of entering metadata values is often
called “Indexing”
▪ Different views and perspectives on records
management metadata
▪ The business perspective: metadata support
18. business processes;
▪ The records management perspective:
metadata support records management over
time
▪ The user view: metadata enable the retrieval
and support understanding and interpretation of
records. 29
Manual Metadata Entry
▪ The most obvious method
▪ Users enter metadata for the record: paper
or electronic
▪ Application may prescribe types and/or
values of metadata
▪ Controlled vocabularies
▪ Data masks
▪ Expensive!
30
19. Controlled Vocabularies
▪ One of the challenges with metadata entry and
usage:
▪ Ensuring enough freedom to users that the system is
usable
▪ Putting a control framework in place that ensures the
metadata is manageable
▪ Ensure that users use metadata fields and terms
consistently
▪ Controlled vocabularies: supporting tools which
should be based on collections of terms which
users use to describe aspects of a record, other
than its business context. 31
Extraction from the Record
▪ Intrinsic metadata from the properties
▪ Recognition technologies
▪ OCR/IR
▪ Barcode recognition
20. ▪ QR code
▪ Specialized recognition technologies
▪ Recognizing and extracting data from audio,
video, and other rich media types
32
Metadata from other data sources
▪ The idea is that the data is already stored in
a database somewhere
▪ Stored in logical structure
▪ Normalized and deduplicated
▪ The organization would do well to reuse that
data
▪ Manual or automated
33
Case Study
▪ An organization scanned project-related paper
documentation including contracts and
21. invoices.
▪ As part of this process the organization
performed data entry on more than 20 index
fields.
▪ The large number of fields and amount of
scanning being done caused them to look at
ways to automate the metadata capture
process.
▪ They decided on the following approach: 34
The
Solution
35
22. The Result
▪ The organization went from approximately a 2%
error and rework rate to less than 0.01%
▪ Resulting in savings of more than $800,000 the
first year.
▪ Not bad for a system that cost less than $75,000
to acquire, integrate, and implement.
36
Automated Metadata Collection
▪ Software can help
▪ Document templates can contain code to
capture metadata
23. ▪ Templates can also contain “bookmarks”,
“fields” and other features to “grab” metadata
▪ Software can also be used to look-up details
of the user
▪ E.g.,: LDAP (Lightweight Directory Access
Protocol) capture certain details, such as the
current user’s name, job title, department, etc.
▪ An ERM system should automate the
capture of metadata values as much as
possible
37
Metadata for Disposed Records
▪ MoReq2 changed its terminology from
24. ‘retention schedule’ to ‘retention and
disposition schedule’
▪ Some metadata will remain after the records
are destroyed.
▪ Additional metadata might be created
▪ Destruction_date
▪ “Reviewer” or other similar fields
38
Document Metadata: What is
Hidden?
39
25. So what’s the problem?
▪ Information Security Policy relates to
employees’ duty to protect the company’s
confidential information and IP.
▪ We can’t say we have control over this if we
don’t implement education and technology to
protect against it.
40
Education: What employees must
consider
41
26. Risk of Data Leaking
▪ Comments
▪ Hidden columns/rows
▪ Previous authors
▪ Track changes
▪ Versions
▪ Redacted text
▪ Reviewers
▪ Footnotes
▪ Small text
42
28. ▪ Metadata
▪ Retention and disposal
▪ Access and use
▪ Documentation
▪ System Testing
▪ Non-electronic record handling
45
Compliance
▪ The system must manage and control
electronic records according to the standards
for compliance and the requirements for
legal admissibility and security, and must be
29. capable of demonstrating this compliance
▪ Compliance regulations vary from industry to
industry
▪ Legislation and regulation
▪ Industry- standards
What is Classification?
▪ Simply put, grouping information together
▪ Think about how you’ve structured the files that
you’ve got on your computer into folders and
why you’ve done it that way.
▪ Formal definition from ISO 15489:
▪ “The systematic identification and arrangement
of business activities and/or records into
30. categories according to logically structured
conventions, methods, and procedural rules
represented in a classification system.”.
47
Why Important?
▪ Two main approaches for effective access to
stored information:
▪ Classification: ‘Aggregate and organize'
▪ Search engine: 'Find by raw power'
▪ We want to be able to find all records related
to a particular topic, project, or client.
▪ Nobody wants to look at “My Documents”
31. folder with 1,000 document in it
▪ Easier to manage files and records in groups
▪ It is what we as a species do.
48
Why Important? - Identify Your
“Toxic” Data
49
Benefits of an Effective
Classification
▪ Linkages between individual records can be
provided easily and that these can be
32. accumulated to provide a continuous record
of an activity
▪ Ensure that records are named in a
consistent manner over time
▪ Assist in the retrieval of all records relating to
a particular function, topic or activity.
▪ Determining security protection and access
appropriate for sets of records
50
Benefits of Classification, Cont’d
▪ Allocating user permissions for access to, or
action on, particular groups of records
33. ▪ Distributing responsibility for the
management of particular sets of records:
▪ Distributing records for action
▪ Determining appropriate retention periods
and disposition actions for records more
easily
51
Source: ISO 15489
For the “Data User” – has to be
obvious
52
34. Classification Scheme
▪ From ISO 11179-1:
▪ “Descriptive information for an arrangement or
division of objects into groups based on
characteristics, which the objects have in
common”
▪ Any structure an organization uses for
organizing, accessing, retrieving, storing &
managing its information.
▪ A business classification scheme (BCS) is a
classification scheme which is based on an
organization's business functions & activities
53
35. Awareness Programs
54
Organizational Taxonomy
▪ The most extensive and most complex
classification structure
▪ Enterprise-wide, including all departments
and functions and all information
repositories.
▪ There are prebuilt taxonomies available for a
number of different vertical industries.
55
36. Authenticity
▪ To be authentic in archives and records
management, a record must be genuine, or
be “what it claims to be”
▪ In order to trust that a record is authentic,
the user must be assured that the systems
that create, capture, and manage
electronic records maintain inviolate
records that are protected from accidental
or unauthorized alteration and from
deletion while the record still has value
Audit Trails
▪ A system audit trail is a record that tracks operations
performed on the system
▪ The audit trail documents the activities performed on
37. records and their metadata from creation to disposal
▪ The audit trail typically documents the activities of
creation, migration and other preservation activities,
transfers or the movement of records, modification,
deletion, defining access, and usage history
▪ The system must automatically capture the audit trail
▪ The audit trail data must be unalterable
▪ The audit trail must be logically linked to the records
they document, so that users can review audit
information when they retrieve records
Demonstrating Compliance
58
Track Key Trends
38. 59
Security and Control
▪ The system must allow only authorized
personnel to create, capture, update or
purge records, metadata associated with
records, files of records, classes in
classification schemes, and retention
schedules
▪ The system must control access to the
records according to well-defined criteria
Records Retention Schedule
▪ “A comprehensive instruction covering the
disposition of records to assure that they are
retained for as long as necessary based on
39. their administrative, fiscal, legal and historic
value.”
Source: UN ARMS
61
Retention and Disposition
▪ Specify periods of time an organization must
retain a document, based on the content of a
document and how it is used.
▪ Present a list of Record Series, categories of
documents with very similar purpose, and a
period of time an organization must retain the
40. records in each series.
▪ The system must provide for the automated
destruction of records in accordance with
authorized and approved records retention
schedules 62
Retention Schedule Format
▪ Often a database
▪ Enable the people to make updates to
retention periods as regulations change
▪ Add new record series as the organization
expands its operations into new functional
areas
41. ▪ Only records management staff and their
direct delegates should have access to the
retention schedule database
63
Preservation Strategies, Backups
and Recovery
▪ The system must incorporate a strategy or
plan for backing up and preserving records
▪ The system must ensure that records,
components of records, audit trails,
metadata, links to metadata or to files, and
classification schemes can be converted or
migrated to new system hardware,
software and storage media without loss of
vital information
42. System Testing
▪ The performance and reliability of system
hardware and software must be regularly
tested
▪ Most important component is backups
▪ Computer components do and will fail
▪ Human errors will occur
▪ Upgrades are always point of vulnrtability
Non-Electronic Records
▪ The system must be capable of classifying and
managing non-electronic, physical records and of
managing electronic and non-electronic records in an
integrated manner.
43. ▪ The system must be able to classify, create and
retrieve audit information and other metadata, and
control access
▪ The system must be capable of defining security, and
applying retention and disposal schedules for non-
electronic records according to the same
requirements that have been defined for electronic
records
66
Systems Design
▪ Systems Development Lifecycle (SDLC)
▪ System concept: purpose, goals, scope
▪ Analysis: user/functional requirements
▪ Design
▪ data design: what information?
44. ▪ software design: processed how?
▪ interface design: user interaction?
▪ Coding and testing: execute & evaluate
▪ Key issue: Systems do (only) what they’re
designed to – purpose, goals, scope,
requirements
67
Cost Benefits
▪ Costs up front to digitize your office
▪ Costs saved at “back end of process”
▪ No storage costs
▪ Can locate documents
▪ Ease in use of documents
▪ Saved paper costs (80% estimate)
45. ▪ Immediate access to information
▪ Easy to reorganize information
▪ Better stewardship
68
Collaboration
▪ The need to collaborate electronically
▪ With employees
▪ With clients
▪ With partners
▪ With consultants
▪ With others, dependent on type of business
▪ Impact of globalization
46. ▪ Need to reduce time frames for work
69
What Hardware Do We Need?
▪ High speed scanner:
▪ Key to successful conversion and operation
▪ Must be networked
▪ Must be able to save and send documents as
PDF
▪ Must be able to send documents directly to
email recipients
▪ Larger flat screen monitors (22 inches or
larger)
47. ▪ Networked workstations
▪ Connected with the high speed scanner
70
Additional Hardware?
▪ Digital faxing
▪ Faxes arrive as emails and are sent as emails
▪ Get rid of the facsimile machine: use software to
convert or use a service
▪ Tablet computers
▪ Pen input
▪ Capacitive touch: using your finger to scroll
▪ OneNote: electronic notebook
▪ Smart Pens
▪ Capture notes electronically
48. ▪ Record audio 71
What Software Do We Need?
▪ Adobe Acrobat (Latest Version):
▪ Portable document format reader
▪ PDF editor
▪ Bookmarks and nesting
▪ OCR functions: Acrobat 8 Pro and later
▪ Text on image = searchable PDF
▪ Document security
▪ Removal of metadata (E.g., Document Inspector in
MS Office 2007 and later)
▪ Signatures
49. ▪ “Locking”
72
What Software Do We Need?
▪ Office products:
▪ Word
▪ Excel
▪ Access
▪ PowerPoint
▪ Now include ability to make pdf documents
▪ Adobe released details of standard
▪ Acrobat add ins:
▪ Autoink (http://www.evermap.com/autoink.asp)
▪ PDF Annotator(http://www.pdfannotator.com/en/)
▪ Xobni: email searching and organizing
51. 75
76
Digitization
▪ Definition:
▪ Converting written and printed information into
electronic form
▪ Creation of computerized version of a printed
analog.
▪ Contents – text image, audio, video or
combination of these (multimedia)
Capture all data in one system
77
53. ▪ The recognition of printed or written text
characters by a computer
▪ involves analysis of the scanned-in image
▪ translation of the character image into character
codes, such as American Standard Code for
Information Interchange (ASCII)
▪ Being applied by libraries, businesses, &
government agencies
▪ to create text-searchable files for digital
collections
78
Imaging
▪ Scan paper, film to create electronic images
54. ▪ Allows simultaneous access to records
▪ May reduce storage costs
▪ Can improve or enhance:
▪ Security
▪ findability
79
Issues with Imaging
▪ Determine what to scan
▪ Records vs. non-records
▪ Backfile vs. day-forward
▪ Document preparation
▪ Indexing
▪ What to do with the originals
55. 80
Reasons for Retaining Paper
Records
▪ Not cost effective to scan because of:
▪ Large volumes
▪ Document size
▪ Document condition
▪ Little activity
▪ Intrinsic value of record
▪ Legislation
81
56. 82
OCR - Process
▪ The scanner or camera typically produced
TIFF images but now pdf is common
▪ The software cleans the image for noises
and starts recognizing patterns
▪ Recognized patterns in alphabets and numbers
▪ Unrecognized patterns into images
Barcodes
▪ As far back as the 1960s, barcodes were used in
industrial work environments
▪ In the early 1970s, common barcodes started
57. appearing on grocery shelves
▪ To automate the process of identifying grocery
items, UPC barcodes were placed on products
Bar Code Systems
▪ An organization may determine a need to use a
bar code system with Record Management
Applications (RMA)
▪ Target how to manage physical records or
documents in a manner consistent with their
electronic counterparts?
▪ While the app stores a digital object that
represents the record
▪ containing metadata like where the physical record is
58. being stored, the appropriate disposition schedule for
the record, etc.
▪ Users need a barcode to identify which physical
records correspond to which digital objects
84
Storage of Records
▪ Records must be stored in such a way that they are
accessible and safeguarded against environmental
damage
▪ Typical paper documents may be stored in a filing
cabinet in an office
▪ Some organizations employ file rooms with specialized
59. environmental controls including temperature & humidity
▪ Vital records may need to be stored in a disaster-
resistant safe or vault to protect against fire, flood,
earthquakes and conflict
▪ In extreme cases, the item may require both disaster-
proofing and public access
▪ E.g. the original, signed US Constitution
85
Off-site vs On-Site Scanning
▪ Most on-going scanning and indexing is done on-site
▪ In addition to on-site storage of records, many
organizations operate their own off-site records
60. centers or contract with commercial records centers
▪ Off-site scanning often used to handle the backlog
as an EDRMS is initiated
▪ Often the value of the system is diminished if all the data
is not present
▪ Many companies specialize in scanning documents
▪ Scanning America
▪ archSCAN
86
Quality Assurance
▪ Concerned with assessing and ensuring that
▪ data is accurate and consistent
▪ the RMA is consistent with its requirements
▪ May include:
61. ▪ requirements of protection for CIA
▪ establishment of a robust assessment process
▪ use of third-party assessments
▪ contract performance measures/incentives
▪ use of regulatory and contract enforcement
authority, including civil, criminal, and financial
penalties
87
Quality Assurance
▪ May Also Include:
▪ customer review and approval
▪ Not addressed in US law except for credit reports
▪ Fair Credit Reporting Act
62. ▪ EU- Individuals have the right to access data
collected about them, to correct inaccurate or
incomplete data, and to have those corrections sent
to those who have received the data
▪ Directive 95/46/EC (European Privacy Directive)
88
Data in Document Management
▪ Much of a document management system’s
content is documents which have been
imaged
▪ Not searchable in their own right
▪ Two technologies available:
63. ▪ Optical Character Recognition followed by full
text searching
▪ Indexing (addition of metadata) followed by
metadata searching
89
Copyright
▪ Traditional copyright laws apply
▪ Copyright principle based on the belief that the public
is entitled to freely use portions of copyrighted
materials for purposes of commentary and criticism
▪ Unfortunately, if the copyright owner disagrees with
your fair use interpretation, the dispute will have to be
resolved by courts or arbitration
▪ The four factors for measuring fair use:
64. ▪ the purpose and character of your use
▪ the nature of the copyrighted work
▪ the amount and substantiality of the portion taken
▪ the effect of the use upon the potential market
▪ Extended in academic environment to the Teach Act
90
Integrity Issues
▪ Need to guarantee the integrity of all documents
in the system
▪ System of record
▪ E-Discovery processes
▪ Version control
▪ Understanding when and why documents get
65. changed
▪ Check-in/checkout functionality
▪ Only one person can be modifying a document at
one time
▪ Access controls
▪ Who can upload documents to a repository
91
Quiz 3 Terms
-Covers Unit 5,6, and 7
▪ Auditing
▪ Audit trail
▪ Authenticity
▪ Barcode
67. ▪ NARA
▪ Non-electronic record
▪ OA (Office Automation)
▪ OCR
▪ QA (Quality Assurance)
▪ Retention schedule
▪ RMA (Record Management
Application)
▪ SDLC
▪ Types of metadata
▪ Version control 92
68. IT 380
Electronic Document and
Record Management
Systems
Unit 6: EDRMS Requirement Analysis
Instructor: Dr. Michelle Liu
Topics
▪ EDRMS Standards
▪ Overall Requirements
▪ Components of EDM
▪ Components of ERM
▪ Workflow
69. ▪ Compliance
2
ERMS Standards
3
▪ DOD 5015.02
▪ Electronic Records Management Software Applications
Design Criteria Standard
▪ April 25, 2007
▪ ISO 16175-3
▪ Information and documentation – principles and functional
requirements for records in electronic office environments
▪ Part 3 – Guidelines and functional requirements for records
in business systems
70. ▪ December 2010
▪ ISO/TR 13028
▪ Information and documentation – Implementation Guidelines
for Digitization of Records
▪ December 2010
▪ National Archives accepts DoD standard
International Standards
▪ ISO 16175
▪ Information and Documentation – Principles and functional
requirements for records in electronic office environments
▪ Part 1 – Overview and Statement of Purpose
▪ Part 2 - : Guidelines and functional requirements for digital
records
71. management systems
▪ Part 3 -Guidelines and functional requirements for records in
business systems
▪ ISO/TR 13028
▪ Information and documentation – Implementation guidelines
for
digitization of records
▪ ISO 19005
▪ Document management _electronic document file format for
long-
term preservation
▪ Specifies pdf: PDF/A (specialized for use in the archiving and
preservation of electronic documents)
4
72. Federal Standards
▪ DoD 5015.2-STD
▪ Defines the basic requirements based on operational,
legislative
and legal needs that must be met by records management
application (RMA) products that are acquired by the Department
of Defense (DoD) and its components
▪ This Standard sets forth mandatory baseline
functional requirements; defines required system
interfaces and search criteria; and describes the
minimum records management requirements that
must be met, based on current National Archives
and Records Administration (NARA) regulations
73. 5
Federal Standards, Cont’d
▪ National Archives and Records Administration
(NARA)
▪ Managing partner of the E-Government Electronic Records
Management Initiative
▪ Endorsed DoD 5015.02-STD for all government records
offices
▪ Records Management Handbook
▪ Federal requirements for including records management in
agency electronic information systems
▪ Toolkit for Managing Electronic Records
▪ http://www.archives.gov/records-mgmt/toolkit/
74. ▪ Records management language for contracts
▪ Guidance and resources for integrating records management
into electronic information systems
6
http://www.archives.gov/records-mgmt/toolkit/
Requirements Standards
▪ Model Requirements for the Management of
Electronic Records (MoReq)
▪ Commissioned by the Interchange of Data
Software Administrations (IDA) Program of the
European Commission
▪ Multiple revisions
▪ MoReq2010® in 2011
75. ▪ Products tested and certified as complaint
with the DoD 5015.2 Standard for
Recordkeeping Applications
▪ http://jitc.fhu.disa.mil/projects/rma/index.
aspx
7
http://www.moreq.info/
http://jitc.fhu.disa.mil/projects/rma/index.aspx
Requirements: Paper vs. Electronic
▪ Requirements are not much different than
what we would like to see in an ideal paper
system
▪ Differences:
▪ In many cases the requirements need to be
76. automated and executable by the system
▪ Reflects the fact we can do a better job
documenting recordkeeping in an automated
environment
8
Overall Requirements
▪ Compliance
▪ Record capture
▪ Classification scheme
▪ Authenticity
▪ Audit trail
77. ▪ Metadata
▪ Retention and disposal
▪ Access and use
▪ Documentation
▪ System Testing
▪ Non-electronic record handling
9
Let’s Brainstorm: Define
Requirements
10
▪ Develop a records management system for the
university’s Registrar’s department to collect
78. and preserve all the necessary records form
admission to graduation
▪ What significant questions that need to be
asked to ensure that we can build an effective
and secure electronic documents and records
system
▪ …
Basic Components of EDMS
11
▪ Document Repository
▪ Integration with Desktop Applications
▪ Check-In and Check-Out
79. ▪ Versioning
▪ Auditing
▪ Security
▪ Classification and Indexing
▪ Search and Retrieval
Document Repository
12
▪ All EDMS require a document repository
▪ System stores documents that are under its
management
▪ Location of documents should be in a centralized
location
▪ Will fail if users do not place documents in the
80. repository when they are created
▪ Make users save documents in repository
▪ System also needs a database to store
information about the documents
▪ Folder structure:
▪ Established by system administrator
▪ Company structure, project, etc.
Integration with Desktop Applications
13
▪ EDMS needs to integrate with desktop
applications
▪ Allows users to save documents straight from
the application that created the document
81. ▪ Vast majority of EDMS integrate with many
popular desktop application
▪ MS Office
▪ Open Office
▪ Must be updated as these applications evolve
Check-In and Check-Out
14
▪ Controls who is editing a document and when it
is being edited
▪ Ensures that no more than one person edits a
document at any one time
▪ Once a document is checked out to one person,
all others only have read-only access
▪ Once edits are completed, the document is
82. checked back in
▪ Others can now access the updated document
▪ Someone else can now change it
▪ What information should you get if a document
is checked out?
Version Control
15
▪ After a document has been updated, there
needs to be a mechanism by which the
system can keep track of the changes made
to that document
▪ Assigning the document a version number
▪ Start with version 1.0
▪ Rules for next version as 1.1 or 2.0?
83. ▪ System should allow authorized access to
previous versions of the document
▪ Read-only
Auditing
16
▪ Audit trail: A chronological record of system activities that
is sufficient to enable the reconstruction, reviewing and
examination of the sequence of environments and activities
surrounding or leading to an operation, a procedure, or an
event in a transaction from its inception to final result
▪ Keeps a check on which users made changes to a
document and when
84. ▪ Allow authorized users to find out the changes that have
been made to the document since it was first created
▪ Who
▪ What
▪ When
▪ Auditing is important for ensuring the chain of custody in a
legal or forensic sense
Security
17
▪ Extremely important component in a properly
implemented system
▪ Needs to be tightly integrated with the
85. system
▪ Security access permissions at different
levels based on:
▪ User
▪ Individual document
▪ Individual user vs. group permissions
▪ Read/Write/No Access
▪ File/Folder/Group of Folder
Classification and Indexing
18
▪ All documents should be classified and
indexed using metadata
▪ Main purpose is retrieval at a later date
86. ▪ Typical metadata:
▪ Date created
▪ Subject
▪ Who created
▪ Document title
▪ Keywords
▪ Classification
▪ Other?
Search and Retrieval
19
▪ The more intuitive the classification and indexing, the
easier it will be to locate them using the search and
retrieval mechanism
87. ▪ Should have multiple ways to locate a document:
▪ Browse the folder structure
▪ Basic search
▪ Advanced search
▪ Index terms vs. full text
▪ Must use Optical Character Recognition (OCR) to
get full text from paper documents
▪ Search should return list of documents with enough
information to select the right one
▪ What information?
Components of ERMS
20
89. 21
▪ Needs to have a repository where the
records are archived
▪ Physically, the records will be located in a
central location
▪ Must be protected so that records cannot be
changed
▪ Allows for a folder structure that supports the
archiving function and the maintenance of
security
Classification, Indexing, and
Metadata
90. 22
▪ All records in the system need to be
categorized and indexed within the folder
structure
▪ Need to use metadata to ensure that
archived records in a systematic manner
▪ Need to assist users to find their documents
much later in the future
▪ What indexing terms make sense?
▪ How much effort should be put in indexing?
Capturing and Declaring Records
23
91. ▪ Needs a mechanism that automatically
capturing and declaring records otherwise
repository will not be complete
▪ Need to decide which documents form the
records
▪ Documents
▪ Email
▪ Letters
▪ Blog entries
▪ Text messages
▪ Others?
Record Security
92. 24
▪ Need to employ stringent security around the
archiving of records
▪ Organization’s security
▪ Compliance
▪ Electronic records, with no paper record,
face special security challenges
▪ Ensuring they are not tampered with?
▪ What types of tampering can happen?
▪ System administrator must control security
by record type and by user
Auditing and Reporting
93. 25
▪ Allow authorized users and administrators to
produce audit trails
▪ Information:
▪ Date created
▪ Create standard and ad-hoc reports to meet
potential future demand
Compliance with Standards
26
▪ Legislation
▪ Freedom of Information Act
▪ Privacy Act
▪ HIPAA
94. ▪ Standards
▪ DoD 5015.2
▪ ISO 15489
▪ Must apply to the relative legislation and
standards that apply to both the industry and
the country
Scanning and Imaging
27
▪ Need facilities to scan and image paper-
based documents
▪ Comprehensive electronic records
▪ Save space in file rooms
95. ▪ Should allow the organization to scan
documents in batches
▪ Also need to index the documents
▪ Quality control is a primary concern
Collaboration
28
▪ Allow people and teams within the organization
to communicate and share information
▪ Work on documents together
▪ The need to collaborate electronically
▪ With employees
▪ With clients
▪ With partners
96. ▪ With consultants
▪ With others, dependent on type of business
▪ Impact of globalization
▪ Need to reduce time frames for work
▪ Desirable rather than mandatory requirements
Workflow
29
▪ Also called business process management
(BPM)
▪ Manage the flow of information around an
organization
97. ▪ Ensure necessary process is followed
▪ Approval process
▪ Scanning and indexing process
▪ Desirable rather than mandatory features
IT 380
Electronic Document and
Record Management
Systems
Unit 5: Emerging Challenges to ERM
Information Governance
Instructor: Dr. Michelle Liu