From the MER Conference 2012
Seakers: Jason R. Baron, Esq. Dave Lewis, Ph.D.
2012 is the year we will see great strides by information professionals in using automation (in the form of "predictive" and "technology-assisted" search, filtering, and auto-classification) for the purpose of achieving efficiencies and cutting costs in records management as well as in legal settings.
The strategic use of these new methods is absolutely necessary given the massive, exponential increases in electronically stored information - in the form of records within corporate networks and repositories.
This session addresses the latest technological developments from the two perspectives:
- A longtime advocate of smart technology in the public recordkeeping sector, and
- A leading information scientist.
The session includes a state of the art overview of the latest developments in technology-assisted review, with an emphasis on how these technologies can and will enhance electronic records management by helping to end the era of excessive reliance on end user RM.
You will learn:
- What technology-assisted review and predictive analytics are all about using advanced search, filtering, and auto-classification as part of a defensible electronic records management program.
- How these technologies also add value to overall corporate information governance.
The document summarizes a workshop on successful implementation of records management. It includes:
1) An agenda for the workshop covering topics like automating information lifecycle governance and implementing records management programs.
2) A background section on the presenter including his experience in records management, ECM, and IBM.
3) Case studies and examples of how organizations have saved millions through implementing information lifecycle governance programs that involve archiving unnecessary data, applying retention schedules, and enabling defensible disposal.
ConceptClassifier for SharePoint Turbo Charging the Public Sectormartingarland
The document discusses Concept Searching's conceptClassifier for SharePoint product. It provides examples of how several public sector organizations are using the product to improve search, records management, compliance, and information sharing through automatic metadata generation, classification, and taxonomy management of documents within SharePoint. Specific clients mentioned include the Defense Centers of Excellence, U.S. Army Records Management Declassification Agency, 711th Human Performance Wing, U.S. Air Force Human Performance Clearing House, Consumer Product Safety Commission, Care Quality Commission, Transport for London, European Bank of Reconstruction and Development, and several U.K. city councils.
“The Fountain of Truth” Web-based Contract Management for Starwood Hotels – TEAM Informatics
This document summarizes a case study of Starwood Hotels & Resorts Worldwide, Inc.'s implementation of Oracle WebCenter for web-based contract management. It describes drivers for replacing their legacy system with a new solution to unify disparate records repositories, support records retention policies, and reduce time spent searching. The solution included a content management platform to centrally manage contracts and related documents, integrate with other systems, and provide a single source of truth for legal and business teams. Key aspects of the implementation approach involved defining requirements, selecting a technology partner, and developing a multi-phased project plan.
The document summarizes a ribbon cutting ceremony for the Oregon Records Management Solution (ORMS) and Synergy Data Center partnership. It then provides details on how the ORMS Software as a Service (SaaS) billing model works, charging agencies per user each month even if all users haven't begun using the system. The monthly fees help cover upfront infrastructure costs and are significantly lower than alternatives. The document urges agencies to ensure employees use the system, as getting billed is not a SaaS issue but a management responsibility. It describes alternatives that pose higher risks and costs if electronic records are not properly managed.
Creating Data Hubs to Enhance Information SharingInnoTech
Data Bridge Management creates data hubs to enhance information sharing between organizations. It integrates data sharing and analytic systems through a centralized "Data Bridge" information sharing hub. This reduces losses from threats while saving lives and resources by enabling accurate information access and analysis. However, information sharing faces challenges around communication, access to data across different platforms and formats, and disconnected data "stovepipes". Data Bridge Management's solution is a brokered system that enforces agreements to enable secure access to various structured and unstructured data sources in a centralized way. It has implemented regional resiliency and emergency response hubs as well as proposed technology transfer and supply chain security hubs.
Collaboration & Social Media New Challenges For Records ManagementMaurene Caplan Grey
Presentation was delivered as the keynote of the 27 Feb 2009, ARMA Northern VA chapter conference (http://www.armamar.org/nova/programs/ARMA%20NOVA%202009%20Seminar%20Brochure-c.pdf).
Integrating Information Protection Into Data Architecture & SDLCDATAVERSITY
The document discusses how integrating data protection into software development life cycles (SDLC) can help close hidden gaps where data governance is often absent. It notes that many SDLCs skip critical data classification steps until late in the process, resulting in inconsistent data protection and governance gaps. The document proposes a parallel SDLC approach that classifies regulated data early and links it to compliance actions to design roles and controls for user entitlements.
The document summarizes a workshop on successful implementation of records management. It includes:
1) An agenda for the workshop covering topics like automating information lifecycle governance and implementing records management programs.
2) A background section on the presenter including his experience in records management, ECM, and IBM.
3) Case studies and examples of how organizations have saved millions through implementing information lifecycle governance programs that involve archiving unnecessary data, applying retention schedules, and enabling defensible disposal.
ConceptClassifier for SharePoint Turbo Charging the Public Sectormartingarland
The document discusses Concept Searching's conceptClassifier for SharePoint product. It provides examples of how several public sector organizations are using the product to improve search, records management, compliance, and information sharing through automatic metadata generation, classification, and taxonomy management of documents within SharePoint. Specific clients mentioned include the Defense Centers of Excellence, U.S. Army Records Management Declassification Agency, 711th Human Performance Wing, U.S. Air Force Human Performance Clearing House, Consumer Product Safety Commission, Care Quality Commission, Transport for London, European Bank of Reconstruction and Development, and several U.K. city councils.
“The Fountain of Truth” Web-based Contract Management for Starwood Hotels – TEAM Informatics
This document summarizes a case study of Starwood Hotels & Resorts Worldwide, Inc.'s implementation of Oracle WebCenter for web-based contract management. It describes drivers for replacing their legacy system with a new solution to unify disparate records repositories, support records retention policies, and reduce time spent searching. The solution included a content management platform to centrally manage contracts and related documents, integrate with other systems, and provide a single source of truth for legal and business teams. Key aspects of the implementation approach involved defining requirements, selecting a technology partner, and developing a multi-phased project plan.
The document summarizes a ribbon cutting ceremony for the Oregon Records Management Solution (ORMS) and Synergy Data Center partnership. It then provides details on how the ORMS Software as a Service (SaaS) billing model works, charging agencies per user each month even if all users haven't begun using the system. The monthly fees help cover upfront infrastructure costs and are significantly lower than alternatives. The document urges agencies to ensure employees use the system, as getting billed is not a SaaS issue but a management responsibility. It describes alternatives that pose higher risks and costs if electronic records are not properly managed.
Creating Data Hubs to Enhance Information SharingInnoTech
Data Bridge Management creates data hubs to enhance information sharing between organizations. It integrates data sharing and analytic systems through a centralized "Data Bridge" information sharing hub. This reduces losses from threats while saving lives and resources by enabling accurate information access and analysis. However, information sharing faces challenges around communication, access to data across different platforms and formats, and disconnected data "stovepipes". Data Bridge Management's solution is a brokered system that enforces agreements to enable secure access to various structured and unstructured data sources in a centralized way. It has implemented regional resiliency and emergency response hubs as well as proposed technology transfer and supply chain security hubs.
Collaboration & Social Media New Challenges For Records ManagementMaurene Caplan Grey
Presentation was delivered as the keynote of the 27 Feb 2009, ARMA Northern VA chapter conference (http://www.armamar.org/nova/programs/ARMA%20NOVA%202009%20Seminar%20Brochure-c.pdf).
Integrating Information Protection Into Data Architecture & SDLCDATAVERSITY
The document discusses how integrating data protection into software development life cycles (SDLC) can help close hidden gaps where data governance is often absent. It notes that many SDLCs skip critical data classification steps until late in the process, resulting in inconsistent data protection and governance gaps. The document proposes a parallel SDLC approach that classifies regulated data early and links it to compliance actions to design roles and controls for user entitlements.
J P Sathiadas, G N Wikramanayake (2003) "Document Management Techniques and Technologies" In:5th International Information Technology Conference, pp. 40-48. Infotel Lanka Society Ltd., Colombo, Sri Lanka: IITC Dec 1-7, ISBN: 955-8974-00-5
The document discusses enterprise content management (ECM). ECM refers to strategies, methods and tools used to capture, manage, store, preserve and deliver content and documents related to key organizational processes. It captures, manages, stores, preserves and delivers content. ECM covers unstructured information like images, documents, and web pages that are processed by humans. It also discusses concepts like metadata, classification, search and retrieval, security and integration that are important aspects of ECM.
The document discusses key concepts in enterprise content management (ECM). It defines ECM as strategies, methods and tools used to capture, manage, store, preserve, and deliver content and documents related to organizational processes. It outlines the main components of ECM systems including capture, storage, management, preservation, and delivery of content. It also discusses important related concepts such as metadata, classification, taxonomy development, and automated vs manual processes.
This chapter discusses database basics, anatomy, operations, and applications. It defines a database as a set of logically related files organized to minimize data redundancy and facilitate access by applications. Key points include:
- Databases store large amounts of information easily and allow flexible retrieval and organization of data.
- A database contains files which contain records made of fields. Fields have defined data types like text or numeric.
- Common database operations are browsing, querying, sorting, and generating reports, labels, and letters.
- Specialized database programs exist for contact managers, calendars, maps, and notes. Real-time databases now replace batch processing for immediate user interaction.
The document discusses considerations for email archiving solutions. It notes that email volume is growing significantly each year, driving the need for archiving. Organizations require archiving primarily for mailbox management, compliance, eDiscovery/litigation support, and knowledge management. When selecting an archiving solution, organizations should consider how it will help manage their overall email environment and infrastructure costs while ensuring important email data remains accessible and policies can be consistently applied. Cloud-based solutions may help address these goals more effectively than on-premises systems.
The document discusses several topics related to information management within government organizations. It begins by outlining the key considerations for a Canadian government RFI on cloud services, including policy, business, technical, procurement, pricing and security. It then discusses challenges of moving to the cloud and key capabilities needed for collaboration and content management. Several graphics show examples of infrastructure layouts, the variety of locations information can be stored, and the need to define user journeys to understand how people complete tasks. It emphasizes identifying "dangerous" user groups where compliance issues are most likely to occur to prioritize support and adoption of information management systems.
A Pragmatic Strategy for Oracle Enterprise Content Management (ECM)Brian Huff
This is a new way of looking at how to manage unstructured content across your enterprise. Its what we call a \"Pragmatic ECM Strategy,\" and is the focus of my second book.
DSS - ITSEC Conference - Protected-Networks - An Open Door May Tempt a Saint ...Andris Soroka
This document discusses the importance of access governance in Windows environments. It notes that 65% of employees have unrestricted access to sensitive data and 80% of data loss originates from within companies, highlighting the security risks of improper access controls. The document advocates for using access rights management software to provide transparency into access rights, enable more efficient administration of rights, and ensure compliance. It introduces 8MAN software as a solution that provides information, documentation, and administration of Active Directory, file server, and SharePoint access rights.
The document discusses procedures for managing information and records in organizations. It covers categories like information path flows, records management systems, prioritizing jobs, and ensuring privacy and assigning passwords in multi-user environments. Specific examples discussed include using data flow paths to show information flows, procedures for printing records in a particular order, using relational databases and records management systems for different business types, and using project management software to organize tasks. The document also covers setting passwords on files, programs, workstations and networks to manage privacy and security in multi-user environments.
Presentation on electronic records management and archival issues. Originally presented at the Fall 2008 meeting of the Southeastern Wisconsin Archivists Group
The document summarizes an electronic records management class which covered:
1) Responsibilities and challenges of managing electronic records including storage, formats, and ensuring access over time.
2) Storage media and database concerns like capacity, recovery, and security.
3) Reformatting records digitally and quality control standards.
4) Metadata and its importance for records.
5) Strategies for email management including determining records, organization, and archiving.
6) Enterprise content management systems and their use by Virginia agencies.
The document discusses database management systems (DBMS) and their advantages over traditional file-oriented data storage. It describes the key components of a DBMS, including the data definition language (DDL) used to define the database schema, the data manipulation language (DML) used to query and manipulate data, and database models like relational, hierarchical and network models. The document provides examples of how a sample education database could be structured in a relational model using tables, attributes, and relations.
This document provides an overview of database environments and compares file processing and database approaches. It defines key terms like data, information, and database. The database approach integrates and shares data across an organization, reducing redundancy and inconsistencies compared to file processing. A database environment consists of several components including the database, database management system, application programs, user interface, and roles like administrators and developers.
A Pragmatic Strategy for Oracle Enterprise Content ManagementBrian Huff
The document outlines seven steps of a pragmatic strategy for enterprise content management (ECM): 1) Create a center of excellence team to govern ECM initiatives, 2) Assess the current environment and label existing systems as strategic, tactical, or replaceable, 3) Consolidate content from replaceable systems into strategic repositories, 4) Federate control of tactical systems to the strategic repositories using federation tools, 5) Secure information wherever it exists using security tools like information rights management, 6) Unify structured and unstructured strategies using tools that extract structure and integrate systems, 7) Plan for the future with 3-year plans to assess storage needs as content volumes grow rapidly over time.
Richard (Dick) Fisher
Organizations are creating data records at a pace few could have imagined just five years ago - terabytes (1 trillion bytes) now and heading toward petabytes (1,000 terabytes) that may need to be archived or disposed of! This session uses the requirement for archiving and disposition of PeopleSoft records and data elements as one example, plus other real world requirements.
Read more: http://www.rimeducation.com/videos/rimondemand.php
The document discusses email and e-form management. It begins by outlining some of the challenges with email, such as lack of control and retrievability. It then defines email archiving as a way to address these issues by providing a system to archive all email messages. The document also discusses the benefits of archiving for compliance, productivity and e-discovery. It provides an overview of different technology approaches and implementation models for archiving email. Finally, it briefly introduces e-forms and discusses how they can improve processes by capturing data electronically.
The document summarizes the many different systems organizations use to manage information and documents, and the challenges of integrating them. It discusses paper management systems, shared drives, document imaging, electronic document management systems, records management systems, workflow systems, website content management, database systems, and cloud-based systems. It notes the risks of not having governance over these distributed systems and the loss of control over information. It recommends that organizations create an Information Manager role, develop a governance plan, map their information assets, and plan for records management. It suggests determining requirements through user input to develop use cases.
Ideate Framework (www.ideate.com). Presented at WS-REST 2011, Hyderabad India. A Resource-Oriented Framework that can dynamically coordinate resources in a virtual information layer to perform as services, without middleware indirection. Ideate delivers the Read-Write-Execute Web.
IS 3003Chapter 61The Globe and MailIt is the.docxpriestmanmable
IS 3003
Chapter 6
1
The Globe and Mail
It is the largest newspaper in Canada; is considered the NYT and USA Today all in one; It wants to enjoy the subscribership of every household in Canada, but it has a problem with trying to house the information needed on all the homes in Canada
Its problem with storing and retrieving information started with the practice of storing everything on a mainframe that was hard to use for retrieving and analyzing; this led to large amounts of data relating to specific areas being downloaded to smaller computers which created pockets of data processing interests that became unwieldy; data was not integrated and difficult to analyze without the relationships to other data being available
2
There were inconsistent database systems in use, such as MS Access, SQL Server, Foxbase Pro, and even Excel; updating information was difficult because the latest information was back on the mainframe, and had to be downloaded again to get to new info
Getting to potential new subscribers was almost impossible; even information on existing subscribers was a problem security-wise, because it was stored in many places thus causing potential security breaches with inconsistent controls
3
In 2002, Globe and Mail acquired SAP NetWeaver BW, a data warehousing platform (platform versus program: platform is a capability to build several programs, whereas a program is just that); the general definition of NetWeaver is that it is an application builder from SAP for integrating business processes and databases from a number of sources while exploiting the leading Web services technologies;
All information is now aggregated within the SAP NetWeaver system in which applications for analyzing and using database items can be built and changed quickly; additionally, the database is available quickly for warehousing and mining purposes
4
The investment paid for itself in one year
This example points out the extreme importance of data management; management decision making was also enhanced by more effective use of available info
Boost in efficiency was caused by:
Making the globe and mail data easier to locate and assemble
The SAP NetWeaver system integrated info from all sources available to the paper
Duplications were eliminated, and synchronization of data sources was achieved
5
A database is – based on computer files, records and items; contains data on people, places, and things; the archetypical database is the phone book which is a record of people who use phones that are listed (accessible to the public)
Most non-computer databases, namely paper records, rolodex files, folders, notes, etc. are sequential and can only be reviewed in a certain order
6
Data and file structures
Databases are usually organized as relational databases
These are 2-dimensional tables where each subject, e.g., customers, is organized as a set on entities (records), which contain things like names of each customer, address, ph ...
M12S23 - Right-sizing Your Information Footprint by Chucking Your Dead DataMER Conference
Speakers: Randolph A. Kahn, Esq. & Jonathan Redgrave, Esq.
Today, most organizations have too much digital content that has outlived its usefulness. Every year, the quantity of this unusable content multiplies. So, there's no time like the present to get busy, get cleaner and get meaner.
Having a defensible methodology and using the right tool for the task allows organizations to right-size their "Information Footprint" without worrying about regulatory or legal consequences.
If your organization right-sizes its' "Information Footprint", it would be much better off. Keeping old, unnecessary content will result in higher costs and risks. If the costs and risks are big enough, the case for proactively cleaning up the "Information Footprint" is very compelling.
Read more: http://www.rimeducation.com/videos/rimondemand.php
M12S21 - "Corporate Alzheimer's": The Impending Crisis in Accessing Digital R...MER Conference
Speakers: Christine Ardern, Adrian Cunningham, Charles Dollar, Ph.D., Mariella Guercio, Ph.D., Kenneth Thibodeau, Ph.D.
Is your organization facing "Corporate Alzheimer's"?
Fast forward to 2020:
- Your organization's CDs are obsolete,
- Its social media has been replaced,
- The Cloud has evaporated, and
- The organization has restructured several times and the terminology it used in 2012 has changed.
The organization's information assets, however, are safe in tiered storage. Because the software and hardware has changed from what was used to create them, access to needed records and data now is limited and, in some instances, impossible.
This is Corporate Alzheimer's - the increasing inability over time to access an organization's long-term digital information - when we know it is there, but changes in computer hardware and software have made the needed information inaccessible/unreadable.
Read more: http://www.rimeducation.com/videos/rimondemand.php
More Related Content
Similar to M12S06 - Will Technology-Assisted Predictive Modeling and Auto-Classification End the 'End User' Burden in Records Management?
J P Sathiadas, G N Wikramanayake (2003) "Document Management Techniques and Technologies" In:5th International Information Technology Conference, pp. 40-48. Infotel Lanka Society Ltd., Colombo, Sri Lanka: IITC Dec 1-7, ISBN: 955-8974-00-5
The document discusses enterprise content management (ECM). ECM refers to strategies, methods and tools used to capture, manage, store, preserve and deliver content and documents related to key organizational processes. It captures, manages, stores, preserves and delivers content. ECM covers unstructured information like images, documents, and web pages that are processed by humans. It also discusses concepts like metadata, classification, search and retrieval, security and integration that are important aspects of ECM.
The document discusses key concepts in enterprise content management (ECM). It defines ECM as strategies, methods and tools used to capture, manage, store, preserve, and deliver content and documents related to organizational processes. It outlines the main components of ECM systems including capture, storage, management, preservation, and delivery of content. It also discusses important related concepts such as metadata, classification, taxonomy development, and automated vs manual processes.
This chapter discusses database basics, anatomy, operations, and applications. It defines a database as a set of logically related files organized to minimize data redundancy and facilitate access by applications. Key points include:
- Databases store large amounts of information easily and allow flexible retrieval and organization of data.
- A database contains files which contain records made of fields. Fields have defined data types like text or numeric.
- Common database operations are browsing, querying, sorting, and generating reports, labels, and letters.
- Specialized database programs exist for contact managers, calendars, maps, and notes. Real-time databases now replace batch processing for immediate user interaction.
The document discusses considerations for email archiving solutions. It notes that email volume is growing significantly each year, driving the need for archiving. Organizations require archiving primarily for mailbox management, compliance, eDiscovery/litigation support, and knowledge management. When selecting an archiving solution, organizations should consider how it will help manage their overall email environment and infrastructure costs while ensuring important email data remains accessible and policies can be consistently applied. Cloud-based solutions may help address these goals more effectively than on-premises systems.
The document discusses several topics related to information management within government organizations. It begins by outlining the key considerations for a Canadian government RFI on cloud services, including policy, business, technical, procurement, pricing and security. It then discusses challenges of moving to the cloud and key capabilities needed for collaboration and content management. Several graphics show examples of infrastructure layouts, the variety of locations information can be stored, and the need to define user journeys to understand how people complete tasks. It emphasizes identifying "dangerous" user groups where compliance issues are most likely to occur to prioritize support and adoption of information management systems.
A Pragmatic Strategy for Oracle Enterprise Content Management (ECM)Brian Huff
This is a new way of looking at how to manage unstructured content across your enterprise. Its what we call a \"Pragmatic ECM Strategy,\" and is the focus of my second book.
DSS - ITSEC Conference - Protected-Networks - An Open Door May Tempt a Saint ...Andris Soroka
This document discusses the importance of access governance in Windows environments. It notes that 65% of employees have unrestricted access to sensitive data and 80% of data loss originates from within companies, highlighting the security risks of improper access controls. The document advocates for using access rights management software to provide transparency into access rights, enable more efficient administration of rights, and ensure compliance. It introduces 8MAN software as a solution that provides information, documentation, and administration of Active Directory, file server, and SharePoint access rights.
The document discusses procedures for managing information and records in organizations. It covers categories like information path flows, records management systems, prioritizing jobs, and ensuring privacy and assigning passwords in multi-user environments. Specific examples discussed include using data flow paths to show information flows, procedures for printing records in a particular order, using relational databases and records management systems for different business types, and using project management software to organize tasks. The document also covers setting passwords on files, programs, workstations and networks to manage privacy and security in multi-user environments.
Presentation on electronic records management and archival issues. Originally presented at the Fall 2008 meeting of the Southeastern Wisconsin Archivists Group
The document summarizes an electronic records management class which covered:
1) Responsibilities and challenges of managing electronic records including storage, formats, and ensuring access over time.
2) Storage media and database concerns like capacity, recovery, and security.
3) Reformatting records digitally and quality control standards.
4) Metadata and its importance for records.
5) Strategies for email management including determining records, organization, and archiving.
6) Enterprise content management systems and their use by Virginia agencies.
The document discusses database management systems (DBMS) and their advantages over traditional file-oriented data storage. It describes the key components of a DBMS, including the data definition language (DDL) used to define the database schema, the data manipulation language (DML) used to query and manipulate data, and database models like relational, hierarchical and network models. The document provides examples of how a sample education database could be structured in a relational model using tables, attributes, and relations.
This document provides an overview of database environments and compares file processing and database approaches. It defines key terms like data, information, and database. The database approach integrates and shares data across an organization, reducing redundancy and inconsistencies compared to file processing. A database environment consists of several components including the database, database management system, application programs, user interface, and roles like administrators and developers.
A Pragmatic Strategy for Oracle Enterprise Content ManagementBrian Huff
The document outlines seven steps of a pragmatic strategy for enterprise content management (ECM): 1) Create a center of excellence team to govern ECM initiatives, 2) Assess the current environment and label existing systems as strategic, tactical, or replaceable, 3) Consolidate content from replaceable systems into strategic repositories, 4) Federate control of tactical systems to the strategic repositories using federation tools, 5) Secure information wherever it exists using security tools like information rights management, 6) Unify structured and unstructured strategies using tools that extract structure and integrate systems, 7) Plan for the future with 3-year plans to assess storage needs as content volumes grow rapidly over time.
Richard (Dick) Fisher
Organizations are creating data records at a pace few could have imagined just five years ago - terabytes (1 trillion bytes) now and heading toward petabytes (1,000 terabytes) that may need to be archived or disposed of! This session uses the requirement for archiving and disposition of PeopleSoft records and data elements as one example, plus other real world requirements.
Read more: http://www.rimeducation.com/videos/rimondemand.php
The document discusses email and e-form management. It begins by outlining some of the challenges with email, such as lack of control and retrievability. It then defines email archiving as a way to address these issues by providing a system to archive all email messages. The document also discusses the benefits of archiving for compliance, productivity and e-discovery. It provides an overview of different technology approaches and implementation models for archiving email. Finally, it briefly introduces e-forms and discusses how they can improve processes by capturing data electronically.
The document summarizes the many different systems organizations use to manage information and documents, and the challenges of integrating them. It discusses paper management systems, shared drives, document imaging, electronic document management systems, records management systems, workflow systems, website content management, database systems, and cloud-based systems. It notes the risks of not having governance over these distributed systems and the loss of control over information. It recommends that organizations create an Information Manager role, develop a governance plan, map their information assets, and plan for records management. It suggests determining requirements through user input to develop use cases.
Ideate Framework (www.ideate.com). Presented at WS-REST 2011, Hyderabad India. A Resource-Oriented Framework that can dynamically coordinate resources in a virtual information layer to perform as services, without middleware indirection. Ideate delivers the Read-Write-Execute Web.
IS 3003Chapter 61The Globe and MailIt is the.docxpriestmanmable
IS 3003
Chapter 6
1
The Globe and Mail
It is the largest newspaper in Canada; is considered the NYT and USA Today all in one; It wants to enjoy the subscribership of every household in Canada, but it has a problem with trying to house the information needed on all the homes in Canada
Its problem with storing and retrieving information started with the practice of storing everything on a mainframe that was hard to use for retrieving and analyzing; this led to large amounts of data relating to specific areas being downloaded to smaller computers which created pockets of data processing interests that became unwieldy; data was not integrated and difficult to analyze without the relationships to other data being available
2
There were inconsistent database systems in use, such as MS Access, SQL Server, Foxbase Pro, and even Excel; updating information was difficult because the latest information was back on the mainframe, and had to be downloaded again to get to new info
Getting to potential new subscribers was almost impossible; even information on existing subscribers was a problem security-wise, because it was stored in many places thus causing potential security breaches with inconsistent controls
3
In 2002, Globe and Mail acquired SAP NetWeaver BW, a data warehousing platform (platform versus program: platform is a capability to build several programs, whereas a program is just that); the general definition of NetWeaver is that it is an application builder from SAP for integrating business processes and databases from a number of sources while exploiting the leading Web services technologies;
All information is now aggregated within the SAP NetWeaver system in which applications for analyzing and using database items can be built and changed quickly; additionally, the database is available quickly for warehousing and mining purposes
4
The investment paid for itself in one year
This example points out the extreme importance of data management; management decision making was also enhanced by more effective use of available info
Boost in efficiency was caused by:
Making the globe and mail data easier to locate and assemble
The SAP NetWeaver system integrated info from all sources available to the paper
Duplications were eliminated, and synchronization of data sources was achieved
5
A database is – based on computer files, records and items; contains data on people, places, and things; the archetypical database is the phone book which is a record of people who use phones that are listed (accessible to the public)
Most non-computer databases, namely paper records, rolodex files, folders, notes, etc. are sequential and can only be reviewed in a certain order
6
Data and file structures
Databases are usually organized as relational databases
These are 2-dimensional tables where each subject, e.g., customers, is organized as a set on entities (records), which contain things like names of each customer, address, ph ...
Similar to M12S06 - Will Technology-Assisted Predictive Modeling and Auto-Classification End the 'End User' Burden in Records Management? (20)
M12S23 - Right-sizing Your Information Footprint by Chucking Your Dead DataMER Conference
Speakers: Randolph A. Kahn, Esq. & Jonathan Redgrave, Esq.
Today, most organizations have too much digital content that has outlived its usefulness. Every year, the quantity of this unusable content multiplies. So, there's no time like the present to get busy, get cleaner and get meaner.
Having a defensible methodology and using the right tool for the task allows organizations to right-size their "Information Footprint" without worrying about regulatory or legal consequences.
If your organization right-sizes its' "Information Footprint", it would be much better off. Keeping old, unnecessary content will result in higher costs and risks. If the costs and risks are big enough, the case for proactively cleaning up the "Information Footprint" is very compelling.
Read more: http://www.rimeducation.com/videos/rimondemand.php
M12S21 - "Corporate Alzheimer's": The Impending Crisis in Accessing Digital R...MER Conference
Speakers: Christine Ardern, Adrian Cunningham, Charles Dollar, Ph.D., Mariella Guercio, Ph.D., Kenneth Thibodeau, Ph.D.
Is your organization facing "Corporate Alzheimer's"?
Fast forward to 2020:
- Your organization's CDs are obsolete,
- Its social media has been replaced,
- The Cloud has evaporated, and
- The organization has restructured several times and the terminology it used in 2012 has changed.
The organization's information assets, however, are safe in tiered storage. Because the software and hardware has changed from what was used to create them, access to needed records and data now is limited and, in some instances, impossible.
This is Corporate Alzheimer's - the increasing inability over time to access an organization's long-term digital information - when we know it is there, but changes in computer hardware and software have made the needed information inaccessible/unreadable.
Read more: http://www.rimeducation.com/videos/rimondemand.php
M12S19 - S19 - CASE STUDY: e-RIM Success with Structured Data SystemsMER Conference
Speakers: Laurie Fischer, Kevin S. Joerling, & Michael S. McKenna
Today, the majority of an organization's business processes and functions are facilitated or supported by the use of electronic systems. In turn, many of these systems create, manage, and/or store electronic data that is "structured".
"Structured data" typically is created and stored according to a pre-defined data model and fits into relational tables, or can be stored in rows and columns.
Information stored in structured data systems often serves as the official evidence of the business process that the system facilitates. As such, this information needs to meet the retention and disposition requirements defined in the organization's retention schedule. The additional requirements for authentic, reliable and unchangeable records are especially challenging since most structured data systems were designed to store and process dynamic and non-redundant data.
Information Technology departments historically have taken the position that since "storage is cheap" the application of an organization's records retention schedule to structured data was not an efficient use of scarce IT resources.
Today, the volume of information and its legal discoverability present a compelling argument for change.
Read more: http://www.rimeducation.com/videos/rimondemand.php
M12S18 - Records and Information Management: What Healthcare Should be Learni...MER Conference
Speakers: Hon. Ronald J. Hedges, Linda Kloss, & Deborah Kohn, MPH RHIA FACHE
Healthcare organizations are making unprecedented investments in information technology to accelerate the transition from paper to electronic health records as a foundation for improving care delivery.
The health care industry is learning that implementing information management and communications technology does not ensure that information is complete, accurate, reliable, secure, or used appropriately.
In fact, research is revealing new data errors and other information-related unintended consequences can impede safe use of technology.
Read More: http://www.rimeducation.com/videos/rimondemand.php
M12S15 - CASE STUDY: Spoliation - The Actual Case As It Was To Be Argued in ...MER Conference
Speakers: Richard Cowen, Esq. Hon. Ronald J. Hedges Matthew Prewitt, Esq.
A key executive with access to his company's confidential and trade secret information downloads information from his company-owned laptop to flash drives, and then, the evening before resigning, uses a commercially available "wipe" to erase his laptop. He claims he didn't want his employer to see porn, personal messages and comments critical of his boss, and didn't realize he erased virtually everything. After leaving the company, and after suit had been filed, he throws away the flash drives, now claiming he did this so it would be clear he wouldn't have any information in his possession.
M12S13 - RIM for the Next Generation: A Call to ActionMER Conference
Speakers: Charles R. Booz, Julia Brickell, and Mike Salvarezza
The RIM paradigms of the past are fast becoming obsolete and unworkable. New perspectives and new approaches are required.
This session is a "Call to Action" - for a complete transformation of the practice of RIM - from regulations and laws to practices and policies.
The session begins by identifying four major changes are redefining how business is conducted:
The emergence of a new generation of workers,
The proliferation of mobile technology,
The explosion of Social Media, and
The rapid advance of new and innovative technological capabilities.
Collectively, these four changes are rapidly and radically changing the world we live and work in. RIM leaders can and should be leading the charge to:
Change the way things are done,
Adjust legal, regulatory, and business expectations to better address new and different technologies, and
Incorporate cultural changes in the way business is conducted.
One approach is to build a prospective Information Governance model that is highly adaptable to changing circumstances and technologies - in order to avoid being trapped by the next paradigmatic fault underlying our basic RIM assumptions.
M12S11 - The Do's and Don'ts of Managing Social MediaMER Conference
M12S11 - The Do's and Don'ts of Managing Social Media
Speakers: Sara Meaney Jesse Wilkins
Many organizations have moved, from experimenting with social media tools, to incorporating them into business processes. As a result, both commercial services and enterprise social content tools are significantly changing the ways organizations do business and interact with their constituents. As the amount of content generated by these tools increases, so does the need to manage them successfully.
Learn more: http://www.rimeducation.com/videos/rimondemand.php
M12S01 - The Information Tsunami: Where We Are and How to Move ForwardMER Conference
The document summarizes a presentation on the current state of electronic records and information management (eRIM) programs. It notes that many eRIM implementations are failing because records managers missed an opportunity to lead, and that senior management buy-in and business operations involvement are key to success. The presentation outlines the various stakeholders involved in eRIM and their sometimes conflicting interests. It recommends clarifying accountability for information management and addressing the effects of social media. Motivating senior leaders requires emphasizing value and risk, while business users must take responsibility for records and embrace necessary changes.
M12S09 - ERM Case Law: The Latest News, Trends, and IssuesMER Conference
1) The document discusses recent cases related to electronic records management (ERM) procedures and preservation of electronically stored information (ESI).
2) It provides examples of cases where proper ERM procedures were valuable, as well as some cases where failures to follow procedures resulted in sanctions.
3) The document also summarizes discussions on potential amendments to rules around preservation duties and spoliation sanctions.
M12S08 - Transforming RIM to 'Responsible Information Management'MER Conference
From the MER Conference 2012
Speaker: Karen Strong
For many years "RIM" has been the name of the function within organizations responsible for Records & Information Management. RIM represents the discipline of managing an organization's records and information according to standards, guidelines, laws, and regulations with a focus on compliance.
It is time, however, to re-think some of the traditional assumptions about Records & Information Management. This session details the transformational value in a change to "Responsible Information Management (RIM)".
Responsible Information Management re-sets the vision and direction of RIM and can put your organization on a path to adopt and sustain "best practice" information management behaviors.
Responsible Information Management takes into consideration the roles and responsibilities of every user who creates and receives information in their daily work activities with clear expectations regarding individual and organizational accountability.
In this session, learn:
The necessary changes in behavior at the individual and organizational level to achieve Responsible Information Management, and
The steps you can take to achieve the required business results of Responsible Information Management by leveraging best practices in organizational transformation and change management.
M12S05 - CASE STUDY: Leveraging Content Analytics to Kick-Start your Informat...MER Conference
This document summarizes a presentation about using content analytics to kickstart an information governance initiative. It discusses challenges organizations face with growing data volumes and regulatory obligations. It then describes how content assessment, using analytics, classification, and collection, can help organizations understand their information landscape, prioritize efforts, and enable defensible disposition of data. The presentation includes an example case study of how one large financial organization used these techniques.
From the MER Conference 2012
Speaker: Bruce Miller
The first Electronic Recordkeeping software emerged in 1991.
The US DoD 5015.2 standard has been in place since 1997, and is now undergoing its third major revision.
Some 60+ product certifications against the standard have been granted to date.
Today approximately 20 different products remain certified. These products continue to be sold around the world.
Yet successful deployment is nowhere near expectations.
Hear Bruce's unique perspective as he reviews the successes and frustrations of the very technology he invented and has evangelized for two decades.
In this session, learn:
- Why we have failed to realize the promise of ERM software,
- What went wrong,
- What have we achieved,
- Where have we failed to meet expectations,
- Why do we still seem unable to make it work, and
- Where we are now.
Bruce will make the case that in order to achieve the adoption rates we expect, change will have to come from all four stakeholder groups:
- RIM practitioners,
- Software vendors,
- IT managers, and
- Business leadership.
Hear Bruce's compelling vision for a way-forward roadmap for a more successful electronic recordkeeping future.
M12S07 - Retention & ESI - Paths to Success - Part TwoMER Conference
From MER Conference 2012
Speakers: Christine Burns and Carol Stainbrook
This session explains "why" your organization's technology selections impact "how" the updated retention schedules described in part one of this two-part session can be applied to electronically stored information (ESI). Learn reasonable and actionable approaches for embedding retention policies into e-mail, file shares and enterprise applications.
This session will address:
- Why "perfection" is often impractical, when it comes to applying retention policy to ESI and some reasonable alternatives to perfection.
- How the technologies for email, file shares, and other ESI affect the implementation of retention policies.
- When it may be necessary to choose different retention strategies for different technologies such e-mail, file shares and enterprise applications.
- Considerations for applying retention policy to data in enterprise applications.
- Criteria to help prioritize where to begin when applying retention policy.
In this session you will learn how to tailor your organization's approach to retention schedules so they are reasonable, actionable and result in the orderly destruction of eligible information, given your organization's technology selections.
Walmart Business+ and Spark Good for Nonprofits.pdfTechSoup
"Learn about all the ways Walmart supports nonprofit organizations.
You will hear from Liz Willett, the Head of Nonprofits, and hear about what Walmart is doing to help nonprofits, including Walmart Business and Spark Good. Walmart Business+ is a new offer for nonprofits that offers discounts and also streamlines nonprofits order and expense tracking, saving time and money.
The webinar may also give some examples on how nonprofits can best leverage Walmart Business+.
The event will cover the following::
Walmart Business + (https://business.walmart.com/plus) is a new shopping experience for nonprofits, schools, and local business customers that connects an exclusive online shopping experience to stores. Benefits include free delivery and shipping, a 'Spend Analytics” feature, special discounts, deals and tax-exempt shopping.
Special TechSoup offer for a free 180 days membership, and up to $150 in discounts on eligible orders.
Spark Good (walmart.com/sparkgood) is a charitable platform that enables nonprofits to receive donations directly from customers and associates.
Answers about how you can do more with Walmart!"
This document provides an overview of wound healing, its functions, stages, mechanisms, factors affecting it, and complications.
A wound is a break in the integrity of the skin or tissues, which may be associated with disruption of the structure and function.
Healing is the body’s response to injury in an attempt to restore normal structure and functions.
Healing can occur in two ways: Regeneration and Repair
There are 4 phases of wound healing: hemostasis, inflammation, proliferation, and remodeling. This document also describes the mechanism of wound healing. Factors that affect healing include infection, uncontrolled diabetes, poor nutrition, age, anemia, the presence of foreign bodies, etc.
Complications of wound healing like infection, hyperpigmentation of scar, contractures, and keloid formation.
Temple of Asclepius in Thrace. Excavation resultsKrassimira Luka
The temple and the sanctuary around were dedicated to Asklepios Zmidrenus. This name has been known since 1875 when an inscription dedicated to him was discovered in Rome. The inscription is dated in 227 AD and was left by soldiers originating from the city of Philippopolis (modern Plovdiv).
Chapter wise All Notes of First year Basic Civil Engineering.pptxDenish Jangid
Chapter wise All Notes of First year Basic Civil Engineering
Syllabus
Chapter-1
Introduction to objective, scope and outcome the subject
Chapter 2
Introduction: Scope and Specialization of Civil Engineering, Role of civil Engineer in Society, Impact of infrastructural development on economy of country.
Chapter 3
Surveying: Object Principles & Types of Surveying; Site Plans, Plans & Maps; Scales & Unit of different Measurements.
Linear Measurements: Instruments used. Linear Measurement by Tape, Ranging out Survey Lines and overcoming Obstructions; Measurements on sloping ground; Tape corrections, conventional symbols. Angular Measurements: Instruments used; Introduction to Compass Surveying, Bearings and Longitude & Latitude of a Line, Introduction to total station.
Levelling: Instrument used Object of levelling, Methods of levelling in brief, and Contour maps.
Chapter 4
Buildings: Selection of site for Buildings, Layout of Building Plan, Types of buildings, Plinth area, carpet area, floor space index, Introduction to building byelaws, concept of sun light & ventilation. Components of Buildings & their functions, Basic concept of R.C.C., Introduction to types of foundation
Chapter 5
Transportation: Introduction to Transportation Engineering; Traffic and Road Safety: Types and Characteristics of Various Modes of Transportation; Various Road Traffic Signs, Causes of Accidents and Road Safety Measures.
Chapter 6
Environmental Engineering: Environmental Pollution, Environmental Acts and Regulations, Functional Concepts of Ecology, Basics of Species, Biodiversity, Ecosystem, Hydrological Cycle; Chemical Cycles: Carbon, Nitrogen & Phosphorus; Energy Flow in Ecosystems.
Water Pollution: Water Quality standards, Introduction to Treatment & Disposal of Waste Water. Reuse and Saving of Water, Rain Water Harvesting. Solid Waste Management: Classification of Solid Waste, Collection, Transportation and Disposal of Solid. Recycling of Solid Waste: Energy Recovery, Sanitary Landfill, On-Site Sanitation. Air & Noise Pollution: Primary and Secondary air pollutants, Harmful effects of Air Pollution, Control of Air Pollution. . Noise Pollution Harmful Effects of noise pollution, control of noise pollution, Global warming & Climate Change, Ozone depletion, Greenhouse effect
Text Books:
1. Palancharmy, Basic Civil Engineering, McGraw Hill publishers.
2. Satheesh Gopi, Basic Civil Engineering, Pearson Publishers.
3. Ketki Rangwala Dalal, Essentials of Civil Engineering, Charotar Publishing House.
4. BCP, Surveying volume 1
Leveraging Generative AI to Drive Nonprofit InnovationTechSoup
In this webinar, participants learned how to utilize Generative AI to streamline operations and elevate member engagement. Amazon Web Service experts provided a customer specific use cases and dived into low/no-code tools that are quick and easy to deploy through Amazon Web Service (AWS.)
🔥🔥🔥🔥🔥🔥🔥🔥🔥
إضغ بين إيديكم من أقوى الملازم التي صممتها
ملزمة تشريح الجهاز الهيكلي (نظري 3)
💀💀💀💀💀💀💀💀💀💀
تتميز هذهِ الملزمة بعِدة مُميزات :
1- مُترجمة ترجمة تُناسب جميع المستويات
2- تحتوي على 78 رسم توضيحي لكل كلمة موجودة بالملزمة (لكل كلمة !!!!)
#فهم_ماكو_درخ
3- دقة الكتابة والصور عالية جداً جداً جداً
4- هُنالك بعض المعلومات تم توضيحها بشكل تفصيلي جداً (تُعتبر لدى الطالب أو الطالبة بإنها معلومات مُبهمة ومع ذلك تم توضيح هذهِ المعلومات المُبهمة بشكل تفصيلي جداً
5- الملزمة تشرح نفسها ب نفسها بس تكلك تعال اقراني
6- تحتوي الملزمة في اول سلايد على خارطة تتضمن جميع تفرُعات معلومات الجهاز الهيكلي المذكورة في هذهِ الملزمة
واخيراً هذهِ الملزمة حلالٌ عليكم وإتمنى منكم إن تدعولي بالخير والصحة والعافية فقط
كل التوفيق زملائي وزميلاتي ، زميلكم محمد الذهبي 💊💊
🔥🔥🔥🔥🔥🔥🔥🔥🔥
How Barcodes Can Be Leveraged Within Odoo 17Celine George
In this presentation, we will explore how barcodes can be leveraged within Odoo 17 to streamline our manufacturing processes. We will cover the configuration steps, how to utilize barcodes in different manufacturing scenarios, and the overall benefits of implementing this technology.
Elevate Your Nonprofit's Online Presence_ A Guide to Effective SEO Strategies...TechSoup
Whether you're new to SEO or looking to refine your existing strategies, this webinar will provide you with actionable insights and practical tips to elevate your nonprofit's online presence.
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.pptHenry Hollis
The History of NZ 1870-1900.
Making of a Nation.
From the NZ Wars to Liberals,
Richard Seddon, George Grey,
Social Laboratory, New Zealand,
Confiscations, Kotahitanga, Kingitanga, Parliament, Suffrage, Repudiation, Economic Change, Agriculture, Gold Mining, Timber, Flax, Sheep, Dairying,
M12S06 - Will Technology-Assisted Predictive Modeling and Auto-Classification End the 'End User' Burden in Records Management?
1. Cohasset Associates, Inc.
NOTES
Will Technology-Assisted Predictive Modeling and Auto-
Classification End the ‘End-User’ Burden in
Records Management?
2012 Managing Electronic Records Conference
Chicago, IL
g
May 7, 2012
Jason R. Baron, Esq.
Director of Litigation
Office of General Counsel
National Archives and Records Administration
Dave Lewis, Ph.D.
David D. Lewis Consulting, LLC
Chicago, IL
A New Era of Government
“[P]roper records management is the backbone of open Government.”
President Obama’s Memorandum dated November 28, 2011
re “Managing Government Records”
http://www.whitehouse.gov/the-press-office/2011/11/28/presidential-memorandum-
managing-government-records
2012 Managing Electronic Records Conference 6.1
2. Cohasset Associates, Inc.
NOTES
Reality:
The era of Big Data has just
begun….
Lehman Brothers Investigation
-- 350 billion page universe (3 petabytes)
-- Examiner narrowed collection by selecting
key custodians, using dozens of Boolean
searches
-- Reviewed 5 million docs (40 million pages
using 70 contract attorneys)
Source: Report of Anton R. Valukas, Examiner, In re Lehman Brothers Holdings Inc., et al., Chapter 11
Case No. 08-13555 (U.S. Bankruptcy Ct. S.D.N.Y. March 11, 2010), Vol. 7, Appx. 5, at
http://lehmanreport.jenner.com/.
Process Optimization Problem 1: The
transactional toll of user-based
recordkeeping schemes (“as is” RM)
5
…. and the need for
better, automated solutions ….
6
2012 Managing Electronic Records Conference 6.2
3. Cohasset Associates, Inc.
NOTES
Impact of Technology on E-Records
Management: Snapshot 2012 (“As is”)
A universe of proprietary products exists in the
marketplace: document management and
records management applications (RMAs)
DoD 5015.2 version 3 compliant products
However, scalability issues exist
Agencies must prepare to confront significant
front-end process issues when transitioning to
electronic recordkeeping
Records schedule simplification is key
7
RM wish list for 2012….
RM’s “easy button”: the elusive goal of zero
extra keystrokes to comply with RM
requirements (capture)
A technology app that automatically tags
records in compliance with RM policies and
practices (categorize)
Supervised learning RM with minimal records
officer or end user involvement (learn)
Rule-based and role-based RM
Advanced search 8
Electronic Archiving As The
First Step
What is it?
100% snapshot of (typically) email, plus in some
cases other selected ESI applications
How does it differ from an RMA?
Goal is of preservation of evidence, not records
management per se
NARA Bulletin 2008-05
9
2012 Managing Electronic Records Conference 6.3
4. Cohasset Associates, Inc.
NOTES
A Possible Path Forward?
Email archiving in short term, synced to existing
proprietary software on email system
Designation of key senior officials as creating
permanent records, consistent with existing records
schedules
Additional designations of permanent records by
agency component
“Smart” filters/categorical rules built in based on
content, to the extent feasible to do
Default are records in designated temporary record
buckets, disposed of under existing records
schedules.
10
A pyramid approach combines disposition policy with automated
tools to bring FRA email under records
management, preservation, and access
= permanent or top
= temporary or staff and support
officials
slider
The position of the “set-point” for email capture depends on policy and resources:
setting it higher allows use of tools now available to get 100% of email at lower
volumes;* setting it lower means more records will be captured and smarter tools
are needed to distinguish and disposition temporary- and non-record.
Implementing an email archiving policy is feasible now, since tools are readily
available to capture 100% of email traffic at the individual or organizational level, in
formats that can be archived.
A pyramid approach combines disposition policy with automated
tools to bring FRA email under records
management, preservation, and access
= permanent or top
= temporary or staff and support
officials
slider
The position of the “set-point” for email capture depends on policy and resources:
setting it higher allows use of tools now available to get 100% of email at lower
volumes;* setting it lower means more records will be captured and smarter tools
are needed to distinguish and disposition temporary- and non-record.
Implementing an email archiving policy is feasible now, since tools are readily
available to capture 100% of email traffic at the individual or organizational level, in
formats that can be archived.
2012 Managing Electronic Records Conference 6.4
5. Cohasset Associates, Inc.
NOTES
How To Avoid A Train Wreck
With Email Archiving….
Capture E-mail But Utilize Records Management!
13
Functional Requirements for
Categorization Products in the Federal
workplace
Ease of use …. Scalability …. Archiving in native
formats….. Metadata preservation … Seamless integration
with existing software apps …. Versioning …. Compatibility
with big bucket records schedules …. Advanced search
capabilities …. Ease of training / machine learning using
records officers or end users …. Cost
Process Optimization Problem 2: The
Coming Age of Dark Archives (and the
inability to provide access)
15
2012 Managing Electronic Records Conference 6.5
6. Cohasset Associates, Inc.
NOTES
Emerging New Strategies:
“Predictive Analytics”
Improved review and case
assessment: cluster docs
thru use of software with
minimal human
intervention at front end to Slide adapted from Gartner
Conference 16
code “seeded” data set June 23, 2010 Washington, D.C.
Language Processing
Technologies
Retrieval / Search 2.
Information Classification 1.
Retrieval
Question Answering
Summarization
Entity Recognition
Information Extraction Natural
Language
Machine Translation Processing
:
17
Text Classification
Deciding which of
several groups a text
belongs to
Crudest form of
language
understanding...
...but often can be automated
with high accuracy
18
2012 Managing Electronic Records Conference 6.6
7. Cohasset Associates, Inc.
NOTES
Why Classify?
...to specify
Reduce an action for
...to finite
infinite every
set of
variety of possible
classes...
text... input.
19
Other Advantages of Text
Classification
Supervised learning:
Classifiers (rules) can be
learned by imitating manual
classifications
Straightforward numerical
measures of quality recall: 85% +/- 4%
precision: 75% +/- 3%
Objective reason why a
decision was made classification
rule
20
Variations on Classification
Binary vs. multiclass
Hierarchical
Probabilistic 83% 17%
Graded / ordered / fuzzy
21
2012 Managing Electronic Records Conference 6.7
8. Cohasset Associates, Inc.
NOTES
Defining Sets of Classes
Tradeoff among
Ideal classes to
implementpolicy
Classes you can teach
people to assign
Classes you can
?
teachsoftwareto assign
Be skeptical of automatic
discovery of classes
22
Text Retrieval Systems
AKA search engines,
semi-structured
databases, text
databases, etc.
databases etc
23
Classification Search
autonomous interactive
long term transitory
organizational personal
structured independent ? ?
?
24
2012 Managing Electronic Records Conference 6.8
9. Cohasset Associates, Inc.
NOTES
Some Distinctions Among
Search Approaches
Exact Match vs.
Ranked Retrieval vs.
"Concepts"
Browsing vs.
"Keywords"
"Keywords"
Text Representations
Matching Aids
25
Exact Match Search
Query specifies conditions
document must meet budget AND Knoxville
AND (revised or preliminary)
Variants
Boolean
B l
SQL
Faceted
Often (ambiguously) called
"keyword" search
26
A Faceted Search Interface
27
2012 Managing Electronic Records Conference 6.9
10. Cohasset Associates, Inc.
NOTES
Ranked Retrieval
Query specifies important
attributes of desired
documents
System statistically weights
those attributes
Results returned in order of
strength of match
28
Statistical Evidence in Ranked
Retrieval
Corpus statistics
Word (and metadata) counts
Unsupervised learning
Clustering, LSI/LSA etc.
Cl t i LSI/LSA, t
finds (maybe useless) patterns
Supervised learning
aka "relevance feedback"
learn indicators of user interest
29
Browsing
Hierarchies
Networks
Clusters
Spaces / Maps / Dimensions
make great pictures / demos
unclear if useful for finding information
30
2012 Managing Electronic Records Conference 6.10
11. Cohasset Associates, Inc.
NOTES
Visual Analysis Examples
(Presentation by Dr. Victoria Lemieux, Univ. British Columbia,
at Society of American Archivist Annual Mtg. 2010, Washington, D.C.)
With acknowledgments to Jeffrey Heer, Exploring Enron, http://hci.stanford.edu/jheer/projects/enron/,
Adam Perer, Contrasting Portraits, http://hcil.cs.umd.edu/trs/2006-08/2006-08.pdf, 31
and Fernanda Viegas, Email Conversations, http://fernandaviegas.com/email.html
32
2012 Managing Electronic Records Conference 6.11
12. Cohasset Associates, Inc.
NOTES
What Evidence Can The
Search Software Use?
Words, phrases, etc.
Manually assigned categories
Metadata
Author, organization, creation date, change
date, access date, length, file type,...
Contextual information (links,
attachments,...)
34
What Resources Aid
Matching?
Linguistic analysis
At word level or higher
Clusters / spaces / ...
Thesauri / semantic nets /
concept maps / ...
Suited to your task?
Modifiable?
How is text determined to
belong to category?
35
Concepts v. Keywords
Supreme Court of Information Retrieval, Case No. 1-tfidf-0-2902, 2009
Search software marketing:
Them = keyword search = bad
Us = concept search = good
Reality:
R lit
Both terms have referred to dozens of
different technologies...
...including some of the same ones!
Conceptual search is an aspiration, not
a technology
36
2012 Managing Electronic Records Conference 6.12
13. Cohasset Associates, Inc.
NOTES
Example of Boolean search string
from U.S. v. Philip Morris
(((master settlement agreement OR msa) AND NOT (medical
savings account OR metropolitan standard area)) OR s. 1415
OR (ets AND NOT educational testing service) OR (liggett
AND NOT sharon a. liggett) OR atco OR lorillard OR (pmi
AND NOT presidential management intern) OR pm usa OR
rjr OR (b&w AND NOT photo*) OR phillip morris OR batco
OR ftc test method OR star scientific OR vector group OR
joe camel OR (marlboro AND NOT upper marlboro)) AND
NOT (tobacco* OR cigarette* OR smoking OR tar OR
nicotine OR smokeless OR synar amendment OR philip
morris OR r.j. reynolds OR ("brown and williamson") OR
("brown & williamson") OR bat industries OR liggett group)
37
U.S. v. Philip Morris E-mail Winnowing
Process
20 million 200,000 100,000 80,000 20,000
email hits based relevant produced placed on
records on keyword emails to opposing privilege
terms used party logs
(1%)
A PROBLEM: only a handful entered as exhibits at trial
A BIGGER PROGLEM: the 1% figure does not scale
38
Judicial endorsement of predictive analytics
in document review by Judge Peck in Da
Silva Moore v. PublicisGroupe(SDNY Feb.
24, 2012)
This opinion appears to be the first in which a Court
has approved of the use of computer-assisted review.
pp p
. . . What the Bar should take away from this Opinion
is that computer-assisted review is an available tool
and should be seriously considered for use in large-
data-volume cases where it may save the producing
party (or both parties) significant amounts of legal
fees in document review. Counsel no longer have to
worry about being the ‘first’ or ‘guinea pig’ for judicial
acceptance of computer-assisted review . . .
Computer-assisted review can now be considered
judicially-approved for use in appropriate cases.
2012 Managing Electronic Records Conference 6.13
14. Cohasset Associates, Inc.
NOTES
Social Networking/Links Analysis Example
From Marc Smith
Posted on Flickr 40
Under Creative Commons License
Judicial second guessing of failure to use
e-search capabilities: Capitol Records v.
MP3 Tunes, 261 F.R.D. 44 (S.D.N.Y. 2009)
“In [a prior case] the Court notes its dismay that the
party opposing discovery of its ESI had organized its
files in a manner which seemed to serve no purpose
other than ‘to discourage audits. . .’ Similarly, in this
case, [the party] host[ed] no ediscovery software on
their servers and apparently are unable to conduct
centralized email searches of groups of users
without downloading them to a separate file and
relying on the services of an outside vendor.”
41
Judicial second guessing of failure to use
e-search capabilities: Capitol Records v.
MP3 Tunes (con’t)
Court went on to add:
“The day will undoubtedly will come when
burden arguments based on a large
organization’s lack of internal ediscovery
g y
software will be received about as well as the
contention that a party should be spared from
retrieving paper documents because it had
filed them sequentially, but in no apparent
groupings, in an effort to avoid the added
expense of file folders or indices.”
42
2012 Managing Electronic Records Conference 6.14
15. Cohasset Associates, Inc.
NOTES
Problem 3: Innovative
Thinking
43
The records management world of
tomorrow….
References
Background Law Review Referencing Autocategorization&
Advanced Search
J. Baron, “Law in the Age of Exabytes: Some Further Thoughts on
‘Information Inflation’ and Current Issues in E-Discovery
Search, 17 Richmond J. Law & Technology (2011), see
http://law.richmond.edu
htt //l i h d d
Latest “Predictive Coding” Case Law to follow in blogs online:
Da Silva Moore v PublicisGroupe& MSL Group, 11 Civ. 1279
(S.D.N.Y.) (Peck, M.J.) (Opinion dated Feb. 24 2012)
Kleen Products, LLC v. Packaging Corp. of America, 10 C 5711
(N.D. Ill.) (Nolan, M.J.)
45
2012 Managing Electronic Records Conference 6.15
16. Cohasset Associates, Inc.
NOTES
Jason R. Baron
Director of Litigation
g
Office of General Counsel
National Archives and
Records Administration
(301) 837-1499
Email: jason.baron@nara.gov
46
Dave Lewis, Ph.D.
David D. Lewis Consulting, LLC
Chicago, IL
Email: consult@DavidDLewis.com
http//www.DavidDLewis.com
47
2012 Managing Electronic Records Conference 6.16