Challenges in closing information and records management capability gaps in share point

1,175 views
1,074 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,175
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
12
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • conceptClassifierfor SharePoint is fully integrated with both SharePoint, Microsoft Office, Windows Server 2008 R2 FCI, FAST and Microsoft Enterprise Search. The automatic extraction of compound terms enables the Subject Matter Expert (SME) to use the terms within the taxonomy generation process, reducing the time to build out and maintain taxonomies by 80%. Features: Downloadable in 30 minutes – no programming required  Automatic classification and compound term meta data extraction Classification technology uses concept extraction and compound term processing Taxonomy based and faceted navigation Robust suite of tools to build an maintain taxonomiesFully integrated with Content TypesAutomatic classification from MS Office and OutlookTaxonomy browse, faceted navigation, and preview functionality from the search interfaceCan automatically classify from SharePoint, folders, and web sites providing a single interface to all permmissable content Simple intuitive interface designed for the SME  Fully SOA compliant, delivered as Web Parts, based on open standards  Integrates with Microsoft Office, Microsoft Records Center 
  • Challenges in closing information and records management capability gaps in share point

    1. 1. Challenges in Closing Information & Records Management Capability Gaps
    2. 2. • Welcome and Introductions • Dave Sanchez of Concept Searching • Juan Celaya of COMPU-DATA • Case Study • Questions and Wrap Up
    3. 3. Concept Searching, Inc. Company founded in 2002  Product launched in 2003  Focus on management of structured and unstructured information  Privately held and profitable – no funding  Growth rate of 35% in 2008 and in excess of 100% for 2009  Founders and management team with company since inception  Technology  Automatic concept identification, content tagging, auto- classification, taxonomy management  Only statistical vendor that can extract conceptual metadata  2009 and 2010 ‘100 Companies that Matter in KM’ (KM World Magazine)  KMWorld ‘Trend Setting Product’ of 2009  Locations: US, UK, & South Africa Client base: Fortune 500/1000 organizations  Managed Partner under Microsoft global ISV Program - “go to partner” for Microsoft for auto-classification and taxonomy management  Microsoft Enterprise Search ISV , FAST Partner David Sanchez * davids@conceptsearching.com * 1 (713) 893-1743 Concept Searching • Don Miller • (408) 828-3400 • donm@conceptsearching.com
    4. 4. Problems  Lack of Information Transparency: e-Discovery and FOIA  Government and Private Sector directives to tag content for retrieval  Untagged Data Assets = Untapped Resources  Time Gap between Information Requests and Discovery is Directly Proportional to Volume of Data Assets  Non-Compliance with Records Management Policies  Sarbanes-Oxley and Government RM Retention Schedules  Record Declaration process is manual  Data Stored in Wrong Location & Information not Preserved in Accordance with Regulatory Guidelines  Increasing Volume of Unplanned Data Exposure Events  Privacy Act Program (PII), Protected Health Information (PHI), HIPAA, Payment Card Industry (PCI), etc…  Organizational Confidential and Sensitive Information David Sanchez * davids@conceptsearching.com * 1 (713) 893-1743
    5. 5. By Sector Medicine 12% Government Business 19% 48% Education 21% Source: Open Security Foundation David Sanchez * davids@conceptsearching.com * 1 (713) 893-1743
    6. 6. By Type DISPOSAL Email 6% Virus 4% FRAUD Web 8% SnailMail 1% 13% 4% UNKNOWN HACK 3% 16% Lost/Stolen Computers and Documents 45% Source: Open Security Foundation David Sanchez * davids@conceptsearching.com * 1 (713) 893-1743
    7. 7. Government DISPOSAL Email 6% FRAUD 4% Virus Web 6% 0% 16% SnailMail HACK 7% 8% Lost/Stolen UNKNOWN Computers and 4% Documents 49% Source: Open Security Foundation David Sanchez * davids@conceptsearching.com * 1 (713) 893-1743
    8. 8. Human Factors  Physical or Cognitive Properties of an Individual or Human Social Behavior which Influence Functioning of Technological Systems Access Rights Records Retention Code Server Content with Metadata Appropriate Document Library 1 Document Library 2 Tagging Metadata, Retention Codes, and Rights Management Templates Document Library 3 Document Library 4 David Sanchez * davids@conceptsearching.com * 1 (713) 893-1743
    9. 9. Human Factors  Physical or Cognitive Properties of an Individual or Human Social Behavior which Influence Functioning of Technological Systems  Limiting Factor = Human Behavior Access Rights Records Retention Code Server Content with Metadata Appropriate Document Library 1 Document Library 2 Tagging Metadata, Retention Codes, and Rights Management Templates Document Library 3 Document Library 4 David Sanchez * davids@conceptsearching.com * 1 (713) 893-1743
    10. 10. Alternatives  Customize system interface to force manual application of metadata  Pros: data assets now have metadata  Cons: high customization costs, increase in end-user labor costs, less end-user productivity, non-standardized application of metadata across enterprise  Hire temporary staff to add metadata to data assets  Pros: data assets now have metadata  Cons: temporary staff = $$$$$ and results in non-standardized tagging  Acknowledge that it is a problem and do nothing David Sanchez * davids@conceptsearching.com * 1 (713) 893-1743
    11. 11. Solution: conceptClassifier for SharePoint Semantic Metadata Increase Tagging Information Retrieval Precision for e-Discovery Concept Classifier for Automatic SharePoint Content Type Updating SharePoint Security Services & Records Windows Retention Rights Code Management Tagging Appropriate Storage & Preservation Document Document Document Document Library 1 Library 2 Library 3 Library 4 David Sanchez * davids@conceptsearching.com * 1 (713) 893-1743
    12. 12. Live Demonstration  e-Discovery & FOIA (moss.conceptsearching.com)  Auto-classification to multiple vocabularies  Faceted Searching  Taxonomy Browsing  Records Management  Aligning Vocabulary to Records Retention Codes  Record Declaration Process – tagging documents with retention codes  Information Management – Data Privacy & Security Compliance  PII, PHI, and PCI tagging  Sensitive content (FOUO, Secret, Internal Use Only – contracts, labor rates, etc…) David Sanchez * davids@conceptsearching.com * 1 (713) 893-1743
    13. 13. conceptClassifier  Automatic Conceptual Metadata Generation We Make Metadata Work For You  Automated Classification  Taxonomy Development & Management • Proven to reduce taxonomy development by 80%  Microsoft Integration • Runs natively in SharePoint 2007 and SharePoint 2010, Microsoft Office Applications, SharePoint Search and FAST, Windows Server 2008 R2 FCI • Fully integrated with SharePoint Content Types  Content Type Updater • Automatically changes the Content Type based on presence of organizationally defined metadata found within the document • Identification of confidential/privacy data • Ability to identify records based on the records retention schedule and route to the records center  Technology • Downloadable in 30 minutes – no programming required • Fully SOA compliant, delivered as Web Parts, based on open standards • Highly scalable David Sanchez * davids@conceptsearching.com * 1 (713) 893-1743
    14. 14. David Sanchez * davids@conceptsearching.com * 1 (713) 893-1743
    15. 15. Leveraging Metadata as an Enabling Asset  Uses Taxonomy Manager to create and manage organizational taxonomies, ontologies, and metadata environment;  Employs conceptClassifier for SharePoint as an Automated Metadata Population Service;  Applies content types base on metadata;  Uses content types derived from metadata to drive individual and group access to data assets using inherent SharePoint Security;  Uses content types derived from metadata to drive migration of data assets to proper document libraries where RMS templates are automatically applied to restrict data asset usage. David Sanchez * davids@conceptsearching.com * 1 (713) 893-1743
    16. 16. COMPU-DATA International, LLC Juan J. Celaya President/CEO Senior Business & IT Consultant jcelaya@cdlac.com Office: 281.292.1333 www.cdlac.com blog.cdlac.com Preserving the Worlds Knowledge - Available Anytime AnywhereSM ©2010 COMPU-DATA International, LLC, All Rights Reserved
    17. 17. COMPU-DATA International, LLC Company Overview Who are we? CDI is a successful information management integrator based in Spring, Texas (North of Houston) with offices in Miami, FL and Stafford, VA. We have been in business for over 22 years with 18 of those focused in Content and Data Integration (CADI™), enterprise search, classification, capture and data management. We are a small business and designated as a certified Texas HUB contractor. What do we do? Integration, software development and reseller of best-of-breed products for ECM solutions focused in Search, Automatic Classification, Capture and Business Automation (Workflows). We work with Government and private industry customers in delivering successful departmental and enterprise solutions. Who do we serve? Medium to large organizations in government, health care, manufacturing and oil industries. Preserving the Worlds Knowledge - Available Anytime AnywhereSM ©2010 COMPU-DATA International, LLC, All Rights Reserved
    18. 18. COMPU-DATA International, LLC Presentation Overview During this Presentation we will: For the case study: 1. Summarize the issues facing U.S. Army researchers and records managers. 2. Describe our approach in resolving those issues within the constraints of a DoD environment and discuss the software tools that comprise the solutions. 3. Discuss the challenges in identifying and managing millions of documents. 4. Review how automatic classification and meta data tagging enhances search in this environment. 5. Address business outcomes and benefits in automating processes. For conceptClassifier: 1. Describe how the concept Classifier is being applied as part of the JSRRC project. 2. Present Concept Searching’s technologies also working outside of the SharePoint® environment. Preserving the Worlds Knowledge - Available Anytime AnywhereSM ©2010 COMPU-DATA International, LLC, All Rights Reserved
    19. 19. COMPU-DATA International, LLC U.S. Army Challenges Records Management Army Records Management Provide oversight and program management for the Army's Records Management Program. Establish programs for records collection and preservation from garrison, training, contingency, and war time operations. Operate and sustain the Army Electronic Archive and provides the means to identify, collect, index and retrieve important Army records, in hard copy and electronic media. Management Information Control (AR 335-15) Records Schedule Hundreds of Record Series with around 4,000 individual record instructions. End users faced with myriad choices when categorizing records. Results in improper classification. Neglect to use schedule at all. Affects retention durations. Reduces impetus to retain record materials. Reduced consistency in tagging records to schedules. New rules and procedural training. Hundreds of locations and data environments. Preserving the Worlds Knowledge - Available Anytime AnywhereSM ©2010 COMPU-DATA International, LLC, All Rights Reserved
    20. 20. U.S. Army Challenges COMPU-DATA International, LLC JSRRC Joint Service Records Research Center (JSRRC) Validates veteran’s war-related claims for the Veterans Administration. Primarily on Post-Traumatic Stress Disorder (PTSD), but also Agent Orange exposure and others. Reviews cases for ALL services from WWII to present day. Required research, among others, of DoD field documents that relate to the specific individual and event. Literally tens-of-millions of documents. No categorization or indexing of documents. Plethora of data sources. Today – millions of electronic files in multiple formats are being generated daily. Usefulness of data – Not determined. Manual identification – Not feasible. For JSRRC – Finding a needle in the hay stack! Goal – Standardize and consolidate field & internally generated data providing a common research interface. Preserving the Worlds Knowledge - Available Anytime AnywhereSM ©2010 COMPU-DATA International, LLC, All Rights Reserved
    21. 21. COMPU-DATA International, LLC CDI’s Solution Philosophy For Records Management Combination of Army process changes and implementation of technology tools. Streamline Records into fewer functional series. End user has minimal or no role in categorizing record. Utilize Army’s ARIMS & SharePoint to attribute initial metadata. Utilize conceptClassifier & conceptTaxonomyManager to correctly identify appropriate disposition based on content and metadata. For JSRRC Develop ability to integrate documents and data from myriad disparate sources utilizing CADI™ framework. Utilize conceptClassifier to classify Army documents into discrete, searchable segments. Leverage the classification implementation to enhance search allowing for better results for the end users. Implement the infrastructure that can be leveraged to move forward with Records Management, FOIA and Declassification organizations at RMDA. Preserving the Worlds Knowledge - Available Anytime AnywhereSM ©2010 COMPU-DATA International, LLC, All Rights Reserved
    22. 22. COMPU-DATA International, LLC Primary Solution Components Base Infrastructure: conceptClassifier conceptTaxonomyManager conceptSearch Application Infrastructure: DigitalAsset Finder™ Professional Services for integration and implementation of solution. Preserving the Worlds Knowledge - Available Anytime AnywhereSM ©2010 COMPU-DATA International, LLC, All Rights Reserved
    23. 23. COMPU-DATA International, LLC JSRRC Solution Consolidate existing data sources: Access databases Applications Network shared drives Prepare for future data sources: Identify possible origins Volume and formats No standards in data delivery Support special security needs Stored in different locations Identify types of metadata & documents that: Must be standardized Derive concepts & content Used to identify data to information relationships Create taxonomies Preserving the Worlds Knowledge - Available Anytime AnywhereSM ©2010 COMPU-DATA International, LLC, All Rights Reserved
    24. 24. COMPU-DATA International, LLC JSRRC Solution Infrastructure built to support initial two environments: Environment #1 Windows based 3-server group. DigitalAsset Finder™, conceptSearch with Distributed Query Server, conceptClassifier & conceptTaxonomyManager Initial configuration for support of 200 terabytes of index-able data. Microsoft Office & other text based files. PDFs and searchable PDFs. Image files (Tiff, JPG and others). Environment #2 Windows based server DigitalAsset Finder™, conceptSearch, conceptClassifier & conceptTaxonomyManager. Currently supporting over 5 million records and growing Microsoft Office & other text based files. PDFs and searchable PDFs. Image files (Tiff, JPG and others). Structured data with no file reference. Preserving the Worlds Knowledge - Available Anytime AnywhereSM ©2010 COMPU-DATA International, LLC, All Rights Reserved
    25. 25. COMPU-DATA International, LLC JSRRC Solution Creation of taxonomies used to: Enhance Search Categorize or Identify documents Some of the taxonomies created include: Unit Names Dates Document Types (Names & Content) Locations Results include: Consolidation of information into distinct groups allowing a focused approach to the required research. Controlled vocabulary that can be applied to the data sets as requirements evolve. Access to information that previously was impossible to reach due to the resource requirements needed to collate the raw data. Collaboration among researchers increase as they share information by contributing their knowledge to existing data for future reference and retrieval. Preserving the Worlds Knowledge - Available Anytime AnywhereSM ©2010 COMPU-DATA International, LLC, All Rights Reserved
    26. 26. COMPU-DATA International, LLC Data Process Pipeline File Filter File Process Synchronizer File type File size Folder name PDF Archive content Generation processing. conceptClassifier  Classification conceptSearch  Metadata Assigned  Search Indexes Classification db Search Indexes  Independent of the data location DigitalAsset Finder™ Preserving the Worlds Knowledge - Available Anytime AnywhereSM ©2010 COMPU-DATA International, LLC, All Rights Reserved

    ×