Federal Enterprise Architecture Data and Information ...
Upcoming SlideShare
Loading in...5
×
 

Federal Enterprise Architecture Data and Information ...

on

  • 1,411 views

 

Statistics

Views

Total Views
1,411
Views on SlideShare
1,411
Embed Views
0

Actions

Likes
0
Downloads
25
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Federal Enterprise Architecture Data and Information ... Federal Enterprise Architecture Data and Information ... Presentation Transcript

  • Contribution to The FEA DRM Data Management Strategy Business Driver 4: Resolve Data Semantics Issues That Impede Community of Practice Work Brand Niemann and Ken Gill November 26, 2003 DRAFT
  • Business Driver 4: Resolve Data Semantics Issues That Impede Community of Practice Work
    • Introduction to Data Semantics
    • Information technology and practices have evolved from centralized data management systems to decentralized and distributed computing information exchanges. Increasingly mature and robust infrastructures for distributing information are helping to realize the idea that information can be available to anyone, anytime, anywhere.
    • The availability of increasing amounts of information presents the challenge of delivering the right information, to the right person, at the right time. The data must be relevant, meaningful and at the appropriate level of detail.
    • Data Semantics is the discipline that facilitates the delivery of the right information based on the requestors requirements. Semantic agreement within Community’s of Practice is essential to facilitating meaningful and effective data exchange.
    • Domain Data Harmonization Strategy
    • The vast majority of existing information systems have evolved over time with diverse requirements and different data models. As a result the data stored in these systems have a varying level meaning, consistency, and quality. Data harmonization is the process by which a Community of Practice agrees to the meaning and format of the data residing in its information systems by applying common definitions, attributes, and values. Example outputs include data dictionaries and data registries.
    • There are several examples of current and ongoing Community of Practice Data Harmonization efforts (see use cases). The success of these efforts demonstrate several guiding principles.
  • Business Driver 4: Resolve Data Semantics Issues That Impede Community of Practice Work
    • Data Harmonization Guiding Principles:
    • 1. Data harmonization is a process not a project and should begin as early as possible.
    • 2. Identify key internal and external stakeholders.
    • 3. Engage existing and potential partners.
    • 4. Understand and agree on scope of initiative.
    • 5. Define requirements.
    • 6. Review best practices.
    • 7. Select a methodology and appropriate tools.
    • 8. Identify relevant information exchanges and the data systems that support them.
    • 9. Concepts and definitions must be universally accepted within the Community of Practice.
    • 10. Publish work product so it can be consumed by practitioners and technologists.
    • Global Justice Information Sharing Initiative (Global), includes the:
    • Global Infrastructure/Standards Working Group (GI/SWG), which created the:
    • XML Structure Task Force (XSTF), which consists of:
    • Agencies (practitioners), commercial XML developers (implementers), technical support staff, and administrative support staff, which produced the:
    • “ The Development of JXDD 3.0, Draft 0.1, July 3, 2003 (Justice XML Data Dictionary Version 3.0).
  • Business Driver 4: Resolve Data Semantics Issues That Impede Community of Practice Work
    • Goals of the JXDD 3.0 Development Effort:
      • Reference architecture and namespaces for a standard Justice XML Data Dictionary Schema specification.
      • Object-oriented data model, named types, extensibility.
      • Maximize use of standards and best practices:
        • ISO 11179;
        • Draft Federal XML Developers Guide;
        • Intelligence Community Metadata Language; and
        • etc.
      • Metadata for content, registry support, and infrastructure support.
      • Value constraints: codes/enumerations, special semantics.
      • Fuller representation of relationships.
      • Incorporate a broader set of user requirements:
        • Data exchange requirements from several efforts; and
        • Functional requirements.
      • XML Schema version control.
      • Migration paths.
  • Business Driver 4: Resolve Data Semantics Issues That Impede Community of Practice Work
    • In this age of eGovernment and Enterprise Architecture whose objectives are increased collaboration, consolidation, and integration to transform from organization- to citizen/customer-centric, some key questions to ask yourself and others are:
      • How “smart” is your data and information?; and
      • Who are you collaborating with?
      • Some key goals are then:
        • “ Smarter data” (put more effort into the data than the applications); and
        • More collaboration (on and by means of “smarter data”).
      • The Semantic Web is a machine-readable web of “smart data” and automated services that amplify the Web far beyond current capabilities.
        • Smart data is data that is application-independent, composable, classified, and part of a larger information ecosystem (ontology).
      • XML provides a simple, yet robust mechanism for encoding semantic information, or the meaning of data and shifts the “power” from the application to the data.
        • But simple XML metadata is not enough because it only provides syntactic interoperability.
        • Additional XML-based Ontology languages are being developed to encode semantic interoperability.
          • In the next ten years, we will see semantics to describe problems and business processes in specialized domains.
  • Business Driver 4: Resolve Data Semantics Issues That Impede Community of Practice Work Dynamic Resources Static Resources Interoperable Syntax Interoperable Semantics Web Services WWW Semantic Web Semantic Web Services Semantic Web Services Enterprise Ontology and Web Services Registry Source: Derived in part from two separate presentations at the Web Services One Conference 2002 by Dieter Fensel and Dragan Sretenovic.
  • Business Driver 4: Resolve Data Semantics Issues That Impede Community of Practice Work
    • RDF: Dublin Core Metadata and Relationships:
      • The Resource Description Framework (RDF) is an XML-based language to describe resources and is designed to create meta data about the “resource” as a standalone entity. The RDF model is often called a “triple” because it has three parts: (1) a resource; (2) a resource’s properties; and (3) the property values.
        • The knowledge representation community uses the grammatical parts of a sentence: (1) subject; (2) predicate; and (3) object.
      • RDF Schema is language layer on top of RDF in what is called the “Semantic Web Stack”. Above RDF Schema is Ontologies and above that is the third and final web in Tim Berners-Lee’s three part vision (collaborative web, Semantic Web, web of trust).
      • Ontology involves discovering categories and fitting objects into them in ways that make sense. When we make a list…we are categorizing - we are engaging in rudimentary ontology. By prioritizing items in a list, we are assigning relationships among various things. Ontology can be relatively simple, or it can be quite complex.
      • XML Topic Maps are popular implementations of taxonomies and have complimentary characteristics to RDF.
  • Business Driver 4: Resolve Data Semantics Issues That Impede Community of Practice Work
    • RDF: Dublin Core Metadata and Relationships (continued):
      • The Dublin Core Metadata Initiative is a cross-disciplinary international effort to develop mechanisms for the discovery-oriented description of diverse resources in an electronic environment. The Dublin Core Element Set is a list of fifteen fixed elements that capture a representation of essential aspects related to the description of resources. A complete list of Dublin Core metadata elements (e.g. author, title, creation date, etc.) can be found at http://dublincore.org/documents/1999/07/02/dces/
      • Metadata can exist within the resource that it is describing (internal metadata), or it can exist in a separate file (external metadata) that is associated with the content file.
      • Three excellent resources are:
        • Practical RDF: Solving Problems with the Resource Description Framework, Shelley Powers, O’Reilly, July 2003.
        • The Semantic Web: A Guide to the Future of XML, Web Services, and Knowledge Management, Wiley Technology Publishing, June 2003; and
        • XML Topic Maps: Creating and Using Topic Maps for the Web, Addison Wesley, July 2002.
  • Business Driver 4: Resolve Data Semantics Issues That Impede Community of Practice Work Key Ontology Components RDF Triple Components Subject* Object Literal Predicate** Predicate** =URI =Literal =Property or Association *The company* **sells batteries **. Person birthdate: date Gender: char Image Leader Organization Resource leads is-A works for published depiction knows Source: The Semantic Web: A Guide to the Future of XML, Web Services, and Knowledge Management, Wiley Technology Publishing, June 2003.
  • Business Driver 4: Resolve Data Semantics Issues That Impede Community of Practice Work RDF: Semantic links - "Joining the Web" Source: Standards, Semantics and Survival, by Tim Berners-Lee, Director, World Wide Web Consortium, January 2003.
  • Business Driver 4: Resolve Data Semantics Issues That Impede Community of Practice Work Source: The Semantic Web: A Guide to the Future of XML, Web Services, and Knowledge Management, Wiley Technology Publishing, June 2003.
  • Business Driver 4: Resolve Data Semantics Issues That Impede Community of Practice Work Source: The Semantic Web: A Guide to the Future of XML, Web Services, and Knowledge Management, Wiley Technology Publishing, June 2003. The Ontology Spectrum: Weak to strong semantics. Weak semantics Strong semantics Taxonomy Is a classification of Thesaurus Has narrower meaning than Conceptual Model Is subclass of Local Domain Theory Is disjoint subclass of with transitivity property Schema XTM RDF/S UML DAML+OIL, OWL
  • Business Driver 4: Resolve Data Semantics Issues That Impede Community of Practice Work Expressivity and Semantic Power Enterprise Support XML RDF OWL Data and Schema Management Validation Run - time Engine Integration and Orchestration Ontology Works enLeague Ontoprise Network Inference Unicorn SchemaLogic Contivo Celcorp Vitria MetaMatrix Modulant IGS S S S S S S S S S U S&U S&U Structured information Unstructured information Supports both Current Support / Primary Strength S Miosoft Emerging Vendors Landscape: Semantic Integration Source: Irene Polikoff, TopQuadrant, Positioning Semantic Technologies: The Emerging Vendor Landscape, September 8, 2003. S U S&U
  • Business Driver 4: Resolve Data Semantics Issues That Impede Community of Practice Work XML Collaborator Registry MetaBase MOF Repository Disparate Data Sources (1) Import Physical Source Metadata (2) Identify and model XML schema types (4) Define XML Schema using registered elements from MetaBase (3) Import modeled XML elements and types into XML Registry (5) Import XML Schema info MOF repository (6) Map virtual XML Document to physical sources using Schema in MOF repository (7) Create and Deploy Web Services for accessing integrated data Web Service (8) Register WSDL in UDDI Registry Design-Time Integration of Data Via Web Services Architecture Pilot See Appendix: July 17, 2003.
  • Business Driver 4: Resolve Data Semantics Issues That Impede Community of Practice Work
    • Suggestions provided to the AIC Governance and Components Subcommittees, August 25, 2003:
      • 1. While ISO 11179 is emphasized because of considerable legacy work and MOF is considered more useful currently, the Semantic Web technologies of RDF and OWL have much more mature and capable data models for semantic integration and interoperability and provides a convergence of the four data communities (document, Web, database, and programming).
        • See http://www.w3.org/2003/08/owl-pressrelease
      • 2. Data Independence is step one (Michael Daconta's "Declaration of Data Independence" from the September 8th Conference on Semantic Technologies for eGov:
        • (a) Data is more important than applications.
        • (b) Data value increases with the number of connections it shares.
        • (c) Data about data can expand to as many layers as there are meanings.
        • (d) Data modeling harmony is the alignment of syntax, semantics, and pragmatics.
        • (e) Data and logic are the yin and yang of information processing.
        • (f) Data modeling makes the implicit explicit and the transparent apparent.
        • (g) Data standardization is not amenable to competition.
        • (h) Data modeling must be decentralized.
        • (i) Data relations must not be based on probability or luck.
        • (j) Data is truly independent when the next generation need not reinvent it.
  • Business Driver 4: Resolve Data Semantics Issues That Impede Community of Practice Work
    • Suggestions provided to the AIC Governance and Components Subcommittees, August 25, 2003 (continued):
      • 3. The Intelligence Community Metadata Working Group (IC MWG) as a DRM Governance Model (http://www.xml.saic.com/icml/)
        • (a) Establish by IC Chief Information Officer (CIO) Executive Council.
        • (b) Promulgates the April 2003 IC Policy requiring IC-wide use of the IC XML standards for metadata and metadata markup. Also identifies and harmonizes enterprise-level metadata and metadata markup standards.
        • (c) Developing community-wide standard XML metadata models including security-marking constructs that assist writers with application of Controlled Access Program Coordination Office (CAPCO) marking instructions. Subsequent standards will address digital signatures, encryption, and public key management.
        • (d) Developing an IC metadata registry and registry services.
        • (e) Currently working on the Terrorist Watchlist Person Data Exchange Standard XML Tags and Schema.
        • (f) Work is accomplished in regular Technical Exchange Meetings (TEM) and Team meetings.
      • 4. The Data and Information Reference Model has been and currently is the object of a series of pilot projects with the Communities of Practice. This should continue and is required by H.R. 2458 - the E-Government Act of 2002, SEC. 212. Integrated Reporting Study and Pilot Projects, (d) Pilot Projects To Encourage Integrated Collection And Management Of Data And Interoperability Of Federal Information Systems. Added November 25 th .
  • Business Driver 4: Resolve Data Semantics Issues That Impede Community of Practice Work
    • The Data and Information Reference Model has been (see Appendix) and currently is the object of a series of pilot projects with the Communities of Practice:
      • Open GIS Consortium (OGC).
        • Information Communities and Semantics WG (ICS WG)
          • http://www.opengis.org/groups/?iid=50
      • Sustainability of Intergovernmental Exchange Networks (Global-Justice, Environmental Information-EPA, and Health IT Sharing (Health) (SIEN).
        • Government Semantic XML Web Services Community of Practice (SWS-CoP)
          • http://web-services.gov/
      • Intelligence Community Metadata Working Group (IC MWG).
        • http://www.xml.saic.com/icml/
      • Semantic Interoperability Special Interest Group (SI-SIG).
        • To be announced.
      • E-Gov SmartServices
        • To join the group send an email to eGov_SmartServices-subscribe@yahoogroups.com with empty Subject and Body. You will then receive an email with a web link where you can select the subscription option.
      • Open International Forum on Business Ontology
        • ONTOLOG - collaborative work environment
          • http://ontolog.cim3.net/
      • More to come.
  • Business Driver 4: Resolve Data Semantics Issues That Impede Community of Practice Work
    • Appendix – History of DRM Support Work:
      • January 22, 2003, Suggestions for the Federal Enterprise Architecture (FEA) Data and Information Reference Model (updated January 27, 2003), FEA Data and Information Reference Model (DRM).
        • http://web-services.gov/FEA-DRM12203.ppt
      • January 31, 2003, Topic Map Web Services for the FEA-PMO and FEA-DRM - Cognitive Topic Map Web Sites (CTW): Aggregating Information Across Individual Agencies and E-Gov Initiatives, Michel Biezunski, Coolheads Consulting, Proposed Pilot.
        • http://web-services.gov/mbegov213003.ppt
      • February 10 and 20, 2003, Distributed Components, Metadata Models, and Registries: Input to the Governance and Components Subcommittee Meetings and the FEA Data and Information Reference Model (DRM). See "The Distributed Components, Metadata Models, and Registries" ListServ Discussion Summary-Joe Chiusano, Booz Allen Hamilton, March 4, 2003.
        • http://web-services.gov/XML%20Web%20Services%20Working%20Group%2022003.ppt
        • http://web-services.gov/Distributed%20Components,%20Metadata%20Models,%20and%20Registries%20Thread%20-%20%2003-04-03.ppt
  • Business Driver 4: Resolve Data Semantics Issues That Impede Community of Practice Work
    • Appendix – History of DRM Support Work (continued):
      • March 19, 2003, Strengthening the Federal Enterprise Architecture (FEA) Data and Information Reference Model (DRM) and Military Pilot Project (Federated Registries: Concepts and CONOPS) discussed at the XML Registry Pilot Team Meeting.
        • http://web-services.govFEA-DRM31903.ppt
        • http://xml.gov/agenda/rrt20030319.htm
      • April 4, 2003, Working Group provides Multiple Registries and Repositories to be federated with the XML Working Group's GSA-NIST XML Registry Pilot to support the CIO Council's Architecture and Infrastructure Committee (AIC) and the Data and Information Reference Model (DRM).
        • http://web-services.gov/Registries41003.ppt
      • April 21, 2003, Strengthening the Federal Enterprise Architecture (FEA) Data and Information Reference Model (DRM) for the DRM Offsite May 19, 2003.
        • http://web-services.gov/FEA-DRM42103.ppt
  • Business Driver 4: Resolve Data Semantics Issues That Impede Community of Practice Work
    • Appendix – History of DRM Support Work (continued):
      • May 16, 2003, Extending the FEA DRM to Support the Joint Government Data & Information Reference Model (GDIRM), the Business Compliance One Stop E-Gov Initiative, ITIPS-II, & Component Technology Activities, input for the DRM Offsite May 19, 2003. Also Geospatial Interoperability Reference Model (GIRM).
        • http://web-services.gov/FEA-DRM51903.ppt
        • http://web-services.gov/EA%20for%20Geospatial3.ppt
      • July 17, 2003, Report at AIC Task Leaders Meeting on Governance Subcommittee Goal 3 Task Pilots: A Government Enterprise Component Registry and Repository Using Native XML Database Technology (for presentation on July 22 and 23) and Joint Government Data and Information Reference Model (IAC White Paper) for review which includes MetaMatix-XML Collaborator Pilot Project (see pages 26-27).
        • http://web-services.gov/Components%20Repository72203.ppt
        • http://web-services.gov/030528_IAC_EA_SIG_Information_and_Data_Reference_Model_Body.pdf
      • September 8, 2003, “Semantic Technologies for eGov” Conference at the White House Conference Center. Proceedings and DVD recording are available.
        • http://www.topquadrant.com/conferences/tq_proceedings.htm
  • Business Driver 4: Resolve Data Semantics Issues That Impede Community of Practice Work
    • Appendix – History of DRM Support Work (continued):
      • October 16, 2003, Founding Meeting of the Semantics SIG to Establish a Community of Purpose, Decides to Develop a Charter, Mission Statement, White Paper, and Foster "Best Practices". More information to follow.
      • October 20, 2003, Emerging Components Quarterly Conference at the White House Conference Center Featuring: Semantic Mapping Tools (Image Matters: userSmarts and Ontology Manipulation Toolkit).
      • Also to be presented November 19-20, 2003, at the Geography Awareness Week and GIS Day 2003, Mellon Auditorium, Washington, DC, 14 th and Constitution Avenue.
        • http://www.componenttechnology.org/Emerging/Oct202003Conference/Agenda/
        • http://web-services.gov/GISdayBrand111903.doc
        • http://web-services.gov/brief-userSmartsOverview-031020.ppt
        • http://www.fgdc.gov/gisday2003/
      • February 4, 2004, E-Gov Web-Enabled Government 2004 Conference, Session 2-4: Understanding Semantic Web Technology, Brand Niemann and Jim Hendler.
        • http://www.e-gov.com
  • Business Driver 4: Resolve Data Semantics Issues That Impede Community of Practice Work
    • Review Comments and Suggestions:
      • A word of praise: I've had some Unisys folks help to add their ideas and revisions to these paragraphs and several of the Unisys Architects have called out your section as being "right on the money" and "very impressive". Davis Roberts, Unisys.
      • Just writing to let you know that your slide presentation (and the work you are doing) is great! Thank you. I look forward to meeting you, and collaborating with you in the not-too-distant future. I just came upon a very good presentation by Brand Niemann and Ken Gill that is part of their contribution to the US Federal Enterprise Architecture ("FEA") Data and Information Reference Model ("DRM") data management strategy. The work of some of our community members: Mike Daconta/Leo Obrst/Kevin Smith, Jack Park/Sam Hunting; and even our [ontolog-forum] community of practice, has been referenced in there too. Let's keep up the good work here ... we definitely look forward to closer collaboration with the eGov/FEA folks in the future. Peter Yim [email_address] , Organization: CIM Engineering, Inc. To: [email_address] , Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/ , Shared Files: http://ontolog.cim3.net/file/ , Community Wiki: http://ontolog.cim3.net/wiki/
      • See “Data Models in a National Infrastructure Handling Reporting Obligations: Norwegian Experience and Opportunities, Version 1.01, August 2003, by Per Myrseth, IBM Norway, 14 pp.
      • See “An Overview of SNOWMED CT (Systematized Nomenclature of Medicine Clinical Terms), American College of Pathologists, 2003, 32 slides.