Curation of information and knowledge

1,502 views

Published on

Published in: Business, Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,502
On SlideShare
0
From Embeds
0
Number of Embeds
59
Actions
Shares
0
Downloads
53
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Curation of information and knowledge

  1. 1. Curation ofInformation & Knowledge http://commons.wikimedia.org/wiki/ File:Wentletrap_001.jpg © 2011 Jorn Bettin Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)
  2. 2. Converting raw data and tacit knowledge intoRelevant Information and Explicit Knowledge
  3. 3. Qualityhttp://commons.wikimedia.org/wiki/ File:JonWoodApril2007Texas.jpg
  4. 4. Complexity ...
  5. 5. Relevance ... http://commons.wikimedia.org/wiki/ File:Administrative_burden.JPG
  6. 6. Understandability ...
  7. 7. Value of Knowledge http://commons.wikimedia.org/wiki/ is hard to File:Cloud_computing_icon.svg communicate • It’s not tangible • It’s not raw data • Much of it is tacit
  8. 8. MeasuringQuality of Information Relevant dimensions 1. Accuracy 2. Currency 3. Completeness 4. Security 5. Reliability 6. Unambiguity 7. Findability 8. Traceability 9. Simplicity 10. Usability
  9. 9. AccuracyWhy does it matter? • Information is used for operational and strategic decision making • It must be trustworthyHow is it measurable? • Define acceptable tolerance intervalsHow can it be improved? • Focus on relevant information and eliminate irrelevant information
  10. 10. CurrencyWhy does it matter? • Information is used for operational and strategic decision making • It must be timelyHow is it measurable? • Define acceptable temporal delaysHow can it be improved? • Increase the level of automated system integration • Invest in adequate computing and network infrastructure
  11. 11. CompletenessWhy does it matter? • Information is used for operational and strategic decision making • It must be sufficiently free of gapsHow is it measurable? • Specify the sources of each piece of information • Distinguish between mandatory and optional information for decision makingHow can it be improved? • Focus on relevant information and eliminate irrelevant information
  12. 12. SecurityWhy does it matter? • To enforce information ownership • To ensure compliance with privacy legislation • To prevent theft of informationHow is it measurable? • Strength of authentication mechanisms • Strength of encryption mechanisms • Level of alignment between role based access control and job descriptionsHow can it be improved? • Introduce stronger authentication and encryption • Remove ambiguities from job descriptions
  13. 13. ReliabilityWhy does it matter? • To avoid outages • To prevent disastersHow is it measurable? • Definine the acceptable minimum availability of each information sourceHow can it be improved? • Use software designs that tolerate temporary outages of required/external services • Invest in system and data centre replication technology
  14. 14. UnambiguitityWhy does it matter? • To minimise communication errors • To prevent wrong decisions • To prevent disastersHow is it measurable? • Count the homonyms in each role-specific contextHow can it be improved? • Establish a comprehensive registry of concepts • Use concepts names that are tailored to the role-specific context • Use semantic identities instead of names when communicating information
  15. 15. FinadabilityWhy does it matter? • To enable staff to find relevant information • To speed up decision making • To prevent disastersHow is it measurable? • Count how often staff need to talk to colleagues to find information that is stored in an information systemHow can it be improved? • Provide advanced support for queries • Make the query engine aware of the role- specific context • Allow query by information category, by container, by name, and by semantic identity
  16. 16. TraceabilityWhy does it matter? • To speed up root cause analysis of errors • To speed up the learning curve for new staff • To meet legal & regulatory compliance needsHow is it measurable? • Count how often staff need to talk to colleagues or need to resort to ad-hoc search for tracing the source of an errorHow can it be improved? • Consistent use of information categories and containers • Automatic tagging of information with temporal & spacial meta data • Adherance to retention constraints
  17. 17. SimplicityWhy does it matter? • To accommodate human cognitive limits • To prevent wrong decisions • To prevent disastersHow is it measurable? • Collect artefact complexity metricsHow can it be improved? • Intuitive representations that are developed in collaboration with domain experts • As needed, role-specific representations • Provide an explicit modularisation mechanism for all artefacts
  18. 18. UsabilityWhy does it matter? • Intuitive user/system interaction • Device independent information access • To discourage use of non-compliant toolsHow is it measurable? • Validation by average usersHow can it be improved? • Consistency of representations across devices • Use of high-quality icons that are developed in collaboration with domain experts • Ensure adequate reliability
  19. 19. Knowledge Curation http://commons.wikimedia.org/wiki/ File:Wentletrap_001.jpg
  20. 20. Knowledge Repositories
  21. 21. Examples Septers AMS datastore bisupport for role based access control criel transSate SelerequmAdequ ate support for role based Surce template/control access support for role based access controlA language artefact is a non-hardware artefact• information content of pheromones• information content of body language• live music• live speech• information content in traditional symbolic notations• program/diagram/hypertext/database content• information content of recorded sound/pictures/videos• information content of genetic material http://commons.wikimedia.org/wiki/ File:Photo_with_histogram.JPG
  22. 22. Definition Septers AMS datastore bisupport for role based access control criel transSate SelerequmAdequ ate support for role based Surce template/control access support for role based access controlA language artefact• is a container of information• is instantiated by a specific actor (human or system)• is consumed by at least one actor (human or system)• represents a natural unit of work (for the instantiating & consuming actors)• may contain links to other artefacts• has a state and a lifecycle
  23. 23. Communication
  24. 24. Definition Septers AMS datastore Septers bisupport for role AMS datastore basedSepters SelerequmAdequ access control for ate support for bisupport role AMS datastore criel based role SelerequmAdequ Septers based access control for atecontrol for Surce template/ support bisupport access SelerequmAdequ role AMS datastore criel based transSate role based access for template/control for Surce role ate support bisupport for supportcontrol access SelerequmAdequ role based criel transSate role based based access template/control for access for role ate support supportcontrol Surce control transSate criel access role based based access template/control Surce access support for role control transSate based access support for role control based access controlSoftware is an arbitrary set oflanguage artefacts
  25. 25. SeptersAMSSepters Selerequdatastor mAdequ AMSSepters Selerequ Software Producerse datastorate Selerequ AMSSepters criel mAdequbisuppor supportSelerequ e Surce ate datastor mAdequ AMS criel bisuppor support e Surce ate datastor template role mAdequ criel for /bisuppor support e Surce ate template role criel for /bisuppor role transSat forsupport template Surce transSat for role / template transSat / transSat software developers software systems & other humans
  26. 26. SeptersAMSSepters Selerequdatastor mAdequ AMSSepters Selerequ 1st-Level Categorisatione datastorate Selerequ AMSSepters criel mAdequbisuppor supportSelerequ e Surce ate datastor mAdequ AMS criel bisuppor support e Surce ate datastor template role mAdequ criel for /bisuppor support e Surce ate template role criel for /bisuppor role transSat forsupport template Surce transSat for role / template transSat / transSat meta data operational data
  27. 27. SeptersAMSSepters Selerequdatastor mAdequ AMSSepters Selerequ Definitionse datastorate Selerequ AMSSepters criel mAdequbisuppor supportSelerequ e Surce ate datastor mAdequ AMS criel bisuppor support e Surce ate datastor template role mAdequ criel for /bisuppor support e Surce ate template role criel for /bisuppor role transSat forsupport template Surce transSat for role / template transSat / transSatthe categories (= meta data) must be relevant to the organisation Data, Information, Knowledge • uncategorised data has very little value • categorised data is valuable information • information combined with an understanding of its usage context is valuable knowledge
  28. 28. Value produce produce consume B Chain Selection criteria for a metadata Selection criteria for a metadata Selection criteria for a metadata repository repository repository Adequate support for CR compatible Adequate support for CR compatible Adequate support for CR compatible versioning, branching, locking versioning, branching, locking versioning, branching, locking requirements requirements requirements Support for interfaces with current Support for interfaces with current Support for interfaces with current commercial products (eg ERWin) commercial products (eg ERWin) commercial products (eg ERWin) Metamodelling capability and ideally Metamodelling capability and ideally Metamodelling capability and ideally an extensible metametamodel an extensible metametamodel an extensible metametamodel A B C Support for development of adapters Support for development of adapters Support for development of adapters Adequate support for generalisation/ Adequate support for generalisation/ Adequate support for generalisation/ produce consume specialisation specialisation specialisation Support for multiple terminologies/ Support for multiple terminologies/ Support for multiple terminologies/ jargons jargons jargons Integration with open source Integration with open source Integration with open source template/transformation languages template/transformation languages template/transformation languages RDBMS datastore binding (to support RDBMS datastore binding (to support RDBMS datastore binding (to support referential integrity) referential integrity) referential integrity) Support for information ownership Support for information ownership Support for information ownership Adequate support for role based Adequate support for role based Adequate support for role based access control access control access controlA C me Selection criteria for a metadata Selection criteria for a metadata Selection criteria for a metadata onsu repository repository repository Adequate support for CR compatible Adequate support for CR compatible Adequate support for CR compatible c versioning, branching, locking versioning, branching, locking versioning, branching, locking requirements requirements requirements Support for interfaces with current Support for interfaces with current Support for interfaces with current D E F commercial products (eg ERWin) commercial products (eg ERWin) commercial products (eg ERWin) Metamodelling capability and ideally Metamodelling capability and ideally Metamodelling capability and ideally an extensible metametamodel Support for development of adapters Adequate support for generalisation/ specialisation an extensible metametamodel Support for development of adapters Adequate support for generalisation/ specialisation an extensible metametamodel Support for development of adapters Adequate support for generalisation/ specialisation consume Support for multiple terminologies/ Support for multiple terminologies/ Support for multiple terminologies/ jargons jargons jargons Integration with open source Integration with open source Integration with open source produce template/transformation languages RDBMS datastore binding (to support referential integrity) Support for information ownership template/transformation languages RDBMS datastore binding (to support referential integrity) Support for information ownership template/transformation languages RDBMS datastore binding (to support referential integrity) Support for information ownership Adequate support for role based Adequate support for role based Adequate support for role based access control access control access controlD F consume produce produce EF
  29. 29. LearningElements of knowledge acquisition• Collaboration• Exploration• Observation• Validation• Abstraction• Modularisation• Representation
  30. 30. Collaboration“We are smarter than me”Jean-Marie FavreSoftware Anthropologist
  31. 31. ExplorationRaw data acquired byexploration is essential forunderstanding an unknowndomain• Data can be analysed and categorised• Lack of data only leads to speculation
  32. 32. ObservationConnecting the dots – building a mental model• Associating information with time, space, and other attributes of origin Tacit• Noticing possible associations between different pieces of information http://commons.wikimedia.org/wiki/ File:Knowledge,_observation_and_reality.svg
  33. 33. ValidationConfirming observations• Using the scientific method• By comparing with observations from others• By involving domain experts from related disciplines• Remember: we are smarter than me!
  34. 34. AbstractionLook for Commonalities• Avoid repetition• Identify patterns• Remember: KISS! Photographer Kurt Salzmann - www.salzmaenner.ch
  35. 35. ModularisationModules preserve Simplicity• Rely on role-based separation of concerns• Modules must correspond to a natural http://commons.wikimedia.org/wiki/ File:Modular_origami.jpg unit of work• Roles and modular artefacts represent the building blocks of value chains• Optimise within the organisational context of customers, suppliers, and available skills
  36. 36. RepresentationModelling is about clarity• Balancing act between simplicity and not compromising the desired intent• Focus is on human cognitive abilities & limits• As needed use multiple syntax elements (visual containers, symbols, text, mathematical expressions)• Borrow syntax from established languages, or design syntax in close collaboration with the user community
  37. 37. CodeAll models are codea system of symbols used for • identification • classification in the sense of groupinga system of signals used to send messagesa set of conventions governing behaviourModelling is meta codingto improve clarity of code
  38. 38. Examples Class : Mammal dateOfBirth http://commons.wikimedia.org/wiki/ Class : Dog Class : CatisPoliceDog [2] [2] [*] [*]Dog : Jack Cat : Coco{1/5/03, yes} {4/3/07} Dog : Susie Cat : Peter {1/2/00, no} {10/9/98}
  39. 39. Communication CostsNot all code is a model• a system of signals that includes a translation of messages to deal with someone else’s syntax• a system of symbols used for classification in the sense of obfuscation or encryption http://commons.wikimedia.org/wiki/File:Encryption_-_decryption.svg
  40. 40. TodaySoftware suffers from thesame problems as way backwhen natural language evolved to enrich theexchange between humansIncreasingly the artefacts exchanged betweenhumans are neither hardware nor naturallanguage (encoded in speech or symbolicnotation)All language artefacts share the probems ofnatural language: unanticipated interpretations
  41. 41. Minimising Unanticipated InterpretationRequires collaboration andgood will between artefactproducers & all consumersAssociating information with its usage contextRespecting the notational and terminologicalpreferences of all parties http://commons.wikimedia.org/wiki/Assigning a unique semantic identity to each File:Discussion.jpgpiece of information (= concept)
  42. 42. Semantic Modelling C A B
  43. 43. Semantic Modelling 1. Identification of concepts and assigment of semantic identities x t ne 2. Modelling ne xt 3. Naming of concepts in as many terminologies as required by artefact producers and consumersModels Semantic Domains
  44. 44. Semantic Modelling • Based on the mathematics of model theory & denotational semantics • Constitutes a solid foundation for information engineering & knowledge curation • Not the same as modelling with the Recource Description Framework (Semantic Web) • Not the same as classical entity-relationship modelling • Not the same as object-oriented modellingModels Semantic Domains
  45. 45. Semantic Modelling • Focuses on the meaning of information in a concrete usage context • Converts tacit knowledge into explicit knowledge for use by humans and software tools • The Recource Description Framework only partially implements denotational semantics • Entity-relationship schemas lack a mechanism for modularity • Object-oriented models are limited to one level of instantiationModels Semantic Domains
  46. 46. Model TheoryWithout delving into the formal mathematical details, the significance of model theory isbest appreciated intuitively by considering the following observations: • Formal lingustics as pioneered by Noam Chomsky in the 1950s and 1960s can be expressed as a special case of model theory. • The work of model theorists goes back to the beginning of the 20th century, and was motivated by mathematicians who were concerned about potential logical inconsistencies in the mathematical symbol system and the conventions governing its use. • The resulting research into symbol systems has led to a mathematical theory that can be used to formalise any symbol system, not limited to the languages invented by humans, and including the genetic code. • The pictures produced on flip charts and white boards constitute domain specific languages as well, and with the help of their authors, sets of pictures can easily be formalised mathematically, using a specialised software tool for semantic modelling.
  47. 47. A B CSemantic Domains D EF F
  48. 48. Modular Models separation of concerns Selection criteria for a metadata repository Adequate support for CR compatible versioning, branching, locking requirements Support for interfaces with current Selection criteria for a metadata repository Adequate support for CR compatible versioning, branching, locking requirements Support for interfaces with current Selection criteria for a metadata repository Adequate support for CR compatible versioning, branching, locking requirements Support for interfaces with current commercial products (eg ERWin) commercial products (eg ERWin) commercial products (eg ERWin) Metamodelling capability and ideally Metamodelling capability and ideally Metamodelling capability and ideally an extensible metametamodel an extensible metametamodel an extensible metametamodel A B C Support for development of adapters Support for development of adapters Support for development of adapters Adequate support for generalisation/ Adequate support for generalisation/ Adequate support for generalisation/ specialisation specialisation specialisation Support for multiple terminologies/ Support for multiple terminologies/ Support for multiple terminologies/ jargons jargons jargonsModules preserve Simplicity Integration with open source Integration with open source Integration with open source template/transformation languages template/transformation languages template/transformation languages RDBMS datastore binding (to support RDBMS datastore binding (to support RDBMS datastore binding (to support referential integrity) referential integrity) referential integrity) Support for information ownership Support for information ownership Support for information ownership Adequate support for role based Adequate support for role based Adequate support for role based access control access control access control• Roles and modular artefacts represent Selection criteria for a metadata repository Selection criteria for a metadata repository Selection criteria for a metadata repository the building blocks of value chains Adequate support for CR compatible Adequate support for CR compatible Adequate support for CR compatible versioning, branching, locking versioning, branching, locking versioning, branching, locking requirements requirements requirements Support for interfaces with current Support for interfaces with current Support for interfaces with current D E F commercial products (eg ERWin) commercial products (eg ERWin) commercial products (eg ERWin) Metamodelling capability and ideally Metamodelling capability and ideally Metamodelling capability and ideally an extensible metametamodel an extensible metametamodel an extensible metametamodel• Support for development of adapters Support for development of adapters Support for development of adapters Optimise within the Adequate support for generalisation/ Adequate support for generalisation/ Adequate support for generalisation/ specialisation specialisation specialisation Support for multiple terminologies/ Support for multiple terminologies/ Support for multiple terminologies/ jargons jargons jargons organisational context of customers, Integration with open source Integration with open source Integration with open source template/transformation languages template/transformation languages template/transformation languages RDBMS datastore binding (to support RDBMS datastore binding (to support RDBMS datastore binding (to support referential integrity) referential integrity) referential integrity) Support for information ownership Support for information ownership Support for information ownership suppliers, and available skills Adequate support for role based Adequate support for role based Adequate support for role based access control access control access control role based unit of work
  49. 49. ConnectedSemantic Domains A B C D E F
  50. 50. Selection criteria for a metadatarepositoryAdequate support for CR compatibleversioning, branching, lockingrequirements ab Selection criteria for a metadata repository Adequate support for CR compatible versioning, branching, locking requirements Selection criteria for a metadata repository Adequate support for CR compatible versioning, branching, locking requirementsSupport for interfaces with current Support for interfaces with current Support for interfaces with currentcommercial products (eg ERWin) commercial products (eg ERWin) commercial products (eg ERWin)Metamodelling capability and ideally Metamodelling capability and ideally Metamodelling capability and ideallyan extensible metametamodel an extensible metametamodel an extensible metametamodelSupport for development of adapters Support for development of adapters Support for development of adaptersAdequate support for generalisation/ Adequate support for generalisation/ Adequate support for generalisation/specialisation specialisation specialisationSupport for multiple terminologies/ Support for multiple terminologies/ Support for multiple terminologies/jargons jargons jargonsIntegration with open source Integration with open source Integration with open sourcetemplate/transformation languages template/transformation languages template/transformation languagesRDBMS datastore binding (to support RDBMS datastore binding (to support RDBMS datastore binding (to supportreferential integrity) referential integrity) referential integrity)Support for information ownership Support for information ownership Support for information ownershipAdequate support for role based Adequate support for role based Adequate support for role based acaccess control access control access controlad Shared LanguageSelection criteria for a metadatarepositoryAdequate support for CR compatibleversioning, branching, lockingrequirementsSupport for interfaces with current de Selection criteria for a metadata repository Adequate support for CR compatible versioning, branching, locking requirements Support for interfaces with current Selection criteria for a metadata repository Adequate support for CR compatible versioning, branching, locking requirements Support for interfaces with currentcommercial products (eg ERWin) commercial products (eg ERWin) commercial products (eg ERWin)Metamodelling capability and ideally Metamodelling capability and ideally Metamodelling capability and ideallyan extensible metametamodel an extensible metametamodel an extensible metametamodelSupport for development of adapters Support for development of adapters Support for development of adaptersAdequate support for generalisation/ Adequate support for generalisation/ Adequate support for generalisation/specialisation specialisation specialisationSupport for multiple terminologies/ Support for multiple terminologies/ Support for multiple terminologies/jargons jargons jargonsIntegration with open source Integration with open source Integration with open sourcetemplate/transformation languages template/transformation languages template/transformation languagesRDBMS datastore binding (to support RDBMS datastore binding (to support RDBMS datastore binding (to supportreferential integrity) referential integrity) referential integrity)Support for information ownership Support for information ownership Support for information ownershipAdequate support for role based Adequate support for role based Adequate support for role basedaccess control access control access control df
  51. 51. Jargon = Words + Symbolsef de df ad cf ac bc ab
  52. 52. Perspective Jargon dfD F View Point
  53. 53. Reflexive Jargon DSML f F F View PointDSML = Domain Specific Modelling Language
  54. 54. Jargons develop on top ofShared Semantic Subdomains ab ac bc A B C ad cf D E F de df ef
  55. 55. More InformationKnowledge Reconstruction http://jornbettin.com& Risk ManagementGmodel Team Blog the-software-artefact.blogspot.comThe Role of Artefacts tiny.cc/artefactsFrom Muddling to Modelling tiny.cc/muddleToModelModel Oriented Domain tiny.cc/domainanalysisAnalysis Thank you Jorn Bettin jbettin @ ibrs.com.au+61 424 758 540 www.ibrs.com.au

×