Your SlideShare is downloading. ×
0
A01-Openness in knowledge-based systems
A01-Openness in knowledge-based systems
A01-Openness in knowledge-based systems
A01-Openness in knowledge-based systems
A01-Openness in knowledge-based systems
A01-Openness in knowledge-based systems
A01-Openness in knowledge-based systems
A01-Openness in knowledge-based systems
A01-Openness in knowledge-based systems
A01-Openness in knowledge-based systems
A01-Openness in knowledge-based systems
A01-Openness in knowledge-based systems
A01-Openness in knowledge-based systems
A01-Openness in knowledge-based systems
A01-Openness in knowledge-based systems
A01-Openness in knowledge-based systems
A01-Openness in knowledge-based systems
A01-Openness in knowledge-based systems
A01-Openness in knowledge-based systems
A01-Openness in knowledge-based systems
A01-Openness in knowledge-based systems
A01-Openness in knowledge-based systems
A01-Openness in knowledge-based systems
A01-Openness in knowledge-based systems
A01-Openness in knowledge-based systems
A01-Openness in knowledge-based systems
A01-Openness in knowledge-based systems
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

A01-Openness in knowledge-based systems

448

Published on

The role of openness in knowle

The role of openness in knowle

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
448
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • I will have more to say about the importance of broad knowledge later
  • Humans are very facile at generating explanations. Confabulation is what happens when the process is disconnected from relevant sources of information. Split brain patients explaining why the other hemisphere did something, or Capgrassymdome
  • I believe that both of these goals are within reach in the next generation.
  • Transcript

    • 1. The Role of Openness in Creating a Mind for Life
    • 2. Open Source, AI, & Biology
      An AI breakthrough can come from an application in biology
      It is imperative that this be open source
      Some steps toward (and questions about) creating an open source AI for understanding life
    • 3. The first artificial mind will think about molecular biology
      “You can’t think about thinking without thinking about thinking about something.”
      Seymour Papert, 1974
      “A thorough study of Human Physiology is, in itself, an education broader and more comprehensive than much that passes under that name. There is no side of the intellect which it does not call into play, no region of human knowledge into which either its roots, or its branches, do not extend.”
      Thomas Huxley,1893
    • 4. Why AI hasn’t succeeded (yet)
      People know a lot about the world implicitly
      Conversing with a partnerwho doesn’t know these basic things is very frustrating
      50 years of failing to capture this “common sense” information computationally suggests:
      Lack of explicit enumeration makes capture very expensive (encyclopedias don’t have it!)
      Still no idea of the extent of this knowledge
    • 5. People don’t have implicit knowledge of molecular biology
      • Everything anyone knows about MolBio comes from some combination of:
      Textbooks
      Scientific publications
      Databases (e.g. NCBI)
      Experiments done in one’s own lab
      • There is no elicitation barrier to capturing everything known about molecular biology
    • Why would biologists care?
      They have to understand genome-scale data, in the context of all that is already known.
      Magic bullets and biomarkers are not enough
      The idea of finding a single marker of disease state, and addressing it with a specifically targeted drug is not panningout as well as hoped.
    • 6. X
      J.J. Hornberg et al. / BioSystems 83 (2006) 81–90
      Homeostatic networks foil single markers and drugs
      outcome
      target
    • 7. Networks change through time
      Mjolsness, Sharp, Reinitz, A Connectionist Model of Development J. Theoretical Bio 1991
    • 8. Understanding the data
      “We are close to having a $1,000 genome sequence, but this may be accompanied by a $1,000,000 interpretation.” - Bruce Korf, president American College of Medical Genetics
      Not only is the cost of sequencing essentially free, but big computers and big storage are cheap, too. What will keep us busy for the next 50 years is understanding the data” - Russ Altman, chair of Biomedical Engineering at Stanford
    • 9. The Hard Problem
      Given a set of genomic regions, variants, gene products, and/or concentrations empirically involved in a defined phenotype…
      Produce:
      An explanation of the reasons that those genomic regions / variants / products / concentrations are (or are not) relevant to the phenotype
      Evidence to support the explanation(s)
      Alternative explanations
      Reasons to prefer one explanation over another
    • 10. Answering Why? questions
      Fundamental to human cognitive development
      Amazing human facility
      Even to confabulation
      Causal explanation is central to science
      The only question “big data”doesn’t seem to be enoughto answer (cfRamachandran & Hovy, 2002)
    • 11. Abductive inference
      “However man may have acquired his faculty of divining the ways of Nature, it has certainly not been by a self-controlled and critical logic. Even now he cannot give any exact reason for his best guesses…. For though it goes wrong oftener than right, yet the relative frequency with which it is right is on the whole the most wonderful thing in our constitution.”
      The Essential Peirce: Selected Philosophical Writings v. 2 p. 217
    • 12. “Two paradoxes are better than one; they may even suggest a solution” –Edward Teller
      Molecular Systems Biology
      +
      Artificial Intelligence
    • 13. Explanation is hard
      Not just about the connection between an explanation and the thing explained, but must also be “consonant” with other explanations.
      Knowledge is key
      Have to know many other explanations.
      Need “judgment” to compare the qualities of alternative explanations.
      Racunas & Shah’s HyBrow system, but required extensive manually represented knowledge
      A “complete enough” knowledge-base?
    • 14. Knowledge-based Computational Biology
      Widespread use, e.g.
      Simulation systems (e.g. BioCyc)
      Question answering systems (e.g. AskHermes or Watson Medicine)
      High-throughput result analysis (e.g. GOEAST, Ontologizer)
      Hypothesis generation / testing (e.g. HyQue)
      Anything that uses an ontology
      Annotations (e.g. GOA)
      Cross-species comparisons
      NCBO
    • 15. KB for explanation
      Knowledge base quality
      Correctness, timeliness (tracking changes)
      Completeness
      A constantly receding goal, that obviously cannot be achieved, but is important anyway
      Need to cover the material in
      Textbooks
      Journal articles
      Databases
    • 16. Explanatory inference
      Even if all the relevant knowledge were available in computationally tractable form…
      We need inferential methods to
      Identify possible explanations of complex biological phenomena (symbolic?)
      Compare alternative explanations in the light of existing evidence (numeric?)
      History of explanatory inference in AI is suggestive, but key open problems remain
    • 17. Why does openness matter?
      Productivity:
      Attacking hard problems efficiently
      Rapid assimilation of effective methods
      Building on (not ignoring) each other’s results
      Equity:
      Access to scientists with low budgets
      Distribution to the widest possible community
      Ethics:
      Transparency for AI is a moral value
    • 18. Transparency is a moral value
      AI matters – lots of social concerns about loss of control, etc. 2001, Robopocolypse
      AI is cheap to replicate, and will diverge (if you can build one mind, building millions more is easy). Too important to be private
      Technological development in the face of such broad social concern requires earning the trust of the society
    • 19. Getting there
      Build on track records of openness
      OBO &Community-curated Ontologies
      Semantic Web / OWL / SPARQL / SWRL
      Open Access Publishing
      Linked Life Data
      Breaking down barriers
      Infrastructure
      Incentives
    • 20. Opening a Bazzar
      To get the productivity advantage, infrastructure matters
      Technical infrastructure to share, compare and integrate code
      Social infrastructure to work together to solve hard problems
      Motivation
      Competition
      Cooperation
    • 21. Confronting the temptations of being proprietary
      The temptations:
      Potential future payoff
      Avoid effort to conform to the infrastructure
      Fear of not being able to improve in the future
      Competition errors
      Wrong task / evaluation / supplied data
      Poor process (timing, execution, infrastructure)
      Doesn’t evolve toward worthy end
    • 22. Goals
      Participation from many, previously disparate communities
      Bio focused: BioCreative, BioNLP,
      Comp Ling: ACL Shared Tasks, CONLL
      NIST: TREC, TAC
      A living, open source collection of useful, modular, repurposable, state of the art software for understanding biomedical texts
      Major advances in AI
    • 23. Facilitating an OS community
      Providing Resources
      Software (UIMA, U-COMPARE)
      Compute power
      Training data (CRAFT, Analysis of analysts)
      Signal Events
      Series of competitions based on CRAFT
      Incentives
      Prizes for significant achievements
    • 24. http://bionlp-corpora.sourceforge.net/CRAFT/http://bionlp.sourceforge.net
    • 25. Remaining challenges
      Pubmed Central and open access
      Corporate ownership (Ontotext & LLD)
      Semantic compatibility of various sources
      UMLS breadth vs. BFO logic
      Sharing inference methods & rules
      Rule syntax (SWRL) is not enough.
      DL inference is not enough
      UIMA equivalent?
    • 26. How to participate
      Help design CRAFT competitions
      Confront publishers about PMC bulk downloads
      Help define inferential benchmarks

    ×