Chance Discovery Employing Text Mining on Scientific Subjects
What is Chance Discovery?
Chance discovery means discovering chances - the breaking points in systems, the marketing windows in business, etc. It in...
“ It takes years of study to create a chance discovery, “ writes Ashley Hay. author of the Science of Serendipity Can we u...
What Do We Use Text Mining For?
There are so many different things included in text mining from social network dynamics to searching the web
The importance of the corpus <ul><li>Using a corpus </li></ul>Here on the left is an example in the social sciences but th...
Case Studies Proposed
To give you some concrete situations to deal with two examples are suggested Poisonous or Venomous Animals The Curious Cas...
Most venomous animals appear to produce their toxins <ul><li>Not the blue ringed octopus
Its venom contains a neurotoxin produced by Bacteria
This toxin isTetrodotoxin
synthesized by several bacterial species, including strains of the family Vibrionaceae, q.v., Pseudomonas sp., and Photoba...
The venom is stored in its salaviary glands </li></ul>
Not an isolated example Many animals host bacteria to provide poisons
Manually identified keywords or keyphrases <ul>Catagories <li>Placement of the venom,
Phrases associated with the toxin, venom, poison
Upcoming SlideShare
Loading in …5
×

Textming chancediscovery

589 views

Published on

student project to employ text mining techniques for chance discovery in a scientific or medical context. Two case studies are offered, poisonous & venomous animals, and dental & arterial plaques

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
589
On SlideShare
0
From Embeds
0
Number of Embeds
12
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • http://uniken.unsw.edu.au/features/science-serendipity
  • http://www.springer.com/computer/theoretical+computer+science/book/978-3-540-00549-0 http://www.pbs.org/wgbh/nova/body/accidental-discoveries.html Accidental Discoveries http://www.thefreedictionary.com/Accidental+Discoveries
  • http://uniken.unsw.edu.au/features/science-serendipity
  • http://www.cc.gatech.edu/~agray/6240spr11/
  • http://www.cc.gatech.edu/~agray/6240spr11/ http://datamining.typepad.com/data_mining/2006/04/visualizing_tex_1.html There are so many different things included in text mining From social network dynamics to searching the web
  • http://www.nactem.ac.uk/assist/ Here is an example in social sciences but These techniques are used to evaluate the capacities of industrial companies via what they put on their webpages
  • http://www.chm.bris.ac.uk/motm/ttx/ttx.htm
  • http://atvb.ahajournals.org/search?fulltext=Ann+Progulske+Fox%2C+2008+bacteria+&amp;submit=yes&amp;x=0&amp;y=0
  • Image sources http://www.cardiobuzz.com/2010_05_01_archive.html http://users.forthnet.gr/ath/abyss/dep1211.htm
  • Textming chancediscovery

    1. 1. Chance Discovery Employing Text Mining on Scientific Subjects
    2. 2. What is Chance Discovery?
    3. 3. Chance discovery means discovering chances - the breaking points in systems, the marketing windows in business, etc. It involves determining the significance of some piece of information about an event and then using this new knowledge in decision making. The techniques developed combine data mining methods for finding rare but important events with knowledge management, groupware, and social psychology. Theoretical Computer Science Springer.com ser·en·dip·i·ty 1. The faculty of making fortunate discoveries by accident. 2. The fact or occurrence of such discoveries. 3. An instance of making such a discovery. Fortuitous accidents Accidents in medicine: The idea sends chills down your spine as you conjure up thoughts of misdiagnoses, mistakenly prescribed drugs, and wrongly amputated limbs. Yet while accidents in the examining room or on the operating table can be regrettable, even tragic, those that occur in the laboratory can sometimes lead to spectacular advances, life-saving treatments, and Nobel Prizes. PBS NOVA
    4. 4. “ It takes years of study to create a chance discovery, “ writes Ashley Hay. author of the Science of Serendipity Can we use text mining techniques to speed up this process?
    5. 5. What Do We Use Text Mining For?
    6. 6. There are so many different things included in text mining from social network dynamics to searching the web
    7. 7. The importance of the corpus <ul><li>Using a corpus </li></ul>Here on the left is an example in the social sciences but these techniques are also used to evaluate the capacities of industrial companies via what they put on their webpages and so on...
    8. 8. Case Studies Proposed
    9. 9. To give you some concrete situations to deal with two examples are suggested Poisonous or Venomous Animals The Curious Case of Dental & Arterial Plaques
    10. 10. Most venomous animals appear to produce their toxins <ul><li>Not the blue ringed octopus
    11. 11. Its venom contains a neurotoxin produced by Bacteria
    12. 12. This toxin isTetrodotoxin
    13. 13. synthesized by several bacterial species, including strains of the family Vibrionaceae, q.v., Pseudomonas sp., and Photobacterium phosphoreum.
    14. 14. The venom is stored in its salaviary glands </li></ul>
    15. 15. Not an isolated example Many animals host bacteria to provide poisons
    16. 16. Manually identified keywords or keyphrases <ul>Catagories <li>Placement of the venom,
    17. 17. Phrases associated with the toxin, venom, poison
    18. 18. Structures associated with hosting the bacteria </li></ul><ul>Possible Results <li>Modified salivary glands
    19. 19. The poison is actually a cocktail of chemicals
    20. 20. Gland, duct, mucas etc </li></ul>
    21. 21. What will you discover? Should you make separate corpuses for before & after the source of the poison was correctly identified as bacterial? Once you have begun manually identify key phrases, add synonyms, and see the the patterns that result ... then you can try to automate this process. How can this system be optimized? When you have some results there is also the challenge of putting them into perspective. If the corpus of the king cobra has many phrases similar to the blue ringed octopus what does this mean?
    22. 22.   For many years medicine has known that dental plaque was caused by oral bacteria However in 2008 University of Florida researchers cornered the bacterial ringleaders of gum disease inside human artery-clogging plaque see Human Atherosclerotic Plaque Contains Viable Invasive Actinobacillus actinomycetemcomitans and Porphyromonas gingivalis by Emil V. Kozarov, Brian R. Dorn, Charles E. Shelburne, William A. Dunn Jr, and Ann Progulske-Fox The Curious Case of Dental & Arterial Plaques
    23. 23. What does this mean? If these two have the same cause perhaps other places were the keyword is plaque is also bacterial in origin or that microbes are implicated
    24. 24. Again more or less the same protocol <ul><li>Manually search texts to identify keyphrases
    25. 25. Include synonyms, science direct, google scholar etc
    26. 26. Find other medical conditions that refer to plaque or use simlar phrases
    27. 27. Create corpuses, manuplate searches, automate
    28. 28. Analyze the results and present them </li></ul>
    29. 29. Good Luck Contact me if you have any questions yuknachris(at)yahoo.com

    ×