Submit Search
Upload
Text mining exercise
•
Download as PPT, PDF
•
0 likes
•
806 views
Lars Juhl Jensen
Follow
Report
Share
Report
Share
1 of 55
Download now
Recommended
Text-mining practical
Text-mining practical
Lars Juhl Jensen
Biomedical data
Biomedical data
beiko
Integration of biomedical literature and databases
Integration of biomedical literature and databases
Lars Juhl Jensen
Cross-species gene normalization by species inference
Cross-species gene normalization by species inference
Raunak Shrestha
Hypothermia bio class presentation
Hypothermia bio class presentation
Nathanael Stanaway
Repositories for Scientific Data: An #animalgarden show (Pecha Kucha) - Peter...
Repositories for Scientific Data: An #animalgarden show (Pecha Kucha) - Peter...
Repository Fringe
SMART Team Research
SMART Team Research
Adrian McAfee
Network-based approaches for the analysis of gene-disease associations
Network-based approaches for the analysis of gene-disease associations
Casey Greene
Recommended
Text-mining practical
Text-mining practical
Lars Juhl Jensen
Biomedical data
Biomedical data
beiko
Integration of biomedical literature and databases
Integration of biomedical literature and databases
Lars Juhl Jensen
Cross-species gene normalization by species inference
Cross-species gene normalization by species inference
Raunak Shrestha
Hypothermia bio class presentation
Hypothermia bio class presentation
Nathanael Stanaway
Repositories for Scientific Data: An #animalgarden show (Pecha Kucha) - Peter...
Repositories for Scientific Data: An #animalgarden show (Pecha Kucha) - Peter...
Repository Fringe
SMART Team Research
SMART Team Research
Adrian McAfee
Network-based approaches for the analysis of gene-disease associations
Network-based approaches for the analysis of gene-disease associations
Casey Greene
Biomedical literature mining (and why we really need open access)
Biomedical literature mining (and why we really need open access)
Lars Juhl Jensen
Equine Exchange
Equine Exchange
Andrey Perelygin
Dr David Schindel and Mike Trizna - BOL Data Portal
Dr David Schindel and Mike Trizna - BOL Data Portal
Consortium for the Barcode of Life (CBOL)
Biomedical engineering resume
Biomedical engineering resume
Aaron Gaylord
Presentation from Code Camp 2017
Presentation from Code Camp 2017
Mitch Miller
Brookens, C
Brookens, C
Claire Brookens
Text-mining practical
Text-mining practical
Lars Juhl Jensen
Text-mining practical
Text-mining practical
Lars Juhl Jensen
Text-mining practical
Text-mining practical
Lars Juhl Jensen
Large-scale integration of data and text
Large-scale integration of data and text
Lars Juhl Jensen
Systems biology - Bioinformatics on complete biological systems
Systems biology - Bioinformatics on complete biological systems
Lars Juhl Jensen
Locus link
Locus link
Vidya Kalaivani Rajkumar
Network biology: Large-scale data and text mining
Network biology: Large-scale data and text mining
Lars Juhl Jensen
Biomedical text mining
Biomedical text mining
Lars Juhl Jensen
STRING: Large-scale data and text mining
STRING: Large-scale data and text mining
Lars Juhl Jensen
Biological literature mining - from information retrieval to biological disco...
Biological literature mining - from information retrieval to biological disco...
Lars Juhl Jensen
Mining literature and medical records
Mining literature and medical records
Lars Juhl Jensen
Intro bioinfo
Intro bioinfo
Vinitha Nair
Intro bioinfo
Intro bioinfo
Vinitha Nair
Literature Mining and Systems Biology
Literature Mining and Systems Biology
Lars Juhl Jensen
Biomarkers brain regions
Biomarkers brain regions
Ann-Marie Roche
Gene
Gene
Jeny Jose
More Related Content
What's hot
Biomedical literature mining (and why we really need open access)
Biomedical literature mining (and why we really need open access)
Lars Juhl Jensen
Equine Exchange
Equine Exchange
Andrey Perelygin
Dr David Schindel and Mike Trizna - BOL Data Portal
Dr David Schindel and Mike Trizna - BOL Data Portal
Consortium for the Barcode of Life (CBOL)
Biomedical engineering resume
Biomedical engineering resume
Aaron Gaylord
Presentation from Code Camp 2017
Presentation from Code Camp 2017
Mitch Miller
Brookens, C
Brookens, C
Claire Brookens
What's hot
(6)
Biomedical literature mining (and why we really need open access)
Biomedical literature mining (and why we really need open access)
Equine Exchange
Equine Exchange
Dr David Schindel and Mike Trizna - BOL Data Portal
Dr David Schindel and Mike Trizna - BOL Data Portal
Biomedical engineering resume
Biomedical engineering resume
Presentation from Code Camp 2017
Presentation from Code Camp 2017
Brookens, C
Brookens, C
Similar to Text mining exercise
Text-mining practical
Text-mining practical
Lars Juhl Jensen
Text-mining practical
Text-mining practical
Lars Juhl Jensen
Text-mining practical
Text-mining practical
Lars Juhl Jensen
Large-scale integration of data and text
Large-scale integration of data and text
Lars Juhl Jensen
Systems biology - Bioinformatics on complete biological systems
Systems biology - Bioinformatics on complete biological systems
Lars Juhl Jensen
Locus link
Locus link
Vidya Kalaivani Rajkumar
Network biology: Large-scale data and text mining
Network biology: Large-scale data and text mining
Lars Juhl Jensen
Biomedical text mining
Biomedical text mining
Lars Juhl Jensen
STRING: Large-scale data and text mining
STRING: Large-scale data and text mining
Lars Juhl Jensen
Biological literature mining - from information retrieval to biological disco...
Biological literature mining - from information retrieval to biological disco...
Lars Juhl Jensen
Mining literature and medical records
Mining literature and medical records
Lars Juhl Jensen
Intro bioinfo
Intro bioinfo
Vinitha Nair
Intro bioinfo
Intro bioinfo
Vinitha Nair
Literature Mining and Systems Biology
Literature Mining and Systems Biology
Lars Juhl Jensen
Biomarkers brain regions
Biomarkers brain regions
Ann-Marie Roche
Gene
Gene
Jeny Jose
Integration of biomedical literature and databases
Integration of biomedical literature and databases
Lars Juhl Jensen
Turning literature into databases
Turning literature into databases
Lars Juhl Jensen
Data-driven drug discovery for rare diseases - Tales from the trenches (CINF ...
Data-driven drug discovery for rare diseases - Tales from the trenches (CINF ...
Frederik van den Broek
ContentMine Presentation for WHO Health Data Seminar
ContentMine Presentation for WHO Health Data Seminar
Jenny Molloy
Similar to Text mining exercise
(20)
Text-mining practical
Text-mining practical
Text-mining practical
Text-mining practical
Text-mining practical
Text-mining practical
Large-scale integration of data and text
Large-scale integration of data and text
Systems biology - Bioinformatics on complete biological systems
Systems biology - Bioinformatics on complete biological systems
Locus link
Locus link
Network biology: Large-scale data and text mining
Network biology: Large-scale data and text mining
Biomedical text mining
Biomedical text mining
STRING: Large-scale data and text mining
STRING: Large-scale data and text mining
Biological literature mining - from information retrieval to biological disco...
Biological literature mining - from information retrieval to biological disco...
Mining literature and medical records
Mining literature and medical records
Intro bioinfo
Intro bioinfo
Intro bioinfo
Intro bioinfo
Literature Mining and Systems Biology
Literature Mining and Systems Biology
Biomarkers brain regions
Biomarkers brain regions
Gene
Gene
Integration of biomedical literature and databases
Integration of biomedical literature and databases
Turning literature into databases
Turning literature into databases
Data-driven drug discovery for rare diseases - Tales from the trenches (CINF ...
Data-driven drug discovery for rare diseases - Tales from the trenches (CINF ...
ContentMine Presentation for WHO Health Data Seminar
ContentMine Presentation for WHO Health Data Seminar
More from Lars Juhl Jensen
One tagger, many uses: Illustrating the power of dictionary-based named entit...
One tagger, many uses: Illustrating the power of dictionary-based named entit...
Lars Juhl Jensen
One tagger, many uses: Simple text-mining strategies for biomedicine
One tagger, many uses: Simple text-mining strategies for biomedicine
Lars Juhl Jensen
Extract 2.0: Text-mining-assisted interactive annotation
Extract 2.0: Text-mining-assisted interactive annotation
Lars Juhl Jensen
Network visualization: A crash course on using Cytoscape
Network visualization: A crash course on using Cytoscape
Lars Juhl Jensen
STRING & STITCH: Network integration of heterogeneous data
STRING & STITCH: Network integration of heterogeneous data
Lars Juhl Jensen
Biomedical text mining: Automatic processing of unstructured text
Biomedical text mining: Automatic processing of unstructured text
Lars Juhl Jensen
Medical network analysis: Linking diseases and genes through data and text mi...
Medical network analysis: Linking diseases and genes through data and text mi...
Lars Juhl Jensen
Network Biology: A crash course on STRING and Cytoscape
Network Biology: A crash course on STRING and Cytoscape
Lars Juhl Jensen
Cellular networks
Cellular networks
Lars Juhl Jensen
Cellular Network Biology: Large-scale integration of data and text
Cellular Network Biology: Large-scale integration of data and text
Lars Juhl Jensen
Statistics on big biomedical data: Methods and pitfalls when analyzing high-t...
Statistics on big biomedical data: Methods and pitfalls when analyzing high-t...
Lars Juhl Jensen
STRING & related databases: Large-scale integration of heterogeneous data
STRING & related databases: Large-scale integration of heterogeneous data
Lars Juhl Jensen
Tagger: Rapid dictionary-based named entity recognition
Tagger: Rapid dictionary-based named entity recognition
Lars Juhl Jensen
Network Biology: Large-scale integration of data and text
Network Biology: Large-scale integration of data and text
Lars Juhl Jensen
Medical text mining: Linking diseases, drugs, and adverse reactions
Medical text mining: Linking diseases, drugs, and adverse reactions
Lars Juhl Jensen
Network biology: Large-scale integration of data and text
Network biology: Large-scale integration of data and text
Lars Juhl Jensen
Medical data and text mining: Linking diseases, drugs, and adverse reactions
Medical data and text mining: Linking diseases, drugs, and adverse reactions
Lars Juhl Jensen
Cellular Network Biology
Cellular Network Biology
Lars Juhl Jensen
Network biology: Large-scale integration of data and text
Network biology: Large-scale integration of data and text
Lars Juhl Jensen
Biomarker bioinformatics: Network-based candidate prioritization
Biomarker bioinformatics: Network-based candidate prioritization
Lars Juhl Jensen
More from Lars Juhl Jensen
(20)
One tagger, many uses: Illustrating the power of dictionary-based named entit...
One tagger, many uses: Illustrating the power of dictionary-based named entit...
One tagger, many uses: Simple text-mining strategies for biomedicine
One tagger, many uses: Simple text-mining strategies for biomedicine
Extract 2.0: Text-mining-assisted interactive annotation
Extract 2.0: Text-mining-assisted interactive annotation
Network visualization: A crash course on using Cytoscape
Network visualization: A crash course on using Cytoscape
STRING & STITCH: Network integration of heterogeneous data
STRING & STITCH: Network integration of heterogeneous data
Biomedical text mining: Automatic processing of unstructured text
Biomedical text mining: Automatic processing of unstructured text
Medical network analysis: Linking diseases and genes through data and text mi...
Medical network analysis: Linking diseases and genes through data and text mi...
Network Biology: A crash course on STRING and Cytoscape
Network Biology: A crash course on STRING and Cytoscape
Cellular networks
Cellular networks
Cellular Network Biology: Large-scale integration of data and text
Cellular Network Biology: Large-scale integration of data and text
Statistics on big biomedical data: Methods and pitfalls when analyzing high-t...
Statistics on big biomedical data: Methods and pitfalls when analyzing high-t...
STRING & related databases: Large-scale integration of heterogeneous data
STRING & related databases: Large-scale integration of heterogeneous data
Tagger: Rapid dictionary-based named entity recognition
Tagger: Rapid dictionary-based named entity recognition
Network Biology: Large-scale integration of data and text
Network Biology: Large-scale integration of data and text
Medical text mining: Linking diseases, drugs, and adverse reactions
Medical text mining: Linking diseases, drugs, and adverse reactions
Network biology: Large-scale integration of data and text
Network biology: Large-scale integration of data and text
Medical data and text mining: Linking diseases, drugs, and adverse reactions
Medical data and text mining: Linking diseases, drugs, and adverse reactions
Cellular Network Biology
Cellular Network Biology
Network biology: Large-scale integration of data and text
Network biology: Large-scale integration of data and text
Biomarker bioinformatics: Network-based candidate prioritization
Biomarker bioinformatics: Network-based candidate prioritization
Text mining exercise
1.
Text mining exercise ~5
m Lars Juhl Jensen
2.
the task
3.
named entity recognition
4.
human proteins
5.
link proteins to
diseases
6.
what I have
done
7.
information retrieval
8.
two diseases
9.
prostate cancer
10.
schizophrenia
11.
two sets of
documents
12.
62,755 abstracts
13.
65,588 abstracts
14.
one directory with
each set
15.
one file with
each abstract
16.
dictionary
17.
tab-delimited file
18.
human proteins
19.
22,523 entities
20.
synonyms
21.
from many databases
22.
orthographic variation
23.
prefixes and postfixes
24.
automatically generated
25.
2,726,495 names
26.
tagdir program
27.
flexible matching
28.
upper- and lower-case
29.
spaces and hyphens
30.
tab-delimited output
31.
what you will
do
32.
named entity recognition
33.
find unfortunate names
34.
create “black list”
35.
information extraction
36.
co-mentioning
37.
within documents
38.
link proteins to
diseases
39.
link between the
diseases
40.
41.
a helping hand
42.
“black list”
43.
100+ matches
44.
10+ matches
45.
46.
wrap up
47.
prostate cancer
48.
FOLH1
49.
schizophrenia
50.
Glutamate carboxypeptidase II
51.
same protein
52.
synonyms matter
53.
“black list” is
crucial
54.
text mining is
quite simple
55.
diseases.jensenlab.org
Download now