Focusing on health verticle consistent with Open Knowledge Networking initiative. Also see a relevant review of: Contextualized Knowledge Graph portal: https://www.slideshare.net/ntkimvinh7/ckg-portal-a-knowledge-publishing-proposal-for-open-knowledge-network
Women Who Code-HSV Event:
'An Introduction to Machine Learning and Genomics'. Dr. Lasseigne will introduce the R programming language and the foundational concepts of machine learning with real-world examples including applications in the field of genomics with an emphasis on complex human disease research.
Brittany Lasseigne, PhD, is a postdoctoral fellow in the lab of Dr. Richard Myers at the HudsonAlpha Institute for Biotechnology and a 2016-2017 Prevent Cancer Foundation Fellow. Dr. Lasseigne received a BS in biological engineering from the James Worth Bagley College of Engineering at Mississippi State University and a PhD in biotechnology science and engineering from The University of Alabama in Huntsville. As a graduate student, she studied the role of epigenetics and copy number variation in cancer, identifying novel diagnostic biomarkers and prognostic signatures associated with kidney cancer. In her current position, Dr. Lasseigne’s research focus is the application of genetics and genomics to complex human diseases. Her recent work includes the identification of gene variants linked to ALS, characterization of gene expression patterns in schizophrenia and bipolar disorder, and development of non-invasive biomarker assays. Dr. Lasseigne is currently focused on integrating genomic data across cancers with functional annotations and patient information to explore novel mechanisms in cancer etiology and progression, identify therapeutic targets, and understand genomic changes associated with patient survival. Based upon those analyses, she is creating tools to share with the scientific community.
Combining Explicit and Latent Web Semantics for Maintaining Knowledge GraphsPaul Groth
A look at how the thinking about Web Data and the sources of semantics can help drive decisions on combining latent and explicit knowledge. Examples from Elsevier and lots of pointers to related work.
CFPB's public data platform and the academic research communityDesiree Zamora Garcia
A talk given in 2013 at the CFPB. Getting data and sharing data is a pain. What does the workflow look like now, and what *could* it look like with the help of CFPB's public data platform?
Focusing on health verticle consistent with Open Knowledge Networking initiative. Also see a relevant review of: Contextualized Knowledge Graph portal: https://www.slideshare.net/ntkimvinh7/ckg-portal-a-knowledge-publishing-proposal-for-open-knowledge-network
Women Who Code-HSV Event:
'An Introduction to Machine Learning and Genomics'. Dr. Lasseigne will introduce the R programming language and the foundational concepts of machine learning with real-world examples including applications in the field of genomics with an emphasis on complex human disease research.
Brittany Lasseigne, PhD, is a postdoctoral fellow in the lab of Dr. Richard Myers at the HudsonAlpha Institute for Biotechnology and a 2016-2017 Prevent Cancer Foundation Fellow. Dr. Lasseigne received a BS in biological engineering from the James Worth Bagley College of Engineering at Mississippi State University and a PhD in biotechnology science and engineering from The University of Alabama in Huntsville. As a graduate student, she studied the role of epigenetics and copy number variation in cancer, identifying novel diagnostic biomarkers and prognostic signatures associated with kidney cancer. In her current position, Dr. Lasseigne’s research focus is the application of genetics and genomics to complex human diseases. Her recent work includes the identification of gene variants linked to ALS, characterization of gene expression patterns in schizophrenia and bipolar disorder, and development of non-invasive biomarker assays. Dr. Lasseigne is currently focused on integrating genomic data across cancers with functional annotations and patient information to explore novel mechanisms in cancer etiology and progression, identify therapeutic targets, and understand genomic changes associated with patient survival. Based upon those analyses, she is creating tools to share with the scientific community.
Combining Explicit and Latent Web Semantics for Maintaining Knowledge GraphsPaul Groth
A look at how the thinking about Web Data and the sources of semantics can help drive decisions on combining latent and explicit knowledge. Examples from Elsevier and lots of pointers to related work.
CFPB's public data platform and the academic research communityDesiree Zamora Garcia
A talk given in 2013 at the CFPB. Getting data and sharing data is a pain. What does the workflow look like now, and what *could* it look like with the help of CFPB's public data platform?
BigDataEurope: Project Introduction @ Year #1 WorkshopsBigData_Europe
An overview of the BDE project's objective, as presented in the introduction (with some variations) in each of the 1st Year series of workshops (seven: one per societal challenge).
Workshop #1 Year Schedule available at: http://www.big-data-europe.eu/first-round-of-bigdataeurope-workshops-announced/
BigDataEurope - Big Data & Secure SocietiesBigData_Europe
Big Data and the Secure Socities domain (vis-a-vis the respective H2020 Societal Challenge) - Opportunities, Challenges and Requirements. As presented and discussed in the public launch of the BigDataEurope project.
BDE-SC6 Hangout - “Insight into Virtual Currency Ecosystems”BigData_Europe
Third SC6 webinar was held on 16 February 2017. It was organised by the Consortium of Social Science Data Archives (CESSDA) from Norway and the Semantic Web Company (SWC) from Austria. Theme of the webinar was “Insight into Virtual Currency Ecosystems” presented by Dr. Bernhard Haslhofer, Data Scientist at the Austrian Institute of Technology.
The MD Anderson / IBM Watson Announcement: What does it mean for machine lear...Health Catalyst
It’s been over six years since IBM’s Watson amazed all of us on Jeopardy, but it has yet to deliver similar breakthroughs in healthcare. The headlines in last week’s Forbes article read, “MD Anderson Benches IBM Watson In Setback For Artificial Intelligence In Medicine.” Is it really a setback for the entire industry or not? Health Catalyst’s EVP for Product Development, Dale Sanders, believes that the challenges are unique to IBM’s machine learning strategy in healthcare. If they adjust that strategy and better manage expectations about what’s possible for machine learning in medicine, the future will be brighter for Watson, their clients, and AI in healthcare, in general. Watson’s success is good for all of us, but it’s failure is bad for all of us, too.
Join Dale as he discusses:
The good news: Machine learning technology is accelerating at a rate beyond Moore’s Law. Dale believes that machine learning algorithms and models are doubling in capability every six months.
The bad news: The healthcare data ecosystem is not nearly as rich as many would believe, and certainly not as rich as that used to train Watson for Jeopardy. Without high-volume, high-quality data, Watson’s potential and the constant advances in machine learning algorithms will hit a glass ceiling in healthcare.
The best news: By adjusting strategy and expectations, there are still plenty of opportunities to do great things with machine learning by using the current data content in healthcare, while we build out the volume and breadth of data we need to truly understand the patient at the center of the healthcare picture… and you don’t need an army of PhD data scientists to do it.
How to Become a Data Science Company instead of a company with Data Scientist...Ruth Kearney
The journey of becoming a data science company is more about the culture and thinking, rather than hiring and up-skilling individuals. In Novartis, while we are hiring data scientists and spending a lot of time in training and learning related to data science, the destination for us is one of cultural change, which is required to make us a data science company.
Head of the AI Hub Dublin, Ashwini Mathur will share practical insights into the Novartis journey and how each employee plays a part. He will talk about the value of using the language of data science throughout the organisation and how this takes them one step closer to becoming a data science company.
BigDataEurope: Project Introduction @ Year #1 WorkshopsBigData_Europe
An overview of the BDE project's objective, as presented in the introduction (with some variations) in each of the 1st Year series of workshops (seven: one per societal challenge).
Workshop #1 Year Schedule available at: http://www.big-data-europe.eu/first-round-of-bigdataeurope-workshops-announced/
BigDataEurope - Big Data & Secure SocietiesBigData_Europe
Big Data and the Secure Socities domain (vis-a-vis the respective H2020 Societal Challenge) - Opportunities, Challenges and Requirements. As presented and discussed in the public launch of the BigDataEurope project.
BDE-SC6 Hangout - “Insight into Virtual Currency Ecosystems”BigData_Europe
Third SC6 webinar was held on 16 February 2017. It was organised by the Consortium of Social Science Data Archives (CESSDA) from Norway and the Semantic Web Company (SWC) from Austria. Theme of the webinar was “Insight into Virtual Currency Ecosystems” presented by Dr. Bernhard Haslhofer, Data Scientist at the Austrian Institute of Technology.
The MD Anderson / IBM Watson Announcement: What does it mean for machine lear...Health Catalyst
It’s been over six years since IBM’s Watson amazed all of us on Jeopardy, but it has yet to deliver similar breakthroughs in healthcare. The headlines in last week’s Forbes article read, “MD Anderson Benches IBM Watson In Setback For Artificial Intelligence In Medicine.” Is it really a setback for the entire industry or not? Health Catalyst’s EVP for Product Development, Dale Sanders, believes that the challenges are unique to IBM’s machine learning strategy in healthcare. If they adjust that strategy and better manage expectations about what’s possible for machine learning in medicine, the future will be brighter for Watson, their clients, and AI in healthcare, in general. Watson’s success is good for all of us, but it’s failure is bad for all of us, too.
Join Dale as he discusses:
The good news: Machine learning technology is accelerating at a rate beyond Moore’s Law. Dale believes that machine learning algorithms and models are doubling in capability every six months.
The bad news: The healthcare data ecosystem is not nearly as rich as many would believe, and certainly not as rich as that used to train Watson for Jeopardy. Without high-volume, high-quality data, Watson’s potential and the constant advances in machine learning algorithms will hit a glass ceiling in healthcare.
The best news: By adjusting strategy and expectations, there are still plenty of opportunities to do great things with machine learning by using the current data content in healthcare, while we build out the volume and breadth of data we need to truly understand the patient at the center of the healthcare picture… and you don’t need an army of PhD data scientists to do it.
How to Become a Data Science Company instead of a company with Data Scientist...Ruth Kearney
The journey of becoming a data science company is more about the culture and thinking, rather than hiring and up-skilling individuals. In Novartis, while we are hiring data scientists and spending a lot of time in training and learning related to data science, the destination for us is one of cultural change, which is required to make us a data science company.
Head of the AI Hub Dublin, Ashwini Mathur will share practical insights into the Novartis journey and how each employee plays a part. He will talk about the value of using the language of data science throughout the organisation and how this takes them one step closer to becoming a data science company.
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...Tom Plasterer
As scientists in the life sciences we are trained to pursue singular goals around a publication or a validated target or a drug submission. Our failure rates are exceedingly high especially as we move closer to patients in the attempt to collect sufficient clinical evidence to demonstrate the value of novel therapeutics. This wastes resources as well as time for patients depending upon us for the next breakthrough.
Edge Informatics is an approach to ameliorate these failures. Using both technical and social solutions together knowledge can be shared and leveraged across the drug development process. This is accomplished by making data assets discoverable, accessible, self-described, reusable and annotatable. The Open PHACTS project pioneered this approach and has provided a number of the technical and social solutions to enable Edge Informatics. A number of pre-competitive consortia and some content providers have also embraced this approach, facilitating networks of collaborators within and outside a given organization. When taken together more accurate, timely and inclusive decision-making is fostered.
CODATA International Training Workshop in Big Data for Science for Researcher...Johann van Wyk
Presentation at NeDICC Meeting on 16 July 2014. Feedback from CODATA International Training Workshop in Big Data for Science for Researchers from Emerging and Developing Countries, Beijing, China, 5-20 June 2014
Using social media to develop your scientific careerDaniel Quintana
These slides outline how you can harness social media to boost your professional profile, collaboration, information gathering, and public outreach. Practical information includes how to establish an online presence, effectively use Twitter and other useful platforms (e.g., blogs, Linkedin), and best manage the deluge of online information.
First presented at NORMENT, KG Jebsen Centre for Psychosis Research, University of Oslo on the 8th of October, 2014
How HudsonAlpha Innovates on IT for Research-Driven Education, Genomic Medici...Dana Gardner
Transcript of a discussion on how HudsonAlpha leverages modern IT infrastructure and big data analytics to power research projects as well as pioneering genomic medicine findings.
This workshop is a hands-on introduction to machine learning with R and was presented on December 8, 2017 at the University of South Carolina for the 2017 Computational Biology Symposium held by the International Society for Computational Biology Regional Student Group-Southeast USA.
Keynote Analytics Week, Boston, MA November 7, 2014
Big Data is in its infancy and is opening the door to profound change - Grand Opportunities (Accelerating Scientific Discovery) and Grand Challenges to be addressed over the next decade. We explore the premise that Data Science is to data-intensive discovery as the Scientific Method is to scientific discovery, leading us to potential Laws and Limits of Data Science, and then to Best Practices.
Lifelogging - A long term data analytics challengeCathal Gurrin
A talk delivered at the DBTA workshop on Lifelogging and Long-term Digital Preservation in Lugano, November 2015. The talk introduces lifelogging and the concept of the digital self. It highlights some potential advantages of lifelogging and suggests the technologies that we need to develop (or have developed) to realise these advantages. Finally it concludes with some insights based on my nine years of practitioner experience.
Big Data Europe SC6 WS #3: PILOT SC6: CITIZEN BUDGET ON MUNICIPAL LEVEL, Mart...BigData_Europe
Presentation at the Big Data Europe SC6 workshop #3 on 11.9.2017 in Amsterdam co-located with SEMANTiCS2017 conference: BDE PIlot Societal Challenge 6: CITIZEN BUDGET ON MUNICIPAL LEVEL by Martin Kaltenboeck (Semantic Web Company, SWC).
Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...BigData_Europe
Talk at the Big Data Europe SC6 workshop number 3 taking place on 11.9.2017 in Amsterdam co-located with SEMANTiCS2017 conference: The Big Data Europe Platform: Apps, challenges, goals by Aad Versteden, TenForce.
Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...BigData_Europe
Where we are and are going for Big Data in OpenScience
Keynote talk at the Big Data Europe SC6 Workshop on 11.9.2017 in Amsterdam co-located with SEMANTiCS2017: The perspective of European official statistics by Fernando Reis, Task-Force Big Data, European Commission (Eurostat).
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...BigData_Europe
Slides for keynote talk at the Big Data Europe workshop nr 3 on 11.9.2017 in Amsterdam co-located with SEMANTiCS2017 conference by Ron Dekker, Director CESSDA: European Open Science Agenda: where we are and where we are going?
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...BigData_Europe
Slides of the keynote at the 3rd Big Data Europe SC6 Workshop co-located at SEMANTiCS2018 in Amsterdam (NL) on: The European Research Data Landscape: Opportunities for CESSDA by Peter Doorn, Director DANS, Chair, Science Europe W.G. on Research Data. Chair, CESSDA ERIC General Assembly
BDE SC3.3 Workshop - Options for Wind Farm performance assessment and Power f...BigData_Europe
Options for Wind Farm performance assessment and Power forecasting (Mr. A. Kyritsis, ALTSOL/TERNA) at the BigDataEurope Workshop, Amsterdam, Novermber 2017.
Big Data Europe: Workshop 3 SC6 Social Science: THE IMPORTANCE OF METADATA & ...BigData_Europe
Big Data Europe: Workshop 3 SC6 Social Science - 11.09.2017 in Amsterdam, co-located with SEMANTiCS2017 titled: THE IMPORTANCE OF METADATA & BIG DATA IN OPEN SCIENCE. Slides by Ivana Versic (Cessda) and Martin Kaltenböck (SWC)
Professional air quality monitoring systems provide immediate, on-site data for analysis, compliance, and decision-making.
Monitor common gases, weather parameters, particulates.
Richard's entangled aventures in wonderlandRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
Multi-source connectivity as the driver of solar wind variability in the heli...Sérgio Sacani
The ambient solar wind that flls the heliosphere originates from multiple
sources in the solar corona and is highly structured. It is often described
as high-speed, relatively homogeneous, plasma streams from coronal
holes and slow-speed, highly variable, streams whose source regions are
under debate. A key goal of ESA/NASA’s Solar Orbiter mission is to identify
solar wind sources and understand what drives the complexity seen in the
heliosphere. By combining magnetic feld modelling and spectroscopic
techniques with high-resolution observations and measurements, we show
that the solar wind variability detected in situ by Solar Orbiter in March
2022 is driven by spatio-temporal changes in the magnetic connectivity to
multiple sources in the solar atmosphere. The magnetic feld footpoints
connected to the spacecraft moved from the boundaries of a coronal hole
to one active region (12961) and then across to another region (12957). This
is refected in the in situ measurements, which show the transition from fast
to highly Alfvénic then to slow solar wind that is disrupted by the arrival of
a coronal mass ejection. Our results describe solar wind variability at 0.5 au
but are applicable to near-Earth observatories.
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...University of Maribor
Slides from:
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Track: Artificial Intelligence
https://www.etran.rs/2024/en/home-english/
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.Sérgio Sacani
The return of a sample of near-surface atmosphere from Mars would facilitate answers to several first-order science questions surrounding the formation and evolution of the planet. One of the important aspects of terrestrial planet formation in general is the role that primary atmospheres played in influencing the chemistry and structure of the planets and their antecedents. Studies of the martian atmosphere can be used to investigate the role of a primary atmosphere in its history. Atmosphere samples would also inform our understanding of the near-surface chemistry of the planet, and ultimately the prospects for life. High-precision isotopic analyses of constituent gases are needed to address these questions, requiring that the analyses are made on returned samples rather than in situ.
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...Scintica Instrumentation
Intravital microscopy (IVM) is a powerful tool utilized to study cellular behavior over time and space in vivo. Much of our understanding of cell biology has been accomplished using various in vitro and ex vivo methods; however, these studies do not necessarily reflect the natural dynamics of biological processes. Unlike traditional cell culture or fixed tissue imaging, IVM allows for the ultra-fast high-resolution imaging of cellular processes over time and space and were studied in its natural environment. Real-time visualization of biological processes in the context of an intact organism helps maintain physiological relevance and provide insights into the progression of disease, response to treatments or developmental processes.
In this webinar we give an overview of advanced applications of the IVM system in preclinical research. IVIM technology is a provider of all-in-one intravital microscopy systems and solutions optimized for in vivo imaging of live animal models at sub-micron resolution. The system’s unique features and user-friendly software enables researchers to probe fast dynamic biological processes such as immune cell tracking, cell-cell interaction as well as vascularization and tumor metastasis with exceptional detail. This webinar will also give an overview of IVM being utilized in drug development, offering a view into the intricate interaction between drugs/nanoparticles and tissues in vivo and allows for the evaluation of therapeutic intervention in a variety of tissues and organs. This interdisciplinary collaboration continues to drive the advancements of novel therapeutic strategies.
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Ana Luísa Pinho
Functional Magnetic Resonance Imaging (fMRI) provides means to characterize brain activations in response to behavior. However, cognitive neuroscience has been limited to group-level effects referring to the performance of specific tasks. To obtain the functional profile of elementary cognitive mechanisms, the combination of brain responses to many tasks is required. Yet, to date, both structural atlases and parcellation-based activations do not fully account for cognitive function and still present several limitations. Further, they do not adapt overall to individual characteristics. In this talk, I will give an account of deep-behavioral phenotyping strategies, namely data-driven methods in large task-fMRI datasets, to optimize functional brain-data collection and improve inference of effects-of-interest related to mental processes. Key to this approach is the employment of fast multi-functional paradigms rich on features that can be well parametrized and, consequently, facilitate the creation of psycho-physiological constructs to be modelled with imaging data. Particular emphasis will be given to music stimuli when studying high-order cognitive mechanisms, due to their ecological nature and quality to enable complex behavior compounded by discrete entities. I will also discuss how deep-behavioral phenotyping and individualized models applied to neuroimaging data can better account for the subject-specific organization of domain-general cognitive systems in the human brain. Finally, the accumulation of functional brain signatures brings the possibility to clarify relationships among tasks and create a univocal link between brain systems and mental functions through: (1) the development of ontologies proposing an organization of cognitive processes; and (2) brain-network taxonomies describing functional specialization. To this end, tools to improve commensurability in cognitive science are necessary, such as public repositories, ontology-based platforms and automated meta-analysis tools. I will thus discuss some brain-atlasing resources currently under development, and their applicability in cognitive as well as clinical neuroscience.
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Sérgio Sacani
We characterize the earliest galaxy population in the JADES Origins Field (JOF), the deepest
imaging field observed with JWST. We make use of the ancillary Hubble optical images (5 filters
spanning 0.4−0.9µm) and novel JWST images with 14 filters spanning 0.8−5µm, including 7 mediumband filters, and reaching total exposure times of up to 46 hours per filter. We combine all our data
at > 2.3µm to construct an ultradeep image, reaching as deep as ≈ 31.4 AB mag in the stack and
30.3-31.0 AB mag (5σ, r = 0.1” circular aperture) in individual filters. We measure photometric
redshifts and use robust selection criteria to identify a sample of eight galaxy candidates at redshifts
z = 11.5 − 15. These objects show compact half-light radii of R1/2 ∼ 50 − 200pc, stellar masses of
M⋆ ∼ 107−108M⊙, and star-formation rates of SFR ∼ 0.1−1 M⊙ yr−1
. Our search finds no candidates
at 15 < z < 20, placing upper limits at these redshifts. We develop a forward modeling approach to
infer the properties of the evolving luminosity function without binning in redshift or luminosity that
marginalizes over the photometric redshift uncertainty of our candidate galaxies and incorporates the
impact of non-detections. We find a z = 12 luminosity function in good agreement with prior results,
and that the luminosity function normalization and UV luminosity density decline by a factor of ∼ 2.5
from z = 12 to z = 14. We discuss the possible implications of our results in the context of theoretical
models for evolution of the dark matter halo mass function.
5. Data growth is exponential
Number of articles indexed by
MEDLINE (PUBMED) per year
http://dan.corlan.net/medline-trend.html
Southan and Cameron, Beyond the tsunami: developing the infrastructure to deal with life
sciences data, The fourth Paradigm: Data-Intensive Scientific Discovery, Microsoft Corp., 2009.
6. Data is source of knowledge
Gillam et al., The healthcare sin-
gularity and the age of seman-
tic medicine, The fourth Paradigm:
Data-Intensive Scientific Discovery,
Microsoft Corp., 2009.
7. One just needs to make the connection
The Swanson case: Fish oil and Raynaud’s syndrome
• Public knowledge since 1975: Raynaud’s syndrome is associated with
high blood viscosity, platelet aggregability, vasoconstriction.
• Public knowledge since 1984: Fish oil leads to reductions in blood
lipids, platelet aggregability, blood viscosity, and vascular reactivity.
• Swanson puts the two together in 1986: Can dietary fish oil
ameliorate or prevent Raynaud’s syndrome? He supports his evidence
with relevant literature.
• DiGiacomo confirms the hypothesis in 1989.
Vision: Create big data machinery that help produce and support more
such cases.
8. BioASQ vision
• 2 articles published in biomedical journals every minute!
• Make sure this knowledge is used to the benefit of patients
• Need to make it accessible to biomedical experts
• Search is not e↵ective enough
• Push research in automated answering of questions
• A challenge for such systems can achieve a multiplying e↵ect
13. Talking to BioASQ experts
“as I’m growing older . . . I spend more time in front of the computer but I learn less. . . . the
complexity has increased, the variety has increased and my time has been reduced.”
“When I do research I use IT stu↵ all the time, I’m looking for papers and data...I’m also
doing statistical analysis”
14. Talking to BioASQ experts
“as I’m growing older . . . I spend more time in front of the computer but I learn less. . . . the
complexity has increased, the variety has increased and my time has been reduced.”
“When I do research I use IT stu↵ all the time, I’m looking for papers and data...I’m also
doing statistical analysis”
“PubMed and all this of course, we really depend on that. We cannot work if we don’t search
in those.”
“The bulk of information, that’s the main problem. . . . if someone has some extra time and
starts reading the results of a search then this might never end!”
“Sometimes you get irrelevant results. That’s the main problem.”
15. Talking to BioASQ experts
“as I’m growing older . . . I spend more time in front of the computer but I learn less. . . . the
complexity has increased, the variety has increased and my time has been reduced.”
“When I do research I use IT stu↵ all the time, I’m looking for papers and data...I’m also
doing statistical analysis”
“PubMed and all this of course, we really depend on that. We cannot work if we don’t search
in those.”
“The bulk of information, that’s the main problem. . . . if someone has some extra time and
starts reading the results of a search then this might never end!”
“Sometimes you get irrelevant results. That’s the main problem.”
“There is abundance of structured information . . . Unfortunately not all structured databases
are included into one.”
“I am looking at least into twenty di↵erent places for the same protein.”
“. . . since I use a number of di↵erent programs I forget them by the time I want to use them
again and I have to remember them once more.”
16. Putting big data to work
(Vision) Information systems that act like peers to human experts:
• understand the information need of the expert
• represent the need in machine-readable format
• match it to the information and data available in various sources
• provide comprehensive and comprehensible response, with supporting
material
(Big data added value) Integration of information from many sources
and large-scale semantic indexing.
(Outlook) Long way ahead but the impact of even marginal progress on
public health can be very significant!
17. Where do we stand?
• Big data is getting linked
• We have a range of tools for analysing and indexing such data
• BDE is set to bring the pieces together
• Challenges, such as push research further; NLM has improved
their MeSH indexing engine by 5%, in the first year of BioASQ!
• IBM Watson to be put in use by 14 US cancer research institutes
• Robotic science assistants making their appearance; “Adam”
generating functional genomics hypotheses about the yeast
Saccharomyces cerevisiae