Invasive Aspergillosis (IA) is a serious fungal infection and a major cause of mortality in patients undergoing
allogeneic stem cell transplantation or chemotherapy for acute leukaemia. Large amounts of data are collected during the treatment of high-risk haematology patients and we
propose leveraging such data to produce more accurate predictions of IA diagnosis. We describe here the
application of machine learning techniques to predict probability of IA, which can be used to enhance the
interpretation of biomarker results.
This document discusses methods for detecting plant diseases using information technology. It describes direct detection methods like serological techniques including ELISA and molecular methods like PCR. Indirect detection methods include imaging techniques like fluorescence and hyperspectral imaging, as well as spectroscopic techniques. Various biosensors for disease detection are also outlined, such as bacteriophage-based biosensors, affinity biosensors using antibodies or DNA, and DNA/RNA-based affinity biosensors. Early detection of plant diseases using these IT-based methods can help control diseases and reduce agricultural losses.
Predicting the Response to Hepatitis C TherapySimone Romano
Working with medical doctors, we implemented novel data mining techniques to predict the Sustained Virological Response (SVR) to hepatitis C treatment. In order to make the models more interpretable, we used Probability Estimation Trees (PETs).
Food is one of the basic needs of human being. Population is increasing day by day. So, it has become important to grow sufficient amount of crops to feed such a huge population. Agricultural intervention in the livelihood of rural India is about 58%. But with the time passing by, plants are being affected with many kinds of diseases, which cause great harm to the agricultural plant productions. It is very difficult to monitor the plant diseases. It requires tremendous amount of work, expertise in the plant diseases, and also require the excessive processing speed and time. Hence, image processing is used for the detection of plant diseases by just capturing the images of the leaves and comparing it with the data sets available. Latest and fostering technologies like Image processing is used to rectify such issues very effectively. In this project, four consecutive stages are used to discover the type of disease. The four stages include pre-processing, leaf segmentation, feature extraction and classification. This paper aims to support and help the farmers in an efficient way.
This document provides an overview of mathematical modeling of infectious diseases. It discusses how to build deterministic compartmental models using systems of differential equations. It also covers fitting models to data by estimating parameters, and analyzing uncertainty and sensitivity using techniques like Latin hypercube sampling, partial rank correlation coefficients, and Fourier amplitude sensitivity testing. The goal is to understand disease transmission dynamics and evaluate the impact of interventions through mathematical modeling.
This document provides a summary and details of Madhavi Tippani's experience and qualifications. She has over 5 years of experience in biomedical engineering, with skills in programming, data analysis, and medical imaging. Currently she is a Research Programmer analyzing corneal imaging data and studying corneal regeneration. Her experience also includes projects involving image and signal processing, biomedical devices, and statistical analysis.
Allergic Broncho Pulmonary Aspergillosis (ABPA) by Dr.Tinku JosephDr.Tinku Joseph
HRCT is more sensitive than CXR in detecting bronchiectasis and other pulmonary changes in ABPA. It helps establish the diagnosis and assess disease severity and response to treatment.
HEART DISEASES PREDICTION USING MACHINE LEARNING ALGORITHMPoojaSri45
Implemented a machine learning project aimed at predicting heart diseases using various algorithms and techniques. Developed as a part of academic or professional endeavor, the project demonstrates proficiency in data preprocessing, feature selection, model training, and evaluation.
This document presents the aim and methodology of a study that aims to develop a machine learning model to predict measles outbreaks. The study will collect a large, diverse dataset from various health sources to train models. It will preprocess the data, select features, train and evaluate models, and deploy the best model in a web app. The model is expected to accurately predict measles likelihood and outbreaks by identifying important risk factors from the extensive dataset. The results could help control measles spread, especially in under-resourced areas.
This document discusses methods for detecting plant diseases using information technology. It describes direct detection methods like serological techniques including ELISA and molecular methods like PCR. Indirect detection methods include imaging techniques like fluorescence and hyperspectral imaging, as well as spectroscopic techniques. Various biosensors for disease detection are also outlined, such as bacteriophage-based biosensors, affinity biosensors using antibodies or DNA, and DNA/RNA-based affinity biosensors. Early detection of plant diseases using these IT-based methods can help control diseases and reduce agricultural losses.
Predicting the Response to Hepatitis C TherapySimone Romano
Working with medical doctors, we implemented novel data mining techniques to predict the Sustained Virological Response (SVR) to hepatitis C treatment. In order to make the models more interpretable, we used Probability Estimation Trees (PETs).
Food is one of the basic needs of human being. Population is increasing day by day. So, it has become important to grow sufficient amount of crops to feed such a huge population. Agricultural intervention in the livelihood of rural India is about 58%. But with the time passing by, plants are being affected with many kinds of diseases, which cause great harm to the agricultural plant productions. It is very difficult to monitor the plant diseases. It requires tremendous amount of work, expertise in the plant diseases, and also require the excessive processing speed and time. Hence, image processing is used for the detection of plant diseases by just capturing the images of the leaves and comparing it with the data sets available. Latest and fostering technologies like Image processing is used to rectify such issues very effectively. In this project, four consecutive stages are used to discover the type of disease. The four stages include pre-processing, leaf segmentation, feature extraction and classification. This paper aims to support and help the farmers in an efficient way.
This document provides an overview of mathematical modeling of infectious diseases. It discusses how to build deterministic compartmental models using systems of differential equations. It also covers fitting models to data by estimating parameters, and analyzing uncertainty and sensitivity using techniques like Latin hypercube sampling, partial rank correlation coefficients, and Fourier amplitude sensitivity testing. The goal is to understand disease transmission dynamics and evaluate the impact of interventions through mathematical modeling.
This document provides a summary and details of Madhavi Tippani's experience and qualifications. She has over 5 years of experience in biomedical engineering, with skills in programming, data analysis, and medical imaging. Currently she is a Research Programmer analyzing corneal imaging data and studying corneal regeneration. Her experience also includes projects involving image and signal processing, biomedical devices, and statistical analysis.
Allergic Broncho Pulmonary Aspergillosis (ABPA) by Dr.Tinku JosephDr.Tinku Joseph
HRCT is more sensitive than CXR in detecting bronchiectasis and other pulmonary changes in ABPA. It helps establish the diagnosis and assess disease severity and response to treatment.
HEART DISEASES PREDICTION USING MACHINE LEARNING ALGORITHMPoojaSri45
Implemented a machine learning project aimed at predicting heart diseases using various algorithms and techniques. Developed as a part of academic or professional endeavor, the project demonstrates proficiency in data preprocessing, feature selection, model training, and evaluation.
This document presents the aim and methodology of a study that aims to develop a machine learning model to predict measles outbreaks. The study will collect a large, diverse dataset from various health sources to train models. It will preprocess the data, select features, train and evaluate models, and deploy the best model in a web app. The model is expected to accurately predict measles likelihood and outbreaks by identifying important risk factors from the extensive dataset. The results could help control measles spread, especially in under-resourced areas.
The document summarizes a study on using an HPV mRNA test to detect high-risk HPV infections in 302 patients. The study found the mRNA test had a low invalid rate and provided useful results. In patients with abnormal cytology results of ASCUS or LSIL, 66% and 56% respectively tested negative for high-risk HPV using the mRNA test. The test also detected active HPV infections in 10 out of 48 patients with unclear cytology results. The mRNA test was found to be a fast and reliable method for HPV detection and typing.
Exploiting NLP for Digital Disease InformaticsNigel Collier
Exploiting These are the slides from my talk at the Department of Computer Science at Sheffield University. The talk covers broad ground in my experience of applying natural language processing to knowledge discovery from various media including social media, news and the scientific literature.
Role of the Laboratory in Antimicrobial Resistance DataAnuj Sharma
The document discusses the role of microbiology laboratories in collecting, analyzing, and circulating antimicrobial resistance data. It outlines how laboratories provide antibiograms, which summarize local bacterial susceptibility patterns to guide empiric antibiotic therapy. The data can also be used for quality improvement, infection control, outbreak detection, and surveillance of resistance trends over time. The document recommends following Clinical and Laboratory Standards Institute guidelines for generating high quality antibiograms and discusses how data can be managed and shared using software tools like WHONET.
This document discusses the transition to personalized genomic medicine and some of the challenges involved. It describes how genomic data constitutes "big data" due to the large amount and complexity. While sequencing costs are decreasing, there are still difficulties in analyzing and managing the genomic data. Successive filtering approaches and knowledge databases are proposed to help identify disease-causing variants and link them to therapies.
Now a day’s, pharma research is facing challenges in
deciphering molecular understanding of disease initiation,
progress and establishment as well as performance
assessment of drug molecule on such phases of disease
development. Emerging of next generation sequencing
bases molecular tools were found to be a key method for
creating genome wide genomics landscape of gene
mutations, gene expression and gene regulation events.
Although NGS is a powerful tool for molecular research but
same time it have its own technical challenges. Few major
challenges of NGS based pharmacogenomics is
summarized below
The study reviewed the relationship between dietary and supplemental antioxidants and prostate cancer risk. Antioxidants examined included vitamin E, selenium, vitamin C, carotenoids, and polyphenols from coffee and tea. The evidence for effects of vitamin E and selenium on prostate cancer risk was inconsistent. While some studies found protective effects of selenium at low baseline levels, others found no effect. Studies of vitamin C, carotenoids, and polyphenols like green tea provided inconclusive or no evidence of relationships with prostate cancer risk.
Sk microfluidics and lab on-a-chip-ch6stanislas547
This document discusses cancer diagnostics and monitoring using microfluidic lab-on-a-chip technologies. It describes how integrating DNA/protein separation, detection, and analysis into microfluidic chips could allow for frequent, non-invasive testing of cancer biomarkers in blood or other bodily fluids. This would enable more precise monitoring of cancer treatment effectiveness and earlier detection of recurrence compared to standard techniques. The document outlines approaches involving microfluidic separation channels coupled to molecular detection and proposes a credit card-sized disposable chip sensor integrated with a small control unit for point-of-care cancer screening and monitoring.
This document evaluates several supervised machine learning algorithms for classifying gene expression data from microarray experiments. It describes analyzing two gene expression datasets, the leukemia and DLBCL datasets, using k-nearest neighbors, naive Bayes, decision trees, and support vector machines with and without feature selection. The results show that support vector machines achieved the best performance overall, and that feature selection improved the accuracy of all the algorithms.
The document discusses the strengths, weaknesses, opportunities, and threats (SWOT) of using whole genome sequencing (WGS) for surveillance and diagnostics of zoonotic bacteria. It provides a case study of using WGS to track the nosocomial transmission of Pseudomonas aeruginosa between patients and the hospital water supply. WGS was able to identify transmission routes and microevolution of the bacteria with single nucleotide resolution. However, challenges include the need for robust and standardized analysis methods as well as experimental design considerations. Overall, WGS provides opportunities for improved outbreak tracking, classification, and diagnostics if its strengths are leveraged and weaknesses addressed.
The document provides an overview of a company called Miroculus that has developed an accurate, easy to use, and affordable microRNA detection platform. Some key points:
1. Miroculus has 4 full-time employees and has labs in Heidelberg and an office in Mexico City. They are developing a platform to detect circulating microRNAs which can be diagnostic biomarkers for diseases like cancer.
2. Their platform includes an accurate bioassay that can detect microRNAs from plasma samples, a low-cost device to run the bioassay, and data analytics algorithms. This allows quantitative and qualitative molecular monitoring for disease in a simple and affordable way.
3. They have prototypes of the bioassay,
This document summarizes the work of three teams revising the M39 standard on antibiograms. Team 1 is reviewing and expanding the current M39 document. Team 2 is defining antimicrobial resistance surveillance programs and providing three approaches. Team 3 is discussing how to incorporate data from automated susceptibility testing instruments, laboratory information systems, and electronic health records into antibiograms. The teams will draft their sections and submit a completed draft for review at the next meeting in January 2019. Companion articles on each section will also be written.
IRJET- Survey Paper on Oral Cancer Detection using Machine LearningIRJET Journal
This document discusses several papers on using machine learning techniques for oral cancer detection. It first provides background on oral cancer and the importance of early detection. It then summarizes five research papers that used different machine learning and data mining approaches for oral cancer classification and detection, including using algorithms like Naive Bayes, J48, and SVM on clinical datasets, as well as analyzing oral microbiome data using metagenomics and machine learning models. The goal is to evaluate machine learning as a domain for early oral cancer detection by analyzing patient datasets and developing predictive and classification rules.
Lab-on-a-Chip for cancer diagnostics and monitoringstanislas547
This document discusses lab-on-a-chip technology for cancer diagnostics and monitoring. It describes how lab-on-a-chip allows miniaturization of diagnostic tools to fit on a small chip. Examples are given of chips that can detect cancer markers from small samples of blood or other bodily fluids. The document outlines how lab-on-a-chip could provide frequent, non-invasive monitoring of cancer markers to guide treatment and detect recurrence. However, challenges remain in developing control units and integrating all necessary functions like fluid handling and molecular analysis onto a single chip.
This document summarizes Paolo Vineis' presentation on measuring the exposome. It discusses:
1. Defining the exposome as the totality of environmental exposures from conception onward, including measuring internal exposures through biomarkers in biological samples.
2. Challenges in exposome research like limited biobanked samples, single spot samples, lack of life-course cohorts, and feasibility of extensive exposure assessment and omics measurements.
3. The "meet-in-the-middle" approach which integrates epidemiology, exposure assessment, omics, and bioinformatics to study cancer risk factors using samples from existing cohorts.
Explains how Cancer Management can be made more effective using an integrated approach. From collecting data across reports to predict tumor growth and to be able to help users manage their condition through diet or therapy, the platform can be used to constantly track and monitor outcomes.
The document discusses several use cases for applying data mining and machine learning techniques in healthcare and biomedical research. Three examples are:
1) Early diagnosis of cancers like lung cancer and breast cancer through predictive modeling of patient data to detect cancers at earlier stages when survival rates are higher.
2) Predicting patient responses to drug therapies for cancers like breast cancer by combining different types of molecular profiling data using techniques like support vector machines and random forests.
3) Using imaging data and temporal analysis of metrics like medication purchases to better understand and predict chronic diseases like diabetes and associated health complications.
Improving Prediction Accuracy Results by Using Q-Statistic Algorithm in High ...rahulmonikasharma
Classification problems in high dimensional information with little sort of observations became furthercommon significantly in microarray information. The increasing amount of text data on internet sites affects the agglomerationanalysis. The text agglomeration could also be a positive analysis technique used for partitioning a huge amount of datainto clusters. Hence, the most necessary draw back that affects the text agglomeration technique is that the presenceuninformative and distributed choices in text documents. A broad class of boosting algorithms is known as actingcoordinate-wise gradient descent to attenuate some potential performs of the margins of a data set. This paperproposes a novel analysis live Q-statistic that comes with the soundness of the chosen feature set to boot to theprediction accuracy. Then we've a bent to propose the Booster of associate degree FS algorithm that enhances theworth of the Q-statistic of the algorithm applied.
This document provides advice on different career paths for entrepreneurs including startup founders, indie hackers, info product creators, and personal branding. It discusses the pros and cons of different industries and strategies for each path. It also summarizes the author's own attempts at various online business ventures, noting lessons learned around customer development, defining the product or service, and getting early traction before building out projects. Overall, the document encourages pursuing work you find meaningful and avoiding "bullshit jobs" that lack purpose.
Guest lecture I gave for Deakin university in Melbourne. It talks about my personal experience and the startup world: startups can be grouped in two categories; the ones following the VC funded approach and the ones following the indie hacking and bootstrapped approach.
More Related Content
Similar to Enhancing Diagnostics for Invasive Aspergillosis using Machine Learning
The document summarizes a study on using an HPV mRNA test to detect high-risk HPV infections in 302 patients. The study found the mRNA test had a low invalid rate and provided useful results. In patients with abnormal cytology results of ASCUS or LSIL, 66% and 56% respectively tested negative for high-risk HPV using the mRNA test. The test also detected active HPV infections in 10 out of 48 patients with unclear cytology results. The mRNA test was found to be a fast and reliable method for HPV detection and typing.
Exploiting NLP for Digital Disease InformaticsNigel Collier
Exploiting These are the slides from my talk at the Department of Computer Science at Sheffield University. The talk covers broad ground in my experience of applying natural language processing to knowledge discovery from various media including social media, news and the scientific literature.
Role of the Laboratory in Antimicrobial Resistance DataAnuj Sharma
The document discusses the role of microbiology laboratories in collecting, analyzing, and circulating antimicrobial resistance data. It outlines how laboratories provide antibiograms, which summarize local bacterial susceptibility patterns to guide empiric antibiotic therapy. The data can also be used for quality improvement, infection control, outbreak detection, and surveillance of resistance trends over time. The document recommends following Clinical and Laboratory Standards Institute guidelines for generating high quality antibiograms and discusses how data can be managed and shared using software tools like WHONET.
This document discusses the transition to personalized genomic medicine and some of the challenges involved. It describes how genomic data constitutes "big data" due to the large amount and complexity. While sequencing costs are decreasing, there are still difficulties in analyzing and managing the genomic data. Successive filtering approaches and knowledge databases are proposed to help identify disease-causing variants and link them to therapies.
Now a day’s, pharma research is facing challenges in
deciphering molecular understanding of disease initiation,
progress and establishment as well as performance
assessment of drug molecule on such phases of disease
development. Emerging of next generation sequencing
bases molecular tools were found to be a key method for
creating genome wide genomics landscape of gene
mutations, gene expression and gene regulation events.
Although NGS is a powerful tool for molecular research but
same time it have its own technical challenges. Few major
challenges of NGS based pharmacogenomics is
summarized below
The study reviewed the relationship between dietary and supplemental antioxidants and prostate cancer risk. Antioxidants examined included vitamin E, selenium, vitamin C, carotenoids, and polyphenols from coffee and tea. The evidence for effects of vitamin E and selenium on prostate cancer risk was inconsistent. While some studies found protective effects of selenium at low baseline levels, others found no effect. Studies of vitamin C, carotenoids, and polyphenols like green tea provided inconclusive or no evidence of relationships with prostate cancer risk.
Sk microfluidics and lab on-a-chip-ch6stanislas547
This document discusses cancer diagnostics and monitoring using microfluidic lab-on-a-chip technologies. It describes how integrating DNA/protein separation, detection, and analysis into microfluidic chips could allow for frequent, non-invasive testing of cancer biomarkers in blood or other bodily fluids. This would enable more precise monitoring of cancer treatment effectiveness and earlier detection of recurrence compared to standard techniques. The document outlines approaches involving microfluidic separation channels coupled to molecular detection and proposes a credit card-sized disposable chip sensor integrated with a small control unit for point-of-care cancer screening and monitoring.
This document evaluates several supervised machine learning algorithms for classifying gene expression data from microarray experiments. It describes analyzing two gene expression datasets, the leukemia and DLBCL datasets, using k-nearest neighbors, naive Bayes, decision trees, and support vector machines with and without feature selection. The results show that support vector machines achieved the best performance overall, and that feature selection improved the accuracy of all the algorithms.
The document discusses the strengths, weaknesses, opportunities, and threats (SWOT) of using whole genome sequencing (WGS) for surveillance and diagnostics of zoonotic bacteria. It provides a case study of using WGS to track the nosocomial transmission of Pseudomonas aeruginosa between patients and the hospital water supply. WGS was able to identify transmission routes and microevolution of the bacteria with single nucleotide resolution. However, challenges include the need for robust and standardized analysis methods as well as experimental design considerations. Overall, WGS provides opportunities for improved outbreak tracking, classification, and diagnostics if its strengths are leveraged and weaknesses addressed.
The document provides an overview of a company called Miroculus that has developed an accurate, easy to use, and affordable microRNA detection platform. Some key points:
1. Miroculus has 4 full-time employees and has labs in Heidelberg and an office in Mexico City. They are developing a platform to detect circulating microRNAs which can be diagnostic biomarkers for diseases like cancer.
2. Their platform includes an accurate bioassay that can detect microRNAs from plasma samples, a low-cost device to run the bioassay, and data analytics algorithms. This allows quantitative and qualitative molecular monitoring for disease in a simple and affordable way.
3. They have prototypes of the bioassay,
This document summarizes the work of three teams revising the M39 standard on antibiograms. Team 1 is reviewing and expanding the current M39 document. Team 2 is defining antimicrobial resistance surveillance programs and providing three approaches. Team 3 is discussing how to incorporate data from automated susceptibility testing instruments, laboratory information systems, and electronic health records into antibiograms. The teams will draft their sections and submit a completed draft for review at the next meeting in January 2019. Companion articles on each section will also be written.
IRJET- Survey Paper on Oral Cancer Detection using Machine LearningIRJET Journal
This document discusses several papers on using machine learning techniques for oral cancer detection. It first provides background on oral cancer and the importance of early detection. It then summarizes five research papers that used different machine learning and data mining approaches for oral cancer classification and detection, including using algorithms like Naive Bayes, J48, and SVM on clinical datasets, as well as analyzing oral microbiome data using metagenomics and machine learning models. The goal is to evaluate machine learning as a domain for early oral cancer detection by analyzing patient datasets and developing predictive and classification rules.
Lab-on-a-Chip for cancer diagnostics and monitoringstanislas547
This document discusses lab-on-a-chip technology for cancer diagnostics and monitoring. It describes how lab-on-a-chip allows miniaturization of diagnostic tools to fit on a small chip. Examples are given of chips that can detect cancer markers from small samples of blood or other bodily fluids. The document outlines how lab-on-a-chip could provide frequent, non-invasive monitoring of cancer markers to guide treatment and detect recurrence. However, challenges remain in developing control units and integrating all necessary functions like fluid handling and molecular analysis onto a single chip.
This document summarizes Paolo Vineis' presentation on measuring the exposome. It discusses:
1. Defining the exposome as the totality of environmental exposures from conception onward, including measuring internal exposures through biomarkers in biological samples.
2. Challenges in exposome research like limited biobanked samples, single spot samples, lack of life-course cohorts, and feasibility of extensive exposure assessment and omics measurements.
3. The "meet-in-the-middle" approach which integrates epidemiology, exposure assessment, omics, and bioinformatics to study cancer risk factors using samples from existing cohorts.
Explains how Cancer Management can be made more effective using an integrated approach. From collecting data across reports to predict tumor growth and to be able to help users manage their condition through diet or therapy, the platform can be used to constantly track and monitor outcomes.
The document discusses several use cases for applying data mining and machine learning techniques in healthcare and biomedical research. Three examples are:
1) Early diagnosis of cancers like lung cancer and breast cancer through predictive modeling of patient data to detect cancers at earlier stages when survival rates are higher.
2) Predicting patient responses to drug therapies for cancers like breast cancer by combining different types of molecular profiling data using techniques like support vector machines and random forests.
3) Using imaging data and temporal analysis of metrics like medication purchases to better understand and predict chronic diseases like diabetes and associated health complications.
Improving Prediction Accuracy Results by Using Q-Statistic Algorithm in High ...rahulmonikasharma
Classification problems in high dimensional information with little sort of observations became furthercommon significantly in microarray information. The increasing amount of text data on internet sites affects the agglomerationanalysis. The text agglomeration could also be a positive analysis technique used for partitioning a huge amount of datainto clusters. Hence, the most necessary draw back that affects the text agglomeration technique is that the presenceuninformative and distributed choices in text documents. A broad class of boosting algorithms is known as actingcoordinate-wise gradient descent to attenuate some potential performs of the margins of a data set. This paperproposes a novel analysis live Q-statistic that comes with the soundness of the chosen feature set to boot to theprediction accuracy. Then we've a bent to propose the Booster of associate degree FS algorithm that enhances theworth of the Q-statistic of the algorithm applied.
Similar to Enhancing Diagnostics for Invasive Aspergillosis using Machine Learning (20)
This document provides advice on different career paths for entrepreneurs including startup founders, indie hackers, info product creators, and personal branding. It discusses the pros and cons of different industries and strategies for each path. It also summarizes the author's own attempts at various online business ventures, noting lessons learned around customer development, defining the product or service, and getting early traction before building out projects. Overall, the document encourages pursuing work you find meaningful and avoiding "bullshit jobs" that lack purpose.
Guest lecture I gave for Deakin university in Melbourne. It talks about my personal experience and the startup world: startups can be grouped in two categories; the ones following the VC funded approach and the ones following the indie hacking and bootstrapped approach.
Measuring Dependency via Intrinsic Dimensionality (ICPR 2016)Simone Romano
Here I present a novel measure of dependency between variables: the Intrinsic Dimensional Dependency (IDD). IDD = 1 when there exist a constant number of 1-dimensional manifolds between the variables analysed.
A Framework to Adjust Dependency Measure Estimates for Chance Simone Romano
Winner of the best paper award at the SIAM International Conference on Data Mining.
Estimating the strength of dependency between two variables is fundamental for exploratory analysis and many other applications in data mining. For example: non-linear dependencies between two continuous variables can be explored with the Maximal Information Coefficient (MIC); and categorical variables that are dependent to the target class are selected using Gini gain in random forests. Nonetheless, because dependency measures are estimated on finite samples, the interpretability of their quantification and the accuracy when ranking dependencies become challenging. Dependency estimates are not equal to 0 when variables are independent, cannot be compared if computed on different sample size, and they are inflated by chance on variables with more categories. In this paper, we propose a framework to adjust dependency measure estimates on finite samples. Our adjustments, which are simple and applicable to any dependency measure, are helpful in improving interpretability when quantifying dependency and in improving accuracy on the task of ranking dependencies. In particular, we demonstrate that our approach enhances the interpretability of MIC when used as a proxy for the amount of noise between variables, and to gain accuracy when ranking variables during the splitting procedure in random forests.
In this presentation, I discuss the topics I covered during my PhD:
Dependency measures between variables are fundamental for a number of important applications in machine learning. They are ubiquitously used: for feature selection, as splitting criteria in random forest, for clustering comparison and validation, to infer biological networks, to list a few. Nonetheless there exist a number of problems when dependencies are estimated on finite data: detection, quantification, and ranking of dependencies are challenging.
This thesis proposes a series of contributions to improve performances on each of the 3 goals above. During the seminar I will demonstrate that:
- Adjusted measures can improve on the tasks of quantification and ranking. In particular, I will discuss some adjustments applied to the Maximal Information Coefficient (MIC), random forests, and clustering comparisons;
- A measure based on mutual information and randomisation we designed is competitive on the tasks of detection and ranking of relationships. We named this measure the Randomised Information Coefficient (RIC) and tested it on the applications of biological network inference and multi-variable feature selection.
My Entry to the Sportsbet/CIKM competitionSimone Romano
The Sportsbet/CIKM competition (http://sportsbetcikm15.com) is a data mining and machine learning challenge: use data about Australian Football League (AFL) matches already played to predict future ones. These slides are related to the entry I submitted to the competition.
The cost of acquiring information by natural selectionCarl Bergstrom
This is a short talk that I gave at the Banff International Research Station workshop on Modeling and Theory in Population Biology. The idea is to try to understand how the burden of natural selection relates to the amount of information that selection puts into the genome.
It's based on the first part of this research paper:
The cost of information acquisition by natural selection
Ryan Seamus McGee, Olivia Kosterlitz, Artem Kaznatcheev, Benjamin Kerr, Carl T. Bergstrom
bioRxiv 2022.07.02.498577; doi: https://doi.org/10.1101/2022.07.02.498577
The technology uses reclaimed CO₂ as the dyeing medium in a closed loop process. When pressurized, CO₂ becomes supercritical (SC-CO₂). In this state CO₂ has a very high solvent power, allowing the dye to dissolve easily.
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Leonel Morgado
Current descriptions of immersive learning cases are often difficult or impossible to compare. This is due to a myriad of different options on what details to include, which aspects are relevant, and on the descriptive approaches employed. Also, these aspects often combine very specific details with more general guidelines or indicate intents and rationales without clarifying their implementation. In this paper we provide a method to describe immersive learning cases that is structured to enable comparisons, yet flexible enough to allow researchers and practitioners to decide which aspects to include. This method leverages a taxonomy that classifies educational aspects at three levels (uses, practices, and strategies) and then utilizes two frameworks, the Immersive Learning Brain and the Immersion Cube, to enable a structured description and interpretation of immersive learning cases. The method is then demonstrated on a published immersive learning case on training for wind turbine maintenance using virtual reality. Applying the method results in a structured artifact, the Immersive Learning Case Sheet, that tags the case with its proximal uses, practices, and strategies, and refines the free text case description to ensure that matching details are included. This contribution is thus a case description method in support of future comparative research of immersive learning cases. We then discuss how the resulting description and interpretation can be leveraged to change immersion learning cases, by enriching them (considering low-effort changes or additions) or innovating (exploring more challenging avenues of transformation). The method holds significant promise to support better-grounded research in immersive learning.
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...Sérgio Sacani
Context. With a mass exceeding several 104 M⊙ and a rich and dense population of massive stars, supermassive young star clusters
represent the most massive star-forming environment that is dominated by the feedback from massive stars and gravitational interactions
among stars.
Aims. In this paper we present the Extended Westerlund 1 and 2 Open Clusters Survey (EWOCS) project, which aims to investigate
the influence of the starburst environment on the formation of stars and planets, and on the evolution of both low and high mass stars.
The primary targets of this project are Westerlund 1 and 2, the closest supermassive star clusters to the Sun.
Methods. The project is based primarily on recent observations conducted with the Chandra and JWST observatories. Specifically,
the Chandra survey of Westerlund 1 consists of 36 new ACIS-I observations, nearly co-pointed, for a total exposure time of 1 Msec.
Additionally, we included 8 archival Chandra/ACIS-S observations. This paper presents the resulting catalog of X-ray sources within
and around Westerlund 1. Sources were detected by combining various existing methods, and photon extraction and source validation
were carried out using the ACIS-Extract software.
Results. The EWOCS X-ray catalog comprises 5963 validated sources out of the 9420 initially provided to ACIS-Extract, reaching a
photon flux threshold of approximately 2 × 10−8 photons cm−2
s
−1
. The X-ray sources exhibit a highly concentrated spatial distribution,
with 1075 sources located within the central 1 arcmin. We have successfully detected X-ray emissions from 126 out of the 166 known
massive stars of the cluster, and we have collected over 71 000 photons from the magnetar CXO J164710.20-455217.
The binding of cosmological structures by massless topological defectsSérgio Sacani
Assuming spherical symmetry and weak field, it is shown that if one solves the Poisson equation or the Einstein field
equations sourced by a topological defect, i.e. a singularity of a very specific form, the result is a localized gravitational
field capable of driving flat rotation (i.e. Keplerian circular orbits at a constant speed for all radii) of test masses on a thin
spherical shell without any underlying mass. Moreover, a large-scale structure which exploits this solution by assembling
concentrically a number of such topological defects can establish a flat stellar or galactic rotation curve, and can also deflect
light in the same manner as an equipotential (isothermal) sphere. Thus, the need for dark matter or modified gravity theory is
mitigated, at least in part.
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...PsychoTech Services
A proprietary approach developed by bringing together the best of learning theories from Psychology, design principles from the world of visualization, and pedagogical methods from over a decade of training experience, that enables you to: Learn better, faster!
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...Advanced-Concepts-Team
Presentation in the Science Coffee of the Advanced Concepts Team of the European Space Agency on the 07.06.2024.
Speaker: Diego Blas (IFAE/ICREA)
Title: Gravitational wave detection with orbital motion of Moon and artificial
Abstract:
In this talk I will describe some recent ideas to find gravitational waves from supermassive black holes or of primordial origin by studying their secular effect on the orbital motion of the Moon or satellites that are laser ranged.
The debris of the ‘last major merger’ is dynamically youngSérgio Sacani
The Milky Way’s (MW) inner stellar halo contains an [Fe/H]-rich component with highly eccentric orbits, often referred to as the
‘last major merger.’ Hypotheses for the origin of this component include Gaia-Sausage/Enceladus (GSE), where the progenitor
collided with the MW proto-disc 8–11 Gyr ago, and the Virgo Radial Merger (VRM), where the progenitor collided with the
MW disc within the last 3 Gyr. These two scenarios make different predictions about observable structure in local phase space,
because the morphology of debris depends on how long it has had to phase mix. The recently identified phase-space folds in Gaia
DR3 have positive caustic velocities, making them fundamentally different than the phase-mixed chevrons found in simulations
at late times. Roughly 20 per cent of the stars in the prograde local stellar halo are associated with the observed caustics. Based
on a simple phase-mixing model, the observed number of caustics are consistent with a merger that occurred 1–2 Gyr ago.
We also compare the observed phase-space distribution to FIRE-2 Latte simulations of GSE-like mergers, using a quantitative
measurement of phase mixing (2D causticality). The observed local phase-space distribution best matches the simulated data
1–2 Gyr after collision, and certainly not later than 3 Gyr. This is further evidence that the progenitor of the ‘last major merger’
did not collide with the MW proto-disc at early times, as is thought for the GSE, but instead collided with the MW disc within
the last few Gyr, consistent with the body of work surrounding the VRM.
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdfSelcen Ozturkcan
Ozturkcan, S., Berndt, A., & Angelakis, A. (2024). Mending clothing to support sustainable fashion. Presented at the 31st Annual Conference by the Consortium for International Marketing Research (CIMaR), 10-13 Jun 2024, University of Gävle, Sweden.
Current Ms word generated power point presentation covers major details about the micronuclei test. It's significance and assays to conduct it. It is used to detect the micronuclei formation inside the cells of nearly every multicellular organism. It's formation takes place during chromosomal sepration at metaphase.
PPT on Direct Seeded Rice presented at the three-day 'Training and Validation Workshop on Modules of Climate Smart Agriculture (CSA) Technologies in South Asia' workshop on April 22, 2024.
8.Isolation of pure cultures and preservation of cultures.pdf
Enhancing Diagnostics for Invasive Aspergillosis using Machine Learning
1. Introduction Results Description Conclusions
HISA Big Data 2014 – April 3rd 2014 ( #BD14 )
Enhancing Diagnostics for Invasive Aspergillosis using
Machine Learning
Simone Romano
simone.romano@unimelb.edu.au
@ialuronico
James Bailey1
Lawrence Cavedon1,2,3
Orla Morrissey4,5
Monica slavin6,7
Karin Verspoor1,2
1The University of Melbourne, Dept. of Computing and Information Systems
2NICTA (National ICT Aust.) VRL
3School of Computer Science and IT, RMIT University
4Alfred Health 5Monash University
6Peter MacCallum Cancer Centre 7Melbourne Health
Simone Romano The University of Melbourne
Enhancing Diagnostics for Invasive Aspergillosis using Machine Learning
2. Introduction Results Description Conclusions
Introduction
Invasive Aspergillosis
Challenging Big Data Task
Results
Diagnostic Model
Description
Machine Learning for Diagnosis
Diagnosis of Invasive Aspergillosis
Conclusions
Summary
Future Work
Simone Romano The University of Melbourne
Enhancing Diagnostics for Invasive Aspergillosis using Machine Learning
3. Introduction Results Description Conclusions
Invasive Aspergillosis
Invasive Aspergillosis (IA)
Serious fungal infection and major cause of
mortality in patients undergoing allogeneic
stem cell transplantation or chemotherapy
for acute leukaemia.
Figure : Pulmonary IA.
http://en.wikipedia.org/wiki/Aspergillosis
Facts
34–43% mortality rate;
culture methods low sensitivity, only 40–50% IA cases identified;
IA patient results in +7 days of hospital stay and +$30,957.
Simone Romano The University of Melbourne
Enhancing Diagnostics for Invasive Aspergillosis using Machine Learning
4. Introduction Results Description Conclusions
Invasive Aspergillosis
Diagnosis and Treatment
Cases are classified with ProvenIA/ProbableIA/PossibleIA.
Current criteria for diagnosing IA are:
1. microbiology, risk factors, and CT scan findings;
2. Improved biomarkers such as Aspergillus PCR and Galactomannan
(GM) tested twice a week.
positive biopsy OR (positive CT scan AND single positive PCR/GM)
⇒ ProvenIA
≥ 2 consecutive positive PCR/GM in 2 week time frame
⇒ ProbableIA
Problem
One single positive biomarker might be a False Positive
⇒ Unnecessary harmful treatment.
Simone Romano The University of Melbourne
Enhancing Diagnostics for Invasive Aspergillosis using Machine Learning
5. Introduction Results Description Conclusions
Challenging Big Data Task
Big Data task
In a randomised controlled trial comparing the two different strategies for
diagnosis IA, large amount of data was collected from 240 patients
between Sept. 2005 and Nov. 2009 at six Australian Centres.
Objective: Leverage such data to produce more accurate prediction of
IA with Machine Learning techniques.
Are we really dealing with Big Data?
Simone Romano The University of Melbourne
Enhancing Diagnostics for Invasive Aspergillosis using Machine Learning
6. Introduction Results Description Conclusions
Challenging Big Data Task
Big Data task
In a randomised controlled trial comparing the two different strategies for
diagnosis IA, large amount of data was collected from 240 patients
between Sept. 2005 and Nov. 2009 at six Australian Centres.
Objective: Leverage such data to produce more accurate prediction of
IA with Machine Learning techniques.
Are we really dealing with Big Data?
All patients tracked for 26 weeks providing rich longitudinal data on
daily and weekly tests for each patient.
240 × 26 × 7 = 45,680 records.
Bed-side interpretation is a challenging task!
Simone Romano The University of Melbourne
Enhancing Diagnostics for Invasive Aspergillosis using Machine Learning
7. Introduction Results Description Conclusions
Diagnostic Model
Introduction
Invasive Aspergillosis
Challenging Big Data Task
Results
Diagnostic Model
Description
Machine Learning for Diagnosis
Diagnosis of Invasive Aspergillosis
Conclusions
Summary
Future Work
Simone Romano The University of Melbourne
Enhancing Diagnostics for Invasive Aspergillosis using Machine Learning
8. Introduction Results Description Conclusions
Diagnostic Model
Model
Our training set is a collection of 358 single positive biomarker tests that
precede the earliest label of IA.
Transplant/Chemotherapy
begins
1st 2nd 3rd 4th 5th months
positive biomarkers infection
Just 29 of the positive biomarkers were associated with a Proven IA or
Probable IA label within a week (329 false positives)
Built a model to output a probability of infection within a week
value;
Validated by a patient-level cross-validation framework.
Simone Romano The University of Melbourne
Enhancing Diagnostics for Invasive Aspergillosis using Machine Learning
9. Introduction Results Description Conclusions
Diagnostic Model
1 − TNR
TPR
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
AUC = 0.63
Simone Romano The University of Melbourne
Enhancing Diagnostics for Invasive Aspergillosis using Machine Learning
10. Introduction Results Description Conclusions
Diagnostic Model
1 − TNR
TPR
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
AUC = 0.63 AUC not too good
Simone Romano The University of Melbourne
Enhancing Diagnostics for Invasive Aspergillosis using Machine Learning
11. Introduction Results Description Conclusions
Diagnostic Model
1 − TNR
TPR
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
AUC = 0.63
But good in
classifying negatives!
Simone Romano The University of Melbourne
Enhancing Diagnostics for Invasive Aspergillosis using Machine Learning
12. Introduction Results Description Conclusions
Diagnostic Model
Result
Setting a low threshold on the model output probability to achieve high
NPV (100%) we were able to identify 95 (26.5%) tests that do not
lead to an IA infection (TNR = 28.9%) within a week.
⇒ Doctors can avoid to start treatment in 26.5% cases!
avoid over-treatment;
reduce drug-toxicity;
reduce antifungal drug costs
(E.g. Amphotericin B $8,260 per patient per week).
Simone Romano The University of Melbourne
Enhancing Diagnostics for Invasive Aspergillosis using Machine Learning
13. Introduction Results Description Conclusions
Machine Learning for Diagnosis
Introduction
Invasive Aspergillosis
Challenging Big Data Task
Results
Diagnostic Model
Description
Machine Learning for Diagnosis
Diagnosis of Invasive Aspergillosis
Conclusions
Summary
Future Work
Simone Romano The University of Melbourne
Enhancing Diagnostics for Invasive Aspergillosis using Machine Learning
14. Introduction Results Description Conclusions
Machine Learning for Diagnosis
Classification Models
Logistic regression;
Decision trees;
Random forest
Training set
Voting
resampling
random tree
resampling
random tree
resampling
random tree
resampling
random tree
resampling
random tree
Random forest because:
It has the capability to work with heterogeneous features
(categorical/continuous);
It can work with many features.
Simone Romano The University of Melbourne
Enhancing Diagnostics for Invasive Aspergillosis using Machine Learning
15. Introduction Results Description Conclusions
Diagnosis of Invasive Aspergillosis
Features to use
Known at baseline: Gender, age, BMI, smoking attitude status,etc.
Daily tested: neutrophil count, body temperature, amount of
administered steroids, haemoglobin, platelets, white cell count, urea,
creatinine, ALT, AST, GGT, bilirubin, LDH, etc.
Simone Romano The University of Melbourne
Enhancing Diagnostics for Invasive Aspergillosis using Machine Learning
16. Introduction Results Description Conclusions
Diagnosis of Invasive Aspergillosis
Features to use
Known at baseline: Gender, age, BMI, smoking attitude status,etc.
Daily tested: neutrophil count, body temperature, amount of
administered steroids, haemoglobin, platelets, white cell count, urea,
creatinine, ALT, AST, GGT, bilirubin, LDH, etc.
Very heterogeneous features!!!
Simone Romano The University of Melbourne
Enhancing Diagnostics for Invasive Aspergillosis using Machine Learning
17. Introduction Results Description Conclusions
Diagnosis of Invasive Aspergillosis
Heterogeneous Features
Features constant along the treatment: Age, Gender, etc.
Features that varied over time: neutrophil count, temperature,
corticosteroid doses, etc.
When we have a positive biomarker test we can use the recent past
information to predict IA. We consider recent past the values in the 3
week window prior a single positive test result.
May Jun Jul
36.537.538.5
date
temperature
window
Simone Romano The University of Melbourne
Enhancing Diagnostics for Invasive Aspergillosis using Machine Learning
18. Introduction Results Description Conclusions
Diagnosis of Invasive Aspergillosis
Features that varied over time
Duration Features we count the number of days the value each
parameter lay within a particular range. For example, we divide the
measured temperature measurements into the intervals [36,37],
(37,38], (38,39], (39, 40], and and greater than 40(>40) Celsius
degrees and counted the number of days temperature occurred in
each interval;
Trajectories We select two days in the 3 week window preceding a
positive test test and compute the mean value, the standard
deviation, and the relative difference between those values. We
do it for all possible intervals in the window.
Simone Romano The University of Melbourne
Enhancing Diagnostics for Invasive Aspergillosis using Machine Learning
19. Introduction Results Description Conclusions
Summary
Introduction
Invasive Aspergillosis
Challenging Big Data Task
Results
Diagnostic Model
Description
Machine Learning for Diagnosis
Diagnosis of Invasive Aspergillosis
Conclusions
Summary
Future Work
Simone Romano The University of Melbourne
Enhancing Diagnostics for Invasive Aspergillosis using Machine Learning
20. Introduction Results Description Conclusions
Summary
Summary
Target: Enhance Diagnostics for biomarkers for Invasive
Aspergillosis
Method: Random forest for heterogeneous features creating
duration features, and trajectories features;
Validation: patient-level cross-validation;
Results: Setting a low threshold on the output probability, NPV =
100%, TNR = 28.9%. Safe avoidance of antifungal
therapy for 26.5% cases. Savings around $8K per patient
per week.
Simone Romano The University of Melbourne
Enhancing Diagnostics for Invasive Aspergillosis using Machine Learning
21. Introduction Results Description Conclusions
Future Work
Future Work
make the model more accurate in predicting when a positive test is
associated with an immediate infection to trigger the antifungal
treatment earlier in time;
search for alternative diagnosis when the outcomes are equally
probable according to the model;
make the model output more interpretable to clinical practitioners,
e.g. by identifying the trajectories in the data which generate a low
or high probability of IA.
Simone Romano The University of Melbourne
Enhancing Diagnostics for Invasive Aspergillosis using Machine Learning
22. Introduction Results Description Conclusions
Future Work
Thank you.
Questions?
Simone Romano The University of Melbourne
Enhancing Diagnostics for Invasive Aspergillosis using Machine Learning