This document summarizes the current state of using machine learning to predict traits and behaviors from brain images. It discusses typical machine learning workflows and a favorite predictive model called the Brain Basis Set. It reviews what traits have been successfully predicted from brain images so far. It also discusses characteristics of successful predictive models, the role of large datasets, and ways prediction could be improved, such as through better data preprocessing and addressing bias. Throughout, it emphasizes the importance of transparency, reproducibility, and collaboration.
Key Insights Of Using Deep Learning To Analyze Healthcare Data | Workshop Fro...Michael Batavia
In this presentation, I present how to properly discover, analyze and find trends in various types of healthcare data in order to utilize machine learning algorithms to predict future trends in the data. This presentation directly discusses the implications of data analysis in predicting benign and malignant cancers but the same techniques in this presentation can be applied to any other types of data in the real world.
For a more in-depth presentation, please watch the video presentation of this slideshow linked here: https://youtu.be/gXSl2iWcJ00
A talk given by Eugene Dubossarsky on predictive analytics at the Big Data Analytics meetup in Sydney this month. The talk is available at http://www.youtube.com/watch?v=aG16YSFgtLY
This was the presentation for the Microsoft Community Technology Update of 2016. The idea was to introduce to people the concept of Machine Learning and its easy to get started if you are keen. My objective was also to communicate how some of the algorithms work and they require no more than basic understanding of Math to get going, sometimes not even that.
The algorithms we covered were, Support Vector Machines (SVM), Decision Tree using R2D3 and Neural Networks for classification. We used the Tensorflow Playground to help understand the Neural Network and Deep Learning concepts.
I gave an analogy of how Machine Learning process is like making a smoothie where your algorithm is a recipe, your data are your ingredients, your computer is your blender and your smoothie is the model that you developed. I used the same example to convey the concept of Training Validation and Testing. Coverage of Type 1 and Type 2 errors together with the metrics of Recall and Precision was covered as well. Finally I closed the session with what are some good resources to get started with Machine Learning for all skill levels. There are references to websites, courses, kaggle competition, podcasts, cheat sheets and books.
If you have heard about machine learning and want to try out some of it, please check this out. In this article I am just trying to jot down few basics and must know stuff to kick start in this field. The objective of this compilation; to trigger the interest in this field of data analytics and to demystify the abstract concept. This article is not for the advanced data scientists, this is for the beginners or those who want a quick refresher.
Key Insights Of Using Deep Learning To Analyze Healthcare Data | Workshop Fro...Michael Batavia
In this presentation, I present how to properly discover, analyze and find trends in various types of healthcare data in order to utilize machine learning algorithms to predict future trends in the data. This presentation directly discusses the implications of data analysis in predicting benign and malignant cancers but the same techniques in this presentation can be applied to any other types of data in the real world.
For a more in-depth presentation, please watch the video presentation of this slideshow linked here: https://youtu.be/gXSl2iWcJ00
A talk given by Eugene Dubossarsky on predictive analytics at the Big Data Analytics meetup in Sydney this month. The talk is available at http://www.youtube.com/watch?v=aG16YSFgtLY
This was the presentation for the Microsoft Community Technology Update of 2016. The idea was to introduce to people the concept of Machine Learning and its easy to get started if you are keen. My objective was also to communicate how some of the algorithms work and they require no more than basic understanding of Math to get going, sometimes not even that.
The algorithms we covered were, Support Vector Machines (SVM), Decision Tree using R2D3 and Neural Networks for classification. We used the Tensorflow Playground to help understand the Neural Network and Deep Learning concepts.
I gave an analogy of how Machine Learning process is like making a smoothie where your algorithm is a recipe, your data are your ingredients, your computer is your blender and your smoothie is the model that you developed. I used the same example to convey the concept of Training Validation and Testing. Coverage of Type 1 and Type 2 errors together with the metrics of Recall and Precision was covered as well. Finally I closed the session with what are some good resources to get started with Machine Learning for all skill levels. There are references to websites, courses, kaggle competition, podcasts, cheat sheets and books.
If you have heard about machine learning and want to try out some of it, please check this out. In this article I am just trying to jot down few basics and must know stuff to kick start in this field. The objective of this compilation; to trigger the interest in this field of data analytics and to demystify the abstract concept. This article is not for the advanced data scientists, this is for the beginners or those who want a quick refresher.
This Edureka Sentiment Analysis tutorial will help you understand all the basics of Sentiment Analysis algorithm along with examples. This tutorial also has an interesting demo on Sentiment Analysis in R - El Clasico Sentiment Analysis. Below are the topics covered in this tutorial:
1. What is Machine Learning?
2. Why Sentiment Analysis?
3. What is Sentiment Analysis?
4. How Sentiment Analysis Works?
5. Sentiment Analysis Demo - El Clasico
6. Sentiment Analysis Use Case
This was the first introductory course on getting started with Azure Machine Learning that was held by Microsoft User Group on 13th September 2017. These were the slides that were presented in the session. The excel file is available separately and can be downloaded from here (https://1drv.ms/x/s!AmjwdE_MMESksHOrlLXP400eHb3p). We covered on how decision trees are created and extended the concept to building Random Forest algorithm.
This was the first introductory course on getting started with Azure Machine Learning that was held by Microsoft User Group on 13th September 2017 and 25th October 2017. These were the slides that were presented in the session. We covered the Decision Tree and Support Vector Machine (SVM) based algorithms:
1. Two-Class /Multiclasss Decision Forest
2. Two-Class /Multiclasss Decision Jungle
3. Two-Class Boosted Decision Tree
4. Two-Class Support Vector Machine (SVM)
5. Two-Class Locally Deep Support Vector Machine (LDSVM)
We also looked at how to look at Feature Importance using "Permutation Feature Importance" module
We also looked at "Partition and Sample" module and how to use it together with "Cross Validate Model" module.
Azure Boot Camp 2017 getting started with azure machine learningSetu Chokshi
This presentation was done at the Azure Boot Camp 2017 in Singapore held on 23rd April 2017.
This presentation covers an general introduction to machine learning and Azure Machine Learning Platform. To introduce the various features of the platform we make use of the Matchbox Recommender to build a collaborative filtering on the MovieLens dataset. We also obtain the state of art Normalized Discounted Cumulative Gain (NDCG) of 0.92 without supplementing any additional user data or movie tags. We just use the movie ratings as the base.
I also show how to build an intuition for the recommendation systems using an Excel Worksheet and show how the latent factors in the recommendation systems are built and what the algorithm does to obtain an optimal solution. The link to the Excel Worksheet will be made available after the session is over.
Have you always wanted to add predictive capabilities to your application, but haven’t been able to find the time or the right technology to get started? Everybody wants to build smart apps, but only a few are Data Scientists. We had the same issue inside Amazon, so we created a Machine Learning engine that Developers can easily use. The same approach is now available in the AWS cloud. We demonstrate how to use Amazon Machine Learning (Amazon ML) to create machine learning models, deploy them to production, and obtain predictions in real-time. We then demonstrate how to build a complete smart application using Amazon ML, Amazon Kinesis, and AWS Lambda. We walk you through the process flow and architecture, demonstrate outcomes, and then dive into the implementation. In this session, you learn how to use Amazon ML as well as how to integrate Amazon ML into your applications to take advantage of predictive analysis in the cloud.
Prediction Analysis in Clinical and Basic NeuroscienceCameron Craddock
Talk given at the Resting State and Brain Connectivity 2016 conference symposium "The Emerging Field of Predictive Analytics in Neuroimaging: Applications, Challenges and Perspectives"
Using Bioinformatics Data to inform Therapeutics discovery and developmentEleanor Howe
Diamond Age Data Science and Zafgen, Inc, co-present on their work in using bioinformatics data effectively in the context of a small therapeutics company.
Eleanor Howe, PhD, CEO of Diamond Age, presents on the different types of computational biologist, the characteristics of a good bioinformatics team, and the pluses and minuses of using deep learning/AI in a discovery biology context.
Huseyin Mehmet, VP of Discovery Research at Zafgen, describes his team's work with Diamond Age and uses their capabilities to inform Zafgen's drug development. He discusses the needs of biotech companies for a diverse, experience bioinformatics team.
1. Introduction and how to get into Data
2. Data Engineering and skills needed
3. Comparison of Data Analytics for statistic and real time streaming data
4. Bayesian Reasoning for Data
In a world of data explosion, the rate of data generation and consumption is on the increasing side,
there comes the buzzword - Big Data.
Big Data is the concept of fast-moving, large-volume data in varying dimensions (sources) and
highly unpredicted sources.
The 4Vs of Big Data
● Volume - Scale of Data
● Velocity - Analysis of Streaming Data
● Variety - Different forms of Data
● Veracity - Uncertainty of Data
With increasing data availability, the new trend in the industry demands not just data collection but making an ample sense of acquired data - thereby, the concept of Data Analytics.
Taking it a step further to further make futuristic prediction and realistic inferences - the concept
of Machine Learning.
A blend of both gives a robust analysis of data for the past, now and the future.
There is a thin line between data analytics and Machine learning which becomes very obvious
when you dig deep.
This Edureka Sentiment Analysis tutorial will help you understand all the basics of Sentiment Analysis algorithm along with examples. This tutorial also has an interesting demo on Sentiment Analysis in R - El Clasico Sentiment Analysis. Below are the topics covered in this tutorial:
1. What is Machine Learning?
2. Why Sentiment Analysis?
3. What is Sentiment Analysis?
4. How Sentiment Analysis Works?
5. Sentiment Analysis Demo - El Clasico
6. Sentiment Analysis Use Case
This was the first introductory course on getting started with Azure Machine Learning that was held by Microsoft User Group on 13th September 2017. These were the slides that were presented in the session. The excel file is available separately and can be downloaded from here (https://1drv.ms/x/s!AmjwdE_MMESksHOrlLXP400eHb3p). We covered on how decision trees are created and extended the concept to building Random Forest algorithm.
This was the first introductory course on getting started with Azure Machine Learning that was held by Microsoft User Group on 13th September 2017 and 25th October 2017. These were the slides that were presented in the session. We covered the Decision Tree and Support Vector Machine (SVM) based algorithms:
1. Two-Class /Multiclasss Decision Forest
2. Two-Class /Multiclasss Decision Jungle
3. Two-Class Boosted Decision Tree
4. Two-Class Support Vector Machine (SVM)
5. Two-Class Locally Deep Support Vector Machine (LDSVM)
We also looked at how to look at Feature Importance using "Permutation Feature Importance" module
We also looked at "Partition and Sample" module and how to use it together with "Cross Validate Model" module.
Azure Boot Camp 2017 getting started with azure machine learningSetu Chokshi
This presentation was done at the Azure Boot Camp 2017 in Singapore held on 23rd April 2017.
This presentation covers an general introduction to machine learning and Azure Machine Learning Platform. To introduce the various features of the platform we make use of the Matchbox Recommender to build a collaborative filtering on the MovieLens dataset. We also obtain the state of art Normalized Discounted Cumulative Gain (NDCG) of 0.92 without supplementing any additional user data or movie tags. We just use the movie ratings as the base.
I also show how to build an intuition for the recommendation systems using an Excel Worksheet and show how the latent factors in the recommendation systems are built and what the algorithm does to obtain an optimal solution. The link to the Excel Worksheet will be made available after the session is over.
Have you always wanted to add predictive capabilities to your application, but haven’t been able to find the time or the right technology to get started? Everybody wants to build smart apps, but only a few are Data Scientists. We had the same issue inside Amazon, so we created a Machine Learning engine that Developers can easily use. The same approach is now available in the AWS cloud. We demonstrate how to use Amazon Machine Learning (Amazon ML) to create machine learning models, deploy them to production, and obtain predictions in real-time. We then demonstrate how to build a complete smart application using Amazon ML, Amazon Kinesis, and AWS Lambda. We walk you through the process flow and architecture, demonstrate outcomes, and then dive into the implementation. In this session, you learn how to use Amazon ML as well as how to integrate Amazon ML into your applications to take advantage of predictive analysis in the cloud.
Prediction Analysis in Clinical and Basic NeuroscienceCameron Craddock
Talk given at the Resting State and Brain Connectivity 2016 conference symposium "The Emerging Field of Predictive Analytics in Neuroimaging: Applications, Challenges and Perspectives"
Using Bioinformatics Data to inform Therapeutics discovery and developmentEleanor Howe
Diamond Age Data Science and Zafgen, Inc, co-present on their work in using bioinformatics data effectively in the context of a small therapeutics company.
Eleanor Howe, PhD, CEO of Diamond Age, presents on the different types of computational biologist, the characteristics of a good bioinformatics team, and the pluses and minuses of using deep learning/AI in a discovery biology context.
Huseyin Mehmet, VP of Discovery Research at Zafgen, describes his team's work with Diamond Age and uses their capabilities to inform Zafgen's drug development. He discusses the needs of biotech companies for a diverse, experience bioinformatics team.
1. Introduction and how to get into Data
2. Data Engineering and skills needed
3. Comparison of Data Analytics for statistic and real time streaming data
4. Bayesian Reasoning for Data
In a world of data explosion, the rate of data generation and consumption is on the increasing side,
there comes the buzzword - Big Data.
Big Data is the concept of fast-moving, large-volume data in varying dimensions (sources) and
highly unpredicted sources.
The 4Vs of Big Data
● Volume - Scale of Data
● Velocity - Analysis of Streaming Data
● Variety - Different forms of Data
● Veracity - Uncertainty of Data
With increasing data availability, the new trend in the industry demands not just data collection but making an ample sense of acquired data - thereby, the concept of Data Analytics.
Taking it a step further to further make futuristic prediction and realistic inferences - the concept
of Machine Learning.
A blend of both gives a robust analysis of data for the past, now and the future.
There is a thin line between data analytics and Machine learning which becomes very obvious
when you dig deep.
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org.
Automated data collection tools and mature database technology lead to tremendous amounts of data stored in databases, data warehouses and other information repositories.
Why Data Mining?
What Is Data Mining?
Data Mining: On What Kind of Data?
Data Classification
What is Sentiment Classification?
Importance of Sentiment classification
Twitter for Sentiment Classification
Problem Statement
Goal of this Classifications
Method to be used
Conclusion
Dear students get fully solved assignments
Send your semester & Specialization name to our mail id :
“ help.mbaassignments@gmail.com ”
or
Call us at : 08263069601
This talk goes over what Data Science is and how
you can start working with data in
your role. This is for everyone interested in Data
Science who might be unsure about how to
start working with data. Learn the core
concepts of Data Science and how you can
start learning data science pain-free!
These slides are from a presentation on understanding Machine Learning at a high level. The talk touches on linear regression, neural networks, and how Deep Learning fits into Machine Learning.
BIG DATA AND MACHINE LEARNING
Big Data is a collection of data that is huge in volume, yet growing exponentially with time. It is a data with so large size and complexity that none of traditional data management tools can store it or process it efficiently. Big data is also a data but with huge size.
This pdf is about the Schizophrenia.
For more details visit on YouTube; @SELF-EXPLANATORY;
https://www.youtube.com/channel/UCAiarMZDNhe1A3Rnpr_WkzA/videos
Thanks...!
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.Sérgio Sacani
The return of a sample of near-surface atmosphere from Mars would facilitate answers to several first-order science questions surrounding the formation and evolution of the planet. One of the important aspects of terrestrial planet formation in general is the role that primary atmospheres played in influencing the chemistry and structure of the planets and their antecedents. Studies of the martian atmosphere can be used to investigate the role of a primary atmosphere in its history. Atmosphere samples would also inform our understanding of the near-surface chemistry of the planet, and ultimately the prospects for life. High-precision isotopic analyses of constituent gases are needed to address these questions, requiring that the analyses are made on returned samples rather than in situ.
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...Scintica Instrumentation
Intravital microscopy (IVM) is a powerful tool utilized to study cellular behavior over time and space in vivo. Much of our understanding of cell biology has been accomplished using various in vitro and ex vivo methods; however, these studies do not necessarily reflect the natural dynamics of biological processes. Unlike traditional cell culture or fixed tissue imaging, IVM allows for the ultra-fast high-resolution imaging of cellular processes over time and space and were studied in its natural environment. Real-time visualization of biological processes in the context of an intact organism helps maintain physiological relevance and provide insights into the progression of disease, response to treatments or developmental processes.
In this webinar we give an overview of advanced applications of the IVM system in preclinical research. IVIM technology is a provider of all-in-one intravital microscopy systems and solutions optimized for in vivo imaging of live animal models at sub-micron resolution. The system’s unique features and user-friendly software enables researchers to probe fast dynamic biological processes such as immune cell tracking, cell-cell interaction as well as vascularization and tumor metastasis with exceptional detail. This webinar will also give an overview of IVM being utilized in drug development, offering a view into the intricate interaction between drugs/nanoparticles and tissues in vivo and allows for the evaluation of therapeutic intervention in a variety of tissues and organs. This interdisciplinary collaboration continues to drive the advancements of novel therapeutic strategies.
Richard's aventures in two entangled wonderlandsRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
Seminar of U.V. Spectroscopy by SAMIR PANDASAMIR PANDA
Spectroscopy is a branch of science dealing the study of interaction of electromagnetic radiation with matter.
Ultraviolet-visible spectroscopy refers to absorption spectroscopy or reflect spectroscopy in the UV-VIS spectral region.
Ultraviolet-visible spectroscopy is an analytical method that can measure the amount of light received by the analyte.
Professional air quality monitoring systems provide immediate, on-site data for analysis, compliance, and decision-making.
Monitor common gases, weather parameters, particulates.
Mammalian Pineal Body Structure and Also Functions
The current state of prediction in neuroimaging
1. The current state of
prediction in neuroimaging
Saige Rutherford
@being_saige
www.beingsaige.com
2. Road Map
• Quick review of typical ML workflow + my favorite predictive
model
• Which traits and behaviors can we predict from brain
images?
• What do various successful predictive models have in
common?
• What does a “successful” predictive model look like?
• How does big data fit in, is there hope for smaller datasets?
• Where is there room to improve brain-behavior predictive
models?
6. Favorite predictive model: Brain Basis Set
Basis Set = Chosen # of top components from PCA decomposition of subjects x features matrix
aka principle component regression
7. Phenotype BBS CPM
General Executive 0.44 0.42
Processing Speed 0.39 0.23
Penn Progressive
Matrices
0.30 0.32
ASR Externalizing 0.24 0.03
ASR Internalizing 0.20 0.04
ASR Attention 0.21 0.00
NEO-Openness 0.18 0.11
NEO-
Conscientiousness
0.19 0.15
NEO-Extroversion 0.13 0.04
NEO-Agreeableness 0.19 0.10
NEO-Neuroticism 0.00 0.05Number of Components Used to Predict
MeanCorrelationbetweenPredicted&
ObservedPhenotype
Sripada et al. Scientific Reports (2019)
100 held out unrelated subjects10-fold Cross Validation
9. Successful Predictive Modeling
Test your prediction model in “the wild”
Sripada et. al Molecular Psychiatry (2019)
Ex. controlling for confounds (motion, demographics,
medication), different cross validation splits.
This shows more believable and realistic results!
Rozycki et. al Schizophrenia Bulletin (2017)
10. Successful Predictive Modeling
Impact of region-definition method on
prediction accuracy
Impact of connectivity
parameterization on prediction
accuracy
Impact of classifier choice on
prediction accuracy
https://www.sciencedirect.com/science/article/pii/S1053811919301594Dadi et al. Neuroimage (2019)
12. What not to do
don’t be this guy
1. Be a research troll
Research Troll: Someone who is overly protective of their
data, unwilling to share data and well-documented code.
2. No out of sample test set or cross validation
13. What not to do
Make bold claims about one model/method being the best…
You know what they say when you assume…
You’re probably wrong, and someone will publicly prove this to
you in a Twitter thread
14. Big Datasets are taking over…
Where does my “small” data fit in?
15. Big Datasets are taking over…
Where does my “small” data fit in?
Big data can be act as a “discovery” data
set.
Use HCP, ABCD, or UKBiobank to find a
brain basis set then get expression
scores of these components in your
dataset.
Use pretrained models from big data,
treat your dataset as a true out of sample
test set.
Externalizing
Internalizing
Attention
Model
Externalizing*
Multi-Task Learning, Transfer Learning
16. Contributing to Big Data
Federated Learning: allows us to train models on distributed datasets that you cannot
directly access.
https://blog.openmined.org/federated-learning-differential-privacy-and-encrypted-computation-for-medical-imaging/
https://arxiv.org/pdf/1610.05492.pdf
https://ai.googleblog.com/2017/04/federated-learning-collaborative.html
Federated Learning tutorial using brain age prediction model coming soon
17. How can we improve prediction?
Put in the (hard) work to prepare your data properly…
Tangential point about preprocessing fMRI data
Haak, Marquand, Beckman, Neuroimage 2017
18. Lots of papers pointing to this same idea…
Don’t use a fixed atlas!
https://cdn.elifesciences.org/articles/44890/elife-44890-v2.pdf
https://cdn.elifesciences.org/articles/32992/elife-32992-v1.pdf
https://www.ncbi.nlm.nih.gov/pubmed/25598050
https://www.sciencedirect.com/science/article/pii/S1053811917305463
https://www.biorxiv.org/content/10.1101/431833v2https://www.ncbi.nlm.nih.gov/pubmed/29878084
19. How can we improve prediction?
Most of machine learning is about good data hygiene.
UNDERSTAND YOUR DATA!
https://twitter.com/justmarkham/status/1155840938356432896
pip install pandas_profiling
import pandas_profiling
df.profile_report()
20. Patient or
healthy
control?
Think deeply before you turn a continuous
trait into a categorical trait.
Dimensional neuroimaging: our ability to
place a brain scan into a succinct, yet highly
comprehensive and informative reference
system, dimensions of which will reflect
patterns associated with normal or pathologic
brain structure or function.
21. How can we improve prediction?
Bias in neuroimaging data…we need to do better at acknowledging it.
Big Data != Population Data
Does ML reveal the true nature of relationships, unconstrained by any bias or human influence?
The answer is an unequivocal No.
23. Take Home Messages
There is not one perfect prediction framework to rule them all.
Machine Learning No Free Lunch theorem: no machine learning
method is better than the others, on average, over a broad family of
problems.
Embrace and collaborate with big data.
Big data: multi-task learning, share your models
Small data: transfer learning, use pre-trained models
Focus on transparency and reproducibility
24. Learning Resources
This is a research process, not a final offering.
OHBM 2019 talks on ML:
https://www.pathlms.com/ohbm/courses/12238/sections/15843/video_presentations/138295
https://www.pathlms.com/ohbm/courses/12238/sections/15843/video_presentations/138032
https://www.pathlms.com/ohbm/courses/12238/sections/15843/video_presentations/138231
https://www.pathlms.com/ohbm/courses/12238/sections/15843/video_presentations/138291
https://www.pathlms.com/ohbm/courses/12238/sections/15843/video_presentations/138219
Gael Varoquaux talks:
https://www.slideshare.net/GaelVaroquaux/functionalconnectome-biomarkers-to-meet-clinical-needs
https://www.slideshare.net/GaelVaroquaux/machine-learning-on-non-curated-data-154905090
Machine learning in neuroimaging: Progress and challenges. Neuroimage. 2019 August 15.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6499712/pdf/nihms-1025732.pdf
Learn a new Pandas trick everyday: https://www.dataschool.io/python-pandas-tips-and-tricks/
25.
26. Thank you!
All who have supported/inspired me on my learning journey.
Mike Angstadt, Chandra Sripada, Jenna Wiens, Daniel Kessler, Aman
Taxali, Bennet Fauber, Marlena Duda, GirlsWhoCode Organization, Ivy Tso,
Soo-Eun Chang, Steve Taylor, the entire University of Michigan community!
@being_saige
www.beingsaige.com
Editor's Notes
How should nodes be chosen? How many nodes are needed for brain-imaging based diagnosis?
How should weights of brain functional connectomes be represented?
What classifiers should be used? Should linear or non-linear models be preferred? Spare or non-sparse models be used? With or without feature selection?
. We study the prediction score of each pipeline relative to the mean across pipelines on each fold. This relative measure discards the variance in scores due to folds or datasets.
•Regions defined functionally (with dictionary learning or ICA) give best prediction.
•Prefer tangent-space parametrization of connectomes to full or partial correlation.
•Non-sparse linear classifiers are best for supervised learning.
Machine learning 101: a model that fits the data well doesn't necessarily generalize well.
In the era of big data, generalization should be tested in separate samples, or else using split-sample approaches in which one split is kept completely hidden until the very final application of a model.
Transfer learning example: knowledge gained while learning to recognize cars could apply when trying to recognize trucks.
Big Data people: think about multi-task learning, which creates more generalizable models. Also make sure you always share your saved models for people who might now have access to the big data.
Small data people: think transfer learning. If you can get access to the saved models that big data people share you can use them to test on your data, even if the model wasn’t explicitly train to predict the exact phenotype you are using.
Collaborative machine learning without centralized datasets.
One theoretical region where there is one mode of organization (it could encode task activation, connectivity, stimulus response) running in one particular direction and in the same area there is a second mode of organization running in a perpendicular direction. Taking measurements directly would mean that we are taking the superposition of this organization and we would wrongly infer that things are organized along this diagonal. When you parcellate this data you get a completely wrong ROI atlas definition which does not at all respect the underlying data. We know this is true in motor cortex and primary visual cortex. Lots of works suggests this presence in other regions of the brain.
Majority of machine learning in clinical neuroscience focused on classifying patients from healthy controls. Although this is a good starting point, its practical value is very limited, since those patients are presumably already “correctly” classified via simpler clinical examinations, hence they are used as ground truth.
In the computer-science based machine learning community, the discussion of bias in predictive models is widely acknowledged and discussed. At MLHC this past summer, the ending panel spent 2 hours discussing biases and ways of overcoming them.