Spatial statistics can be used in epidemiology to analyze spatial point patterns of disease cases and controls. Common models include homogeneous and inhomogeneous Poisson processes, which describe patterns of complete spatial randomness and non-random clustering or dispersion. Descriptive statistics like the first-order intensity function λ(s) and second-order K-function can quantify clustering in a point pattern. A case-control study compares these statistics between case and control patterns to test for non-random spatial variations in disease risk. Monte Carlo simulations are used to calculate p-values when testing hypotheses about relative risk and clustering.
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...Preferred Networks
This presentation explains basic ideas of graph neural networks (GNNs) and their common applications. Primary target audiences are students, engineers and researchers who are new to GNNs but interested in using GNNs for their projects. This is a modified version of the course material for a special lecture on Data Science at Nara Institute of Science and Technology (NAIST), given by Preferred Networks researcher Katsuhiko Ishiguro, PhD.
Introduction to Topological Data AnalysisMason Porter
Here are slides for my 3/14/21 talk on an introduction to topological data analysis.
This is the first talk in our Short Course on topological data analysis at the 2021 American Physical Society (APS) March Meeting: https://march.aps.org/program/dsoft/gsnp-short-course-introduction-to-topological-data-analysis/
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...Preferred Networks
This presentation explains basic ideas of graph neural networks (GNNs) and their common applications. Primary target audiences are students, engineers and researchers who are new to GNNs but interested in using GNNs for their projects. This is a modified version of the course material for a special lecture on Data Science at Nara Institute of Science and Technology (NAIST), given by Preferred Networks researcher Katsuhiko Ishiguro, PhD.
Introduction to Topological Data AnalysisMason Porter
Here are slides for my 3/14/21 talk on an introduction to topological data analysis.
This is the first talk in our Short Course on topological data analysis at the 2021 American Physical Society (APS) March Meeting: https://march.aps.org/program/dsoft/gsnp-short-course-introduction-to-topological-data-analysis/
Frontiers of data-driven property prediction: molecular machine learningIchigaku Takigawa
Innovation Camp 2018 for Computational Materials Science(ICCMS2018)
January 23rd(Tue.)-25th(Thu.), 2018
The Jozankei View Hotel, Sapporo, Hokkaido, Japan.
http://ccms.issp.u-tokyo.ac.jp/events/eventsfolder/ICCMS2018
In materials science, data-centric science is becoming one of the major approaches along with theoretical, experimental, and computational sciences. The main purpose of this camp is that we learn the basics of the machine learning as data-centric science and use it to solve problems in our researches through group works. We will also have lectures on advanced researches in computational and data-centric sciences and discuss future perspectives. Furthermore, we learn innovation minds by inviting lecturers who are at the forefront beyond the industry-government-academia framework.
計算物質科学イノベーションキャンプ2018
物質科学の課題を解決する際、理論科学、実験科学、計算科学に加え、データ科学の活用が盛んになっている。本キャンプでは、そのデータ科学として機械学習の手法を学び、チームでの実習を通し手法を身に着け、各自の研究やプロジェクトの課題解決に役立てることを主目的とする。また、講師を招いて計算科学やデータ科学の最先端の研究成果に関する講義と今後の発展の可能性などについて議論する。さらに、産官学や学問領域を超えて活躍する方々のレクチャーと意見交換などでイノベーションマインドを学ぶ。
Robust Low-rank and Sparse Decomposition for Moving Object DetectionActiveEon
Presentation summary:
* Moving object detection by background modeling and subtraction.
* Solved and unsolved challenges.
* Framework for low-rank and sparse decomposition.
* Some applications of RPCA on:
* * Background modeling and foreground separation.
* * Very dynamic background.
* * Multidimensional and streaming data.
* LRSLibrary1 + demo.
Graph-based learning using Graph neural networks: This is a beginner-friendly exploration of Graph Neural Networks (GNNs), where we unravel the fundamentals of this powerful technique for analyzing interconnected data structures and pave the way for deeper understanding and practical applications. This will be a precursor to a subsequent hands-on workshop that'll be announced later.
This talk was delivered as part of the neo4j meetup that happened on 19th August, 2023 at Thoughtworks, Bangalore. Meetup link: https://www.meetup.com/graph-database-bangalore/events/294780261
Extraction of common conceptual components from multiple ontologiesValentina Carriero
Understanding large ontologies, with diverse semantics and modelling practices, is still an issue, and has an impact on many ontology engineering tasks. While existing methods summarise ontologies by extracting the most important nodes or subgraphs, a complete overview of an ontology, and a comparison between multiple ontologies, are not supported. Based on the hypothesis that ontologies are designed as compositions of patterns, this slides present a method able to extract conceptual components from multiple ontologies and the observed ontology design patterns implementing them.
related paper: https://arxiv.org/abs/2106.12831
For non-grid 3D images like point clouds and meshes, and inherently graph-based data.
Inherently graph-based data include for example brain connectivity analysis, scientific article citation networks, (social) network analysis, etc.
Alternative download link:
https://www.dropbox.com/s/2o3cofcd6d6e2qt/geometricGraph_deepLearning.pdf?dl=0
Introduction to Graph neural networks @ Vienna Deep Learning meetupLiad Magen
Graphs are useful data structures that can be used to model various sorts of data: from molecular protein structures to social networks, pandemic spreading models, and visually rich content such as websites & invoices. In the recent few years, graph neural networks have done a huge leap forward. It is a powerful tool that every data scientist should know. In this talk, we will review their basic structure, show some example usages, and explore the existing (python) tools.
https://telecombcn-dl.github.io/2018-dlai/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.
Machine learning on graphs is an important and ubiquitous task with applications ranging from drug design to friendship recommendation in social networks. The primary challenge in this domain is finding a way to represent, or encode, graph structure so that it can be easily exploited by machine learning models. However, traditionally machine learning approaches relied on user-defined heuristics to extract features encoding structural information about a graph. In this talk I will discuss methods that automatically learn to encode graph structure into low-dimensional embeddings, using techniques based on deep learning and nonlinear dimensionality reduction. I will provide a conceptual review of key advancements in this area of representation learning on graphs, including random-walk based algorithms, and graph convolutional networks.
Frontiers of data-driven property prediction: molecular machine learningIchigaku Takigawa
Innovation Camp 2018 for Computational Materials Science(ICCMS2018)
January 23rd(Tue.)-25th(Thu.), 2018
The Jozankei View Hotel, Sapporo, Hokkaido, Japan.
http://ccms.issp.u-tokyo.ac.jp/events/eventsfolder/ICCMS2018
In materials science, data-centric science is becoming one of the major approaches along with theoretical, experimental, and computational sciences. The main purpose of this camp is that we learn the basics of the machine learning as data-centric science and use it to solve problems in our researches through group works. We will also have lectures on advanced researches in computational and data-centric sciences and discuss future perspectives. Furthermore, we learn innovation minds by inviting lecturers who are at the forefront beyond the industry-government-academia framework.
計算物質科学イノベーションキャンプ2018
物質科学の課題を解決する際、理論科学、実験科学、計算科学に加え、データ科学の活用が盛んになっている。本キャンプでは、そのデータ科学として機械学習の手法を学び、チームでの実習を通し手法を身に着け、各自の研究やプロジェクトの課題解決に役立てることを主目的とする。また、講師を招いて計算科学やデータ科学の最先端の研究成果に関する講義と今後の発展の可能性などについて議論する。さらに、産官学や学問領域を超えて活躍する方々のレクチャーと意見交換などでイノベーションマインドを学ぶ。
Robust Low-rank and Sparse Decomposition for Moving Object DetectionActiveEon
Presentation summary:
* Moving object detection by background modeling and subtraction.
* Solved and unsolved challenges.
* Framework for low-rank and sparse decomposition.
* Some applications of RPCA on:
* * Background modeling and foreground separation.
* * Very dynamic background.
* * Multidimensional and streaming data.
* LRSLibrary1 + demo.
Graph-based learning using Graph neural networks: This is a beginner-friendly exploration of Graph Neural Networks (GNNs), where we unravel the fundamentals of this powerful technique for analyzing interconnected data structures and pave the way for deeper understanding and practical applications. This will be a precursor to a subsequent hands-on workshop that'll be announced later.
This talk was delivered as part of the neo4j meetup that happened on 19th August, 2023 at Thoughtworks, Bangalore. Meetup link: https://www.meetup.com/graph-database-bangalore/events/294780261
Extraction of common conceptual components from multiple ontologiesValentina Carriero
Understanding large ontologies, with diverse semantics and modelling practices, is still an issue, and has an impact on many ontology engineering tasks. While existing methods summarise ontologies by extracting the most important nodes or subgraphs, a complete overview of an ontology, and a comparison between multiple ontologies, are not supported. Based on the hypothesis that ontologies are designed as compositions of patterns, this slides present a method able to extract conceptual components from multiple ontologies and the observed ontology design patterns implementing them.
related paper: https://arxiv.org/abs/2106.12831
For non-grid 3D images like point clouds and meshes, and inherently graph-based data.
Inherently graph-based data include for example brain connectivity analysis, scientific article citation networks, (social) network analysis, etc.
Alternative download link:
https://www.dropbox.com/s/2o3cofcd6d6e2qt/geometricGraph_deepLearning.pdf?dl=0
Introduction to Graph neural networks @ Vienna Deep Learning meetupLiad Magen
Graphs are useful data structures that can be used to model various sorts of data: from molecular protein structures to social networks, pandemic spreading models, and visually rich content such as websites & invoices. In the recent few years, graph neural networks have done a huge leap forward. It is a powerful tool that every data scientist should know. In this talk, we will review their basic structure, show some example usages, and explore the existing (python) tools.
https://telecombcn-dl.github.io/2018-dlai/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.
Machine learning on graphs is an important and ubiquitous task with applications ranging from drug design to friendship recommendation in social networks. The primary challenge in this domain is finding a way to represent, or encode, graph structure so that it can be easily exploited by machine learning models. However, traditionally machine learning approaches relied on user-defined heuristics to extract features encoding structural information about a graph. In this talk I will discuss methods that automatically learn to encode graph structure into low-dimensional embeddings, using techniques based on deep learning and nonlinear dimensionality reduction. I will provide a conceptual review of key advancements in this area of representation learning on graphs, including random-walk based algorithms, and graph convolutional networks.
Introduction about Monte Carlo Methods, lecture given at Technical University of Kaiserslautern 2014.
There are many situations where Monte Carlo Methods are useful to solve data science problems
Talk on the design on non-negative unbiased estimators, useful to perform exact inference for intractable target distributions.
Corresponds to the article http://arxiv.org/abs/1309.6473
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
call for paper 2012, hard copy of journal, research paper publishing, where to publish research paper,
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal
Noise is unwanted sound considered unpleasant, loud, or disruptive to hearing. From a physics standpoint, there is no distinction between noise and desired sound, as both are vibrations through a medium, such as air or water. The difference arises when the brain receives and perceives a sound.
We approach the screening problem - i.e. detecting which inputs of a computer model significantly impact the output - from a formal Bayesian model selection point of view. That is, we place a Gaussian process prior on the computer model and consider the $2^p$ models that result from assuming that each of the subsets of the $p$ inputs affect the response. The goal is to obtain the posterior probabilities of each of these models. In this talk, we focus on the specification of objective priors on the model-specific parameters and on convenient ways to compute the associated marginal likelihoods. These two problems that normally are seen as unrelated, have challenging connections since the priors proposed in the literature are specifically designed to have posterior modes in the boundary of the parameter space, hence precluding the application of approximate integration techniques based on e.g. Laplace approximations. We explore several ways of circumventing this difficulty, comparing different methodologies with synthetic examples taken from the literature.
Authors: Gonzalo Garcia-Donato (Universidad de Castilla-La Mancha) and Rui Paulo (Universidade de Lisboa)
International Conference on Monte Carlo techniques
Closing conference of thematic cycle
Paris July 5-8th 2016
Campus les cordeliers
Jere Koskela's slides
Data fusion is the process of combining data from different sources to enhance the utility of the combined product. In remote sensing, input data sources are typically massive, noisy, and have different spatial supports and sampling characteristics. We take an inferential approach to this data fusion problem: we seek to infer a true but not directly observed spatial (or spatio-temporal) field from heterogeneous inputs. We use a statistical model to make these inferences, but like all models it is at least somewhat uncertain. In this talk, we will discuss our experiences with the impacts of these uncertainties and some potential ways addressing them.
International Conference on Monte Carlo techniques
Closing conference of thematic cycle
Paris July 5-8th 2016
Campus les Cordeliers
Slides of Richard Everitt's presentation
Nonlinear Stochastic Optimization by the Monte-Carlo Method
Spatial Point Processes and Their Applications in Epidemiology
1. Spatial Statistics for Epidemiology
—Spatial Point Processes
By Liu Xu U086105E
Supervisor: Prof Loh Wei Liem
Department of Statistics and Applied Probability
National University of Singapore
15 March 2012
1
5. Example: cancer cases
Intro Models Theory Application
5
marks: extra information
attached to points,
categorical/ continuous
6. Example : the milky way galaxy
Intro Models Theory Application 6
7. Types of point patterns
Intro Models Theory Application
7
regularityCSRclustering repulsionattraction
8. Aim: describe and model “pattern”
Are points randomly located?
• If so, find a statistical model to
describe the “randomness”;
• If not, …
Intro Models Theory Application
8
9. Models: spatial point processes
• A spatial point process is a stochastic process X which
generates a countable set of events in defined space.
• A spatial pattern x = {x1, x2, …, xn} on an observational region
W generated from a spatial point process is a realization of
the process.
• Only consider point processes in 2-D space.
• The locations of any object can be modelled−plants, animals,
cells, stars, disease cases, earthquakes, …
Intro Models Theory Application
9
10. Models: spatial point processes
Intro Models Theory Application
10
• Notation:
W: study region in R2
N(A): number of events inside subregion A, A W.
|A|: area of region A
s: random locations in W
ds: infinitesimal region centered at s
• Assumptions on spatial point processes:
i. Locally finite: the number of events in any bounded region is bounded
ii. At any point location s, there is either one event or no events at all
11. HPP
A spatial point process in a bounded region W in R2 is a
homogeneous Poisson process (HPP) if:
i. For all subregion A in W, N(A) ~ Poi(λ|A|), where 0 < λ < ∞
is a constant, called intensity (homogeneity).
ii. If A1 and A2 are two disjoint subregions in W, then N(A1) and
N(A2) are independent (independence).
• Standard model for complete spatial randomness (CSR);
• Can be generalized to more complicated models;
• A reference process when analyzing spatial characteristics of
a specific pattern.
Intro Models Theory Application
11
12. IPP
A spatial point process in a bounded region W in R2 is a
inhomogeneous Poisson process (IPP) if:
i. For any subregion A in W, N(A) ~ Poi(∫Aλ(s)ds), where 0 <
λ(s) < ∞ is the intensity at s .
ii. If A1 and A2 are two disjoint subregions in W, then N(A1) and
N(A2) are independent (independence).
Intro Models Theory Application
12
HPPIPP
tiongeneraliza
casespecial
13. Simulation from Poisson processes
Intro Models Theory Application
13
Two Poisson process realizations on the unit square having the same
expected number of events = 100.
14. Summary statistics: first-order
First-order intensity of a spatial point process is:
• Interpretation: expected number of events per unit area. For
small region ds, λ(s)|ds| describes the probability for an event
in ds.
• Intensity may be constant (homogeneous) or may vary from
location to location (inhomogeneous). If the process is
homogeneous, estimate intensity by
Intro Models Theory Application 14
s
s
s
s d
dNE
d
))((
lim)(
0
W
WN )(ˆ
15. Estimate λ(s) in inhomogeneous case
• Estimating the intensity of a spatial point pattern is similar to estimating a
bivariate probability density
• How to estimate bivariate density?
Given an i.i.d. sample (y1, . . . , yn) of a bivariate random variable Y, an estimate of
the density f (·) of Y at y is
where K(·) is the kernel and h is the bandwidth.
• The expression for kernel smoothing of the intensity function of a point
pattern x = {x1, …, xn} at location s is
the bandwidth h is chosen based on some cross-validation criterion.
Intro Models Theory Application
15
n
i
)
h
K(
nh
)(f
1
2
1ˆ i
yy
y
n
i
)
h
K(
h
)(λ
1
2
1ˆ sx
s i
16. Kernel smoothed intensity of IPP
Intro Models Theory Application
16
Kernel estimated intensity for the point pattern simulated from HPP
with λ(s) = 400xy on [0, 1] * [0, 1].
17. Summary statistics: second-order
The second-order properties of a point process involve
relationship between number of events at different locations.
• The second-order intensity of a spatial point process is
• A point process is called stationary if
• A stationary point process is isotropic if
Intro Models Theory Application 17
ji
ji
ss
ji
ss
ss
ss
ji dd
dNdNE
dd
)]()([
lim),(
0,
2
)(),(
,)(
22 jiji ssss
ss
)(),( 22 jiji ssss
18. If a point process is stationary and isotropic, the K-function of
the process is defined by:
λK(r) = E[number of further events within distance r from an
arbitrary event]
Two properties of K-function:
• For a HPP, λK(r) = λπr2 , thus Kp(r) = πr2
• K(r) is invariant to random thinning.
Intro Models Theory Application
18
K-function
Def. random thinning: each event of a point process X is either
retained or deleted with retention probability p, independently of
other events. The resulting point process X’ contains a subset of
events of the original process X.
19. Comparing estimated K-functions of simulated point patterns
Intro Models Theory Application 19
CSR: K(r) = πr2
clustered: K(r) > πr2
regular: K(r) < πr2
20. Estimation of K(r):Ê(# further events…)/λ̂
Intro Models Theory Application
20
negatively biased edge correction
21. Application in epidemiology
John Snow (15 March 1813 – 16 June
1858) is considered to be one of the
fathers of epidemiology, because of his
work in tracing the source of a cholera
outbreak in Soho, England, in 1854
Intro Models Theory Application
21
22. Case-control study
Goal: compare the spatial distribution
of disease cases with the underlying
population
• Null hypothesis :
equal spatial distribution
• Controls:
selected to represent population
heterogeneity
Intro Models Theory Application
22
Incidence
of disease
Population
density
Overall
risk of
disease
Other risk
factors, e.g.
distance from
point source
Do disease cases occur randomly
among population?
23. Case-control data consist of two point patterns:
• the locations of n1 cases of particular disease {x1, x2, …, xn1}
• the locations n0 controls {xn1+1, …, xn1+n0}
in a study region W over a defined period of time. Total
number of data points n = n1 + n0.
Assumption:
• Cases from an IPP with intensity λ1(s)
• Controls from another independent IPP with intensity λ0(s)
Intro Models Theory Application
23
24. Spatial risk
relative risk:
estimated relative risk:
H0:
test statistic:
estimated test statistic:
significance: Monte Carlo test
Intro Models Theory Application
24
)(
)(
)(
0
1
s
s
s
0
1
0)(
n
n
s
n
i
T
1
2
0 ])([ ix
n
i
T
1
2
0 ])(ˆ[ˆ ix
)(ˆ
)(ˆ
)(ˆ
0
1
s
s
s
25. Spatial clustering
K0(r)→ amount of clustering due to population
K1(r)→ amount of clustering due to population plus effect
of other possible risk factors
D(r) = K1(r) - K0(r) → the amount of clustering that is not
due to population
estimate:
H0:
Test statistic:
significance: Monte Carlo test
Intro Models Theory Application
25
m
k k
k
rD
rD
D
1 )](var[
)(
)(ˆ)(ˆ)(ˆ
01 rKrKrD
0D(r)=
26. Monte Carlo test
1). simulation with random labelling at jth iteration, j=1, 2, …, 99
• randomly select n1 points from n data points and label the selected points as “case”, label
the remaining n0 points as “control”
• with the relabelled data, estimate kernel smoother and at every data point.
• estimate K1j(r) and K0j(r) and compute D̂j(r) at a set of discrete distances {r1, r2, …, rm} .
2). test statistic
• for each j, compute
• compute the variance of D̂(rk) for each k=1, 2, …, m. then get
3). p-value
Intro Models Theory Application
26
)(ˆ),(ˆ
01 xx jj )(ˆ xj
2
1 0 ])(ˆ[ˆ
n
i ijjT x
m
k
k
kj
j
rD
rD
D
1 )](ˆvar[
)(ˆ
ˆ
)199/(]1}ˆˆ{[
)199/(]1}ˆˆ{[
99
1
2
99
1
1
j
j
j
j
DDIp
TTIp