SlideShare a Scribd company logo
1 of 1
Adapt-then-Combine (ATC) diffusion strategy
•Error analysis:
The algorithm is stable in mean if
•Steady state mean variance:
• Assuming small probability of missing, we have
• Smoothing filters
to estimate the
variance
• Relation between perfect and imperfect estimates is give as
The estimate is biased with respect to
• To compensate the bias, we associate the following individual cost to each agent:
where is a symmetric matrix to be chosen.
To have an unbiased estimate, i.e.,
The minimum cost:
Assumption 3: The covariance matrix of regressor is diagonal.
Under assumption 3:
Mohammad Reza Gholami1
, Erik G. Ström1
, and Ali H. Sayed2
1
Department of Signals and Systems, Chalmers University of Technology, Gothenburg, SE-412 96, Sweden
2
Electrical Engineering Department, University of California, Los Angeles, CA 90095, USA
Emails: {moreza,erik.strom}@chalmers.se, sayed@ee.ucla.edu
In many fields, and especially in the medical and social sciences and in various
recommender systems, data are often gathered through clinical studies or targeted
surveys. Participants are generally reluctant to respond to all questions in a survey or
they may lack information to respond adequately to the questions. The data collected
from these studies tend to lead to linear regression models where the regression
vectors are only known partially: some of their entries are either missing completely or
replaced randomly by noisy values. There are also situations where it is not known
beforehand which entries are missing or censored. There have been many useful
studies in the literature on techniques to perform estimation and inference with
missing data. In this work, we examine how a connected network of agents, with each
one of them subjected to a stream of data with incomplete regression information, can
cooperate with each other through local interactions to estimate the underlying model
parameters in the presence of missing data. We explain how to modify traditional
distributed strategies through regularization in order to eliminate the bias introduced
by the incomplete model. We also examine the stability and performance of the
resulting diffusion strategy and provide simulations in support of the findings. We
consider two applications: one dealing with a mental health survey and the other
dealing with a household consumption survey.
Diffusion Estimation over Cooperative Networks with Missing Data
AbstractAbstract
System ModelSystem Model
• Consider a connected network. Each agent senses a wide-sense stationary data
that satisfy the following linaer regresson model:
Assumption 1: The regression and the noise processes are each spatially independent and
temporally white. In addition,
• The model for incomplete regressor : (1)
Assumption 2: Random variables are independent of each other.
• Optimal estimator (minimum-mean-square error):
Perfect: Missing data:
The minimum cost for the perfect scenario:
(www.asl.ee.ucla.edu)
Simulation ResultsSimulation Results• In data gathering procedures, it is common that some components of the data are
missing or left unobserved, e.g., a participant may be reluctant to answer some
questions in a clinical study.
• Data can be missed in a random or deterministic fashion.
• Two techniques to deal with missing data are: imputation, which makes biased in
estimation, and deletion, which degrades the performance.
• This work studies the missing data problem over a network of agents, with each one
of them subjected to a stream of data with incomplete regression information, can
cooperate with each other to estimate the underlying model parameters in the
presence of missing data.
• In this study, we consider a linear regression model.
• We adjust the traditional diffusion strategies through (de)regularization in order to
mitigate the bias introduced by imputation.
• We consider two applications: one dealing with a mental health survey and the other
dealing with a household consumption survey.
IntroductionIntroduction
Bias CompensationBias Compensation
Distributed AlgorithmDistributed Algorithm
• Household Consumption:
• Mental Health Survey:
Adaptive Systems Laboratory
Estimation of Regularization ParameterEstimation of Regularization Parameter
Ncoop: Non Cooperative
MATC: Modified ATC

More Related Content

What's hot

Rodriguez_Ullmayer_Rojo_RUSIS@UNR_REU_Poster_Presentation_JMM
Rodriguez_Ullmayer_Rojo_RUSIS@UNR_REU_Poster_Presentation_JMMRodriguez_Ullmayer_Rojo_RUSIS@UNR_REU_Poster_Presentation_JMM
Rodriguez_Ullmayer_Rojo_RUSIS@UNR_REU_Poster_Presentation_JMM​Iván Rodríguez
 
Imputation Techniques For Market Research Datasets With Missing Values
Imputation Techniques For Market Research Datasets With Missing Values Imputation Techniques For Market Research Datasets With Missing Values
Imputation Techniques For Market Research Datasets With Missing Values Salford Systems
 
Rodriguez_Ullmayer_Rojo_RUSIS@UNR_REU_Poster_Presentation_SACNAS
Rodriguez_Ullmayer_Rojo_RUSIS@UNR_REU_Poster_Presentation_SACNASRodriguez_Ullmayer_Rojo_RUSIS@UNR_REU_Poster_Presentation_SACNAS
Rodriguez_Ullmayer_Rojo_RUSIS@UNR_REU_Poster_Presentation_SACNAS​Iván Rodríguez
 
Pharmacokinetic pharmacodynamic modeling
Pharmacokinetic pharmacodynamic modelingPharmacokinetic pharmacodynamic modeling
Pharmacokinetic pharmacodynamic modelingMeghana Gowda
 
SheffieldR July Meeting - Multiple Imputation with Chained Equations (MICE) p...
SheffieldR July Meeting - Multiple Imputation with Chained Equations (MICE) p...SheffieldR July Meeting - Multiple Imputation with Chained Equations (MICE) p...
SheffieldR July Meeting - Multiple Imputation with Chained Equations (MICE) p...Paul Richards
 
Biostatistics Workshop: Missing Data
Biostatistics Workshop: Missing DataBiostatistics Workshop: Missing Data
Biostatistics Workshop: Missing DataHopkinsCFAR
 
Maximum likelihood estimation from uncertain
Maximum likelihood estimation from uncertainMaximum likelihood estimation from uncertain
Maximum likelihood estimation from uncertainIEEEFINALYEARPROJECTS
 
Statistics in real life engineering
Statistics in real life engineeringStatistics in real life engineering
Statistics in real life engineeringMD TOUFIQ HASAN ANIK
 
06 quantitative data processing
06 quantitative data processing06 quantitative data processing
06 quantitative data processingKanagaraj Easwaran
 
Measuring Improvement in Access to Complete Data in Healthcare Collaborative ...
Measuring Improvement in Access to Complete Data in Healthcare Collaborative ...Measuring Improvement in Access to Complete Data in Healthcare Collaborative ...
Measuring Improvement in Access to Complete Data in Healthcare Collaborative ...Nurul Emran
 
Machine Learning and Causal Inference
Machine Learning and Causal InferenceMachine Learning and Causal Inference
Machine Learning and Causal InferenceNBER
 
Machine learning meets user analytics - Metageni tech talk
Machine learning meets user analytics - Metageni tech talkMachine learning meets user analytics - Metageni tech talk
Machine learning meets user analytics - Metageni tech talkGabriel Hughes PhD
 
5. Llinking employers and employees responses
5. Llinking employers and employees responses5. Llinking employers and employees responses
5. Llinking employers and employees responsesBEYOND4.0
 

What's hot (20)

Rodriguez_Ullmayer_Rojo_RUSIS@UNR_REU_Poster_Presentation_JMM
Rodriguez_Ullmayer_Rojo_RUSIS@UNR_REU_Poster_Presentation_JMMRodriguez_Ullmayer_Rojo_RUSIS@UNR_REU_Poster_Presentation_JMM
Rodriguez_Ullmayer_Rojo_RUSIS@UNR_REU_Poster_Presentation_JMM
 
Imputation Techniques For Market Research Datasets With Missing Values
Imputation Techniques For Market Research Datasets With Missing Values Imputation Techniques For Market Research Datasets With Missing Values
Imputation Techniques For Market Research Datasets With Missing Values
 
Rodriguez_Ullmayer_Rojo_RUSIS@UNR_REU_Poster_Presentation_SACNAS
Rodriguez_Ullmayer_Rojo_RUSIS@UNR_REU_Poster_Presentation_SACNASRodriguez_Ullmayer_Rojo_RUSIS@UNR_REU_Poster_Presentation_SACNAS
Rodriguez_Ullmayer_Rojo_RUSIS@UNR_REU_Poster_Presentation_SACNAS
 
Pharmacokinetic pharmacodynamic modeling
Pharmacokinetic pharmacodynamic modelingPharmacokinetic pharmacodynamic modeling
Pharmacokinetic pharmacodynamic modeling
 
SheffieldR July Meeting - Multiple Imputation with Chained Equations (MICE) p...
SheffieldR July Meeting - Multiple Imputation with Chained Equations (MICE) p...SheffieldR July Meeting - Multiple Imputation with Chained Equations (MICE) p...
SheffieldR July Meeting - Multiple Imputation with Chained Equations (MICE) p...
 
Biostatistics Workshop: Missing Data
Biostatistics Workshop: Missing DataBiostatistics Workshop: Missing Data
Biostatistics Workshop: Missing Data
 
Analyzing data
Analyzing dataAnalyzing data
Analyzing data
 
Fuzzy
FuzzyFuzzy
Fuzzy
 
B0930610
B0930610B0930610
B0930610
 
Maximum likelihood estimation from uncertain
Maximum likelihood estimation from uncertainMaximum likelihood estimation from uncertain
Maximum likelihood estimation from uncertain
 
Statistics in real life engineering
Statistics in real life engineeringStatistics in real life engineering
Statistics in real life engineering
 
06 quantitative data processing
06 quantitative data processing06 quantitative data processing
06 quantitative data processing
 
Missing data handling
Missing data handlingMissing data handling
Missing data handling
 
Outlier Detection
Outlier DetectionOutlier Detection
Outlier Detection
 
Measuring Improvement in Access to Complete Data in Healthcare Collaborative ...
Measuring Improvement in Access to Complete Data in Healthcare Collaborative ...Measuring Improvement in Access to Complete Data in Healthcare Collaborative ...
Measuring Improvement in Access to Complete Data in Healthcare Collaborative ...
 
Machine Learning and Causal Inference
Machine Learning and Causal InferenceMachine Learning and Causal Inference
Machine Learning and Causal Inference
 
Machine learning meets user analytics - Metageni tech talk
Machine learning meets user analytics - Metageni tech talkMachine learning meets user analytics - Metageni tech talk
Machine learning meets user analytics - Metageni tech talk
 
Introductionedited
IntroductioneditedIntroductionedited
Introductionedited
 
Use of excel
Use of excelUse of excel
Use of excel
 
5. Llinking employers and employees responses
5. Llinking employers and employees responses5. Llinking employers and employees responses
5. Llinking employers and employees responses
 

Similar to poster_Reza

Outlier analysis and anomaly detection
Outlier analysis and anomaly detectionOutlier analysis and anomaly detection
Outlier analysis and anomaly detectionShantanuDeosthale
 
IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...
IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...
IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...IJDKP
 
PREDICTION OF MALIGNANCY IN SUSPECTED THYROID TUMOUR PATIENTS BY THREE DIFFER...
PREDICTION OF MALIGNANCY IN SUSPECTED THYROID TUMOUR PATIENTS BY THREE DIFFER...PREDICTION OF MALIGNANCY IN SUSPECTED THYROID TUMOUR PATIENTS BY THREE DIFFER...
PREDICTION OF MALIGNANCY IN SUSPECTED THYROID TUMOUR PATIENTS BY THREE DIFFER...cscpconf
 
HEALTH PREDICTION ANALYSIS USING DATA MINING
HEALTH PREDICTION ANALYSIS USING DATA  MININGHEALTH PREDICTION ANALYSIS USING DATA  MINING
HEALTH PREDICTION ANALYSIS USING DATA MININGAshish Salve
 
A Two-Step Self-Evaluation Algorithm On Imputation Approaches For Missing Cat...
A Two-Step Self-Evaluation Algorithm On Imputation Approaches For Missing Cat...A Two-Step Self-Evaluation Algorithm On Imputation Approaches For Missing Cat...
A Two-Step Self-Evaluation Algorithm On Imputation Approaches For Missing Cat...CSCJournals
 
computer application in pharmaceutical research
computer application in pharmaceutical researchcomputer application in pharmaceutical research
computer application in pharmaceutical researchSUJITHA MARY
 
Sensitivity Analysis, Optimal Design, Population Modeling.pptx
Sensitivity Analysis, Optimal Design, Population Modeling.pptxSensitivity Analysis, Optimal Design, Population Modeling.pptx
Sensitivity Analysis, Optimal Design, Population Modeling.pptxAditiChauhan701637
 
Data Analytics on Solar Energy Using Hadoop
Data Analytics on Solar Energy Using HadoopData Analytics on Solar Energy Using Hadoop
Data Analytics on Solar Energy Using HadoopIJMERJOURNAL
 
IRJET- Extending Association Rule Summarization Techniques to Assess Risk of ...
IRJET- Extending Association Rule Summarization Techniques to Assess Risk of ...IRJET- Extending Association Rule Summarization Techniques to Assess Risk of ...
IRJET- Extending Association Rule Summarization Techniques to Assess Risk of ...IRJET Journal
 
KIT-601 Lecture Notes-UNIT-2.pdf
KIT-601 Lecture Notes-UNIT-2.pdfKIT-601 Lecture Notes-UNIT-2.pdf
KIT-601 Lecture Notes-UNIT-2.pdfDr. Radhey Shyam
 
Descriptive versus Mechanistic Modeling
Descriptive versus Mechanistic ModelingDescriptive versus Mechanistic Modeling
Descriptive versus Mechanistic ModelingAshwani Dhingra
 
Data science notes for ASDS calicut 2.pptx
Data science notes for ASDS calicut 2.pptxData science notes for ASDS calicut 2.pptx
Data science notes for ASDS calicut 2.pptxswapnaraghav
 
JSM2013,Proceedings,paper307699_79238,DSweitzer
JSM2013,Proceedings,paper307699_79238,DSweitzerJSM2013,Proceedings,paper307699_79238,DSweitzer
JSM2013,Proceedings,paper307699_79238,DSweitzerDennis Sweitzer
 
COMPARISION OF PERCENTAGE ERROR BY USING IMPUTATION METHOD ON MID TERM EXAMIN...
COMPARISION OF PERCENTAGE ERROR BY USING IMPUTATION METHOD ON MID TERM EXAMIN...COMPARISION OF PERCENTAGE ERROR BY USING IMPUTATION METHOD ON MID TERM EXAMIN...
COMPARISION OF PERCENTAGE ERROR BY USING IMPUTATION METHOD ON MID TERM EXAMIN...ijiert bestjournal
 
Multiple Linear Regression Models in Outlier Detection
Multiple Linear Regression Models in Outlier Detection Multiple Linear Regression Models in Outlier Detection
Multiple Linear Regression Models in Outlier Detection IJORCS
 

Similar to poster_Reza (20)

Outlier analysis and anomaly detection
Outlier analysis and anomaly detectionOutlier analysis and anomaly detection
Outlier analysis and anomaly detection
 
IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...
IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...
IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...
 
PREDICTION OF MALIGNANCY IN SUSPECTED THYROID TUMOUR PATIENTS BY THREE DIFFER...
PREDICTION OF MALIGNANCY IN SUSPECTED THYROID TUMOUR PATIENTS BY THREE DIFFER...PREDICTION OF MALIGNANCY IN SUSPECTED THYROID TUMOUR PATIENTS BY THREE DIFFER...
PREDICTION OF MALIGNANCY IN SUSPECTED THYROID TUMOUR PATIENTS BY THREE DIFFER...
 
HEALTH PREDICTION ANALYSIS USING DATA MINING
HEALTH PREDICTION ANALYSIS USING DATA  MININGHEALTH PREDICTION ANALYSIS USING DATA  MINING
HEALTH PREDICTION ANALYSIS USING DATA MINING
 
A Two-Step Self-Evaluation Algorithm On Imputation Approaches For Missing Cat...
A Two-Step Self-Evaluation Algorithm On Imputation Approaches For Missing Cat...A Two-Step Self-Evaluation Algorithm On Imputation Approaches For Missing Cat...
A Two-Step Self-Evaluation Algorithm On Imputation Approaches For Missing Cat...
 
computer application in pharmaceutical research
computer application in pharmaceutical researchcomputer application in pharmaceutical research
computer application in pharmaceutical research
 
Sensitivity Analysis, Optimal Design, Population Modeling.pptx
Sensitivity Analysis, Optimal Design, Population Modeling.pptxSensitivity Analysis, Optimal Design, Population Modeling.pptx
Sensitivity Analysis, Optimal Design, Population Modeling.pptx
 
Data Analytics on Solar Energy Using Hadoop
Data Analytics on Solar Energy Using HadoopData Analytics on Solar Energy Using Hadoop
Data Analytics on Solar Energy Using Hadoop
 
Ijetr021251
Ijetr021251Ijetr021251
Ijetr021251
 
Datascience
DatascienceDatascience
Datascience
 
datascience.docx
datascience.docxdatascience.docx
datascience.docx
 
IRJET- Extending Association Rule Summarization Techniques to Assess Risk of ...
IRJET- Extending Association Rule Summarization Techniques to Assess Risk of ...IRJET- Extending Association Rule Summarization Techniques to Assess Risk of ...
IRJET- Extending Association Rule Summarization Techniques to Assess Risk of ...
 
KIT-601 Lecture Notes-UNIT-2.pdf
KIT-601 Lecture Notes-UNIT-2.pdfKIT-601 Lecture Notes-UNIT-2.pdf
KIT-601 Lecture Notes-UNIT-2.pdf
 
Descriptive versus Mechanistic Modeling
Descriptive versus Mechanistic ModelingDescriptive versus Mechanistic Modeling
Descriptive versus Mechanistic Modeling
 
Data Science 1.pdf
Data Science 1.pdfData Science 1.pdf
Data Science 1.pdf
 
Data science notes for ASDS calicut 2.pptx
Data science notes for ASDS calicut 2.pptxData science notes for ASDS calicut 2.pptx
Data science notes for ASDS calicut 2.pptx
 
03 Data Mining Techniques
03 Data Mining Techniques03 Data Mining Techniques
03 Data Mining Techniques
 
JSM2013,Proceedings,paper307699_79238,DSweitzer
JSM2013,Proceedings,paper307699_79238,DSweitzerJSM2013,Proceedings,paper307699_79238,DSweitzer
JSM2013,Proceedings,paper307699_79238,DSweitzer
 
COMPARISION OF PERCENTAGE ERROR BY USING IMPUTATION METHOD ON MID TERM EXAMIN...
COMPARISION OF PERCENTAGE ERROR BY USING IMPUTATION METHOD ON MID TERM EXAMIN...COMPARISION OF PERCENTAGE ERROR BY USING IMPUTATION METHOD ON MID TERM EXAMIN...
COMPARISION OF PERCENTAGE ERROR BY USING IMPUTATION METHOD ON MID TERM EXAMIN...
 
Multiple Linear Regression Models in Outlier Detection
Multiple Linear Regression Models in Outlier Detection Multiple Linear Regression Models in Outlier Detection
Multiple Linear Regression Models in Outlier Detection
 

poster_Reza

  • 1. Adapt-then-Combine (ATC) diffusion strategy •Error analysis: The algorithm is stable in mean if •Steady state mean variance: • Assuming small probability of missing, we have • Smoothing filters to estimate the variance • Relation between perfect and imperfect estimates is give as The estimate is biased with respect to • To compensate the bias, we associate the following individual cost to each agent: where is a symmetric matrix to be chosen. To have an unbiased estimate, i.e., The minimum cost: Assumption 3: The covariance matrix of regressor is diagonal. Under assumption 3: Mohammad Reza Gholami1 , Erik G. Ström1 , and Ali H. Sayed2 1 Department of Signals and Systems, Chalmers University of Technology, Gothenburg, SE-412 96, Sweden 2 Electrical Engineering Department, University of California, Los Angeles, CA 90095, USA Emails: {moreza,erik.strom}@chalmers.se, sayed@ee.ucla.edu In many fields, and especially in the medical and social sciences and in various recommender systems, data are often gathered through clinical studies or targeted surveys. Participants are generally reluctant to respond to all questions in a survey or they may lack information to respond adequately to the questions. The data collected from these studies tend to lead to linear regression models where the regression vectors are only known partially: some of their entries are either missing completely or replaced randomly by noisy values. There are also situations where it is not known beforehand which entries are missing or censored. There have been many useful studies in the literature on techniques to perform estimation and inference with missing data. In this work, we examine how a connected network of agents, with each one of them subjected to a stream of data with incomplete regression information, can cooperate with each other through local interactions to estimate the underlying model parameters in the presence of missing data. We explain how to modify traditional distributed strategies through regularization in order to eliminate the bias introduced by the incomplete model. We also examine the stability and performance of the resulting diffusion strategy and provide simulations in support of the findings. We consider two applications: one dealing with a mental health survey and the other dealing with a household consumption survey. Diffusion Estimation over Cooperative Networks with Missing Data AbstractAbstract System ModelSystem Model • Consider a connected network. Each agent senses a wide-sense stationary data that satisfy the following linaer regresson model: Assumption 1: The regression and the noise processes are each spatially independent and temporally white. In addition, • The model for incomplete regressor : (1) Assumption 2: Random variables are independent of each other. • Optimal estimator (minimum-mean-square error): Perfect: Missing data: The minimum cost for the perfect scenario: (www.asl.ee.ucla.edu) Simulation ResultsSimulation Results• In data gathering procedures, it is common that some components of the data are missing or left unobserved, e.g., a participant may be reluctant to answer some questions in a clinical study. • Data can be missed in a random or deterministic fashion. • Two techniques to deal with missing data are: imputation, which makes biased in estimation, and deletion, which degrades the performance. • This work studies the missing data problem over a network of agents, with each one of them subjected to a stream of data with incomplete regression information, can cooperate with each other to estimate the underlying model parameters in the presence of missing data. • In this study, we consider a linear regression model. • We adjust the traditional diffusion strategies through (de)regularization in order to mitigate the bias introduced by imputation. • We consider two applications: one dealing with a mental health survey and the other dealing with a household consumption survey. IntroductionIntroduction Bias CompensationBias Compensation Distributed AlgorithmDistributed Algorithm • Household Consumption: • Mental Health Survey: Adaptive Systems Laboratory Estimation of Regularization ParameterEstimation of Regularization Parameter Ncoop: Non Cooperative MATC: Modified ATC