In this talk we present a framework for splitting data assimilation problems based upon the model dynamics. This is motivated by assimilation in the unstable subspace (AUS) and center manifold and inertial manifold techniques in dynamical systems. Recent efforts based upon the development of particle filters projected into the unstable subspace will be highlighted.
Individualized treatment rules (ITR) assign treatments according to different patients' characteristics. Despite recent advances on the estimation of ITRs, much less attention has been given to uncertainty assessments for the estimated rules. We propose a hypothesis testing procedure for the estimated ITRs from a general framework that directly optimizes overall treatment bene t equipped with sparse penalties. Specifically, we construct a local test for testing low dimensional components of high-dimensional linear decision rules. The procedure can apply to observational studies by taking into account the additional variability from the estimation of propensity score. Theoretically, our test extends the decorrelated score test proposed in Nang and Liu (2017, Ann. Stat.) and is valid no matter whether model selection consistency for the true parameters holds or not. The proposed methodology is illustrated with numerical studies and a real data example on electronic health records of patients with Type-II Diabetes.
Equational axioms for probability calculus and modelling of Likelihood ratio ...Advanced-Concepts-Team
Based on the theory of meadows an equational axiomatisation is given for probability functions on finite event spaces. Completeness of the axioms is stated with some pointers to how that is shown.Then a simplified model courtroom subjective probabilistic reasoning is provided in terms of a protocol with two proponents: the trier of fact (TOF, the judge), and the moderator of evidence (MOE, the scientific witness). Then the idea is outlined of performing of a step of Bayesian reasoning by way of applying a transformation of the subjective probability function of TOF on the basis of different pieces of information obtained from MOE. The central role of the so-called Adams transformation is outlined. A simple protocol is considered where MOE transfers to TOF first a likelihood ratio for a hypothesis H and a potential piece of evidence E and thereupon the additional assertion that E holds true. As an alternative a second protocol is considered where MOE transfers two successive likelihoods (the quotient of both being the mentioned ratio) followed with the factuality of E. It is outlined how the Adams transformation allows to describe information processing at TOF side in both protocols and that the resulting probability distribution is the same in both cases. Finally it is indicated how the Adams transformation also allows the required update of subjective probability at MOE side so that both sides in the protocol may be assumed to comply with the demands of subjective probability.
This talk will report briey on some findings from the problem of picking the weights for a weighted function space in QMC. Then it will be mostly about importance sampling. We want to estimate the probability _ of a union of J rare events. The method uses n samples, each of which picks one of the rare events at random, samples conditionally on that rare event happening and counts the total number of rare events that happen. It was used by Naiman and Priebe for scan
statistics, Shi, Siegmund and Yakir for genomic scans and Adler, Blanchet and Liu for extrema of Gaussian processes. We call it ALOE, for `at least one event'. The ALOE estimate is unbiased and we find that it has a coefficient of variation no larger than p (J + J�1 � 2)=(4n). The coefficient of variation is also no larger than p (__=_ � 1)=n where __ is the union bound. Our motivating problem comes from power system reliability, where the phase differences between connected nodes have a joint Gaussian distribution and the J rare events arise from unacceptably large phase differences. In the grid reliability problems even some events defined by 5772
constraints in 326 dimensions, with probability below 10�22, are estimated with a coefficient of variation of about 0:0024 with only n = 10;000 sample values. In a genomic context, the rare events become false discoveries. There we are interested in the possibility of a large number of simultaneous events, not just one or more. Some work with Kenneth Tay will be presented on that problem.
Joint with Yury Maximov and Michael Chertkov Los Alamos National Laboratory and Kenneth Tay, Stanford
ABC with data cloning for MLE in state space modelsUmberto Picchini
An application of the "data cloning" method for parameter estimation via MLE aided by Approximate Bayesian Computation. The relevant paper is http://arxiv.org/abs/1505.06318
Individualized treatment rules (ITR) assign treatments according to different patients' characteristics. Despite recent advances on the estimation of ITRs, much less attention has been given to uncertainty assessments for the estimated rules. We propose a hypothesis testing procedure for the estimated ITRs from a general framework that directly optimizes overall treatment bene t equipped with sparse penalties. Specifically, we construct a local test for testing low dimensional components of high-dimensional linear decision rules. The procedure can apply to observational studies by taking into account the additional variability from the estimation of propensity score. Theoretically, our test extends the decorrelated score test proposed in Nang and Liu (2017, Ann. Stat.) and is valid no matter whether model selection consistency for the true parameters holds or not. The proposed methodology is illustrated with numerical studies and a real data example on electronic health records of patients with Type-II Diabetes.
Knowledge of cause-effect relationships is central to the field of climate science, supporting mechanistic understanding, observational sampling strategies, experimental design, model development and model prediction. While the major causal connections in our planet's climate system are already known, there is still potential for new discoveries in some areas. The purpose of this talk is to make this community familiar with a variety of available tools to discover potential cause-effect relationships from observed or simulation data. Some of these tools are already in use in climate science, others are just emerging in recent years. None of them are miracle solutions, but many can provide important pieces of information to climate scientists. An important way to use such methods is to generate cause-effect hypotheses that climate experts can then study further. In this talk we will (1) introduce key concepts important for causal analysis; (2) discuss some methods based on the concepts of Granger causality and Pearl causality; (3) point out some strengths and limitations of these approaches; and (4) illustrate such methods using a few real-world examples from climate science.
The main machine learning algorithms are built upon various mathematical foundations such as statistics, optimization, and probability. Will this also hold true for Artificial Intelligence? In this presentation, I will showcase some recent examples of interactions between machine learning and mathematics.
Colloquium @ CEREMADE (October 3, 2023)
Talk presented on this workshop "Workshop: Imaging With Uncertainty Quantification (IUQ), September 2022",
https://people.compute.dtu.dk/pcha/CUQI/IUQworkshop.html
We consider a weakly supervised classification problem. It
is a classification problem where the target variable can be unknown
or uncertain for some subset of samples. This problem appears when
the labeling is impossible, time-consuming, or expensive. Noisy measurements
and lack of data may prevent accurate labeling. Our task
is to build an optimal classification function. For this, we construct and
minimize a specific objective function, which includes the fitting error on
labeled data and a smoothness term. Next, we use covariance and radial AQ1
basis functions to define the degree of similarity between points. The further
process involves the repeated solution of an extensive linear system
with the graph Laplacian operator. To speed up this solution process,
we introduce low-rank approximation techniques. We call the resulting
algorithm WSC-LR. Then we use the WSC-LR algorithm for analysis
CT brain scans to recognize ischemic stroke disease. We also compare
WSC-LR with other well-known machine learning algorithms.
Talk at the modcov19 CNRS workshop, en France, to present our article COVID-19 pandemic control: balancing detection policy and lockdown intervention under ICU sustainability
how to sell pi coins at high rate quickly.DOT TECH
Where can I sell my pi coins at a high rate.
Pi is not launched yet on any exchange. But one can easily sell his or her pi coins to investors who want to hold pi till mainnet launch.
This means crypto whales want to hold pi. And you can get a good rate for selling pi to them. I will leave the telegram contact of my personal pi vendor below.
A vendor is someone who buys from a miner and resell it to a holder or crypto whale.
Here is the telegram contact of my vendor:
@Pi_vendor_247
what is the future of Pi Network currency.DOT TECH
The future of the Pi cryptocurrency is uncertain, and its success will depend on several factors. Pi is a relatively new cryptocurrency that aims to be user-friendly and accessible to a wide audience. Here are a few key considerations for its future:
Message: @Pi_vendor_247 on telegram if u want to sell PI COINS.
1. Mainnet Launch: As of my last knowledge update in January 2022, Pi was still in the testnet phase. Its success will depend on a successful transition to a mainnet, where actual transactions can take place.
2. User Adoption: Pi's success will be closely tied to user adoption. The more users who join the network and actively participate, the stronger the ecosystem can become.
3. Utility and Use Cases: For a cryptocurrency to thrive, it must offer utility and practical use cases. The Pi team has talked about various applications, including peer-to-peer transactions, smart contracts, and more. The development and implementation of these features will be essential.
4. Regulatory Environment: The regulatory environment for cryptocurrencies is evolving globally. How Pi navigates and complies with regulations in various jurisdictions will significantly impact its future.
5. Technology Development: The Pi network must continue to develop and improve its technology, security, and scalability to compete with established cryptocurrencies.
6. Community Engagement: The Pi community plays a critical role in its future. Engaged users can help build trust and grow the network.
7. Monetization and Sustainability: The Pi team's monetization strategy, such as fees, partnerships, or other revenue sources, will affect its long-term sustainability.
It's essential to approach Pi or any new cryptocurrency with caution and conduct due diligence. Cryptocurrency investments involve risks, and potential rewards can be uncertain. The success and future of Pi will depend on the collective efforts of its team, community, and the broader cryptocurrency market dynamics. It's advisable to stay updated on Pi's development and follow any updates from the official Pi Network website or announcements from the team.
how can I sell pi coins after successfully completing KYCDOT TECH
Pi coins is not launched yet in any exchange 💱 this means it's not swappable, the current pi displaying on coin market cap is the iou version of pi. And you can learn all about that on my previous post.
RIGHT NOW THE ONLY WAY you can sell pi coins is through verified pi merchants. A pi merchant is someone who buys pi coins and resell them to exchanges and crypto whales. Looking forward to hold massive quantities of pi coins before the mainnet launch.
This is because pi network is not doing any pre-sale or ico offerings, the only way to get my coins is from buying from miners. So a merchant facilitates the transactions between the miners and these exchanges holding pi.
I and my friends has sold more than 6000 pi coins successfully with this method. I will be happy to share the contact of my personal pi merchant. The one i trade with, if you have your own merchant you can trade with them. For those who are new.
Message: @Pi_vendor_247 on telegram.
I wouldn't advise you selling all percentage of the pi coins. Leave at least a before so its a win win during open mainnet. Have a nice day pioneers ♥️
#kyc #mainnet #picoins #pi #sellpi #piwallet
#pinetwork
The European Unemployment Puzzle: implications from population agingGRAPE
We study the link between the evolving age structure of the working population and unemployment. We build a large new Keynesian OLG model with a realistic age structure, labor market frictions, sticky prices, and aggregate shocks. Once calibrated to the European economy, we quantify the extent to which demographic changes over the last three decades have contributed to the decline of the unemployment rate. Our findings yield important implications for the future evolution of unemployment given the anticipated further aging of the working population in Europe. We also quantify the implications for optimal monetary policy: lowering inflation volatility becomes less costly in terms of GDP and unemployment volatility, which hints that optimal monetary policy may be more hawkish in an aging society. Finally, our results also propose a partial reversal of the European-US unemployment puzzle due to the fact that the share of young workers is expected to remain robust in the US.
What website can I sell pi coins securely.DOT TECH
Currently there are no website or exchange that allow buying or selling of pi coins..
But you can still easily sell pi coins, by reselling it to exchanges/crypto whales interested in holding thousands of pi coins before the mainnet launch.
Who is a pi merchant?
A pi merchant is someone who buys pi coins from miners and resell to these crypto whales and holders of pi..
This is because pi network is not doing any pre-sale. The only way exchanges can get pi is by buying from miners and pi merchants stands in between the miners and the exchanges.
How can I sell my pi coins?
Selling pi coins is really easy, but first you need to migrate to mainnet wallet before you can do that. I will leave the telegram contact of my personal pi merchant to trade with.
Tele-gram.
@Pi_vendor_247
Currently pi network is not tradable on binance or any other exchange because we are still in the enclosed mainnet.
Right now the only way to sell pi coins is by trading with a verified merchant.
What is a pi merchant?
A pi merchant is someone verified by pi network team and allowed to barter pi coins for goods and services.
Since pi network is not doing any pre-sale The only way exchanges like binance/huobi or crypto whales can get pi is by buying from miners. And a merchant stands in between the exchanges and the miners.
I will leave the telegram contact of my personal pi merchant. I and my friends has traded more than 6000pi coins successfully
Tele-gram
@Pi_vendor_247
Poonawalla Fincorp and IndusInd Bank Introduce New Co-Branded Credit Cardnickysharmasucks
The unveiling of the IndusInd Bank Poonawalla Fincorp eLITE RuPay Platinum Credit Card marks a notable milestone in the Indian financial landscape, showcasing a successful partnership between two leading institutions, Poonawalla Fincorp and IndusInd Bank. This co-branded credit card not only offers users a plethora of benefits but also reflects a commitment to innovation and adaptation. With a focus on providing value-driven and customer-centric solutions, this launch represents more than just a new product—it signifies a step towards redefining the banking experience for millions. Promising convenience, rewards, and a touch of luxury in everyday financial transactions, this collaboration aims to cater to the evolving needs of customers and set new standards in the industry.
Turin Startup Ecosystem 2024 - Ricerca sulle Startup e il Sistema dell'Innov...Quotidiano Piemontese
Turin Startup Ecosystem 2024
Una ricerca de il Club degli Investitori, in collaborazione con ToTeM Torino Tech Map e con il supporto della ESCP Business School e di Growth Capital
how to sell pi coins in South Korea profitably.DOT TECH
Yes. You can sell your pi network coins in South Korea or any other country, by finding a verified pi merchant
What is a verified pi merchant?
Since pi network is not launched yet on any exchange, the only way you can sell pi coins is by selling to a verified pi merchant, and this is because pi network is not launched yet on any exchange and no pre-sale or ico offerings Is done on pi.
Since there is no pre-sale, the only way exchanges can get pi is by buying from miners. So a pi merchant facilitates these transactions by acting as a bridge for both transactions.
How can i find a pi vendor/merchant?
Well for those who haven't traded with a pi merchant or who don't already have one. I will leave the telegram id of my personal pi merchant who i trade pi with.
Tele gram: @Pi_vendor_247
#pi #sell #nigeria #pinetwork #picoins #sellpi #Nigerian #tradepi #pinetworkcoins #sellmypi
how to swap pi coins to foreign currency withdrawable.DOT TECH
As of my last update, Pi is still in the testing phase and is not tradable on any exchanges.
However, Pi Network has announced plans to launch its Testnet and Mainnet in the future, which may include listing Pi on exchanges.
The current method for selling pi coins involves exchanging them with a pi vendor who purchases pi coins for investment reasons.
If you want to sell your pi coins, reach out to a pi vendor and sell them to anyone looking to sell pi coins from any country around the globe.
Below is the contact information for my personal pi vendor.
Telegram: @Pi_vendor_247
USDA Loans in California: A Comprehensive Overview.pptxmarketing367770
USDA Loans in California: A Comprehensive Overview
If you're dreaming of owning a home in California's rural or suburban areas, a USDA loan might be the perfect solution. The U.S. Department of Agriculture (USDA) offers these loans to help low-to-moderate-income individuals and families achieve homeownership.
Key Features of USDA Loans:
Zero Down Payment: USDA loans require no down payment, making homeownership more accessible.
Competitive Interest Rates: These loans often come with lower interest rates compared to conventional loans.
Flexible Credit Requirements: USDA loans have more lenient credit score requirements, helping those with less-than-perfect credit.
Guaranteed Loan Program: The USDA guarantees a portion of the loan, reducing risk for lenders and expanding borrowing options.
Eligibility Criteria:
Location: The property must be located in a USDA-designated rural or suburban area. Many areas in California qualify.
Income Limits: Applicants must meet income guidelines, which vary by region and household size.
Primary Residence: The home must be used as the borrower's primary residence.
Application Process:
Find a USDA-Approved Lender: Not all lenders offer USDA loans, so it's essential to choose one approved by the USDA.
Pre-Qualification: Determine your eligibility and the amount you can borrow.
Property Search: Look for properties in eligible rural or suburban areas.
Loan Application: Submit your application, including financial and personal information.
Processing and Approval: The lender and USDA will review your application. If approved, you can proceed to closing.
USDA loans are an excellent option for those looking to buy a home in California's rural and suburban areas. With no down payment and flexible requirements, these loans make homeownership more attainable for many families. Explore your eligibility today and take the first step toward owning your dream home.
US Economic Outlook - Being Decided - M Capital Group August 2021.pdfpchutichetpong
The U.S. economy is continuing its impressive recovery from the COVID-19 pandemic and not slowing down despite re-occurring bumps. The U.S. savings rate reached its highest ever recorded level at 34% in April 2020 and Americans seem ready to spend. The sectors that had been hurt the most by the pandemic specifically reduced consumer spending, like retail, leisure, hospitality, and travel, are now experiencing massive growth in revenue and job openings.
Could this growth lead to a “Roaring Twenties”? As quickly as the U.S. economy contracted, experiencing a 9.1% drop in economic output relative to the business cycle in Q2 2020, the largest in recorded history, it has rebounded beyond expectations. This surprising growth seems to be fueled by the U.S. government’s aggressive fiscal and monetary policies, and an increase in consumer spending as mobility restrictions are lifted. Unemployment rates between June 2020 and June 2021 decreased by 5.2%, while the demand for labor is increasing, coupled with increasing wages to incentivize Americans to rejoin the labor force. Schools and businesses are expected to fully reopen soon. In parallel, vaccination rates across the country and the world continue to rise, with full vaccination rates of 50% and 14.8% respectively.
However, it is not completely smooth sailing from here. According to M Capital Group, the main risks that threaten the continued growth of the U.S. economy are inflation, unsettled trade relations, and another wave of Covid-19 mutations that could shut down the world again. Have we learned from the past year of COVID-19 and adapted our economy accordingly?
“In order for the U.S. economy to continue growing, whether there is another wave or not, the U.S. needs to focus on diversifying supply chains, supporting business investment, and maintaining consumer spending,” says Grace Feeley, a research analyst at M Capital Group.
While the economic indicators are positive, the risks are coming closer to manifesting and threatening such growth. The new variants spreading throughout the world, Delta, Lambda, and Gamma, are vaccine-resistant and muddy the predictions made about the economy and health of the country. These variants bring back the feeling of uncertainty that has wreaked havoc not only on the stock market but the mindset of people around the world. MCG provides unique insight on how to mitigate these risks to possibly ensure a bright economic future.
NO1 Uk Divorce problem uk all amil baba in karachi,lahore,pakistan talaq ka m...Amil Baba Dawood bangali
Contact with Dawood Bhai Just call on +92322-6382012 and we'll help you. We'll solve all your problems within 12 to 24 hours and with 101% guarantee and with astrology systematic. If you want to take any personal or professional advice then also you can call us on +92322-6382012 , ONLINE LOVE PROBLEM & Other all types of Daily Life Problem's.Then CALL or WHATSAPP us on +92322-6382012 and Get all these problems solutions here by Amil Baba DAWOOD BANGALI
#vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore#blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #blackmagicforlove #blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #Amilbabainuk #amilbabainspain #amilbabaindubai #Amilbabainnorway #amilbabainkrachi #amilbabainlahore #amilbabaingujranwalan #amilbabainislamabad
Falcon stands out as a top-tier P2P Invoice Discounting platform in India, bridging esteemed blue-chip companies and eager investors. Our goal is to transform the investment landscape in India by establishing a comprehensive destination for borrowers and investors with diverse profiles and needs, all while minimizing risk. What sets Falcon apart is the elimination of intermediaries such as commercial banks and depository institutions, allowing investors to enjoy higher yields.
Falcon Invoice Discounting: Optimizing Returns with Minimal Risk
Side 2019 #9
1. Arthur Charpentier, SIDE Summer School, July 2019
# 9 Updates & Missing Values
Arthur Charpentier (Universit´e du Qu´ebec `a Montr´eal)
Machine Learning & Econometrics
SIDE Summer School - July 2019
@freakonometrics freakonometrics freakonometrics.hypotheses.org 1
2. Arthur Charpentier, SIDE Summer School, July 2019
Machine Learning, Practical Issues
Two important practical issue :
• what if we cannot access the entire dataset ?
• what if there is an update ? (new observation or new variable)
Consider the case where datasets are located on various
servers, and cannot be downloaded (e.g. hospitals), but
one can run functions and obtain outputs.
see Wolfson et al. (2010, Data Shield)
or http://www.datashield.ac.uk/
Consider a regression model y = Xβ + ε
@freakonometrics freakonometrics freakonometrics.hypotheses.org 2
3. Arthur Charpentier, SIDE Summer School, July 2019
Machine Learning, Practical Issues
Use the QR decomposition of X, X = QR where Q is an orthogonal matrix
QT
Q = I. Then
β = [XT
X]−1
XT
y = R−1
QT
y
Consider m blocks - map part
y =
y1
y2
...
ym
and X =
X1
X2
...
Xm
=
Q
(1)
1 R
(1)
1
Q
(1)
2 R
(1)
2
...
Q(1)
m R(1)
m
@freakonometrics freakonometrics freakonometrics.hypotheses.org 3
4. Arthur Charpentier, SIDE Summer School, July 2019
Machine Learning, Practical Issues
Consider the QR decomposition of R(1)
- step 1 of reduce part
R(1)
=
R1
R2
...
Rm
= Q(2)
R(2)
where Q(2)
=
Q
(2)
1
Q
(2)
2
...
Q(2)
m
define - step 2 of reduce part
Q
(3)
j = Q
(2)
j Q
(1)
j and V j = Q
(3)
j
T
yj
and finally set - step 3 of reduce part
β = [R(2)
]−1
m
j=1
V j
@freakonometrics freakonometrics freakonometrics.hypotheses.org 4
5. Arthur Charpentier, SIDE Summer School, July 2019
Online Learning
Let Tn = {(y1, x1), · · · , (yn, xn)} denote the training dataset, with y ∈ Y.
Learning
A learning algorithm is a map A : Tn → Y
Online Learning
A pure online learning algorithm is a sequence of recursive algorithms
(i) m0 is the initialization
(ii) for k = 1, 2 · · · , mk = A(mk−1, (yn, xn))
Recall that the risk is R(m) = E (Y, mX)
As in gradient boosting, consider some approximation of the gradient of R(m),
mk = mk−1 + γkG(mk−1, (yn, xn))
@freakonometrics freakonometrics freakonometrics.hypotheses.org 5
6. Arthur Charpentier, SIDE Summer School, July 2019
• Update with a new observation, as Ridell (1975, Recursive Estimation Algorithms
for Economic Research)
Let X1:n denote the matrix of covariates, with n observations (rows), and xn+1
denote a new one. Recall that
βn = [X1:n
T
X1:n]−1
X1:n
T
y1:n = C−1
n X1:n
T
y1:n
Since Cn+1 = X1:n+1X1:n+1 = Cn + xn+1xn+1, then
βn+1 = βn + C−1
n+1xn+1[yn+1 − xn+1βn]
This updating formation is also called a differential correction, since it is
proportional to the prediction error.
Note that the residual sum of squares can also be updated, with
Sn+1 = Sn +
1
d
[yn+1 − xn+1
T
βn]2
@freakonometrics freakonometrics freakonometrics.hypotheses.org 6
7. Arthur Charpentier, SIDE Summer School, July 2019
Online Learning
Online Learning for OLS
βn+1 = βn + C−1
n+1xn+1[yn+1 − xn+1βn]
is a recursive formula, requires storing all the data
(and inverting a matrix at each step).
Good news, [A + BCD]−1
= A−1
− A−1
B DA−1
B + C−1 −1
DA−1
, so
C−1
n+1 = C−1
n −
C−1
n xn+1xn+1C−1
n
1 + xn+1C−1
n xn+1
We have an algorithm of the form for k = 1, 2 · · · , mk = A(mk−1, (yn, Cn, xn))
for some matrix Cn
@freakonometrics freakonometrics freakonometrics.hypotheses.org 7
8. Arthur Charpentier, SIDE Summer School, July 2019
Online Learning
Online Learning for OLS
βn+1 = βn + C−1
n+1xn+1[yn+1 − xn+1βn]
is also a gradient-type algorithm, since
yn+1 − xn+1β
2
= 2xn+1[yn+1 − xn+1β]
One might consider using γn+1 ∈ R instead of Cn+1 (p × p matrix)
Polyak-Ruppert Averaging suggests to use γn = n−α
where α ∈ (1/2, 1) to ensure
convergence
@freakonometrics freakonometrics freakonometrics.hypotheses.org 8
9. Arthur Charpentier, SIDE Summer School, July 2019
Update Formulas
• Update with a new variable
Let X1:k denote the matrix of covariates, with k explanatory variables
(columns), and xk+1 denote a new one. Recall that
βk = [X1:k
T
X1:k]−1
X1:k
T
y
Then βk+1 = (βk , βk+1)T
where
βk = βk −
[X1:k
T
X1:k]−1
X1:k
T
xk+1xk+1P⊥
k y
xk+1
TP⊥
k xk+1
with P⊥
k = I − X1:k(X1:k
T
X1:k)−1
X1:k
T
, while
βk+1 =
xk+1
T
P⊥
k y
xk+1
TP⊥
k xk+1
If xk+1 is orthogonal to previous variables - X1:k
T
xk+1 = 0, then βk = βk.
Observe that P⊥
k y = εk.
@freakonometrics freakonometrics freakonometrics.hypotheses.org 9
10. Arthur Charpentier, SIDE Summer School, July 2019
Missing Values
“There are two kinds of model in the world : those who can extrapolate from incomplete data...”
From Tropical Atmosphere Ocean (TAO) dataset, see VIM::tao
@freakonometrics freakonometrics freakonometrics.hypotheses.org 10
11. Arthur Charpentier, SIDE Summer School, July 2019
Missing Values
With lm function, rows with missing values (in y or x) are deleted
To deal with them, one should understand the mechanism leading to missing values
Expectation - Maximization, see Dempster et al. (1977, Maximum Likelihood from Incomplete
Data via the EM Algorithm)
Consider a mixture model dF(y) = p1dFθ1 (y) + p2dFθ2 (y), i.e. there is Θ ∈ {1, 2} (with
pj = P[Θ = j]) such that
yi =
y1,i with Y1 ∼ Fθ1 , if Θ = 1
y2,i with Y2 ∼ Fθ2 , if Θ = 2
see mixtools::normalmixEM for Gaussian mixtures
@freakonometrics freakonometrics freakonometrics.hypotheses.org 11
12. Arthur Charpentier, SIDE Summer School, July 2019
Observable and Non-Obsevable Heterogeneity
Mixture distribution (with two classes) :
• if θ = A, Y ∼ N(µA, σ2
A)
• if θ = B, Y ∼ N(µB, σ2
B)
f(y) = pAfA(y) + pBfB(y)
5 parameters to estimate,
no interpretation of the mixture parameter θ
Height (in cm)
Density
150 160 170 180 190 200
0.000.010.020.030.040.05
@freakonometrics freakonometrics freakonometrics.hypotheses.org 12
13. Arthur Charpentier, SIDE Summer School, July 2019
Observable and Non-Observable Heterogeneity
One categorical variable (e.g. gender)
• if gender=M, Y ∼ N(µM , σ2
M )
• if gender=F, Y ∼ N(µF , σ2
F )
f(y) = pM fM (y) + pF fF (y)
4 parameters to estimate,
(pM and pF are known)
clear interpretation of the mixture parameter
Height (in cm)
Density
150 160 170 180 190 200
0.000.010.020.030.040.05
@freakonometrics freakonometrics freakonometrics.hypotheses.org 13
14. Arthur Charpentier, SIDE Summer School, July 2019
Expectation - Maximization
EM for Mixtures
(i) start with initial values θ1,0 and θ2,0, pj,0
(ii) for k = 1, 2, · · ·
E step : γk,j,i =
dF
θj,k−1
(yi)
p1,k−1dF
θ1,k−1
(yi) + p2,k−1dF
θ2,k−1
(yi)
M step : use ML techniques with weights γk,j,i
M step with a Gaussian mixture, µj,k =
γk,j,iyi
γk,j,i
and σ2
j,k =
γk,j,i[yi − µj,k]2
γk,j,i
@freakonometrics freakonometrics freakonometrics.hypotheses.org 14
15. Arthur Charpentier, SIDE Summer School, July 2019
Expectation - Maximization
Expectation - Maximization
E step expectation : compute Q(θ, θk
) = E log f(Y |θ)|yobs, θk
M step maximization : θk+1 = argmax
θ
Q(θ, θk
)
Stochastic EM (for Mixtures)
(i) start with initial values θ1,0 and θ2,0, pj,0
(ii) for k = 1, 2, · · ·
E step : γk,j,i =
dF
θj,k−1
(yi)
p1,k−1dF
θ1,k−1
(yi) + p2,k−1dF
θ2,k−1
(yi)
S step : generate ξk,i in {1, 2} with probabilities γk,1,i and γk,2,i
M step : compute ML estimate θk,j on sample {yi : ξk,i = j}
@freakonometrics freakonometrics freakonometrics.hypotheses.org 15
16. Arthur Charpentier, SIDE Summer School, July 2019
Missing Values : Single Imputation
Classical idea : Principal Component Analysis (PCA)
Approximate n × p matrix X with a lower rank matrix,
Xs = argmin
Y , rank(Y )≤s
X − Y 2
2 = UsΛ
1/2
s V s
(using Singular Value Decomposition)
One can consider PCA with missing values, based on weighted least squares
Xs = argmin
Y , rank(Y )≤s
W (X − Y ) 2
2
where W is the n × p matrix with 1’s, and Wi,j = 0 if xi,j is missing, see Gabriel & Zamir
(1979, Lower rank approximation of matrices by least squares with any choice of weights) or Kiers
(1997, Weighted least squares fitting using ordinary least squares algorithms)
@freakonometrics freakonometrics freakonometrics.hypotheses.org 16
17. Arthur Charpentier, SIDE Summer School, July 2019
Missing Values : Single Imputation
Iterative PCA
(i) if xi,j is missing, Wi,j = 0,
x1
i,j = Wi,j · x0
i,j + (1 − Wi,j) · 0
(ii) for k = 1, 2, · · ·
• Xs = argmin
Y , rank(Y )≤s
W (X − Y ) 2
2
• xk+1
i,j = Wi,j · xk
i,j + (1 − Wi,j) · xi,j
q
q
q
q
q
q
q
q
−0.5 0.0 0.5 1.0 1.5
−0.50.00.51.01.5
Connections with fixed effects model, xi,j =
s
k=1
fi,kuj,k + εi,j with εi,j ∼ N(0, σ2)
and random effects model, xi = Γzi + εi with εi ∼ N(0, σ2I) and zi ∼ N(0, I)
@freakonometrics freakonometrics freakonometrics.hypotheses.org 17
18. Arthur Charpentier, SIDE Summer School, July 2019
Missing Values : Single Imputation
Iterative PCA
(i) if xi,j is missing, Wi,j = 0,
x1
i,j = Wi,j · x0
i,j + (1 − Wi,j) · 0
(ii) for k = 1, 2, · · ·
• Xs = argmin
Y , rank(Y )≤s
W (X − Y ) 2
2
• xk+1
i,j = Wi,j · xk
i,j + (1 − Wi,j) · xi,j
q
q
q
q
q
q
q
q
−0.5 0.0 0.5 1.0 1.5
−0.50.00.51.01.5
q
q
Connections with fixed effects model, xi,j =
s
k=1
fi,kuj,k + εi,j with εi,j ∼ N(0, σ2)
and random effects model, xi = Γzi + εi with εi ∼ N(0, σ2I) and zi ∼ N(0, I)
@freakonometrics freakonometrics freakonometrics.hypotheses.org 18
19. Arthur Charpentier, SIDE Summer School, July 2019
Missing Values : Single Imputation
Iterative PCA
(i) if xi,j is missing, Wi,j = 0,
x1
i,j = Wi,j · x0
i,j + (1 − Wi,j) · 0
(ii) for k = 1, 2, · · ·
• Xs = argmin
Y , rank(Y )≤s
W (X − Y ) 2
2
• xk+1
i,j = Wi,j · xk
i,j + (1 − Wi,j) · xi,j
q
q
q
q
q
q
q
q
−0.5 0.0 0.5 1.0 1.5
−0.50.00.51.01.5
q
q
Connections with fixed effects model, xi,j =
s
k=1
fi,kuj,k + εi,j with εi,j ∼ N(0, σ2)
and random effects model, xi = Γzi + εi with εi ∼ N(0, σ2I) and zi ∼ N(0, I)
@freakonometrics freakonometrics freakonometrics.hypotheses.org 19
20. Arthur Charpentier, SIDE Summer School, July 2019
Missing Values : Single Imputation
Iterative PCA
(i) if xi,j is missing, Wi,j = 0,
x1
i,j = Wi,j · x0
i,j + (1 − Wi,j) · 0
(ii) for k = 1, 2, · · ·
• Xs = argmin
Y , rank(Y )≤s
W (X − Y ) 2
2
• xk+1
i,j = Wi,j · xk
i,j + (1 − Wi,j) · xi,j q
q
q
q
q
q
q
q
q
q
−0.5 0.0 0.5 1.0 1.5
−0.50.00.51.01.5
q
q
q
q
Connections with fixed effects model, xi,j =
s
k=1
fi,kuj,k + εi,j with εi,j ∼ N(0, σ2)
and random effects model, xi = Γzi + εi with εi ∼ N(0, σ2I) and zi ∼ N(0, I)
@freakonometrics freakonometrics freakonometrics.hypotheses.org 20
21. Arthur Charpentier, SIDE Summer School, July 2019
Missing Values : Single Imputation
Iterative PCA
(i) if xi,j is missing, Wi,j = 0,
x1
i,j = Wi,j · x0
i,j + (1 − Wi,j) · 0
(ii) for k = 1, 2, · · ·
• Xs = argmin
Y , rank(Y )≤s
W (X − Y ) 2
2
• xk+1
i,j = Wi,j · xk
i,j + (1 − Wi,j) · xi,j q
q
q
q
q
q
q
q
q
q
−0.5 0.0 0.5 1.0 1.5
−0.50.00.51.01.5
q
q
Connections with fixed effects model, xi,j =
s
k=1
fi,kuj,k + εi,j with εi,j ∼ N(0, σ2)
and random effects model, xi = Γzi + εi with εi ∼ N(0, σ2I) and zi ∼ N(0, I)
@freakonometrics freakonometrics freakonometrics.hypotheses.org 21
22. Arthur Charpentier, SIDE Summer School, July 2019
Missing Values : Single Imputation
The iterative PCA is simply using EM on fixed effects model,
xi,j =
s
i=1
fi,jui,j + εi,j with εi,j ∼ N(0, σ2
)
X
n×p
= F
n×s
U
p×s
Log-likelihood is here
log L(F , u, σ2
) = −
np
2
log 2πσ2
−
1
2σ2
X − F u 2
E step : compute E Xi,j X, F k, Uk, σ2
k (imputation)
M step : maximize the log-likelihood
Uk+1 = Xk F k F k F k
−1
and F k+1 = XkUk Uk Uk
−1
@freakonometrics freakonometrics freakonometrics.hypotheses.org 22
23. Arthur Charpentier, SIDE Summer School, July 2019
Missing Values : Single Imputation
One can use regularized iterative PCA. So far, we used (SVD) XsUsΛ
1/2
s V s
Xi,j =
s
k=1
λkUi,kVj,k
Following Efron & Morris (1972, Limiting the Risk of Bayes and Empirical Bayes Estimators)
consider a shrinkage version
Xi,j =
s
k=1
λk − σ2
λk
λkUi,kVj,k =
s
k=1
λk −
σ2
λk
Ui,kVj,k
where σ2
=
n[λs + 1 + · · · + λp]
np − p − ns − ps + s2 + s
See package missMDA
@freakonometrics freakonometrics freakonometrics.hypotheses.org 23
24. Arthur Charpentier, SIDE Summer School, July 2019
Missing Values : Single Imputation
One can use soft-thresholding PCA. Following Hastie & Mazumber (2015, Matrix Completion
and Low-Rank SVD)
Xi,j =
s
k=1
λk − λ
+
Ui,kVj,k
solution of
Xs = argmin
Y , rank(Y )≤s
W (X − Y ) 2
2 + λ Y
where the penalty is based on the nuclear norm (sum of the singular values).
Complicated to select λ...
See package softImpute
@freakonometrics freakonometrics freakonometrics.hypotheses.org 24
25. Arthur Charpentier, SIDE Summer School, July 2019
Missing Values : Single Imputation
One can also use k-nearest neigbors
with missMDA::imputePCA(y,ncp=1) and VIM::kNN(y,k=5)
@freakonometrics freakonometrics freakonometrics.hypotheses.org 25
26. Arthur Charpentier, SIDE Summer School, July 2019
Missing Values : Multiple Imputation
It aims to allow for the uncertainty about the missing data by creating several different
plausible imputed data sets (via Sterne et al. (2009, Multiple imputation for missing data)
Reference, Rubin (2007, Multiple imputation for nonresponse in surveys)
The idea is to generate N possible values for each missing value, see Honaker, King & Blackwell
(2010, Amelia) and library Amelia using boostrap samples or van Buuren (2018, Multivariate
Imputation by Chained Equations) with mice using bootstrap and regression
The idea of imputation is both seductive and dangerous. It is seductive because it can lull the
user into the pleasurable state of believing that the data are complete after all, and it is
dangerous because it lumps together situations where the problem is sufficiently minor that it
can be legitimately handled in this way and situations where standard estimators applied to the
real and imputed data have substantial biases Dempster & Rubin (1983, Incomplete Data in
Sample Surveys)
@freakonometrics freakonometrics freakonometrics.hypotheses.org 26
27. Arthur Charpentier, SIDE Summer School, July 2019
Missing Values : Gaussian process regression (and krigging)
Extrapolation or interpolation ?
x y
1 y1
2 y2
3 ?
x y
1 y1
2 ?
3 y3
y1
y2
y3
∼ N
0,
σ1,1 σ1,2 σ1,3
σ2,1 σ2,2 σ2,3
σ3,1 σ3,2 σ3,3
y
y
∼ N 0,
Σ Σ
Σ Σ
(y |y) ∼ N(µy, Σy) where
µy = Σ Σ−1y
Σy = Σ − Σ Σ−1Σ
see Roberts et al. (2012, Gaussian Processes for Time Series) or Rasmussen & Williams (2006,
Gaussian Processes for Machine Learning)
@freakonometrics freakonometrics freakonometrics.hypotheses.org 27