This document summarizes a research paper that proposes a new metric learning method called Capped Trace Norm regularization. It introduces low-rank regularization to learn Mahalanobis distances. The method caps the singular values to be less sensitive to changes in large values. Experiments on synthetic and face data demonstrate it outperforms other metric learning baselines in learning a more accurate distance metric with better generalization.
Overview on Optimization algorithms in Deep LearningKhang Pham
Overview on function optimization in general and in deep learning. The slides cover from basic algorithms like batch gradient descent, stochastic gradient descent to the state of art algorithm like Momentum, Adagrad, RMSprop, Adam.
* ML in HEP
* classification and regression
* knn classification and regression
* ROC curve
* optimal bayesian classifier
* Fisher's QDA
* intro to Logistic Regression
Andres hernandez ai_machine_learning_london_nov2017Andres Hernandez
My slides from the AI & Machine Learning in Quantitative Finance conference in London. I train a neural network to train another neural network to optimize particular black boxes
Overview on Optimization algorithms in Deep LearningKhang Pham
Overview on function optimization in general and in deep learning. The slides cover from basic algorithms like batch gradient descent, stochastic gradient descent to the state of art algorithm like Momentum, Adagrad, RMSprop, Adam.
* ML in HEP
* classification and regression
* knn classification and regression
* ROC curve
* optimal bayesian classifier
* Fisher's QDA
* intro to Logistic Regression
Andres hernandez ai_machine_learning_london_nov2017Andres Hernandez
My slides from the AI & Machine Learning in Quantitative Finance conference in London. I train a neural network to train another neural network to optimize particular black boxes
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsPK Lehre
We describe how to estimate the optimisation time of the UMDA, an estimation of distribution algorithm, using the level-based theorem. The paper was presented at GECCO 2015 in Madrid.
Introduction to machine learning terminology.
Applications within High Energy Physics and outside HEP.
* Basic problems: classification and regression.
* Nearest neighbours approach and spacial indices
* Overfitting (intro)
* Curse of dimensionality
* ROC curve, ROC AUC
* Bayes optimal classifier
* Density estimation: KDE and histograms
* Parametric density estimation
* Mixtures for density estimation and EM algorithm
* Generative approach vs discriminative approach
* Linear decision rule, intro to logistic regression
* Linear regression
Sequential quasi-Monte Carlo (SQMC) is a quasi-Monte Carlo (QMC) version of sequential Monte Carlo (or particle filtering), a popular class of Monte Carlo techniques used to carry out inference in state space models. In this talk I will first review the SQMC methodology as well as some theoretical results. Although SQMC converges faster than the usual Monte Carlo error rate its performance deteriorates quickly as the dimension of the hidden variable increases. However, I will show with an example that SQMC may perform well for some "high" dimensional problems. I will conclude this talk with some open problems and potential applications of SQMC in complicated settings.
Here a Review of the Combination of Machine Learning models from Bayesian Averaging, Committees to Boosting... Specifically An statistical analysis of Boosting is done
* Logistic regression, logistic loss (log loss)
* stochastic optimization
* adding new features, generalized linear model
* Kernel trick, intro to SVM
* Overfitting
* Decision trees for classification and regression
* Building trees greedily: Gini index, entropy
* Trees fighting with overfitting: pre-stopping and post-pruning
* Feature importances
Markov chain Monte Carlo (MCMC) methods are popularly used in Bayesian computation. However, they need large number of samples for convergence which can become costly when the posterior distribution is expensive to evaluate. Deterministic sampling techniques such as Quasi-Monte Carlo (QMC) can be a useful alternative to MCMC, but the existing QMC methods are mainly developed only for sampling from unit hypercubes. Unfortunately, the posterior distributions can be highly correlated and nonlinear making them occupy very little space inside a hypercube. Thus, most of the samples from QMC can get wasted. The QMC samples can be saved if they can be pulled towards the high probability regions of the posterior distribution using inverse probability transforms. But this can be done only when the distribution function is known, which is rarely the case in Bayesian problems. In this talk, I will discuss a deterministic sampling technique, known as minimum energy designs, which can directly sample from the posterior distributions.
Kriging and spatial design accelerated by orders of magnitudeAlexander Litvinenko
We suggest new method, which combine a low-rank tensor technique and Fast Fourier Transformation to speed up kriging, estimation of the variance and geostatistical optimal design. We do 3D Kriging with O(10e+13) estimation points, A-criterion for optimal design with O(10e+12) points in 30 sec.
최근 이수가 되고 있는 Bayesian Deep Learning 관련 이론과 최근 어플리케이션들을 소개합니다. Bayesian Inference 의 이론에 관해서 간단히 설명하고 Yarin Gal 의 Monte Carlo Dropout 의 이론과 어플리케이션들을 소개합니다.
Multi Model Ensemble (MME) predictions are a popular ad-hoc technique for improving predictions of high-dimensional, multi-scale dynamical systems. The heuristic idea behind MME framework is simple: given a collection of models, one considers predictions obtained through the convex superposition of the individual probabilistic forecasts in the hope of mitigating model error. However, it is not obvious if this is a viable strategy and which models should be included in the MME forecast in order to achieve the best predictive performance. I will present an information-theoretic approach to this problem which allows for deriving a sufficient condition for improving dynamical predictions within the MME framework; moreover, this formulation gives rise to systematic and practical guidelines for optimising data assimilation techniques which are based on multi-model ensembles. Time permitting, the role and validity of “fluctuation-dissipation” arguments for improving imperfect predictions of externally perturbed non-autonomous systems - with possible applications to climate change considerations - will also be addressed.
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsPK Lehre
We describe how to estimate the optimisation time of the UMDA, an estimation of distribution algorithm, using the level-based theorem. The paper was presented at GECCO 2015 in Madrid.
Introduction to machine learning terminology.
Applications within High Energy Physics and outside HEP.
* Basic problems: classification and regression.
* Nearest neighbours approach and spacial indices
* Overfitting (intro)
* Curse of dimensionality
* ROC curve, ROC AUC
* Bayes optimal classifier
* Density estimation: KDE and histograms
* Parametric density estimation
* Mixtures for density estimation and EM algorithm
* Generative approach vs discriminative approach
* Linear decision rule, intro to logistic regression
* Linear regression
Sequential quasi-Monte Carlo (SQMC) is a quasi-Monte Carlo (QMC) version of sequential Monte Carlo (or particle filtering), a popular class of Monte Carlo techniques used to carry out inference in state space models. In this talk I will first review the SQMC methodology as well as some theoretical results. Although SQMC converges faster than the usual Monte Carlo error rate its performance deteriorates quickly as the dimension of the hidden variable increases. However, I will show with an example that SQMC may perform well for some "high" dimensional problems. I will conclude this talk with some open problems and potential applications of SQMC in complicated settings.
Here a Review of the Combination of Machine Learning models from Bayesian Averaging, Committees to Boosting... Specifically An statistical analysis of Boosting is done
* Logistic regression, logistic loss (log loss)
* stochastic optimization
* adding new features, generalized linear model
* Kernel trick, intro to SVM
* Overfitting
* Decision trees for classification and regression
* Building trees greedily: Gini index, entropy
* Trees fighting with overfitting: pre-stopping and post-pruning
* Feature importances
Markov chain Monte Carlo (MCMC) methods are popularly used in Bayesian computation. However, they need large number of samples for convergence which can become costly when the posterior distribution is expensive to evaluate. Deterministic sampling techniques such as Quasi-Monte Carlo (QMC) can be a useful alternative to MCMC, but the existing QMC methods are mainly developed only for sampling from unit hypercubes. Unfortunately, the posterior distributions can be highly correlated and nonlinear making them occupy very little space inside a hypercube. Thus, most of the samples from QMC can get wasted. The QMC samples can be saved if they can be pulled towards the high probability regions of the posterior distribution using inverse probability transforms. But this can be done only when the distribution function is known, which is rarely the case in Bayesian problems. In this talk, I will discuss a deterministic sampling technique, known as minimum energy designs, which can directly sample from the posterior distributions.
Kriging and spatial design accelerated by orders of magnitudeAlexander Litvinenko
We suggest new method, which combine a low-rank tensor technique and Fast Fourier Transformation to speed up kriging, estimation of the variance and geostatistical optimal design. We do 3D Kriging with O(10e+13) estimation points, A-criterion for optimal design with O(10e+12) points in 30 sec.
최근 이수가 되고 있는 Bayesian Deep Learning 관련 이론과 최근 어플리케이션들을 소개합니다. Bayesian Inference 의 이론에 관해서 간단히 설명하고 Yarin Gal 의 Monte Carlo Dropout 의 이론과 어플리케이션들을 소개합니다.
Multi Model Ensemble (MME) predictions are a popular ad-hoc technique for improving predictions of high-dimensional, multi-scale dynamical systems. The heuristic idea behind MME framework is simple: given a collection of models, one considers predictions obtained through the convex superposition of the individual probabilistic forecasts in the hope of mitigating model error. However, it is not obvious if this is a viable strategy and which models should be included in the MME forecast in order to achieve the best predictive performance. I will present an information-theoretic approach to this problem which allows for deriving a sufficient condition for improving dynamical predictions within the MME framework; moreover, this formulation gives rise to systematic and practical guidelines for optimising data assimilation techniques which are based on multi-model ensembles. Time permitting, the role and validity of “fluctuation-dissipation” arguments for improving imperfect predictions of externally perturbed non-autonomous systems - with possible applications to climate change considerations - will also be addressed.
Performance of Matching Algorithmsfor Signal Approximationiosrjce
IOSR Journal of Electronics and Communication Engineering(IOSR-JECE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of electronics and communication engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in electronics and communication engineering. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
Mean Absolute Percentage Error for regression models, presentation of the paper published in Neurocomputing, 2016.
http://www.sciencedirect.com/science/article/pii/S0925231216003325
Sampling based approximation of confidence intervals for functions of genetic...prettygully
Approximate lower bound sampling errors of maximum likelihood estimates of covariance components and their linear functions can be obtained from the inverse of the
information matrix. For non-linear functions, sampling variances are commonly determined as the variance of their first order Taylor series expansions. This is used to obtain sampling errors for estimates of heritabilities and correlations, and these quantities can be computed
with most software performing such analyses. In other instances, however, more complicated functions are of interest or the linear approximation is difficult or inadequate. A pragmatic alternative then is to evaluate sampling characteristics by repeated sampling of parameters from their asymptotic, multivariate normal distribution, calculating the function(s) of interest for each sample and inspecting the distribution across replicates. This paper demonstrates the use of this approach and examines the quality of
approximation obtained.
We approach the screening problem - i.e. detecting which inputs of a computer model significantly impact the output - from a formal Bayesian model selection point of view. That is, we place a Gaussian process prior on the computer model and consider the $2^p$ models that result from assuming that each of the subsets of the $p$ inputs affect the response. The goal is to obtain the posterior probabilities of each of these models. In this talk, we focus on the specification of objective priors on the model-specific parameters and on convenient ways to compute the associated marginal likelihoods. These two problems that normally are seen as unrelated, have challenging connections since the priors proposed in the literature are specifically designed to have posterior modes in the boundary of the parameter space, hence precluding the application of approximate integration techniques based on e.g. Laplace approximations. We explore several ways of circumventing this difficulty, comparing different methodologies with synthetic examples taken from the literature.
Authors: Gonzalo Garcia-Donato (Universidad de Castilla-La Mancha) and Rui Paulo (Universidade de Lisboa)
One of the central tasks in computational mathematics and statistics is to accurately approximate unknown target functions. This is typically done with the help of data — samples of the unknown functions. The emergence of Big Data presents both opportunities and challenges. On one hand, big data introduces more information about the unknowns and, in principle, allows us to create more accurate models. On the other hand, data storage and processing become highly challenging. In this talk, we present a set of sequential algorithms for function approximation in high dimensions with large data sets. The algorithms are of iterative nature and involve only vector operations. They use one data sample at each step and can handle dynamic/stream data. We present both the numerical algorithms, which are easy to implement, as well as rigorous analysis for their theoretical foundation.
New data structures and algorithms for \\post-processing large data sets and ...Alexander Litvinenko
In this work, we describe advanced numerical tools for working with multivariate functions and for
the analysis of large data sets. These tools will drastically reduce the required computing time and the
storage cost, and, therefore, will allow us to consider much larger data sets or ner meshes. Covariance
matrices are crucial in spatio-temporal statistical tasks, but are often very expensive to compute and
store, especially in 3D. Therefore, we approximate covariance functions by cheap surrogates in a
low-rank tensor format. We apply the Tucker and canonical tensor decompositions to a family of
Matern- and Slater-type functions with varying parameters and demonstrate numerically that their
approximations exhibit exponentially fast convergence. We prove the exponential convergence of the
Tucker and canonical approximations in tensor rank parameters. Several statistical operations are
performed in this low-rank tensor format, including evaluating the conditional covariance matrix,
spatially averaged estimation variance, computing a quadratic form, determinant, trace, loglikelihood,
inverse, and Cholesky decomposition of a large covariance matrix. Low-rank tensor approximations
reduce the computing and storage costs essentially. For example, the storage cost is reduced from an
exponential O(nd) to a linear scaling O(drn), where d is the spatial dimension, n is the number of
mesh points in one direction, and r is the tensor rank. Prerequisites for applicability of the proposed
techniques are the assumptions that the data, locations, and measurements lie on a tensor (axesparallel)
grid and that the covariance function depends on a distance,...
Introduction:
RNA interference (RNAi) or Post-Transcriptional Gene Silencing (PTGS) is an important biological process for modulating eukaryotic gene expression.
It is highly conserved process of posttranscriptional gene silencing by which double stranded RNA (dsRNA) causes sequence-specific degradation of mRNA sequences.
dsRNA-induced gene silencing (RNAi) is reported in a wide range of eukaryotes ranging from worms, insects, mammals and plants.
This process mediates resistance to both endogenous parasitic and exogenous pathogenic nucleic acids, and regulates the expression of protein-coding genes.
What are small ncRNAs?
micro RNA (miRNA)
short interfering RNA (siRNA)
Properties of small non-coding RNA:
Involved in silencing mRNA transcripts.
Called “small” because they are usually only about 21-24 nucleotides long.
Synthesized by first cutting up longer precursor sequences (like the 61nt one that Lee discovered).
Silence an mRNA by base pairing with some sequence on the mRNA.
Discovery of siRNA?
The first small RNA:
In 1993 Rosalind Lee (Victor Ambros lab) was studying a non- coding gene in C. elegans, lin-4, that was involved in silencing of another gene, lin-14, at the appropriate time in the
development of the worm C. elegans.
Two small transcripts of lin-4 (22nt and 61nt) were found to be complementary to a sequence in the 3' UTR of lin-14.
Because lin-4 encoded no protein, she deduced that it must be these transcripts that are causing the silencing by RNA-RNA interactions.
Types of RNAi ( non coding RNA)
MiRNA
Length (23-25 nt)
Trans acting
Binds with target MRNA in mismatch
Translation inhibition
Si RNA
Length 21 nt.
Cis acting
Bind with target Mrna in perfect complementary sequence
Piwi-RNA
Length ; 25 to 36 nt.
Expressed in Germ Cells
Regulates trnasposomes activity
MECHANISM OF RNAI:
First the double-stranded RNA teams up with a protein complex named Dicer, which cuts the long RNA into short pieces.
Then another protein complex called RISC (RNA-induced silencing complex) discards one of the two RNA strands.
The RISC-docked, single-stranded RNA then pairs with the homologous mRNA and destroys it.
THE RISC COMPLEX:
RISC is large(>500kD) RNA multi- protein Binding complex which triggers MRNA degradation in response to MRNA
Unwinding of double stranded Si RNA by ATP independent Helicase
Active component of RISC is Ago proteins( ENDONUCLEASE) which cleave target MRNA.
DICER: endonuclease (RNase Family III)
Argonaute: Central Component of the RNA-Induced Silencing Complex (RISC)
One strand of the dsRNA produced by Dicer is retained in the RISC complex in association with Argonaute
ARGONAUTE PROTEIN :
1.PAZ(PIWI/Argonaute/ Zwille)- Recognition of target MRNA
2.PIWI (p-element induced wimpy Testis)- breaks Phosphodiester bond of mRNA.)RNAse H activity.
MiRNA:
The Double-stranded RNAs are naturally produced in eukaryotic cells during development, and they have a key role in regulating gene expression .
Richard's aventures in two entangled wonderlandsRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Sérgio Sacani
We characterize the earliest galaxy population in the JADES Origins Field (JOF), the deepest
imaging field observed with JWST. We make use of the ancillary Hubble optical images (5 filters
spanning 0.4−0.9µm) and novel JWST images with 14 filters spanning 0.8−5µm, including 7 mediumband filters, and reaching total exposure times of up to 46 hours per filter. We combine all our data
at > 2.3µm to construct an ultradeep image, reaching as deep as ≈ 31.4 AB mag in the stack and
30.3-31.0 AB mag (5σ, r = 0.1” circular aperture) in individual filters. We measure photometric
redshifts and use robust selection criteria to identify a sample of eight galaxy candidates at redshifts
z = 11.5 − 15. These objects show compact half-light radii of R1/2 ∼ 50 − 200pc, stellar masses of
M⋆ ∼ 107−108M⊙, and star-formation rates of SFR ∼ 0.1−1 M⊙ yr−1
. Our search finds no candidates
at 15 < z < 20, placing upper limits at these redshifts. We develop a forward modeling approach to
infer the properties of the evolving luminosity function without binning in redshift or luminosity that
marginalizes over the photometric redshift uncertainty of our candidate galaxies and incorporates the
impact of non-detections. We find a z = 12 luminosity function in good agreement with prior results,
and that the luminosity function normalization and UV luminosity density decline by a factor of ∼ 2.5
from z = 12 to z = 14. We discuss the possible implications of our results in the context of theoretical
models for evolution of the dark matter halo mass function.
This presentation explores a brief idea about the structural and functional attributes of nucleotides, the structure and function of genetic materials along with the impact of UV rays and pH upon them.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.Sérgio Sacani
The return of a sample of near-surface atmosphere from Mars would facilitate answers to several first-order science questions surrounding the formation and evolution of the planet. One of the important aspects of terrestrial planet formation in general is the role that primary atmospheres played in influencing the chemistry and structure of the planets and their antecedents. Studies of the martian atmosphere can be used to investigate the role of a primary atmosphere in its history. Atmosphere samples would also inform our understanding of the near-surface chemistry of the planet, and ultimately the prospects for life. High-precision isotopic analyses of constituent gases are needed to address these questions, requiring that the analyses are made on returned samples rather than in situ.
This pdf is about the Schizophrenia.
For more details visit on YouTube; @SELF-EXPLANATORY;
https://www.youtube.com/channel/UCAiarMZDNhe1A3Rnpr_WkzA/videos
Thanks...!
Multi-source connectivity as the driver of solar wind variability in the heli...Sérgio Sacani
The ambient solar wind that flls the heliosphere originates from multiple
sources in the solar corona and is highly structured. It is often described
as high-speed, relatively homogeneous, plasma streams from coronal
holes and slow-speed, highly variable, streams whose source regions are
under debate. A key goal of ESA/NASA’s Solar Orbiter mission is to identify
solar wind sources and understand what drives the complexity seen in the
heliosphere. By combining magnetic feld modelling and spectroscopic
techniques with high-resolution observations and measurements, we show
that the solar wind variability detected in situ by Solar Orbiter in March
2022 is driven by spatio-temporal changes in the magnetic connectivity to
multiple sources in the solar atmosphere. The magnetic feld footpoints
connected to the spacecraft moved from the boundaries of a coronal hole
to one active region (12961) and then across to another region (12957). This
is refected in the in situ measurements, which show the transition from fast
to highly Alfvénic then to slow solar wind that is disrupted by the arrival of
a coronal mass ejection. Our results describe solar wind variability at 0.5 au
but are applicable to near-Earth observatories.
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Sérgio Sacani
Since volcanic activity was first discovered on Io from Voyager images in 1979, changes
on Io’s surface have been monitored from both spacecraft and ground-based telescopes.
Here, we present the highest spatial resolution images of Io ever obtained from a groundbased telescope. These images, acquired by the SHARK-VIS instrument on the Large
Binocular Telescope, show evidence of a major resurfacing event on Io’s trailing hemisphere. When compared to the most recent spacecraft images, the SHARK-VIS images
show that a plume deposit from a powerful eruption at Pillan Patera has covered part
of the long-lived Pele plume deposit. Although this type of resurfacing event may be common on Io, few have been detected due to the rarity of spacecraft visits and the previously low spatial resolution available from Earth-based telescopes. The SHARK-VIS instrument ushers in a new era of high resolution imaging of Io’s surface using adaptive
optics at visible wavelengths.
Nutraceutical market, scope and growth: Herbal drug technologyLokesh Patil
As consumer awareness of health and wellness rises, the nutraceutical market—which includes goods like functional meals, drinks, and dietary supplements that provide health advantages beyond basic nutrition—is growing significantly. As healthcare expenses rise, the population ages, and people want natural and preventative health solutions more and more, this industry is increasing quickly. Further driving market expansion are product formulation innovations and the use of cutting-edge technology for customized nutrition. With its worldwide reach, the nutraceutical industry is expected to keep growing and provide significant chances for research and investment in a number of categories, including vitamins, minerals, probiotics, and herbal supplements.
Slide 1: Title Slide
Extrachromosomal Inheritance
Slide 2: Introduction to Extrachromosomal Inheritance
Definition: Extrachromosomal inheritance refers to the transmission of genetic material that is not found within the nucleus.
Key Components: Involves genes located in mitochondria, chloroplasts, and plasmids.
Slide 3: Mitochondrial Inheritance
Mitochondria: Organelles responsible for energy production.
Mitochondrial DNA (mtDNA): Circular DNA molecule found in mitochondria.
Inheritance Pattern: Maternally inherited, meaning it is passed from mothers to all their offspring.
Diseases: Examples include Leber’s hereditary optic neuropathy (LHON) and mitochondrial myopathy.
Slide 4: Chloroplast Inheritance
Chloroplasts: Organelles responsible for photosynthesis in plants.
Chloroplast DNA (cpDNA): Circular DNA molecule found in chloroplasts.
Inheritance Pattern: Often maternally inherited in most plants, but can vary in some species.
Examples: Variegation in plants, where leaf color patterns are determined by chloroplast DNA.
Slide 5: Plasmid Inheritance
Plasmids: Small, circular DNA molecules found in bacteria and some eukaryotes.
Features: Can carry antibiotic resistance genes and can be transferred between cells through processes like conjugation.
Significance: Important in biotechnology for gene cloning and genetic engineering.
Slide 6: Mechanisms of Extrachromosomal Inheritance
Non-Mendelian Patterns: Do not follow Mendel’s laws of inheritance.
Cytoplasmic Segregation: During cell division, organelles like mitochondria and chloroplasts are randomly distributed to daughter cells.
Heteroplasmy: Presence of more than one type of organellar genome within a cell, leading to variation in expression.
Slide 7: Examples of Extrachromosomal Inheritance
Four O’clock Plant (Mirabilis jalapa): Shows variegated leaves due to different cpDNA in leaf cells.
Petite Mutants in Yeast: Result from mutations in mitochondrial DNA affecting respiration.
Slide 8: Importance of Extrachromosomal Inheritance
Evolution: Provides insight into the evolution of eukaryotic cells.
Medicine: Understanding mitochondrial inheritance helps in diagnosing and treating mitochondrial diseases.
Agriculture: Chloroplast inheritance can be used in plant breeding and genetic modification.
Slide 9: Recent Research and Advances
Gene Editing: Techniques like CRISPR-Cas9 are being used to edit mitochondrial and chloroplast DNA.
Therapies: Development of mitochondrial replacement therapy (MRT) for preventing mitochondrial diseases.
Slide 10: Conclusion
Summary: Extrachromosomal inheritance involves the transmission of genetic material outside the nucleus and plays a crucial role in genetics, medicine, and biotechnology.
Future Directions: Continued research and technological advancements hold promise for new treatments and applications.
Slide 11: Questions and Discussion
Invite Audience: Open the floor for any questions or further discussion on the topic.
5. Background:low-rank and existing method
M is low-rank → Mahalanobis distance is defined in low dim space.
Rank minimization is NP Hard [2] ⇒ Use approximation of that.
Trace Norm regularization
▶ Minimizes sum of all singular values σ(M).
Reg(M)
∑
s
σs(M).
Changing of one large singular value affects overall.
Fantope regularization
▶ Minimizes sum of k smallest singular values.
Reg(M)
k∑
s
σs(M).
Be sensitive for hyper-parameter k. 3 / 14
6. Proposed Method
In proposed method, use Capped Trace Norm regularization.
Capped Trace Norm regularization
Only minimizes the singular values that are smaller than ϵ.
Reg(M)
∑
s
min(σs(M),ϵ).
Reduce the effect of changing large singular value.
Stable for hyper-parameter than Fantope.
4 / 14
7. Proposed Method
Optimization Problem:
min
M ∈Sd
+
∑
(i,j,k,l)∈A
[
δijkl +
⟨
M, xij x⊤
ij − xkl x⊤
kl
⟩]
+
Degree of violation for quadrupletwise constraints
+
γ
2
∑
s
min(σs(M),ϵ)
Regularization term
,where A =
{
(i, j, k,l) : dM (xk, xl) ≥ dM (xi, xj) + δijkl
}
.
⇒ This function is non-convex
5 / 14
8. Proposed Method:Algorithm
Singular value decomposition of M:
M = UΣU⊤
= · · · us · · ·
...
σs
...
· · · us · · ·
⊤
.
Define D:
D =
1
2
k∑
s=1
σ−1
s us u⊤
s .
, where k is number of singular values that smaller than ϵ.
Transform into this convex optimization by using D.
min
M ∈Sd
+
∑
(i,j,k,l)∈A
[
ξ(i,j,k,l) +
⟨
M, xij x⊤
ij − xkl x⊤
kl
⟩]
+
γ
2
Tr(M⊤
DM)
, where D is fixed.
6 / 14
9. Proposed Method:Algorithm
Proximal Gradient Descent
Key Points
They prove the convergence of our optimization algorithm.
k is hyper-parameter.
▶ ϵ is adaptively determined.
7 / 14
10. Experiment:Synthetic Data
Data:
1. Make T ∈ Sd
+, where rank(T) is e.
2. Quadrupletwise constraints that are satisfied on Mahalanobis Distance of T, split for
train data A, validation data V, test data T.
Setting:
▶ d = 100
▶ e = 10
▶ |A| = |V| = |T | = 104
▶ γ is tuned in the range of
{
10−2,10−2,1,10,102
}
▶ k is tuned from 5 to 20
Compared Methods:
▶ ML: No regularization
▶ ML+Trace: Trace Norm regularization
▶ ML+Fantope: Fantope regularization
▶ ML+capped: Proposed Method
8 / 14
11. Experiment:Synthetic Data
Accuracy and rank(M).
Method Accuracy rank(M)
ML 85.62% 53
ML + Trace 88.44% 41
ML + Fantope 95.50% 10
ML + capped 95.43% 10
Table 1: Synthetic experiment results.
⇒ Fantope reg and Capped Norm reg are both better than other method.
9 / 14
12. Experiment:Synthetic Data
Accuracy on changing hyper-parameter k
Rank k
6 8 10 12 14 16 18 20
Accuracy%
88
89
90
91
92
93
94
95
96
ML+Trace
ML+Fantope
Our method
⇒ Proposed method mostly outperforms Fantope reg.
⇒ Propsoed method performs more stable than Fantope reg.
10 / 14
13. Experiment:Labeled Faces in The Wild
Task: Deciding if two face images are from the same person.
Data:
▶ 13233 images from 5749 persons.
▶ Use SIFT feature.
Setting:
▶ Use pairwise constraints.
▶ γ is tuned in
{
10−2,10−1,1,10,102
}
▶ k is tuned in {30,35,40,45,50,55,60,65,70}
Compared Methods:
▶ IDENTITY: Euclidean Distance
▶ MAHALANOBIS: Traditional Mahalanobis Distance
▶ KISSME: [3]
▶ ITML: [4]
▶ LDML: [5]
11 / 14
15. Experiment:Labeled Faces in The Wild
Accuracy on changing hyper-parameter k.
Rank k
30 35 40 45 50 55 60 65 70
Accuracy%
79.5
80
80.5
81
81.5
82
82.5
ML+Trace
ML+Fantope
Our method
⇒ Proposed method get better results than metric learning with Fantope
regularization.
13 / 14
16. Conclusion
They proposed a novel low-rank regularization, Capped Trace Norm regularization.
Proposed algorithm for optimization problem and prove convergence of that.
Experimental results show that our method outperforms the state-of-the-art
metric learning methods.
14 / 14
17. Reference I
Zhouyuan Huo, Feiping Nie, and Heng Huang.
Robust and effective metric learning using capped trace norm: Metric learning via
capped trace norm.
In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining, pp. 1605–1614. ACM, 2016.
Aur´elien Bellet, Amaury Habrard, and Marc Sebban.
A survey on metric learning for feature vectors and structured data.
arXiv preprint arXiv:1306.6709, 2013.
Martin Koestinger, Martin Hirzer, Paul Wohlhart, Peter M Roth, and Horst
Bischof.
Large scale metric learning from equivalence constraints.
In 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp.
2288–2295. IEEE, 2012.
14 / 14
18. Reference II
Jason V Davis, Brian Kulis, Prateek Jain, Suvrit Sra, and Inderjit S Dhillon.
Information-theoretic metric learning.
In Proceedings of the 24th international conference on Machine learning, pp.
209–216. ACM, 2007.
Matthieu Guillaumin, Jakob Verbeek, and Cordelia Schmid.
Is that you? metric learning approaches for face identification.
In 2009 IEEE 12th International Conference on Computer Vision, pp. 498–505.
IEEE, 2009.
14 / 14