Comparing the performance of precRec on D2 (prob. data) using uniform, random approaches and D3 (deterministic data) using a threshold of 0.5. These graphs are updated after corrections to the source precision & recall of deterministic data.
Comparing the performance of precRec on D2 (prob. data) using uniform, random approaches and D3 (deterministic data) using a threshold of 0.5. These graphs are updated after corrections to the source precision & recall of deterministic data.
A seminar presentation from Stanford on spectra of large networks. This work is preliminary, so please check for recent versions and contact me for more information.
The JBossWS project is a working example of how the open source development model can lead to real innovation while delivering enterprise level software. JBossWS is a framework for providing webservice features on top of JBoss AS. Instead of just implementing functionalities from scratch, JBossWS provides an integration layer for running three supported open source ws stacks (including Apache CXF and Glassfish Metro) on JBoss Application Server. While this is of course possible thanks to the WS world being quite specified by standards, the whole architecture is interesting because of its ability to both act as a testing framework for the supported stacks as well as the field for delivering additional cross-stack features. I'm going to present in details the architecture layout as well as describe how it allows users to really have an open choice on what stack to run their services on, while getting benefits from common management, tooling, additional features etc. On the other side, I'll introduce and provide samples of how this development model also benefits the supported stacks' communities thanks to the collaboration with them.
Fast Katz and Commuters - Quadrature Rules and Sparse Linear Solvers
for Link Prediction Heuristics
Motivated by social network data mining problems such as link
prediction and collaborative filtering, significant research effort
has been devoted to computing topological measures including the Katz
score and the commute time. Existing approaches approximate all
pairwise relationships simultaneously. We are interested in
computing
the score for a single pair of nodes;
the top-k nodes with the best scores from a given source node.
For the pairwise problem, we introduce an iterative algorithm that
computes upper and lower bounds for the measures we seek. This
algorithm exploits a relationship between the Lanczos process and a
quadrature rule.
For the top-k problem, we propose an algorithm that only accesses a
small portion of the graph, similar to algorithms used in personalized
PageRank computing. To test scalability and accuracy, we experiment
with three real-world networks and find that our algorithms run in
milliseconds to seconds without any preprocessing.
Fast matrix primitives for ranking, link-prediction and moreDavid Gleich
I gave this talk at Netflix about some of the recent work I've been doing on fast matrix primitives for link prediction and also some non-standard uses of the nuclear norm for ranking.
Relaxation methods for the matrix exponential on large networksDavid Gleich
My talk from the Stanford ICME seminar series on doing network analysis and link prediction using the a fast algorithm for the matrix exponential on graph problems.
Participation costs dismiss the advantage of heterogeneous networks in evolut...Naoki Masuda
Presentation slides for the following two papers (mainly (1)):
(1) Masuda. Proceedings of the Royal Society B: Biological Sciences, 274, 1815-1821 (2007).
(2) Masuda and Aihara. Physics Letters A, 313, 55-61 (2003).
For the canonical regression setup where one wants to discover the relationship between Y and a pdimensional
vector x, BART (Bayesian Additive Regression Trees) approximates the conditional mean E[Y|x] with a sum of regression trees model, where each tree is constrained by a regularization prior to be a weak learner. Fitting and inference are accomplished via a scalable iterative Bayesian backfitting MCMC algorithm that generates samples from a posterior. Effectively, BART is a nonparametric Bayesian regression approach which uses dimensionally adaptive random basis elements. Motivated by ensemble methods in general, and boosting algorithms in particular, BART is defined by a statistical model: a prior and a likelihood. This approach enables full posterior inference including point and interval estimates of the unknown regression function as well as the marginal effects of potential predictors. By keeping track of predictor inclusion frequencies, BART can also be used for model-free variable selection. To further illustrate the modeling flexibility of BART, we introduce two elaborations, MBART and HBART. Exploiting the potential monotonicity of E[Y|x] in components of x, MBART incorporates such monotonicity with a multivariate basis of monotone trees. To allow for the possibility of heteroscedasticity, HBART incorporates an additional product of regression trees model component for the conditional.
A seminar presentation from Stanford on spectra of large networks. This work is preliminary, so please check for recent versions and contact me for more information.
The JBossWS project is a working example of how the open source development model can lead to real innovation while delivering enterprise level software. JBossWS is a framework for providing webservice features on top of JBoss AS. Instead of just implementing functionalities from scratch, JBossWS provides an integration layer for running three supported open source ws stacks (including Apache CXF and Glassfish Metro) on JBoss Application Server. While this is of course possible thanks to the WS world being quite specified by standards, the whole architecture is interesting because of its ability to both act as a testing framework for the supported stacks as well as the field for delivering additional cross-stack features. I'm going to present in details the architecture layout as well as describe how it allows users to really have an open choice on what stack to run their services on, while getting benefits from common management, tooling, additional features etc. On the other side, I'll introduce and provide samples of how this development model also benefits the supported stacks' communities thanks to the collaboration with them.
Fast Katz and Commuters - Quadrature Rules and Sparse Linear Solvers
for Link Prediction Heuristics
Motivated by social network data mining problems such as link
prediction and collaborative filtering, significant research effort
has been devoted to computing topological measures including the Katz
score and the commute time. Existing approaches approximate all
pairwise relationships simultaneously. We are interested in
computing
the score for a single pair of nodes;
the top-k nodes with the best scores from a given source node.
For the pairwise problem, we introduce an iterative algorithm that
computes upper and lower bounds for the measures we seek. This
algorithm exploits a relationship between the Lanczos process and a
quadrature rule.
For the top-k problem, we propose an algorithm that only accesses a
small portion of the graph, similar to algorithms used in personalized
PageRank computing. To test scalability and accuracy, we experiment
with three real-world networks and find that our algorithms run in
milliseconds to seconds without any preprocessing.
Fast matrix primitives for ranking, link-prediction and moreDavid Gleich
I gave this talk at Netflix about some of the recent work I've been doing on fast matrix primitives for link prediction and also some non-standard uses of the nuclear norm for ranking.
Relaxation methods for the matrix exponential on large networksDavid Gleich
My talk from the Stanford ICME seminar series on doing network analysis and link prediction using the a fast algorithm for the matrix exponential on graph problems.
Participation costs dismiss the advantage of heterogeneous networks in evolut...Naoki Masuda
Presentation slides for the following two papers (mainly (1)):
(1) Masuda. Proceedings of the Royal Society B: Biological Sciences, 274, 1815-1821 (2007).
(2) Masuda and Aihara. Physics Letters A, 313, 55-61 (2003).
For the canonical regression setup where one wants to discover the relationship between Y and a pdimensional
vector x, BART (Bayesian Additive Regression Trees) approximates the conditional mean E[Y|x] with a sum of regression trees model, where each tree is constrained by a regularization prior to be a weak learner. Fitting and inference are accomplished via a scalable iterative Bayesian backfitting MCMC algorithm that generates samples from a posterior. Effectively, BART is a nonparametric Bayesian regression approach which uses dimensionally adaptive random basis elements. Motivated by ensemble methods in general, and boosting algorithms in particular, BART is defined by a statistical model: a prior and a likelihood. This approach enables full posterior inference including point and interval estimates of the unknown regression function as well as the marginal effects of potential predictors. By keeping track of predictor inclusion frequencies, BART can also be used for model-free variable selection. To further illustrate the modeling flexibility of BART, we introduce two elaborations, MBART and HBART. Exploiting the potential monotonicity of E[Y|x] in components of x, MBART incorporates such monotonicity with a multivariate basis of monotone trees. To allow for the possibility of heteroscedasticity, HBART incorporates an additional product of regression trees model component for the conditional.
What the matrix can tell us about the social network.David Gleich
In this talk, I give a high level picture of my research about how looking at social network problems as matrix computations is a productive line of work.
Synthetic Image Data Generation using GAN &Triple GAN.pptxRupeshKumar301638
The presentation focuses on the utilization of different deep generative models for synthetic image generation. These models include Variational Autoencoders (VAEs), Generative Adversarial Networks (GANs), Auto-Regressive Models, and Flow-Based Models.
Firstly, the presentation introduces VAEs, which are probabilistic models that aim to encode input images into a latent space and generate new images by sampling from this latent space. It explains the underlying principles of VAEs and their ability to generate diverse and realistic synthetic images.
Next, the presentation delves into GANs, which involve two competing neural networks: a generator network and a discriminator network. The generator network generates synthetic images, while the discriminator network learns to distinguish between real and synthetic images. The presentation describes the training process and the theoretical basis of GANs.
The presentation further explores Auto-Regressive Models, which model the joint probability distribution of the image pixels conditioned on previous pixels. It discusses how these models leverage the dependencies among pixels to generate coherent and high-quality synthetic images.
Flow-Based Models, another class of generative models, are then introduced. These models learn a bijective transformation between a simple base distribution and the target distribution of images. The presentation explains how these models can generate images by sampling from the base distribution and applying the inverse transformation.
Finally, the presentation highlights the Triple GAN, a specific type of GAN that exhibits superiority in synthetic image generation compared to other models and existing GANs. It discusses the unique characteristics of Triple GAN, such as its improved stability and ability to generate high-resolution images. The presentation supports these claims by providing mathematical proofs and presenting implementation results that demonstrate the superior performance of Triple GAN in generating realistic and diverse synthetic images.
Overall, the presentation covers various deep generative models, their principles, and their applications in synthetic image generation. It emphasizes the superiority of Triple GAN, supported by mathematical proofs and implementation results, showcasing its advancements in this field.
Fast relaxation methods for the matrix exponential David Gleich
The matrix exponential is a matrix computing primitive used in link prediction and community detection. We describe a fast method to compute it using relaxation on a large linear system of equations. This enables us to compute a column of the matrix exponential is sublinear time, or under a second on a standard desktop computer.
Higher-order organization of complex networksDavid Gleich
A talk I gave at the Park City Institute of Mathematics about our recent work on using motifs to analyze and cluster networks. This involves a higher-order cheeger inequality in terms of motifs.
Similar to Spectral methods for linear systems with random inputs (20)
Correlation clustering and community detection in graphs and networksDavid Gleich
We show a new relationship between various community detection objectives and a correlation clustering framework. These enable us to detect communities with good bounds on the solution.
Spectral clustering with motifs and higher-order structuresDavid Gleich
I presented these slides at the #strathna meeting in Glasgow in June 2017. They are an updated and enhanced version of the earlier talks on the subject.
Spacey random walks and higher-order data analysisDavid Gleich
My talk at TMA 2016 (The workshop on Tensors, Matrices, and their Applications) on the relationship between a spacey random walk process and tensor eigenvectors
A copy of my slides from the SILO Seminar at UW Madison on our recent developments for the NEO-K-Means methods including new optimization routines and results.
Using Local Spectral Methods to Robustify Graph-Based LearningDavid Gleich
This is my KDD2015 talk on robustness in semi-supervised learning. The paper is already on Michael Mahoney's website: http://www.stat.berkeley.edu/~mmahoney/pubs/robustifying-kdd15.pdf See the KDD paper for all the details, which this talk is a bit light on.
Spacey random walks and higher order Markov chainsDavid Gleich
My talk at SIAM NetSci workshop (2015) on our new spacey random walk and spacey random surfer models and how we derived them. There many potential extensions and opportunities to use this for analyzing big data as tensors.
Localized methods in graph mining exploit the local structures in a graph instead attempting to find global structures. These are widely successful at all sorts of problems including community detection, label propagation, and a few others.
PageRank Centrality of dynamic graph structuresDavid Gleich
A talk I gave at the SIAM Annual Meeting Mini-symposium on the mathematics of the power grid organized by Mahantesh Halappanavar. I discuss a few ideas on how our dynamic centrality could help analyze such situations.
Big data matrix factorizations and Overlapping community detection in graphsDavid Gleich
In a talk at the Chinese Academic of Sciences Institute for Automation, I discuss some of the MapReduce and community detection methods I've worked on.
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...David Gleich
This talk covers the idea of anti-differentiating approximation algorithms, which is an idea to explain the success of widely used heuristic procedures. Formally, this involves finding an optimization problem solved exactly by an approximation algorithm or heuristic.
Localized methods for diffusions in large graphsDavid Gleich
I describe a few ongoing research projects on diffusions in large graphs and how we can create efficient matrix computations in order to determine them efficiently.
Anti-differentiating Approximation Algorithms: PageRank and MinCutDavid Gleich
We study how Google's PageRank method relates to mincut and a particular type of electrical flow in a network. We also explain the details of how the "push method" for computing PageRank helps to accelerate it. This has implications for semi-supervised learning and machine learning, as well as social network analysis.
Gaps between the theory and practice of large-scale matrix-based network comp...David Gleich
I discuss some runtimes for the personalized PageRank vector and how it relates to open questions in how we should tackle these network based measures via matrix computations.
MapReduce Tall-and-skinny QR and applicationsDavid Gleich
A talk at the SIMONS workshop on Parallel and Distributed Algorithms for Inference and Optimization on how to do tall-and-skinny QR factorizations on MapReduce using a communication avoiding algorithm.
Recommendation and graph algorithms in Hadoop and SQLDavid Gleich
A talk I gave at ancestry.com on Hadoop, SQL, recommendation and graph algorithms. It's a tutorial overview, there are better algorithms than those I describe, but these are a simple starting point.
This talk is a new update based on some of our recent results on doing Tall and Skinny QRs in MapReduce. In particular, the "fast" iterative refinement approximation based on a sample is new.
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...David Gleich
My talk from KDD2012 about vertex neighborhoods and low conductance cuts. See the paper here: http://arxiv.org/abs/1112.0031 and http://dl.acm.org/citation.cfm?id=2339628
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Welocme to ViralQR, your best QR code generator.ViralQR
Welcome to ViralQR, your best QR code generator available on the market!
At ViralQR, we design static and dynamic QR codes. Our mission is to make business operations easier and customer engagement more powerful through the use of QR technology. Be it a small-scale business or a huge enterprise, our easy-to-use platform provides multiple choices that can be tailored according to your company's branding and marketing strategies.
Our Vision
We are here to make the process of creating QR codes easy and smooth, thus enhancing customer interaction and making business more fluid. We very strongly believe in the ability of QR codes to change the world for businesses in their interaction with customers and are set on making that technology accessible and usable far and wide.
Our Achievements
Ever since its inception, we have successfully served many clients by offering QR codes in their marketing, service delivery, and collection of feedback across various industries. Our platform has been recognized for its ease of use and amazing features, which helped a business to make QR codes.
Our Services
At ViralQR, here is a comprehensive suite of services that caters to your very needs:
Static QR Codes: Create free static QR codes. These QR codes are able to store significant information such as URLs, vCards, plain text, emails and SMS, Wi-Fi credentials, and Bitcoin addresses.
Dynamic QR codes: These also have all the advanced features but are subscription-based. They can directly link to PDF files, images, micro-landing pages, social accounts, review forms, business pages, and applications. In addition, they can be branded with CTAs, frames, patterns, colors, and logos to enhance your branding.
Pricing and Packages
Additionally, there is a 14-day free offer to ViralQR, which is an exceptional opportunity for new users to take a feel of this platform. One can easily subscribe from there and experience the full dynamic of using QR codes. The subscription plans are not only meant for business; they are priced very flexibly so that literally every business could afford to benefit from our service.
Why choose us?
ViralQR will provide services for marketing, advertising, catering, retail, and the like. The QR codes can be posted on fliers, packaging, merchandise, and banners, as well as to substitute for cash and cards in a restaurant or coffee shop. With QR codes integrated into your business, improve customer engagement and streamline operations.
Comprehensive Analytics
Subscribers of ViralQR receive detailed analytics and tracking tools in light of having a view of the core values of QR code performance. Our analytics dashboard shows aggregate views and unique views, as well as detailed information about each impression, including time, device, browser, and estimated location by city and country.
So, thank you for choosing ViralQR; we have an offer of nothing but the best in terms of QR code services to meet business diversity!
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Accelerate your Kubernetes clusters with Varnish Caching
Spectral methods for linear systems with random inputs
1. Spectral methods for linear
systems with random inputs
A parameterized matrix view
David F. Gleich
Sandia National Laboratories
with Paul Constantine @ Sandia
and Gianluca Iaccarino @ Stanford
2. Spectral methods for linear
systems with random inputs
A parameterized matrix view
First linear systems
Second random inputs
Third parameterized
matrices
Fourth spectral methods
3. David F. Gleich (Sandia) Parameterized Matrices 3 / 38
Computational Science
Discretizing Reality
Start with physical model
Discretize space and time
Arrive at linear system or
eigenvalue problem
4. David F. Gleich (Sandia) Parameterized Matrices 4 / 38
Computational Science
Discretizing Reality
5. David F. Gleich (Sandia) Parameterized Matrices 5 / 38
Computational Science
Discretizing Reality
Ax = b
6. David F. Gleich (Sandia) Parameterized Matrices 6 / 38
Matrices at this workshop
A
Random Gaussian
Random sums of
independent matrices
Random adjacency
matrices
7. David F. Gleich (Sandia) Parameterized Matrices 7 / 38
Fireflies and Jellybeans, Creative Commons
∇·∇ =ƒ
8. David F. Gleich (Sandia) Parameterized Matrices 8 / 38
Fireflies and Jellybeans, Creative Commons
∇ · (α(s, )∇ ) = ƒ
K + s1 K1 + s2 K2 + . . . = f
9. David F. Gleich (Sandia) Parameterized Matrices 9 / 38
My favorite model PAG E R A N K
3
1. follow out-edges uniformly with
probability α, and
2 5
4
2. randomly jump according to v
with probability 1 − α, we’ll as-
1 6
sume = 1/ n.
1/ 6 ↓ Induces a Markov chain model
1/ 2 0 0 0 0
1/ 6 0 0 1/ 3 0 0 αP + (1 − α)veT x(α) = x(α)
1/ 6 1/ 2 0 1/ 3 0 0
1/ 6 0 1/ 2 0 0 0
1/ 6 0 1/ 2 1/ 3 0 1 or the linear system
1/ 6 0 0 0 1 0
( − αP)x(α) = (1 − α)v
P
10. David F. Gleich (Sandia) Parameterized Matrices 10 / 38
The PageRank Random Variable
3.0 InfBeta( 3.2 , 2.0 , 1.9e−05 , 0.0019 )
2.5
2.0
density
1.5
1.0
0.5
0.0
0.0 0.2 0.4 0.6 0.8 1.0
Raw α
11. David F. Gleich (Sandia) Parameterized Matrices 11 / 38
Parameterized Matrices
Better Discretized Reality
A(s)x(s) = b(s)
12. David F. Gleich (Sandia) Parameterized Matrices 12 / 38
Parameterized Matrices
Better Discretized Reality
A(s)x(s) = b(s)
s - independent random variables/parameters
bounded, analytic, non-singular
13. David F. Gleich (Sandia) Parameterized Matrices 13 / 38
A Parameterized Matrix View of
Uncertainty Quantification
Setup
A(s)x(s) = b(s)
s∈D
ƒ = ƒ ds
D
Wi l l m y c o o ki e s b u rn ?
Questions
E[x(s)] = 〈x(s)〉
Std[x(s)]
P { (s) ≥ γ}
x(s) ≈ faster y(s) Fireflies and Jellybeans, Creative Commons
14. David F. Gleich (Sandia) Parameterized Matrices 14 / 38
Uncertainty Quantification
At this workshop
Richmond
Unknown sensor array locations.
Schehr
Where are the viscious walkers?
Antonsen
Uncertain component structure.
Assumed "totally" random
15. David F. Gleich (Sandia) Parameterized Matrices 15 / 38
A new type of sensitivity analysis
Ulam Networks on the Chirikov Map
Chirikov map Ulam network
yt+1 = ηyt +k sin( t +θt ) 1. divide phase space into uniform cells
t+1 = t + yt+1 2. form P based on trajectories.
log(E [x(A)]) log(Std [x(A)]))/ log(E [x(A)])
A ∼ Bet (2, 16)
Note White is larger, black is smaller
Google matrix, dynamical attractors, and Ulam networks, Shepelyansky and Zhirov, arXiv
David F. Gleich (UBC) Random sensitivity Sandia 23 / 37
16. David F. Gleich (Sandia) Parameterized Matrices 16 / 38
Improved web-spam classification
Webspam application
Hosts of uk-2006 are labeled as spam, not-spam, other
P R f FP FN
Baseline 0.694 0.558 0.618 0.034 0.442
Beta(0.5,1.5) 0.695 0.561 0.621 0.034 0.439
Beta(1,1) 0.698 0.562 0.622 0.033 0.438
Beta(2,16) 0.699 0.562 0.623 0.033 0.438
Note Bagged (10) J48 decision tree classifier in Weka, mean of 50 repetitions from
10-fold cross-validation of 4948 non-spam and 674 spam hosts (5622 total).
Becchetti et al. Link analysis for Web spam detection, 2008.
David F. Gleich (UBC) Random sensitivity Sandia 29 / 37
17. David F. Gleich (Sandia) Parameterized Matrices 17 / 38
Solutions are rational or analytic
A(s)x(s) = b(s)
det(A (s))
(s) =
det(A(s))
A = A(s) with ith column
replaced by b(s)
19. Spectral methods for linear
systems with random inputs
A parameterized matrix view
First linear systems
Second random inputs
Third parameterized
matrices
Fourth spectral methods
20. David F. Gleich (Sandia) Parameterized Matrices 20 / 38
Spectral Methods
Approximate a function in a polynomial basis!
In UQ, known as
polynomial chaos
generalized polynomial chaos
stochastic Galerkin
stochastic collation
21. David F. Gleich (Sandia) Parameterized Matrices 21 / 38
Spectral Fourier Coefficients
{π , ∈ N} : an orthonormal polynomial basis.
∞
ƒ (s) = ƒ π π (s)
=0
Fourier coefficients
Truncating this representation yields best
approximation in a mean sense.
But how do we compute them?
22. David F. Gleich (Sandia) Parameterized Matrices 22 / 38
Computable Polynomial Approx.
{π , ∈ N} : an orthonormal polynomial basis.
∞
ƒ (s) = ƒ π π (s)
=0
Approx. 〈ƒ π 〉 with m-point Gauss quadrature
pseudo-spectral
23. David F. Gleich (Sandia) Parameterized Matrices 23 / 38
Gaussian Quadrature
b m
ƒ ( ) dω( ) = ƒ (λ )ω
=1
An m point quadrature rule will exactly inte-
grate all polynomials of degree 2m − 1
All ω > 0, all < λ < b.
24. David F. Gleich (Sandia) Parameterized Matrices 24 / 38
Pseudospectral Methods for PMEs
A(s)x(s) = b(s)
N−1
x(s) ≈ x π (s) = Xπ(s)
=0
m
x = x(λj )π (λj )ωj
j=0
“X = x(Λ)DQ”
25. David F. Gleich (Sandia) Parameterized Matrices 25 / 38
Galerkin Approximations for PMEs
A(s)x(s) = b(s)
N−1
x(s) ≈ x π (s) = Xπ(s)
=0
(A(s)Xπ(s) − b(s))π(s)T = 0
A(s)Xπ(s)π(s)T = b(s)π T
π(s)π(s)T ⊗ A(s) vec(X) = π(s) ⊗ b(s)
But how do we compute them?
26. David F. Gleich (Sandia) Parameterized Matrices 26 / 38
Comparison results
ρ1
ρ2
-1 1
Let ρ be the sum of semi-axes of the ellipse
(hyperellipse) of analyticity.
Both methods converge:
Cp ρ−N vs. Cg ρ−N
Is it even worth it?
27. David F. Gleich (Sandia) Parameterized Matrices 27 / 38
Convergence of approximation
SPECTRAL METHODS F
0
1+ s 0 (s) 10
s 1 1 (s)
−2
2 10
=
1
L2 Error −4
10
2−s
0 (s) =
1 + − s2 −6
10
1 + − 2s ε=0.8
1 (s) = −8 ε=0.6
1 + − s2 10
ε=0.4
ε=0.2
Convergence rate −10
10
ρ<1+ . 0 5 10 15 20 25 30
Order
28. David F. Gleich (Sandia) Parameterized Matrices 28 / 38
A Gautschi-Golub comparison
Quadrature
b m
ƒ ( )dω( ) ≈ ƒ (λj )ωj = eT ƒ (Jm )e1
1
j=1
where
Jm is the m × m Jacobi matrix for ω
J is tridiagonal, and encodes three-term
recurrence
29. David F. Gleich (Sandia) Parameterized Matrices 29 / 38
A Gautschi-Golub comparison
Pseudo-spectral Galerkin
A(Jm ) vec(X) = b(Jm )e1 [A(J∞ )]m vec(X) = [b(J∞ )]m
the notation [·]m means take
the leading m × m block of ·.
This solution is truncating This solution is truncating the
the expansion operator
Computational Implication
Given 〈(π(s)π(s)T ⊗ A(s))〉 vec(X) = 〈π(s) ⊗ b(s)〉
Approximate 〈(π(s)π(s)T ⊗ A(s))〉, and 〈π(s) ⊗ b(s)〉 with GQ?
NO! Equivalent to [A(Jm )]m =⇒ same answer.
NOTE! Both equal for linear A(s), and “low-degree” polys b(s)
30. David F. Gleich (Sandia) Parameterized Matrices 30 / 38
Computing the Galerkin solution
,j block of π(s)π(s)T ⊗A(s) = A(s)π (s)πj (s)
IDEA use M > m point quadrature.
If A(s) is a polynomial of degree d, then if
m+m+d
“M > ” not precise
2
the solution will be exact.
If A(s) is an analytic function with a rapidly
converging expansion, large M will be close.
31. David F. Gleich (Sandia) Parameterized Matrices 31 / 38
Numerically integrated Galerkin
,j block of π(s)π(s)T ⊗A(s) = A(s)π (s)πj (s)
Integrate each block with M point quadrature
After much munging with quadrature rules
π(s)π(s)T ⊗ A(s) M = (Q ⊗ )A(Λ)(Q ⊗ )T
where Q:m×M
A(λ1 )
... orthogonal rows
A(Λ) = , weighted rows
A(λM ) of J ’s eigenvecs
All we need is a function for A(·)
M
32. David F. Gleich (Sandia) Parameterized Matrices 32 / 38
Numerical Gakerkin factorization
π(s)π(s)T ⊗ A(s) M = (Q ⊗ )A(Λ)(Q ⊗ )T
Provides
computable matrix-vector product!
eigenvalue bounds on A( s)
preconditioning insights
a computable residual
33. David F. Gleich (Sandia) Parameterized Matrices 33 / 38
Parameterized Matrix Package
PMPACK
A Matlab package for
Parameterized Matrix Problems
https: //github. com/paulcon/pmpack
Implements univariate and multivariate
Galerkin and pseudo-spectral methods
Many demos Residual error estimates
Uncertainty quanti- Arbitrary polynomial
fication helpers bases (anisotropic)
Simple interface Many parameter types
35. David F. Gleich (Sandia) Parameterized Matrices 35 / 38
Where is this going?
Beyond spectral methods!
MapReduce and Surrogate Models
A surrogate model
is a function that
reproduces the
f1 Surrogate
output of a simul-
Sample
ation and predicts
its output at new f2
parameter values.
f5
The Database New Samples
The Surrogate
s1 -> f1 Extraction Interpolation sa -> fa
s2 -> f2 sb -> fb
sk -> fk Just one machine
sc -> fc
On the MapReduce cluster On the MapReduce cluster
David Gleich (Sandia) 5/5/2011 13/18
36. David F. Gleich (Sandia) Parameterized Matrices 36 / 38
Where is this going?
Parameterized Lanczos! constant!
A(s)Vk (s) = Vk+1 (s)Tk,k+1
The matrix Tk,k is the first k terms of the Ja-
cobi matrix for the weight
b(s)T A(s)b(s)
where b(s) is the first Lanczos vector.
uses Chebfun for one-parameter
multivarite methods using Monte Carlo
37. David F. Gleich (Sandia) Parameterized Matrices 37 / 38
Summary
Look at problems in uncertainty quantification
as parameterized matrices
Extended the theory of spectral methods to
the parameterized matrix case.
Devleoped software for spectral methods for
parameterized matrices.
38. Papers
Constantine, Gleich, Iaccarino. Spectral Methods
for Parametrized Matrix Problems. SIMAX, 2010.
Constantine, Gleich, Iaccarino. A Factorization of
the Spectal Galerkin System for Parameterized
Matrix Equations: Derivation and Applications.
SISC, to appear.
Constantine, Gleich. Random Alpha PageRank.
Internet Mathematics, 2010.
Code
https: //github. com/paulcon/pmpack