Modeling perceptual similarity and shift invariance in deep networksNAVER Engineering
Abstract: While it is nearly effortless for humans to quickly assess the perceptual similarity between two images, the underlying processes are thought to be quite complex. Despite this, the most widely used perceptual metrics today, such as PSNR and SSIM, are simple, shallow functions, and fail to account for many nuances of human perception. Recently, the deep learning community has found that features of the VGG network trained on ImageNet classification have been remarkably useful as a training loss for image synthesis. But how perceptual are these so-called "perceptual losses"? What elements are critical for their success? To answer these questions, we introduce a new dataset of human perceptual similarity judgments. We systematically evaluate deep features across different architectures and tasks and compare them with classic metrics. We find that deep features outperform all previous metrics by large margins on our dataset. More surprisingly, this result is not restricted to ImageNet-trained VGG features, but holds across different deep architectures and levels of supervision (supervised, self-supervised, or even unsupervised). Our results suggest that perceptual similarity is an emergent property shared across deep visual representations.
Despite their strong transfer performance, deep convolutional representations surprisingly lack a basic low-level property -- shift-invariance, as small input shifts or translations can cause drastic changes in the output. Commonly used downsampling methods, such as max-pooling, strided-convolution, and average-pooling, ignore the sampling theorem. The well-known signal processing fix is anti-aliasing by low-pass filtering before downsampling. However, simply inserting this module into deep networks degrades performance; as a result, it is seldomly used today. We show that when integrated correctly, it is compatible with existing architectural components, such as max-pooling and strided-convolution. We observe increased accuracy in ImageNet classification, across several commonly-used architectures, such as ResNet, DenseNet, and MobileNet, indicating effective regularization. Furthermore, we observe better generalization, in terms of stability and robustness to input corruptions. Our results demonstrate that this classical signal processing technique has been undeservingly overlooked in modern deep networks.
Depth estimation do we need to throw old things awayNAVER Engineering
발표의 개요 : Human visual system 기반의 CNN for depth estimation과 CNN inspired by conventional methods
Case1: Cross-channel stereo matching
Case2: Depth from light field
Case3: Multiview stereo
Conclusion
Modeling perceptual similarity and shift invariance in deep networksNAVER Engineering
Abstract: While it is nearly effortless for humans to quickly assess the perceptual similarity between two images, the underlying processes are thought to be quite complex. Despite this, the most widely used perceptual metrics today, such as PSNR and SSIM, are simple, shallow functions, and fail to account for many nuances of human perception. Recently, the deep learning community has found that features of the VGG network trained on ImageNet classification have been remarkably useful as a training loss for image synthesis. But how perceptual are these so-called "perceptual losses"? What elements are critical for their success? To answer these questions, we introduce a new dataset of human perceptual similarity judgments. We systematically evaluate deep features across different architectures and tasks and compare them with classic metrics. We find that deep features outperform all previous metrics by large margins on our dataset. More surprisingly, this result is not restricted to ImageNet-trained VGG features, but holds across different deep architectures and levels of supervision (supervised, self-supervised, or even unsupervised). Our results suggest that perceptual similarity is an emergent property shared across deep visual representations.
Despite their strong transfer performance, deep convolutional representations surprisingly lack a basic low-level property -- shift-invariance, as small input shifts or translations can cause drastic changes in the output. Commonly used downsampling methods, such as max-pooling, strided-convolution, and average-pooling, ignore the sampling theorem. The well-known signal processing fix is anti-aliasing by low-pass filtering before downsampling. However, simply inserting this module into deep networks degrades performance; as a result, it is seldomly used today. We show that when integrated correctly, it is compatible with existing architectural components, such as max-pooling and strided-convolution. We observe increased accuracy in ImageNet classification, across several commonly-used architectures, such as ResNet, DenseNet, and MobileNet, indicating effective regularization. Furthermore, we observe better generalization, in terms of stability and robustness to input corruptions. Our results demonstrate that this classical signal processing technique has been undeservingly overlooked in modern deep networks.
Depth estimation do we need to throw old things awayNAVER Engineering
발표의 개요 : Human visual system 기반의 CNN for depth estimation과 CNN inspired by conventional methods
Case1: Cross-channel stereo matching
Case2: Depth from light field
Case3: Multiview stereo
Conclusion
Yurii Pashchenko: Zero-shot learning capabilities of CLIP model from OpenAILviv Startup Club
Yurii Pashchenko: Zero-shot learning capabilities of CLIP model from OpenAI
AI & BigData Online Day 2021
Website - https://aiconf.com.ua/
Youtube - https://www.youtube.com/startuplviv
FB - https://www.facebook.com/aiconf
Estimating Human Pose from Occluded Images (ACCV 2009)Jia-Bin Huang
We address the problem of recovering 3D human pose from single 2D images, in which the pose estimation problem is formulated as a direct nonlinear regression from image observation to 3D joint positions. One key issue that has not been addressed in the literature is how to estimate 3D pose when humans in the scenes are partially or heavily occluded. When occlusions occur, features extracted from image observations (e.g., silhouettes-based shape features, histogram of oriented gradient, etc.) are seriously corrupted, and consequently the regressor (trained on un-occluded images) is unable to estimate pose states correctly. In this paper, we present a method that is capable of handling occlusions using sparse signal representations, in which each test sample is represented as a compact linear combination of training samples. The sparsest solution can then be efficiently obtained by solving a convex optimization problem with certain norms (such as l1-norm). The corrupted test image can be recovered with a sparse linear combination of un-occluded training images which can then be used for estimating human pose correctly (as if no occlusions exist). We also show that the proposed approach implicitly performs relevant feature selection with un-occluded test images. Experimental results on synthetic and real data sets bear out our theory that with sparse representation 3D human pose can be robustly estimated when humans are partially or heavily occluded in the scenes.
“Automatically learning multiple levels of representations of the underlying distribution of the data to be modelled”
Deep learning algorithms have shown superior learning and classification performance.
In areas such as transfer learning, speech and handwritten character recognition, face recognition among others.
(I have referred many articles and experimental results provided by Stanford University)
Deep Learning Using TensorFlow | TensorFlow Tutorial | AI & Deep Learning Tra...Edureka!
This Edureka "Deep Learning Using TensorFlow" video will help you to understand how to use TensorFlow in Deep Learning. This tutorial will be discussing about Artificial Intelligence, Machine Learning and its limitations, how Deep Learning overcame Machine Learning limitations, different real-life applications of Deep Learning, how to use TensorFlow for Deep Learning. Below are the topics covered in this tutorial:
1. Why Artificial Intelligence?
2. What Is Artificial Intelligence?
3. Subsets Of Artificial Intelligence
4. What Is Machine Learning?
5. Limitations Of Machine Learning
6. What Is Deep Learning And How It Works?
7. Single Layer Perceptron
8. Limitations Of Single Layer Perceptron
9. Multi Layer Perceptron
"You Can Do It" by Louis Monier (Altavista Co-Founder & CTO) & Gregory Renard (CTO & Artificial Intelligence Lead Architect at Xbrain) for Deep Learning keynote #0 at Holberton School (http://www.meetup.com/Holberton-School/events/228364522/)
If you want to assist to similar keynote for free, checkout http://www.meetup.com/Holberton-School/
Deep Neural Networks that talk (Back)… with styleRoelof Pieters
Talk at Nuclai 2016 in Vienna
Can neural networks sing, dance, remix and rhyme? And most importantly, can they talk back? This talk will introduce Deep Neural Nets with textual and auditory understanding and some of the recent breakthroughs made in these fields. It will then show some of the exciting possibilities these technologies hold for "creative" use and explorations of human-machine interaction, where the main theorem is "augmentation, not automation".
http://events.nucl.ai/track/cognitive/#deep-neural-networks-that-talk-back-with-style
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesMatthew Lease
Talk given at the 8th Forum for Information Retrieval Evaluation (FIRE, http://fire.irsi.res.in/fire/2016/), December 10, 2016, and at the Qatar Computing Research Institute (QCRI), December 15, 2016.
Practical and Robust Stenciled Shadow Volumes for Hardware-Accelerated RenderingMark Kilgard
Twenty-five years ago, Crow published the shadow volume approach for determining shadowed regions in a scene. A decade ago, Heidmann described a hardware-accelerated stencil bufferbased shadow volume algorithm. However, hardware-accelerated stenciled shadow volume techniques have not been widely adopted by 3D games and applications due in large part to the lack of robustness of described techniques. This situation persists despite widely available hardware support. Specifically what has been lacking is a technique that robustly handles various "hard" situations created by near or far plane clipping of shadow volumes. We describe a robust, artifact-free technique for hardwareaccelerated rendering of stenciled shadow volumes. Assuming existing hardware, we resolve the issues otherwise caused by shadow volume near and far plane clipping through a combination of (1) placing the conventional far clip plane “at infinity”, (2) rasterization with infinite shadow volume polygons via homogeneous coordinates, and (3) adopting a zfail stencil-testing scheme. Depth clamping, a new rasterization feature provided by NVIDIA's GeForce3 & GeForce4 Ti GPUs, preserves existing depth precision by not requiring the far plane to be placed at infinity. We also propose two-sided stencil testing to improve the efficiency of rendering stenciled shadow volumes.
March 12, 2002.
This was submitted to the SIGGRAPH 2002 papers committee but was rejected.
These slides summarize the main trends in deep neural networks for video encoding. Including single frame models, spatiotemporal convolutionals, long term sequence modeling with RNNs and their combinaction with optical flow.
Yurii Pashchenko: Zero-shot learning capabilities of CLIP model from OpenAILviv Startup Club
Yurii Pashchenko: Zero-shot learning capabilities of CLIP model from OpenAI
AI & BigData Online Day 2021
Website - https://aiconf.com.ua/
Youtube - https://www.youtube.com/startuplviv
FB - https://www.facebook.com/aiconf
Estimating Human Pose from Occluded Images (ACCV 2009)Jia-Bin Huang
We address the problem of recovering 3D human pose from single 2D images, in which the pose estimation problem is formulated as a direct nonlinear regression from image observation to 3D joint positions. One key issue that has not been addressed in the literature is how to estimate 3D pose when humans in the scenes are partially or heavily occluded. When occlusions occur, features extracted from image observations (e.g., silhouettes-based shape features, histogram of oriented gradient, etc.) are seriously corrupted, and consequently the regressor (trained on un-occluded images) is unable to estimate pose states correctly. In this paper, we present a method that is capable of handling occlusions using sparse signal representations, in which each test sample is represented as a compact linear combination of training samples. The sparsest solution can then be efficiently obtained by solving a convex optimization problem with certain norms (such as l1-norm). The corrupted test image can be recovered with a sparse linear combination of un-occluded training images which can then be used for estimating human pose correctly (as if no occlusions exist). We also show that the proposed approach implicitly performs relevant feature selection with un-occluded test images. Experimental results on synthetic and real data sets bear out our theory that with sparse representation 3D human pose can be robustly estimated when humans are partially or heavily occluded in the scenes.
“Automatically learning multiple levels of representations of the underlying distribution of the data to be modelled”
Deep learning algorithms have shown superior learning and classification performance.
In areas such as transfer learning, speech and handwritten character recognition, face recognition among others.
(I have referred many articles and experimental results provided by Stanford University)
Deep Learning Using TensorFlow | TensorFlow Tutorial | AI & Deep Learning Tra...Edureka!
This Edureka "Deep Learning Using TensorFlow" video will help you to understand how to use TensorFlow in Deep Learning. This tutorial will be discussing about Artificial Intelligence, Machine Learning and its limitations, how Deep Learning overcame Machine Learning limitations, different real-life applications of Deep Learning, how to use TensorFlow for Deep Learning. Below are the topics covered in this tutorial:
1. Why Artificial Intelligence?
2. What Is Artificial Intelligence?
3. Subsets Of Artificial Intelligence
4. What Is Machine Learning?
5. Limitations Of Machine Learning
6. What Is Deep Learning And How It Works?
7. Single Layer Perceptron
8. Limitations Of Single Layer Perceptron
9. Multi Layer Perceptron
"You Can Do It" by Louis Monier (Altavista Co-Founder & CTO) & Gregory Renard (CTO & Artificial Intelligence Lead Architect at Xbrain) for Deep Learning keynote #0 at Holberton School (http://www.meetup.com/Holberton-School/events/228364522/)
If you want to assist to similar keynote for free, checkout http://www.meetup.com/Holberton-School/
Deep Neural Networks that talk (Back)… with styleRoelof Pieters
Talk at Nuclai 2016 in Vienna
Can neural networks sing, dance, remix and rhyme? And most importantly, can they talk back? This talk will introduce Deep Neural Nets with textual and auditory understanding and some of the recent breakthroughs made in these fields. It will then show some of the exciting possibilities these technologies hold for "creative" use and explorations of human-machine interaction, where the main theorem is "augmentation, not automation".
http://events.nucl.ai/track/cognitive/#deep-neural-networks-that-talk-back-with-style
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesMatthew Lease
Talk given at the 8th Forum for Information Retrieval Evaluation (FIRE, http://fire.irsi.res.in/fire/2016/), December 10, 2016, and at the Qatar Computing Research Institute (QCRI), December 15, 2016.
Practical and Robust Stenciled Shadow Volumes for Hardware-Accelerated RenderingMark Kilgard
Twenty-five years ago, Crow published the shadow volume approach for determining shadowed regions in a scene. A decade ago, Heidmann described a hardware-accelerated stencil bufferbased shadow volume algorithm. However, hardware-accelerated stenciled shadow volume techniques have not been widely adopted by 3D games and applications due in large part to the lack of robustness of described techniques. This situation persists despite widely available hardware support. Specifically what has been lacking is a technique that robustly handles various "hard" situations created by near or far plane clipping of shadow volumes. We describe a robust, artifact-free technique for hardwareaccelerated rendering of stenciled shadow volumes. Assuming existing hardware, we resolve the issues otherwise caused by shadow volume near and far plane clipping through a combination of (1) placing the conventional far clip plane “at infinity”, (2) rasterization with infinite shadow volume polygons via homogeneous coordinates, and (3) adopting a zfail stencil-testing scheme. Depth clamping, a new rasterization feature provided by NVIDIA's GeForce3 & GeForce4 Ti GPUs, preserves existing depth precision by not requiring the far plane to be placed at infinity. We also propose two-sided stencil testing to improve the efficiency of rendering stenciled shadow volumes.
March 12, 2002.
This was submitted to the SIGGRAPH 2002 papers committee but was rejected.
These slides summarize the main trends in deep neural networks for video encoding. Including single frame models, spatiotemporal convolutionals, long term sequence modeling with RNNs and their combinaction with optical flow.
Automatic Differentiation and SciML in Reality: What can go wrong, and what t...Chris Rackauckas
How does automatic differentiation work, what happens when you apply it to equation solvers, and how can it go wrong? This talk is all about the details of how scientific machine learning (SciML) works. It goes into detail as to how neural networks are trained in the context of equation solvers, along with the numerical issues that can arise in the differentiation processes.
https://sciml.ai/
Mathematical Modeling for Practical ProblemsLiwei Ren任力偉
Mathematical modeling is an important step for developing many advanced technologies in various domains such as network security, data mining and etc… This lecture introduces a process that the speaker summarizes from his past practice of mathematical modeling and algorithmic solutions in IT industry, as an applied mathematician, algorithm specialist or software engineer , and even as an entrepreneur. A practical problem from DLP system will be used as an example for creating math models and providing algorithmic solutions.
Performance of Matching Algorithmsfor Signal Approximationiosrjce
IOSR Journal of Electronics and Communication Engineering(IOSR-JECE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of electronics and communication engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in electronics and communication engineering. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
An introduction to Deep Learning (DL) concepts, such as neural networks, back propagation, activation functions, CNNs, RNNs (if time permits), and the CLT/AUT/fixed-point theorems, along with code samples in Java and TensorFlow.
A Compressed Sensing Approach to Image Reconstructionijsrd.com
compressed sensing is a new technique that discards the Shannon Nyquist theorem for reconstructing a signal. It uses very few random measurements that were needed traditionally to recover any signal or image. The need of this technique comes from the fact that most of the information is provided by few of the signal coefficients, then why do we have to acquire all the data if it is thrown away without being used. A number of review articles and research papers have been published in this area. But with the increasing interest of practitioners in this emerging field it is mandatory to take a fresh look at this method and its implementations. The main aim of this paper is to review the compressive sensing theory and its applications.
Generalizing Scientific Machine Learning and Differentiable Simulation Beyond...Chris Rackauckas
The combination of scientific models into deep learning structures, commonly referred to as scientific machine learning (SciML), has made great strides in the last few years in incorporating models such as ODEs and PDEs into deep learning through differentiable simulation. However, the vast space of scientific simulation also includes models like jump diffusions, agent-based models, and more. Is SciML constrained to the simple continuous cases or is there a way to generalize to more advanced model forms? This talk will dive into the mathematical aspects of generalizing differentiable simulation to discuss cases like chaotic simulations, differentiating stochastic simulations like particle filters and agent-based models, and solving inverse problems of Bayesian inverse problems (i.e. differentiation of Markov Chain Monte Carlo methods). We will then discuss the evolving numerical stability issues, implementation issues, and other interesting mathematical tidbits that are coming to light as these differentiable programming capabilities are being adopted.
Bio: Dr. Chris Rackauckas is the VP of Modeling and Simulation at JuliaHub, the Director of Scientific Research at Pumas-AI, Co-PI of the Julia Lab at MIT, and the lead developer of the SciML Open Source Software Organization. For his work in mechanistic machine learning, his work is credited for the 15,000x acceleration of NASA Launch Services simulations and recently demonstrated a 60x-570x acceleration over Modelica tools in HVAC simulation, earning Chris the US Air Force Artificial Intelligence Accelerator Scientific Excellence Award. See more at https://chrisrackauckas.com/. He is the lead developer of the Pumas project and has received a top presentation award at every ACoP in the last 3 years for improving methods for uncertainty quantification, automated GPU acceleration of nonlinear mixed effects modeling (NLME), and machine learning assisted construction of NLME models with DeepNLME. For these achievements, Chris received the Emerging Scientist award from ISoP.
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...Levi Shapiro
Letter from the Congress of the United States regarding Anti-Semitism sent June 3rd to MIT President Sally Kornbluth, MIT Corp Chair, Mark Gorenberg
Dear Dr. Kornbluth and Mr. Gorenberg,
The US House of Representatives is deeply concerned by ongoing and pervasive acts of antisemitic
harassment and intimidation at the Massachusetts Institute of Technology (MIT). Failing to act decisively to ensure a safe learning environment for all students would be a grave dereliction of your responsibilities as President of MIT and Chair of the MIT Corporation.
This Congress will not stand idly by and allow an environment hostile to Jewish students to persist. The House believes that your institution is in violation of Title VI of the Civil Rights Act, and the inability or
unwillingness to rectify this violation through action requires accountability.
Postsecondary education is a unique opportunity for students to learn and have their ideas and beliefs challenged. However, universities receiving hundreds of millions of federal funds annually have denied
students that opportunity and have been hijacked to become venues for the promotion of terrorism, antisemitic harassment and intimidation, unlawful encampments, and in some cases, assaults and riots.
The House of Representatives will not countenance the use of federal funds to indoctrinate students into hateful, antisemitic, anti-American supporters of terrorism. Investigations into campus antisemitism by the Committee on Education and the Workforce and the Committee on Ways and Means have been expanded into a Congress-wide probe across all relevant jurisdictions to address this national crisis. The undersigned Committees will conduct oversight into the use of federal funds at MIT and its learning environment under authorities granted to each Committee.
• The Committee on Education and the Workforce has been investigating your institution since December 7, 2023. The Committee has broad jurisdiction over postsecondary education, including its compliance with Title VI of the Civil Rights Act, campus safety concerns over disruptions to the learning environment, and the awarding of federal student aid under the Higher Education Act.
• The Committee on Oversight and Accountability is investigating the sources of funding and other support flowing to groups espousing pro-Hamas propaganda and engaged in antisemitic harassment and intimidation of students. The Committee on Oversight and Accountability is the principal oversight committee of the US House of Representatives and has broad authority to investigate “any matter” at “any time” under House Rule X.
• The Committee on Ways and Means has been investigating several universities since November 15, 2023, when the Committee held a hearing entitled From Ivory Towers to Dark Corners: Investigating the Nexus Between Antisemitism, Tax-Exempt Universities, and Terror Financing. The Committee followed the hearing with letters to those institutions on January 10, 202
Francesca Gottschalk - How can education support child empowerment.pptxEduSkills OECD
Francesca Gottschalk from the OECD’s Centre for Educational Research and Innovation presents at the Ask an Expert Webinar: How can education support child empowerment?
Model Attribute Check Company Auto PropertyCeline George
In Odoo, the multi-company feature allows you to manage multiple companies within a single Odoo database instance. Each company can have its own configurations while still sharing common resources such as products, customers, and suppliers.
2024.06.01 Introducing a competency framework for languag learning materials ...Sandy Millin
http://sandymillin.wordpress.com/iateflwebinar2024
Published classroom materials form the basis of syllabuses, drive teacher professional development, and have a potentially huge influence on learners, teachers and education systems. All teachers also create their own materials, whether a few sentences on a blackboard, a highly-structured fully-realised online course, or anything in between. Despite this, the knowledge and skills needed to create effective language learning materials are rarely part of teacher training, and are mostly learnt by trial and error.
Knowledge and skills frameworks, generally called competency frameworks, for ELT teachers, trainers and managers have existed for a few years now. However, until I created one for my MA dissertation, there wasn’t one drawing together what we need to know and do to be able to effectively produce language learning materials.
This webinar will introduce you to my framework, highlighting the key competencies I identified from my research. It will also show how anybody involved in language teaching (any language, not just English!), teacher training, managing schools or developing language learning materials can benefit from using the framework.
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Dr. Vinod Kumar Kanvaria
Exploiting Artificial Intelligence for Empowering Researchers and Faculty,
International FDP on Fundamentals of Research in Social Sciences
at Integral University, Lucknow, 06.06.2024
By Dr. Vinod Kumar Kanvaria
A workshop hosted by the South African Journal of Science aimed at postgraduate students and early career researchers with little or no experience in writing and publishing journal articles.
A Strategic Approach: GenAI in EducationPeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
How to Make a Field invisible in Odoo 17Celine George
It is possible to hide or invisible some fields in odoo. Commonly using “invisible” attribute in the field definition to invisible the fields. This slide will show how to make a field invisible in odoo 17.
A review of the growth of the Israel Genealogy Research Association Database Collection for the last 12 months. Our collection is now passed the 3 million mark and still growing. See which archives have contributed the most. See the different types of records we have, and which years have had records added. You can also see what we have for the future.
Acetabularia Information For Class 9 .docxvaibhavrinwa19
Acetabularia acetabulum is a single-celled green alga that in its vegetative state is morphologically differentiated into a basal rhizoid and an axially elongated stalk, which bears whorls of branching hairs. The single diploid nucleus resides in the rhizoid.
1. Sparse & Redundant Representation Modeling of Images: Theory and Applications Michael Elad The Computer Science Department The Technion – Israel Institute of technology Haifa 32000, Israel Seventh International Conference on Curves and Surfaces Avignon - FRANCE June 24-30, 2010 This research was supported by the European Community's FP7-FET program SMALL under grant agreement no. 225913
2. 2 This Talk Gives and Overview On … A decade of tremendous progress in the field of Sparse and Redundant Representations Numerical Problems Theory Applications Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
3.
4. When used in image processing, they lead to state-of-the-art results. Today we will show that Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
5. 4 Part IDenoising by Sparse & Redundant Representations Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
6.
7. Many Considered Directions: Partial differential equations, Statistical estimators, Adaptive filters, Inverse problems & regularization, Wavelets, Example-based techniques, Sparse representations, …Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
8.
9. Clearly, the wisdom in such an approach is within the choice of the prior – modeling the images of interest. Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
12. …Total-Variation Wavelet Sparsity Sparse & Redundant The Evolution of G(x) During the past several decades we have made all sort of guesses about the prior G(x) for images: Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
13.
14.
15.
16. Rich:A general model: the obtained signals are a union of many low-dimensional Gaussians.
17. Familiar: We have been using this model in other context for a while now (wavelet, JPEG, …).Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
18. 10 As p 0 we get a count of the non-zeros in the vector 1 -1 +1 Sparse & Redundant Rep. Modeling? Our signal model is thus: Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
19.
20. The vector is the representation (sparse/redundant) of the desired signal x.
21. The core idea: while few (L out of K) atoms can be merged to form the true signal, the noise cannot be fitted well. Thus, we obtain an effective projection of the noise onto a very low-dimensional space, thus getting denoising effect. Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
22.
23. Practical Problems: What dictionary D should we use, such that all this leads to effective denoising? Will all this work in applications?Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
24. Image Denoising & Beyond Via Learned Dictionaries and Sparse representations By: Michael Elad 13 To Summarize So Far … Image denoising (and many other problems in image processing) requires a model for the desired image We proposed a model for signals/images based on sparse and redundant representations What do we do? There are some issues: Theoretical How to approximate? What about D? Great! No?
25. 14 Part IITheoretical & Numerical Foundations Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
26. Sparse and Redundant Signal Representation, and Its Role in Image Processing 15 Lets Start with the Noiseless Problem Suppose we build a signal by the relation We aim to find the signal’s representation: Known Why should we necessarily get ? It might happen that eventually . Uniqueness
27. 16 * Definition:Given a matrix D, =Spark{D} is the smallestnumber of columns that are linearly dependent. Donoho & E. (‘02) Example: Spark = 3 * In tensor decomposition, Kruskal defined something similar already in 1989. Matrix “Spark” Rank = 4 Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
28. 17 Suppose this problem has been solved somehow Uniqueness If we found a representation that satisfy Then necessarily it is unique (the sparsest). Donoho & E. (‘02) M This result implies that if generates signals using “sparse enough” , the solution of the above will find it exactly. Uniqueness Rule Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
29. 18 This is a combinatorial problem, proven to be NP-Hard! Solve the LS problem for each support There are (K) such supports L Our Goal Here is a recipe for solving this problem: Gather all the supports {Si}i of cardinality L LS error ≤ ε2? Set L=1 Yes No Set L=L+1 Assume: K=1000, L=10 (known!), 1 nano-sec per each LS We shall need ~8e+6 years to solve this problem !!!!! Done Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
30. 19 Lets Approximate Greedy methods Build the solution one non-zero element at a time Relaxation methods Smooth the L0 and use continuous optimization techniques Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
41. The Orthogonal MP (OMP) is an improved version that re-evaluates the coefficients by Least-Squares after each round.Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
42.
43. Relaxation Algorithms: Basis Pursuit (a.k.a. LASSO), Dnatzig Selector & numerical ways to handle them [1995-today].
45. …Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
46.
47.
48. The above result corresponds to the worst-case, and as such, it is too pessimistic.
49. Average performance results are available too, showing much better bounds [Donoho (`04)] [Candes et.al. (‘04)] [Tanner et.al. (‘05)] [E. (‘06)] [Tropp et.al. (‘06)] … [Candes et. al. (‘09)]. Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
50.
51. This result is the oracle’s error, multuiplied by C·logK.
52. Similar results exist for other pursuit algorithms (Dantzig Selector, Orthogonal Matching Pursuit, CoSaMP, Subspace Pursuit, …)Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
53. Image Denoising & Beyond Via Learned Dictionaries and Sparse representations By: Michael Elad 26 To Summarize So Far … Image denoising (and many other problems in image processing) requires a model for the desired image We proposed a model for signals/images based on sparse and redundant representations Problems? What do we do? We have seen that there are approximation methods to find the sparsest solution, and there are theoretical results that guarantee their success. The Dictionary D should be found somehow !!! What next?
54. 27 Part IIIDictionary Learning: The K-SVD Algorithm Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
55. 28 D should be chosen such that it sparsifies the representations The approach we will take for building D is training it, based on Learning from Image Examples One approach to choose D is from a known set of transforms (Steerable wavelet, Curvelet, Contourlets, Bandlets, Shearlets…) What Should D Be? Our Assumption: Good-behaved Images have a sparse representation Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
56. 29 D X A Each example has a sparse representation with no more than L atoms Each example is a linear combination of atoms from D Measure of Quality for D [Field & Olshausen (‘96)] [Engan et. al. (‘99)] [Lewicki & Sejnowski (‘00)] [Cotter et. al. (‘03)] [Gribonval et. al. (‘04)] [Aharon, E. & Bruckstein (‘04)] [Aharon, E. & Bruckstein (‘05)] Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
57. 30 D Initialize D Sparse Coding Nearest Neighbor XT Dictionary Update Column-by-Column by Mean computation over the relevant examples K–Means For Clustering Clustering: An extreme sparse representation Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
58. 31 D Initialize D Sparse Coding Use Matching Pursuit XT Dictionary Update Column-by-Column by SVD computation over the relevant examples The K–SVD Algorithm – General [Aharon, E. & Bruckstein (‘04,‘05)] Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
59. 32 D D is known! For the jth item we solve XT K–SVD: Sparse Coding Stage Solved by A Pursuit Algorithm Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
60. 33 Fixing all A and D apart from the kth column, and seek both dk and the kth column in A to better fit the residual! We should solve: K–SVD: Dictionary Update Stage We refer only to the examples that use the column dk D SVD Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
61. Image Denoising & Beyond Via Learned Dictionaries and Sparse representations By: Michael Elad 34 To Summarize So Far … Image denoising (and many other problems in image processing) requires a model for the desired image We proposed a model for signals/images based on sparse and redundant representations Problems? What do we do? We have seen approximation methods that find the sparsest solution, and theoretical results that guarantee their success. We also saw a way to learn D Will it all work in applications? What next?
62. 35 Part IVBack to Denoising … and Beyond – Combining it All Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
65. The solution: Force shift-invariant sparsity - on each patch of size N-by-N (N=8) in the image, including overlaps. Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
69. Image of size 10002 pixels ~106 examples to use – more than enough.
70. This works much better!Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
71. 38 ComputeD to minimize using SVD, updating one column at a time Compute x by which is a simple averaging of shifted patches Computeij per patch using the matching pursuit K-SVD K-SVD Image Denoising D? x=y and D known x and ij known D and ij known Complexity of this algorithm: O(N2×K×L×Iterations) per pixel. For N=8, L=1, K=256, and 10 iterations, we need 160,000 (!!) operations per pixel. Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
72.
73. In a recent work that extended this algorithm to use joint sparse representation on the patches, the best published denoising performance are obtained [Mairal, Bach, Ponce, Sapiro & Zisserman (‘09)].Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
74.
75. The solution with the above algorithm is simple – consider 3D patches or 8-by-8 with the 3 color layers, and the dictionary will detect the proper relations. Original Noisy (20.43dB) Result (30.75dB) Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
76. 41 Original Noisy (12.77dB) Result (29.87dB) Denoising (Color) [Mairal, E. & Sapiro (‘08)] Our experiments lead to state-of-the-art denoising results, giving ~1dB better results compared to [Mcauley et. al. (‘06)] which implements a learned MRF model (Field-of-Experts) Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
77. 42 Video Denoising [Protter & E. (‘09)] When turning to handle video, one could improve over the previous scheme in three important ways: Propagate the dictionary from one frame to another, and thus reduce the number of iterations; Use 3D patches that handle the motion implicitly; and Motion estimation and compensation can and should be avoided [Buades, Col, and Morel (‘06)]. Our experiments lead to state-of-the-art video denoising results, giving ~0.5dB better results on average compared to [Boades, Coll & Morel (‘05)] and comparable to [Rusanovskyy, Dabov, & Egiazarian (‘06)] Original Noisy (σ=25) Denoised (PSNR=27.62) Original Noisy (σ=15) Denoised (PSNR=29.98) Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
78.
79. In medicine, CT projections are obtained by X-ray, and it typically requires a high dosage of radiation in order to obtain a good quality reconstruction.
81. Armed with sparse and redundant representation modeling, we can denoise the data and the final reconstruction … enabling CT with lower dosage.Denoising of the sinogram and post-processing (another denoising stage) of the reconstruction PSNR=26.06dB FBP result with low dosage (one fifth) PSNR=22.31dB
86. If α0 was sparse enough, it will be the solution of the above problem! Thus, computing Dα0recovers x perfectly.= Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
91. If α0 was sparse enough, it will be the sparsest solution of the new system, thus, computing Dα0recovers x perfectly.
92. Compressed sensing focuses on conditions for this to happen, guaranteeing such recovery.Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
93. 46 Inpainting[Mairal, E. & Sapiro (‘08)] Our experiments lead to state-of-the-art inpainting results. Original 80% missing Result Original 80% missing Result Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
94. 47 Inpainting[Mairal, E. & Sapiro (‘08)] The same can be done for video, very much like the denoising treatment: (i) 3D patches, (ii) no need to compute the dictionary from scratch for each frame, and (iii) no need for explicit motion estimation Original 80% missing Result Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
95.
96. Generalizing the inpainting scheme to handle demosaicing is tricky because of the possibility to learn the mosaic pattern within the dictionary.
97. In order to avoid “over-fitting”, we handle the demosaicing problem while forcing strong sparsity and applying only few iterations. Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
100. By adapting to the image-content (PCA/K-SVD), better results could be obtained.
101. For these techniques to operate well, traindictionaries locally (per patch) using a training set of images is required.
102. In PCA, only the (quantized) coefficients are stored, whereas the K-SVD requires storage of the indices as well.
103. Geometric alignment of the image is very helpful and should be done [Goldenberg, Kimmel, & E. (‘05)]. Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
104. 50 Divide the image into disjoint 15-by-15 patches. For each compute mean and dictionary On the training set Per each patch find the operating parameters (number of atoms L, quantization Q) Warp, remove the mean from each patch, sparse code using L atoms, apply Q, and dewarp On the test image Image Compression Detect main features and warp the images to a common reference (20 parameters) Training set (2500 images) Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
105. 51 11.99 10.49 8.81 5.56 10.83 8.92 7.89 4.82 10.93 8.71 8.61 5.58 Image Compression Results Original JPEG JPEG-2000 Local-PCA K-SVD Results for 820 Bytes per each file Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
106. 52 15.81 13.89 10.66 6.60 14.67 12.41 9.44 5.49 15.30 12.57 10.27 6.36 Image Compression Results Original JPEG JPEG-2000 Local-PCA K-SVD Results for 550 Bytes per each file Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
107. 53 ? 18.62 12.30 7.61 ? 16.12 11.38 6.31 ? 16.81 12.54 7.20 Image Compression Results Original JPEG JPEG-2000 Local-PCA K-SVD Results for 400 Bytes per each file Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
108. Image Denoising & Beyond Via Learned Dictionaries and Sparse representations By: Michael Elad 54 Deblocking the Results [Bryt and E. (`09)] 550 bytes K-SVD results with and without deblocking K-SVD (6.60) K-SVD (11.67) K-SVD (6.45) K-SVD (5.49) Deblock (6.24) Deblock (11.32) Deblock (6.03) Deblock (5.27)
109.
110. Image scale-up using bicubic interpolation is far from being satisfactory for this task.
111. Recently, a sparse and redundant representation technique was proposed [Yang, Wright, Huang, and Ma (’08)] for solving this problem, by training a coupled-dictionaries for the low- and high res. images.
112. We extended and improved their algorithms and results.SR Result PSNR=16.95dB The training image: 717×717 pixels, providing a set of 54,289 training patch-pairs. Bicubic interpolation PSNR=14.68dB Given Image Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad 55
113. Super-Resolution – Results (2) Given image Scaled-Up (factor 2:1) using the proposed algorithm, PSNR=29.32dB (3.32dB improvement over bicubic) Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad 56
114. Super-Resolution – Results (2) The Original Bicubic Interpolation SR result Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad 57
115. Super-Resolution – Results (2) The Original Bicubic Interpolation SR result Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad 58
116. Image Denoising & Beyond Via Learned Dictionaries and Sparse representations By: Michael Elad 59 To Summarize So Far … Image denoising (and many other problems in image processing) requires a model for the desired image We proposed a model for signals/images based on sparse and redundant representations Well, does this work? What do we do? Yes! We have seen a group of applications where this model is showing very good results: denoising of bw/color stills/video, CT improvement, inpainting, super-resolution, and compression Well … many more things … So, what next?
117. 60 Part V Summary and Conclusion Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
121. … What next? Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
122. 62 Thank You All this Work is Made Possible Due to my teachers and mentors colleagues & friends collaborating with me and my students A.M. Bruckstein D.L. Donoho G. Sapiro J.L. Starck I. Yavneh M. Zibulevsky M. Aharon O. Bryt J. Mairal M. Protter R. Rubinstein J. Shtok R. Giryes Z. Ben-Haim J. Turek R. Zeyde Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
123. 63 And Also … Thank you so much to the organizers of this event for inviting me to give this talk. Seventh International Conference on Curves and Surfaces Avignon - FRANCE June 24-30, 2010 Sparse and Redundant Representation Modeling of Signals – Theory and Applications By: Michael Elad
124. Image Denoising & Beyond Via Learned Dictionaries and Sparse representations By: Michael Elad 64 If you are Interested … More on this topic (including the slides, the papers, and Matlab toolboxes) can be found in my webpage: http://www.cs.technion.ac.il/~elad A new book on this topic will be published on ~August.