The document summarizes several papers on visual attention in deep learning. It begins with an overview of recurrent attention models (RAM) from 2014-2015 that used RNNs to sequentially attend to different parts of an image. It then discusses spatial transformer networks (STN) from 2014 that introduced differentiable image transformations. Recent works from 2017-2018 applying attention to tasks like image captioning, inpainting, and fine-grained image recognition are then reviewed.
Mostly paper review of Semantic Image Inpainting with Deep Generative Models, R Yeh et al. CVPR 2017.
Prepared for Lab Seminar at SNU Datamining Center on 20180213.
This lecture reviews methods that allow interpreting the outcomes of a deep convolutional neural network. It presents some of the techniques proposed in the literature.
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.
https://telecombcn-dl.github.io/2018-dlai/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.
Mostly paper review of Semantic Image Inpainting with Deep Generative Models, R Yeh et al. CVPR 2017.
Prepared for Lab Seminar at SNU Datamining Center on 20180213.
This lecture reviews methods that allow interpreting the outcomes of a deep convolutional neural network. It presents some of the techniques proposed in the literature.
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.
https://telecombcn-dl.github.io/2018-dlai/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.
https://telecombcn-dl.github.io/2019-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
Modeling perceptual similarity and shift invariance in deep networksNAVER Engineering
Abstract: While it is nearly effortless for humans to quickly assess the perceptual similarity between two images, the underlying processes are thought to be quite complex. Despite this, the most widely used perceptual metrics today, such as PSNR and SSIM, are simple, shallow functions, and fail to account for many nuances of human perception. Recently, the deep learning community has found that features of the VGG network trained on ImageNet classification have been remarkably useful as a training loss for image synthesis. But how perceptual are these so-called "perceptual losses"? What elements are critical for their success? To answer these questions, we introduce a new dataset of human perceptual similarity judgments. We systematically evaluate deep features across different architectures and tasks and compare them with classic metrics. We find that deep features outperform all previous metrics by large margins on our dataset. More surprisingly, this result is not restricted to ImageNet-trained VGG features, but holds across different deep architectures and levels of supervision (supervised, self-supervised, or even unsupervised). Our results suggest that perceptual similarity is an emergent property shared across deep visual representations.
Despite their strong transfer performance, deep convolutional representations surprisingly lack a basic low-level property -- shift-invariance, as small input shifts or translations can cause drastic changes in the output. Commonly used downsampling methods, such as max-pooling, strided-convolution, and average-pooling, ignore the sampling theorem. The well-known signal processing fix is anti-aliasing by low-pass filtering before downsampling. However, simply inserting this module into deep networks degrades performance; as a result, it is seldomly used today. We show that when integrated correctly, it is compatible with existing architectural components, such as max-pooling and strided-convolution. We observe increased accuracy in ImageNet classification, across several commonly-used architectures, such as ResNet, DenseNet, and MobileNet, indicating effective regularization. Furthermore, we observe better generalization, in terms of stability and robustness to input corruptions. Our results demonstrate that this classical signal processing technique has been undeservingly overlooked in modern deep networks.
http://ixa2.si.ehu.es/deep_learning_seminar/
Deep neural networks have boosted the convergence of multimedia data analytics in a unified framework shared by practitioners in natural language and vision. Image captioning, visual question answering or multimodal translation are some of the first applications of a new and exciting field that exploiting the generalization properties of deep neural representations. This talk will provide an overview of how vision and language problems are addressed with deep neural networks, and the exciting challenges being addressed nowadays by the research community.
ICCES 2017 - Crowd Density Estimation Method using Regression AnalysisAhmed Gad
The oral presentation of the paper titled "Crowd Density Estimation Method using Multiple Feature Categories and Multiple Regression Models".
This paper was accepted for publication and oral presentation in the 12th IEEE International Conference on Computer Engineering and Systems (ICCES 2017) held from 19 to 20 December 2017 in Cairo, Egypt.
The paper proposed a new method to estimate the number of people within crowded scenes using regression analysis. The two challenges in crowd density estimation using regression analysis are perspective distortion and non-linearity. This paper solves the perspective distortion using perspective normalization which is the best way to deal with that problem based on recent works.
The second challenge is solved by creating a new combination of features collected from multiple already existing categories including segmented region, texture, edge, and keypoints. This paper created a feature vector of length 164.
Five regression models are used which are GPR, RF, RPF, LASSO, and KNN.
Based on the experimental results, our proposed method gives better results than previous works.
----------------------------------
أحمد فوزي جاد Ahmed Fawzy Gad
قسم تكنولوجيا المعلومات Information Technology (IT) Department
كلية الحاسبات والمعلومات Faculty of Computers and Information (FCI)
جامعة المنوفية, مصر Menoufia University, Egypt
Teaching Assistant/Demonstrator
ahmed.fawzy@ci.menofia.edu.eg
---------------------------------
Find me on:
Blog
(Arabic) https://aiage-ar.blogspot.com.eg/
(English) https://aiage.blogspot.com.eg/
YouTube
https://www.youtube.com/AhmedGadFCIT
Google Plus
https://plus.google.com/u/0/+AhmedGadIT
SlideShare
https://www.slideshare.net/AhmedGadFCIT
LinkedIn
https://www.linkedin.com/in/ahmedfgad
reddit
https://www.reddit.com/user/AhmedGadFCIT
ResearchGate
https://www.researchgate.net/profile/Ahmed_Gad13
Academia
https://menofia.academia.edu/Gad
Google Scholar
https://scholar.google.com.eg/citations?user=r07tjocAAAAJ&hl=en
Mendelay
https://www.mendeley.com/profiles/ahmed-gad12
ORCID
https://orcid.org/0000-0003-1978-8574
StackOverFlow
http://stackoverflow.com/users/5426539/ahmed-gad
Twitter
https://twitter.com/ahmedfgad
Facebook
https://www.facebook.com/ahmed.f.gadd
Pinterest
https://www.pinterest.com/ahmedfgad
In this project, we propose methods for semantic segmentation with the deep learning state-of-the-art models. Moreover,
we want to filterize the segmentation to the specific object in specific application. Instead of concentrating on unnecessary objects we
can focus on special ones and make it more specialize and effecient for special purposes. Furtheromore, In this project, we leverage
models that are suitable for face segmentation. The models that are used in this project are Mask-RCNN and DeepLabv3. The
experimental results clearly indicate that how illustrated approach are efficient and robust in the segmentation task to the previous work
in the field of segmentation. These models are reached to 74.4 and 86.6 precision of Mean of Intersection over Union. The visual
Results of the models are shown in Appendix part.
Supervised Learning of Sparsity-Promoting Regularizers for DenoisingMike McCann
Prepared for the SIAM Conference on Imaging Science, special session on Advances in Non-Smooth/Non-Convex Optimization for Inverse Problems in Imaging. July 7, 2020
Deep Neural Networks that talk (Back)… with styleRoelof Pieters
Talk at Nuclai 2016 in Vienna
Can neural networks sing, dance, remix and rhyme? And most importantly, can they talk back? This talk will introduce Deep Neural Nets with textual and auditory understanding and some of the recent breakthroughs made in these fields. It will then show some of the exciting possibilities these technologies hold for "creative" use and explorations of human-machine interaction, where the main theorem is "augmentation, not automation".
http://events.nucl.ai/track/cognitive/#deep-neural-networks-that-talk-back-with-style
https://telecombcn-dl.github.io/2017-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
The primary goal of my trip to Seattle was to establish a collaboration with a world-leading group on data integration. But by having chosen Seattle, a hub for technology companies, I also learned about synergies between business and research: Ilya Shmulevich from the Institute for Systems Biology makes use of Amazon's ''Random Forest" implementation and Google's 600.000 CPU cluster for cancer genomic association discovery. I also met with experts from University of Washington and Microsoft research to learn about technological advancements to tackle BigData and commoditizing parallelization. Finally, I observed a government funded research agency invest in solutions geared towards their enterprise structure rather than adopt solutions designed for research institutes without active computational community. In conclusion: CSIRO has unique properties and skill-sets that many collaborators would be interested in benefiting from, in return such collaborations would propel CSIRO instantly to the forefront of technology, which in particular for the analysis of big, unstructured datasets could be very rewarding.
EPC 2018 - SEED - Exploring The Collaboration Between Proceduralism & Deep Le...Electronic Arts / DICE
Proceduralism is a powerful language of rules, dependencies and patterns that can generate content indistinguishable from a manually produced one. Yet there are new opportunities that hold a great potential to enhance the existing techniques. In this talk, SEED's Anastasia Opara shares some of the early tests of marrying Proceduralism and Deep Learning and discusses how it can contribute to the current workflows.
You can view a recording of the presentation from 2018's Everything Procedural Conference here:
https://www.youtube.com/watch?v=dpYwLny0P8M
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
https://telecombcn-dl.github.io/2019-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
Modeling perceptual similarity and shift invariance in deep networksNAVER Engineering
Abstract: While it is nearly effortless for humans to quickly assess the perceptual similarity between two images, the underlying processes are thought to be quite complex. Despite this, the most widely used perceptual metrics today, such as PSNR and SSIM, are simple, shallow functions, and fail to account for many nuances of human perception. Recently, the deep learning community has found that features of the VGG network trained on ImageNet classification have been remarkably useful as a training loss for image synthesis. But how perceptual are these so-called "perceptual losses"? What elements are critical for their success? To answer these questions, we introduce a new dataset of human perceptual similarity judgments. We systematically evaluate deep features across different architectures and tasks and compare them with classic metrics. We find that deep features outperform all previous metrics by large margins on our dataset. More surprisingly, this result is not restricted to ImageNet-trained VGG features, but holds across different deep architectures and levels of supervision (supervised, self-supervised, or even unsupervised). Our results suggest that perceptual similarity is an emergent property shared across deep visual representations.
Despite their strong transfer performance, deep convolutional representations surprisingly lack a basic low-level property -- shift-invariance, as small input shifts or translations can cause drastic changes in the output. Commonly used downsampling methods, such as max-pooling, strided-convolution, and average-pooling, ignore the sampling theorem. The well-known signal processing fix is anti-aliasing by low-pass filtering before downsampling. However, simply inserting this module into deep networks degrades performance; as a result, it is seldomly used today. We show that when integrated correctly, it is compatible with existing architectural components, such as max-pooling and strided-convolution. We observe increased accuracy in ImageNet classification, across several commonly-used architectures, such as ResNet, DenseNet, and MobileNet, indicating effective regularization. Furthermore, we observe better generalization, in terms of stability and robustness to input corruptions. Our results demonstrate that this classical signal processing technique has been undeservingly overlooked in modern deep networks.
http://ixa2.si.ehu.es/deep_learning_seminar/
Deep neural networks have boosted the convergence of multimedia data analytics in a unified framework shared by practitioners in natural language and vision. Image captioning, visual question answering or multimodal translation are some of the first applications of a new and exciting field that exploiting the generalization properties of deep neural representations. This talk will provide an overview of how vision and language problems are addressed with deep neural networks, and the exciting challenges being addressed nowadays by the research community.
ICCES 2017 - Crowd Density Estimation Method using Regression AnalysisAhmed Gad
The oral presentation of the paper titled "Crowd Density Estimation Method using Multiple Feature Categories and Multiple Regression Models".
This paper was accepted for publication and oral presentation in the 12th IEEE International Conference on Computer Engineering and Systems (ICCES 2017) held from 19 to 20 December 2017 in Cairo, Egypt.
The paper proposed a new method to estimate the number of people within crowded scenes using regression analysis. The two challenges in crowd density estimation using regression analysis are perspective distortion and non-linearity. This paper solves the perspective distortion using perspective normalization which is the best way to deal with that problem based on recent works.
The second challenge is solved by creating a new combination of features collected from multiple already existing categories including segmented region, texture, edge, and keypoints. This paper created a feature vector of length 164.
Five regression models are used which are GPR, RF, RPF, LASSO, and KNN.
Based on the experimental results, our proposed method gives better results than previous works.
----------------------------------
أحمد فوزي جاد Ahmed Fawzy Gad
قسم تكنولوجيا المعلومات Information Technology (IT) Department
كلية الحاسبات والمعلومات Faculty of Computers and Information (FCI)
جامعة المنوفية, مصر Menoufia University, Egypt
Teaching Assistant/Demonstrator
ahmed.fawzy@ci.menofia.edu.eg
---------------------------------
Find me on:
Blog
(Arabic) https://aiage-ar.blogspot.com.eg/
(English) https://aiage.blogspot.com.eg/
YouTube
https://www.youtube.com/AhmedGadFCIT
Google Plus
https://plus.google.com/u/0/+AhmedGadIT
SlideShare
https://www.slideshare.net/AhmedGadFCIT
LinkedIn
https://www.linkedin.com/in/ahmedfgad
reddit
https://www.reddit.com/user/AhmedGadFCIT
ResearchGate
https://www.researchgate.net/profile/Ahmed_Gad13
Academia
https://menofia.academia.edu/Gad
Google Scholar
https://scholar.google.com.eg/citations?user=r07tjocAAAAJ&hl=en
Mendelay
https://www.mendeley.com/profiles/ahmed-gad12
ORCID
https://orcid.org/0000-0003-1978-8574
StackOverFlow
http://stackoverflow.com/users/5426539/ahmed-gad
Twitter
https://twitter.com/ahmedfgad
Facebook
https://www.facebook.com/ahmed.f.gadd
Pinterest
https://www.pinterest.com/ahmedfgad
In this project, we propose methods for semantic segmentation with the deep learning state-of-the-art models. Moreover,
we want to filterize the segmentation to the specific object in specific application. Instead of concentrating on unnecessary objects we
can focus on special ones and make it more specialize and effecient for special purposes. Furtheromore, In this project, we leverage
models that are suitable for face segmentation. The models that are used in this project are Mask-RCNN and DeepLabv3. The
experimental results clearly indicate that how illustrated approach are efficient and robust in the segmentation task to the previous work
in the field of segmentation. These models are reached to 74.4 and 86.6 precision of Mean of Intersection over Union. The visual
Results of the models are shown in Appendix part.
Supervised Learning of Sparsity-Promoting Regularizers for DenoisingMike McCann
Prepared for the SIAM Conference on Imaging Science, special session on Advances in Non-Smooth/Non-Convex Optimization for Inverse Problems in Imaging. July 7, 2020
Deep Neural Networks that talk (Back)… with styleRoelof Pieters
Talk at Nuclai 2016 in Vienna
Can neural networks sing, dance, remix and rhyme? And most importantly, can they talk back? This talk will introduce Deep Neural Nets with textual and auditory understanding and some of the recent breakthroughs made in these fields. It will then show some of the exciting possibilities these technologies hold for "creative" use and explorations of human-machine interaction, where the main theorem is "augmentation, not automation".
http://events.nucl.ai/track/cognitive/#deep-neural-networks-that-talk-back-with-style
https://telecombcn-dl.github.io/2017-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
The primary goal of my trip to Seattle was to establish a collaboration with a world-leading group on data integration. But by having chosen Seattle, a hub for technology companies, I also learned about synergies between business and research: Ilya Shmulevich from the Institute for Systems Biology makes use of Amazon's ''Random Forest" implementation and Google's 600.000 CPU cluster for cancer genomic association discovery. I also met with experts from University of Washington and Microsoft research to learn about technological advancements to tackle BigData and commoditizing parallelization. Finally, I observed a government funded research agency invest in solutions geared towards their enterprise structure rather than adopt solutions designed for research institutes without active computational community. In conclusion: CSIRO has unique properties and skill-sets that many collaborators would be interested in benefiting from, in return such collaborations would propel CSIRO instantly to the forefront of technology, which in particular for the analysis of big, unstructured datasets could be very rewarding.
EPC 2018 - SEED - Exploring The Collaboration Between Proceduralism & Deep Le...Electronic Arts / DICE
Proceduralism is a powerful language of rules, dependencies and patterns that can generate content indistinguishable from a manually produced one. Yet there are new opportunities that hold a great potential to enhance the existing techniques. In this talk, SEED's Anastasia Opara shares some of the early tests of marrying Proceduralism and Deep Learning and discusses how it can contribute to the current workflows.
You can view a recording of the presentation from 2018's Everything Procedural Conference here:
https://www.youtube.com/watch?v=dpYwLny0P8M
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
2. WHO AM I 2
▸ Chung Minki
▸ BS, KAIST, IE, 2016
▸ MS, SNU, IE, 2018..?!
▸ Vision Projects
▸ Working on Semantic Image Inpainting
3. WHAT IS VISUAL ATTENTION 3
▸ Attention is HOT nowadays
▸ http://openaccess.thecvf.com/CVPR2017_search.py
▸ http://search.iclr2018.smerity.com/search/?query=attention
4. WHAT IS VISUAL ATTENTION 4
▸ Maybe heard of
▸ "Neural Machine Translation by Jointly Learning to Align and Translate"
▸ "Show, Attend, and Tell: Neural Image Caption"
Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio, 2015, ICLR. "Neural Machine Translation by Jointly Learning to Align and Translate"
Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard Zemel, Yoshua Bengio, 2015, ICML.
"Show, Attend, and Tell: Neural Image Caption Generation with Visual Attention"
5. WHAT IS VISUAL ATTENTION 5
▸ More,
Jimmy Lei Ba, Volodymyr Mnih, Koray Kavukcuoglu, 2015, ILCR. "Multiple Object Recognition With Visual Attention"
Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, NIPS, 2014. "Spatial Transformer Network"
Jianlong Fu, Heliang Zheng, Tao Mei, 2017, CVPR. "Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-
grained Image Recognition"
Siavash Gorji, James J. Clark, 2017, CVPR. "Attentional Push: A Deep Convolutional Network for Augmenting Image Salience
with Shared Attention Modeling in Social Scenes"
6. WHAT IS VISUAL ATTENTION 6
▸ Visual Attention:
▸ Attend on certain part of image to solve a task more efficiently
▸ Deep learning, the black box model → Interpretability
7. TABLE OF CONTENTS 7
▸ Early Works
▸ Recurrent Attention Model (RAM)
▸ Spatial Transformer Network (STN)
▸ Recent Works of visual attention
▸ in ICLR
▸ in CVPR
10. RECURRENT ATTENTION MODEL 10
▸ Volodymyr Mnih, Nicolas Heess, Alex Graves, Koray Kavukcuoglu, 2014, NIPS.
"Recurrent Models of Visual Attention"
▸ Google DeepMind, 563 citations
▸ Motivation: Confronted by large image, human process image sequentially,
selecting where and what to look
▸ Tackle ConvNet limitation: poor scalability with increasing input image size
11. RECURRENT ATTENTION MODEL 11
▸ Multiple Object Recognition with Visual Attention (DRAM), 2015, ICLR
▸ Refined architecture version of RAM
▸ RNN Structure with multi-resolution crop, called glimpse
▸ Architecture:
Jimmy Lei Ba, Volodymyr Mnih, Koray Kavukcuoglu, 2015, ILCR. "Multiple Object Recognition With Visual Attention"
12. RECURRENT ATTENTION MODEL 12
▸ Architecture:
Jimmy Lei Ba, Volodymyr Mnih, Koray Kavukcuoglu, 2015, ILCR. "Multiple Object Recognition With Visual Attention"
WHERE TO SEE
WHAT TO SEE
provide initial state
locate glimpse
outputs the inputs for rnn(1)
for multiple objects
13. RECURRENT ATTENTION MODEL 13
▸ Demo
▸ Single object classification
https://github.com/kevinzakka/recurrent-visual-attention
14. RECURRENT ATTENTION MODEL 14
▸ Training:
▸ maximize
Jimmy Lei Ba, Volodymyr Mnih, Koray Kavukcuoglu, 2015, ILCR. "Multiple Object Recognition With Visual Attention"
LOWERBOUND F
multiple object case
15. RECURRENT ATTENTION MODEL 15
▸ Cont'd:
Jimmy Lei Ba, Volodymyr Mnih, Koray Kavukcuoglu, 2015, ILCR. "Multiple Object Recognition With Visual Attention"
REINFORCE
16. RECURRENT ATTENTION MODEL 16
▸ Experiments & Results
▸ MNIST, SVHN
Jimmy Lei Ba, Volodymyr Mnih, Koray Kavukcuoglu, 2015, ILCR. "Multiple Object Recognition With Visual Attention"
17. SPATIAL TRANSFORMER NETWORK 17
▸ Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, 2014
NIPS. "Spatial Transformer Network"
▸ Google DeepMind, 624 citations
▸ Motivation: Human process distorted objects by un-distorting it
▸ ConvNet is not actually invariant to large transformation(only realised over a
deep hierarchy of max-pooling)
Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, 2014, NIPS. "Spatial Transformer Network"
https://kevinzakka.github.io/2017/01/18/stn-part2/
18. SPATIAL TRANSFORMER NETWORK 18
▸ Architecture:
▸ three parts: localisation net, sampling grid, sampler
▸ Assume 𝛵𝜃 is 2D affine transformation A𝜃,
Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, 2014, NIPS. "Spatial Transformer Network"
regression
H,W,C H',W',C
19. SPATIAL TRANSFORMER NETWORK 19
▸ 𝛵𝜃, for attention becomes:
▸ Allowing cropping, translation, isotropic scaling
▸ In case if a bilinear sampling kernel,
▸ Differentiable, Modular,
Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, 2014, NIPS. "Spatial Transformer Network"
20. SPATIAL TRANSFORMER NETWORK 20
▸ Experiments and Results
▸ MNIST
▸ SVHN
Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, 2014, NIPS. "Spatial Transformer Network"
21. SPATIAL TRANSFORMER NETWORK 21
▸ Experiments and Results
▸ Fine-grained classification (CUB-200-211 bird classification dataset)
Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, 2014, NIPS. "Spatial Transformer Network"
22. SPATIAL TRANSFORMER NETWORK 22
▸ Already implemented in Tensorlayer
Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, 2014, NIPS. "Spatial Transformer Network"
23. RECURRENT ATTENTIONAL NETWORKS FOR SALIENCY DETECTION 23
▸ Jason Kuen, Zhenhua Wang, Gang Wang, 2016, CVPR. "Recurrent Attentional
Networks for Saliency Detection"
▸ RAM(Glimpse system) + STN(Differentiability) for Saliency Detection
Jason Kuen, Zhenhua Wang, Gang Wang, 2016, CVPR. "Recurrent Attentional Networks for Saliency Detection"
24. RECURRENT ATTENTIONAL NETWORKS FOR SALIENCY DETECTION 24
▸ Recurrent Attentional Convolutional-Deconvolutional Network (RACDNN)
▸ Architecture
Jason Kuen, Zhenhua Wang, Gang Wang, 2016, CVPR. "Recurrent Attentional Networks for Saliency Detection"
25. RECURRENT ATTENTIONAL NETWORKS FOR SALIENCY DETECTION 25
▸ Experiments & Results
Jason Kuen, Zhenhua Wang, Gang Wang, 2016, CVPR. "Recurrent Attentional Networks for Saliency Detection"
27. GENERATIVE IMAGE INPAINTING WITH CONTEXTUAL ATTENTION 27
▸ Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, Thomas S. Huang, 2018, CVPR.
"Generative Image Inpainting with Contextual Attention"
▸ Adobe Research
Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, Thomas S. Huang, 2018, CVPR. "Generative Image Inpainting with Contextual Attention
28. GENERATIVE IMAGE INPAINTING WITH CONTEXTUAL ATTENTION 28
▸ Architecture
▸ Two-stage(coarse to fine)
▸ Global and Local W-GANS
▸ Spatially discounted reconstruction loss(𝑙1): 𝛾
Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, Thomas S. Huang, 2018, CVPR. "Generative Image Inpainting with Contextual Attention
USE W-GAN
attention
𝑙
29. GENERATIVE IMAGE INPAINTING WITH CONTEXTUAL ATTENTION 29
▸ Attention
Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, Thomas S. Huang, 2018, CVPR. "Generative Image Inpainting with Contextual Attention
fx,y
bx,y
Calculate cosine similarity:
30. GENERATIVE IMAGE INPAINTING WITH CONTEXTUAL ATTENTION 30
▸ Experiments & Results
Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, Thomas S. Huang, 2018, CVPR. "Generative Image Inpainting with Contextual Attention
31. LEARN TO PAY ATTENTION 31
▸ Saumya Jetley, Nicholas A. Lord, Namhoon Lee, Philip H. S. Torr, 2018, ICLR. "Learn
to Pay Attention"
▸ Very simple
Saumya Jetley, Nicholas A. Lord, Namhoon Lee, Philip H. S. Torr, 2018, ICLR. "Learn to Pay Attention"
32. LEARN TO PAY ATTENTION 32
▸ Architecture
Saumya Jetley, Nicholas A. Lord, Namhoon Lee, Philip H. S. Torr, 2018, ICLR. "Learn to Pay Attention"
Attention
Compatibility
function(dot
product)
33. LEARN TO PAY ATTENTION 33
▸ Experiments & Results
▸ Image classification and fine-grained recognition
Saumya Jetley, Nicholas A. Lord, Namhoon Lee, Philip H. S. Torr, 2018, ICLR. "Learn to Pay Attention"
34. LEARN TO PAY ATTENTION 34
▸ Experiments & Results
▸ Weakly supervised semantic segmentation
Saumya Jetley, Nicholas A. Lord, Namhoon Lee, Philip H. S. Torr, 2018, ICLR. "Learn to Pay Attention"
35. LOOK CLOSER TO SEE BETTER 35
▸ Jianlong Fu, Heliang Zheng, Tao Mei, 2017, CVPR. "Look Closer to See Better:
Recurrent Attention Convolutional Neural Network for Fine-grained Image
Recognition"
▸ Fine-grained image recognition:
▸ Discriminative region localization + fine-grained feature learning
Jianlong Fu, Heliang Zheng, Tao Mei, 2017, CVPR. "Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-
grained Image Recognition"
36. LOOK CLOSER TO SEE BETTER 36
▸ Recurrent Attention Convolutional Neural Network (RA-CNN)
▸ Multi-scale networks: classification sub-network, attention proposal sub-
network(APN)
▸ Finer-scale network (coarse to fine)
▸ Intra-scale softmax loss for classification, inter-scale pairwise ranking loss for
APN
Jianlong Fu, Heliang Zheng, Tao Mei, 2017, CVPR. "Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-
grained Image Recognition"
37. LOOK CLOSER TO SEE BETTER 37
▸ RA-CNN architecture:
Jianlong Fu, Heliang Zheng, Tao Mei, 2017, CVPR. "Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-
grained Image Recognition"
bilinear
interpolation
to amplify
38. LOOK CLOSER TO SEE BETTER 38
▸ Training:
▸ Multi-task loss:
Jianlong Fu, Heliang Zheng, Tao Mei, 2017, CVPR. "Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-
grained Image Recognition"
forces
39. LOOK CLOSER TO SEE BETTER 39
▸ Experiments & Results
▸ CUB-200-211 Bird Dataset
Jianlong Fu, Heliang Zheng, Tao Mei, 2017, CVPR. "Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-
grained Image Recognition"
40. LOOK CLOSER TO SEE BETTER 40
▸ Experiments & Results
▸ Stanford Dogs, Stanford Cars
Jianlong Fu, Heliang Zheng, Tao Mei, 2017, CVPR. "Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-
grained Image Recognition"
41. SUMMARY 41
▸ Attention for efficiency, better performance, interpretability
▸ Many types of Attention:
▸ RAM
▸ STN
▸ RAM+STN
▸ Others