Deep neural networks continue to advance the state-of-the-art of image recognition tasks with various methods. However, applications of these methods to multimodality remain limited. We present Multimodal Residual Networks (MRN) for the multimodal residual learning of visual question-answering, which extends the idea of the deep residual learning. Unlike the deep residual learning, MRN effectively learns the joint representation from vision and language information. The main idea is to use element-wise multiplication for the joint residual mappings exploiting the residual learning of the attentional models in recent studies. Various alternative models introduced by multimodality are explored based on our study. We achieve the state-of-the-art results on the Visual QA dataset for both Open-Ended and Multiple-Choice tasks. Moreover, we introduce a novel method to visualize the attention effect of the joint representations for each learning block using back-propagation algorithm, even though the visual features are collapsed without spatial information.
Joint contrastive learning with infinite possibilitiestaeseon ryu
Contrastive Learning은 두 이미지가 유사한지 유사하지 않은 지에 대해서 어떤 label이 없이 피쳐들을 배우게 하는 머신 learning 테크닉 중에 하나입니다 우리는 기존에 있는 Supervised learning과 조금 차이가 있는데 Supervised learning은 label cost가 들고
그다음에 Task specific 하기 때문에 generalizability가 조금 떨어질 수 있습니다 하지만 Contrastive Learning은 label이 없이 진행하기때문에 label cost가 없고 generalizability가 조금 더 좋을수 있습니다. 해당 논문은 보다 유용한 Contrastive Learning을 위한 Joint Contrastive Learning에 대해 제안을 하는대요 https://youtu.be/0NLq-ikBP1I
Nelly Litvak – Asymptotic behaviour of ranking algorithms in directed random ...Yandex
There is a vast empirical research on the behaviour of ranking algorithms, e.g. Google PageRank, in scale-free networks. In this talk, we address this problem by analytical probabilistic methods. In particular, it is well-known that the PageRank in scale-free networks follows a power law with the same exponent as in-degree. Recent probabilistic analysis has provided an explanation for this phenomenon by obtaining a natural approximation for PageRank based on stochastic fixed-point equations. For these equations, explicit solutions can be constructed on weighted branching trees, and their tail behavior can be described in great detail.
In this talk we present a model for generating directed random graphs with prescribed degree distributions where we can prove that the PageRank of a randomly chosen node does indeed converge to the solution of the corresponding fixed-point equation as the number of nodes in the graph grows to infinity. The proof of this result is based on classical random graph coupling techniques combined with the now extensive literature on the behavior of branching recursions on trees.
How can we apply machine learning techniques on graphs to obtain predictions in a variety of domains? Know more from Sami Abu-El-Haija, an AI Scientist with experience from both industry (Google Research) and academia (University of Southern California).
Joint contrastive learning with infinite possibilitiestaeseon ryu
Contrastive Learning은 두 이미지가 유사한지 유사하지 않은 지에 대해서 어떤 label이 없이 피쳐들을 배우게 하는 머신 learning 테크닉 중에 하나입니다 우리는 기존에 있는 Supervised learning과 조금 차이가 있는데 Supervised learning은 label cost가 들고
그다음에 Task specific 하기 때문에 generalizability가 조금 떨어질 수 있습니다 하지만 Contrastive Learning은 label이 없이 진행하기때문에 label cost가 없고 generalizability가 조금 더 좋을수 있습니다. 해당 논문은 보다 유용한 Contrastive Learning을 위한 Joint Contrastive Learning에 대해 제안을 하는대요 https://youtu.be/0NLq-ikBP1I
Nelly Litvak – Asymptotic behaviour of ranking algorithms in directed random ...Yandex
There is a vast empirical research on the behaviour of ranking algorithms, e.g. Google PageRank, in scale-free networks. In this talk, we address this problem by analytical probabilistic methods. In particular, it is well-known that the PageRank in scale-free networks follows a power law with the same exponent as in-degree. Recent probabilistic analysis has provided an explanation for this phenomenon by obtaining a natural approximation for PageRank based on stochastic fixed-point equations. For these equations, explicit solutions can be constructed on weighted branching trees, and their tail behavior can be described in great detail.
In this talk we present a model for generating directed random graphs with prescribed degree distributions where we can prove that the PageRank of a randomly chosen node does indeed converge to the solution of the corresponding fixed-point equation as the number of nodes in the graph grows to infinity. The proof of this result is based on classical random graph coupling techniques combined with the now extensive literature on the behavior of branching recursions on trees.
How can we apply machine learning techniques on graphs to obtain predictions in a variety of domains? Know more from Sami Abu-El-Haija, an AI Scientist with experience from both industry (Google Research) and academia (University of Southern California).
Overview of the course. Introduction to image sciences, image processing and computer vision. Basics of machine learning, terminologies, paradigms. No-free lunch theorem. Supervised versus unsupervised learning. Clustering and K-Means. Classification and regression. Linear least squares and polynomial curve fitting. Model complexity and overfitting. Curse of dimensionality. Dimensionality reduction and principal component analysis. Image representation, semantic gap, image features, and classical computer vision pipelines.
Localization and classification. Overfeat: class agnostic versu class specific localization, fully convolutional neural networks, greedy merge strategy. Multiobject detection. Region proposal and selective search. R-CNN, Fast R-CNN, Faster R-CNN and YOLO. Image segmentation. Semantic segmentation and transposed convolutions. Instance segmentation and Mask R-CNN. Image captioning. Recurrent Neural Networks (RNNs). Language generation. Long Short Term Memory (LSTMs). DeepImageSent, Show and Tell, and Show, Attend and Tell algorithms.
Binary classification and linear separators. Perceptron, ADALINE, artifical neurons. Artificial neural networks (ANNs), activation functions, and universal approximation theorem. Linear versus non-linear classification problems. Typical tasks, architectures and loss functions. Gradient descent and back-propagation. Support Vector Machines (SVMs), soft-margins and kernel trick. Connexions between ANNs and SVMs.
We study communication cost of computing functions when inputs are distributed among k processors, each of which is located at one vertex of a network/graph called a terminal. Every other node of the network also has a processor, with no input. The communication is point-to-point and the cost is the total number of bits exchanged by the protocol, in the worst case, on all edges. Our results show the effect of topology of the network on the total communication cost. We prove tight bounds for simple functions like Element-Distinctness (ED), which depend on the 1-median of the graph. On the other hand, we show that for a large class of natural functions like Set-Disjointness the communication cost is essentially n times the cost of the optimal Steiner tree connecting the terminals. Further, we show for natural composed functions like ED∘XOR and XOR∘ED, the naive protocols suggested by their definition is optimal for general networks. Interestingly, the bounds for these functions depend on more involved topological parameters that are a combination of Steiner tree and 1-median costs. To obtain our results, we use some tools like metric embeddings and linear programming whose use in the context of communication complexity is novel as far as we know. (Based on joint works with Jaikumar Radhakrishnan and Atri Rudra)
Deep learning @ University of Oradea - part I (16 Jan. 2018)Vlad Ovidiu Mihalca
Deep Learning series of presentations at University of Oradea, Faculty of Managerial and Technological Engineering, Mechatronics department.
English and Romanian language series held in parallel for Erasmus foreign students and Engineering Doctoral School students, teachers as well as anyone interested within the university.
This presentation was the first in the English language series, covering a tiny part of the theoretical aspects of Deep Learning. It will be followed by presentations and discussion regarding frameworks for use in products featuring Deep Learning, as well as current state of the art in Deep Learning research and applications in Robotics and Computer/Machine Vision.
C. Guyon, T. Bouwmans. E. Zahzah, “Foreground Detection via Robust Low Rank Matrix Decomposition including Spatio-Temporal Constraint”, International Workshop on Background Model Challenges, ACCV 2012, Daejon, Korea, November 2012.
IMAGE GENERATION WITH GANS-BASED TECHNIQUES: A SURVEYijcsit
In recent years, frameworks that employ Generative Adversarial Networks (GANs) have achieved immense results for various applications in many fields especially those related to image generation both due to their ability to create highly realistic and sharp images as well as train on huge data sets. However, successfully training GANs are notoriously difficult task in case ifhigh resolution images are required. In this article, we discuss five applicable and fascinating areas for image synthesis based on the state-of-theart GANs techniques including Text-to-Image-Synthesis, Image-to-Image-Translation, Face Manipulation, 3D Image Synthesis and DeepMasterPrints. We provide a detailed review of current GANs-based image generation models with their advantages and disadvantages.The results of the publications in each section show the GANs based algorithmsAREgrowing fast and their constant improvement, whether in the same field or in others, will solve complicated image generation tasks in the future.
the slides are aimed to give a brief introductory base to Neural Networks and its architectures. it covers logistic regression, shallow neural networks and deep neural networks. the slides were presented in Deep Learning IndabaX Sudan.
Lecture by Xavier Giro-i-Nieto (UPC) at the Master in Computer Vision Barcelona (March 30, 2016).
http://pagines.uab.cat/mcv/
This lecture provides an overview of computer vision analysis of images at a global scale using deep learning techniques. The session is structure in two blocks: a first one addressing end to end learning, and a second one focusing on applications that use off-the-shelf features.
Please submit your feedback as comments on the GDrive source slides:
https://docs.google.com/presentation/d/1ms9Fczkep__9pMCjxtVr41OINMklcHWc74kwANj7KKI/edit?usp=sharing
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.
Overview of the course. Introduction to image sciences, image processing and computer vision. Basics of machine learning, terminologies, paradigms. No-free lunch theorem. Supervised versus unsupervised learning. Clustering and K-Means. Classification and regression. Linear least squares and polynomial curve fitting. Model complexity and overfitting. Curse of dimensionality. Dimensionality reduction and principal component analysis. Image representation, semantic gap, image features, and classical computer vision pipelines.
Localization and classification. Overfeat: class agnostic versu class specific localization, fully convolutional neural networks, greedy merge strategy. Multiobject detection. Region proposal and selective search. R-CNN, Fast R-CNN, Faster R-CNN and YOLO. Image segmentation. Semantic segmentation and transposed convolutions. Instance segmentation and Mask R-CNN. Image captioning. Recurrent Neural Networks (RNNs). Language generation. Long Short Term Memory (LSTMs). DeepImageSent, Show and Tell, and Show, Attend and Tell algorithms.
Binary classification and linear separators. Perceptron, ADALINE, artifical neurons. Artificial neural networks (ANNs), activation functions, and universal approximation theorem. Linear versus non-linear classification problems. Typical tasks, architectures and loss functions. Gradient descent and back-propagation. Support Vector Machines (SVMs), soft-margins and kernel trick. Connexions between ANNs and SVMs.
We study communication cost of computing functions when inputs are distributed among k processors, each of which is located at one vertex of a network/graph called a terminal. Every other node of the network also has a processor, with no input. The communication is point-to-point and the cost is the total number of bits exchanged by the protocol, in the worst case, on all edges. Our results show the effect of topology of the network on the total communication cost. We prove tight bounds for simple functions like Element-Distinctness (ED), which depend on the 1-median of the graph. On the other hand, we show that for a large class of natural functions like Set-Disjointness the communication cost is essentially n times the cost of the optimal Steiner tree connecting the terminals. Further, we show for natural composed functions like ED∘XOR and XOR∘ED, the naive protocols suggested by their definition is optimal for general networks. Interestingly, the bounds for these functions depend on more involved topological parameters that are a combination of Steiner tree and 1-median costs. To obtain our results, we use some tools like metric embeddings and linear programming whose use in the context of communication complexity is novel as far as we know. (Based on joint works with Jaikumar Radhakrishnan and Atri Rudra)
Deep learning @ University of Oradea - part I (16 Jan. 2018)Vlad Ovidiu Mihalca
Deep Learning series of presentations at University of Oradea, Faculty of Managerial and Technological Engineering, Mechatronics department.
English and Romanian language series held in parallel for Erasmus foreign students and Engineering Doctoral School students, teachers as well as anyone interested within the university.
This presentation was the first in the English language series, covering a tiny part of the theoretical aspects of Deep Learning. It will be followed by presentations and discussion regarding frameworks for use in products featuring Deep Learning, as well as current state of the art in Deep Learning research and applications in Robotics and Computer/Machine Vision.
C. Guyon, T. Bouwmans. E. Zahzah, “Foreground Detection via Robust Low Rank Matrix Decomposition including Spatio-Temporal Constraint”, International Workshop on Background Model Challenges, ACCV 2012, Daejon, Korea, November 2012.
IMAGE GENERATION WITH GANS-BASED TECHNIQUES: A SURVEYijcsit
In recent years, frameworks that employ Generative Adversarial Networks (GANs) have achieved immense results for various applications in many fields especially those related to image generation both due to their ability to create highly realistic and sharp images as well as train on huge data sets. However, successfully training GANs are notoriously difficult task in case ifhigh resolution images are required. In this article, we discuss five applicable and fascinating areas for image synthesis based on the state-of-theart GANs techniques including Text-to-Image-Synthesis, Image-to-Image-Translation, Face Manipulation, 3D Image Synthesis and DeepMasterPrints. We provide a detailed review of current GANs-based image generation models with their advantages and disadvantages.The results of the publications in each section show the GANs based algorithmsAREgrowing fast and their constant improvement, whether in the same field or in others, will solve complicated image generation tasks in the future.
the slides are aimed to give a brief introductory base to Neural Networks and its architectures. it covers logistic regression, shallow neural networks and deep neural networks. the slides were presented in Deep Learning IndabaX Sudan.
Lecture by Xavier Giro-i-Nieto (UPC) at the Master in Computer Vision Barcelona (March 30, 2016).
http://pagines.uab.cat/mcv/
This lecture provides an overview of computer vision analysis of images at a global scale using deep learning techniques. The session is structure in two blocks: a first one addressing end to end learning, and a second one focusing on applications that use off-the-shelf features.
Please submit your feedback as comments on the GDrive source slides:
https://docs.google.com/presentation/d/1ms9Fczkep__9pMCjxtVr41OINMklcHWc74kwANj7KKI/edit?usp=sharing
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.
"Subclass deep neural networks: re-enabling neglected classes in deep network training for multimedia classification", by N. Gkalelis, V. Mezaris. Proceedings of the 26th Int. Conf. on Multimedia Modeling (MMM2020), Daejeon, Korea, Jan. 2020.
https://telecombcn-dl.github.io/idl-2020/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.
Fast Object Recognition from 3D Depth Data with Extreme Learning MachineSoma Boubou
Object recognition from RGB-D sensors has recently emerged as a renowned and challenging research topic. The current systems often require large amounts of time to train the models and to classify new data. We proposed an effective and fast object recognition approach from 3D data acquired from depth sensors such as Structure or Kinect sensors.
Our contribution in this work} is to present a novel fast and effective approach for real-time object recognition from 3D depth data:
- First, we extract simple but effective frame-level features, which we name as differential frames, from the raw depth data.
- Second, we build a recognition system based on Extreme Learning Machine classifier with a Local Receptive Field (ELM-LRF).
Performance Evaluation of Object Tracking Technique Based on Position VectorsCSCJournals
In this paper, a novel algorithm for moving object tracking based on position vectors has proposed. The position vector of an object in first frame of a video has been extracted based on selection of region of interest. Based on position vector in first frame object direction has shown in nine different directions. We extract nine position vectors for nine different directions. With these position vectors next frame is cropped into nine blocks. We exploit block matching of the first frame with nine blocks of the next frame in a simple feature space by Descrete wavelet transform and dual tree complex wavelet transform. The matched block is considered as tracked object and its position vector is a reference location for the next successive frame. We describe performance evaluation and algorithm in detail to perform simulation experiments of object tracking using different feature vectors which verifies the tracking algorithm efficiency.
http://imatge-upc.github.io/telecombcn-2016-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of big annotated data and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which had been addressed until now with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or text captioning.
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of big annotated data and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which had been addressed until now with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or text captioning.
To describe the dynamics taking place in networks that structurally change over time, we propose an approach to search for attributes whose value changes impact the topology of the graph. In several applications, it appears that the variations of a group of attributes are often followed by some structural changes in the graph that one may assume they generate. We formalize the triggering pattern discovery problem as a method jointly rooted in sequence mining and graph analysis. We apply our approach on three real-world dynamic graphs of different natures - a co-authoring network, an airline network, and a social bookmarking system - assessing the relevancy of the triggering pattern mining approach.
안녕하세요 딥러닝 논문읽기 모임 입니다! 오늘 소개할 논문은 3D관련 업무를 진행 하시는/ 희망하시는 분들의 필수 논문인 VoxelNET 입니다.
발표자료:https://www.slideshare.net/taeseonryu/mcsemultimodal-contrastive-learning-of-sentence-embeddings
안녕하세요! 딥러닝 논문읽기 모임입니다.
오늘은 자율 주행, 가정용 로봇, 증강/가상 현실과 같은 다양한 응용 분야에서 중요한 문제인 3D 포인트 클라우드에서의 객체 탐지에 대한 획기적인 진전을 소개하고자 합니다. 이를 위해 'VoxelNet'이라는 새로운 3D 탐지 네트워크에 대해 알아보겠습니다.
1. 기존 방법의 한계
기존의 많은 노력은 수동으로 만들어진 특징 표현, 예를 들어 새의 눈 시점 투영 등에 집중해 왔습니다. 하지만 이러한 방법들은 LiDAR 포인트 클라우드와 영역 제안 네트워크(RPN) 사이의 연결을 효과적으로 수행하기 어렵습니다.
2. VoxelNet의 혁신적 접근법
VoxelNet은 3D 포인트 클라우드를 위한 수동 특징 공학의 필요성을 없애고, 특징 추출과 바운딩 박스 예측을 단일 단계, end-to-end 학습 가능한 깊은 네트워크로 통합합니다. VoxelNet은 포인트 클라우드를 균일하게 배치된 3D 복셀로 나누고, 새롭게 도입된 복셀 특징 인코딩(VFE) 레이어를 통해 각 복셀 내의 포인트 그룹을 통합된 특징 표현으로 변환합니다.
3. 효과적인 기하학적 표현 학습
이 방식을 통해 포인트 클라우드는 서술적인 체적 표현으로 인코딩되며, 이는 RPN에 연결되어 탐지를 생성합니다. VoxelNet은 다양한 기하학적 구조를 가진 객체의 효과적인 구별 가능한 표현을 학습합니다.
4. 성능 평가
KITTI 자동차 탐지 벤치마크에서의 실험 결과, VoxelNet은 기존의 LiDAR 기반 3D 탐지 방법들을 큰 차이로 능가했습니다. 또한, LiDAR만을 기반으로 한 보행자와 자전거 탐지에서도 희망적인 결과를 보였습니다.
VoxelNet의 도입은 3D 포인트 클라우드에서의 객체 탐지를 혁신적으로 개선하고 있으며, 이 분야에서의 미래 발전에 중요한 영향을 미칠 것으로 기대됩니다.
오늘 논문 리뷰를 위해 이미지처리 허정원님이 자세한 리뷰를 도와주셨습니다 많은 관심 미리 감사드립니다!
https://youtu.be/yCgsCyoJoMg
Similar to Multimodal Residual Networks for Visual QA (20)
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
1. Multimodal Residual Networks
for Visual QA
Jin-Hwa Kim, Sang-Woo Lee, Dong-Hyun Kwak,
Min-Oh Heo, Jeonghee Kim, Jung-Woo Ha,
Byoung-Tak Zhang
9 June 2016
Biointelligence Lab.
Seoul National University @jnhwkim
3. Visual QA Challenge
■ What is VQA?
• VQA is a new dataset containing open-ended questions about images.
• These questions require an understanding of vision, language and
commonsense knowledge to answer.
[Agrawal et al., 2015]
5. Stacked Attention Networks
Question:
What are sitting
in the basket on
a bicycle?
CNN/
LSTM
Softmax
dogs
Answer:
CNN
+
Query
+
Attention layer 1
Attention layer 2
feature vectors of different
parts of image
(a) Stacked Attention Network for Image QA
Original Image First Attention Layer Second Attention Layer
(b) Visualization of the learned multiple attention layers. The
stacked attention network first focuses on all referred concepts,
e.g., bicycle, basket and objects in the basket (dogs) in
the first attention layer and then further narrows down the focus in
the second layer and finds out the answer dog.
[Yang et al., 2015]
6. ■ Attentional Parameters
• For linear combination of visual features
• Shortcut for question vector
■ Representative Bottleneck
• question weakly contributes to the joint only
through coefficients p
• which may cause a “bottleneck”
Stacked Attention Networks
[Yang et al., 2015]
vQ vI
⊕
◉
hA
pI
⊕ vI’
u vI
Layer 1
Layer 2
uk
= !vI
k
+ uk−1
!vI
k
= pi
k
vi
i
∑
7. Multimodal Learning for VQA
■ A Strong Baseline by Lu et al., 2015
• A simple method of element-wise multiplication after linear-tanh
embeddings
• Outperform some of the recent works, DPPnet (Noh et al., 2015) and
D-NMN (Andreas, et al., 2016).
vQ vI
tanh tanh
◉
softmax
8. Multimodal Residual Networks
■ Residual Networks for Multimodal Inputs
• A shortcut mapping of SAN (question vector)
• Element-wise multiplication for the joint residual function
Q
V
ARNN
CNN
softmax
Multimodal Residual Networks
What kind of animals
are these ?
sheep
word
embedding
question shortcuts
element-wise
multiplication
word2vec
(Mikolov et al., 2013)
skip-thought vectors
(Kiros et al., 2015)
ResNet
(He et al., 2016)
10. Exploring Alternative Models
Tanh
Linear
Linear
Tanh
Linear
Q V
Hl V
⊙
⊕
(a)
Linear
Tanh
Linear
Tanh
Linear
Tanh
Linear
Q V
Hl
V⊙
⊕
(c)
Linear
Tanh
Linear
Tanh
Linear
TanhLinear
Tanh
Linear
Q V
Hl V
⊙
⊕
(b)
Linear
Tanh
Linear
TanhLinear
Tanh
Linear
Q V
Hl
V
⊙
⊕
(e)
Linear
Tanh
Linear
TanhLinear
Tanh
Linear
Q V
Hl V
⊙
⊕
(d)
if l=1
else
Identity
if l=1
Linear
else
none
Table 1: The results of alternative models
(a)-(e) on the test-dev.
Open-Ended
All Y/N Num. Other
(a) 60.17 81.83 38.32 46.61
(b) 60.53 82.53 38.34 46.78
(c) 60.19 81.91 37.87 46.70
(d) 59.69 81.67 37.23 46.00
(e) 60.20 81.98 38.25 46.57
Table 2: The e ect of the visual features and
# of target answers on the test-dev results.
Vgg for VGG-19, and Res for ResNet-152 fea-
tures described in Section 4.
Open-Ended
All Y/N Num. Other
Vgg, 1k 60.53 82.53 38.34 46.78
Vgg, 2k 60.79 82.13 38.87 47.52
Vgg, 3k 60.68 82.40 38.69 47.10
Res, 1k 61.45 82.36 38.40 48.81
Res, 2k 61.68 82.28 38.82 49.25
Res, 3k 61.47 82.28 39.09 48.76
11. Results on VQA test-standard
Table 3: The VQA test-standard results. The precision of some accuracies [30, 1] are one
less than others, so, zero-filled to match others.
Open-Ended Multiple-Choice
All Y/N Num. Other All Y/N Num. Other
DPPnet [21] 57.36 80.28 36.92 42.24 62.69 80.35 38.79 52.79
D-NMN [1] 58.00 - - - - - - -
Deep Q+I [11] 58.16 80.56 36.53 43.73 63.09 80.59 37.70 53.64
SAN [30] 58.90 - - - - - - -
ACK [27] 59.44 81.07 37.12 45.83 - - - -
FDA [9] 59.54 81.34 35.67 46.10 64.18 81.25 38.30 55.20
DMN+ [28] 60.36 80.43 36.82 48.33 - - - -
MRN 61.84 82.39 38.23 49.41 66.33 82.41 39.57 58.40
Human [2] 83.30 95.77 83.39 72.67 - - - -
5.1 Visualization
In Equation 3, the left term ‡(Wqq) can be seen as a masking (attention) vector to
select a part of visual information. We assume that the di erence between the right term
V = ‡(W2‡(W1v)) and the masked vector F(q, v) indicates an attention e ect caused by
the masking vector. Then, the attention e ect Latt = 1
2 (V ≠ F)2
is visualized on the image
by calculating the gradient of Latt with respect to a given image I.
12. Visualization
■ Attentive Effect
• The difference between the right term V = σ(Wσ(Wv)) and the masked
vector F(q,v) caused by the masking vector σ(Wq).
■ Visualization of Input Gradient
• Then, the attention effect is visualized on the image by calculating the
gradient of Latt with respect to a given image I, while treating F as a
constant.
Latt =
1
2
V − F
2
∂Latt
∂I
=
∂V
∂I
(V − F )
14. Visualization Examplesexamples examples
What kind of animals are these ? sheep What animal is the picture ? elephant
What is this animal ? zebra What game is this person playing ? tennis
How many cats are here ? 2 What color is the bird ? yellow
What sport is this ? surfing Is the horse jumping ? yes
(a) (b)
(c) (d)
(e) (f)
(g) (h)
16. Acknowledgments
This work was supported by Naver Corp.
and partly by the Korea government (IITP-R0126-16-1072-
SW.StarLab, KEIT-10044009-HRI.MESSI, KEIT-10060086-RISF,
ADD-UD130070ID-BMRR).
17. References
• Antol, S., Agrawal, A., Lu, J., Mitchell, M., Batra, D., Zitnick, C. L., & Parikh, D. (2015). VQA: Visual Question
Answering. arXiv:1505.00468v1, 1–16.
• Noh, H., Seo, P. H., & Han, B. (2015). Image Question Answering using Convolutional Neural Network with Dynamic
Parameter Prediction. Computer Vision and Pattern Recognition; Computation and Language; Learning. Retrieved
from http://arxiv.org/abs/1511.05756
• Yang, Z., He, X., Gao, J., Deng, L., & Smola, A. (2015). Stacked Attention Networks for Image Question Answering.
arXiv:1511.02274. Learning; Computation and Language; Computer Vision and Pattern Recognition; Neural and
Evolutionary Computing. Retrieved from http://arxiv.org/abs/1511.02274
• He, K., Zhang, X., Ren, S., & Sun, J. (2015). Deep Residual Learning for Image Recognition. Computer Vision and
Pattern Recognition. Retrieved from http://arxiv.org/abs/1512.03385
• Srivastava, R. K., Greff, K., & Schmidhuber, J. (2015). Training Very Deep Networks, 11. Learning; Neural and
Evolutionary Computing. Retrieved from http://arxiv.org/abs/1507.06228
• Kim, J.-H., Kim, J., Ha, J.-W., & Zhang, B.-T. (2016). TrimZero: A Torch Recurrent Module for Efficient Natural
Language Processing. In Proceedings of KIIS Spring Conference (Vol. 26, pp. 165–166).
• Léonard, N., Waghmare, S., Wang, Y., & Kim, J.-H. (2015). rnn : Recurrent Library for Torch. arXiv Preprint arXiv:
1511.07889. Retrieved from http://arxiv.org/abs/1511.07889
• Xiong, C., Merity, S., & Socher, R. (2016). Dynamic Memory Networks for Visual and Textual Question Answering.
Neural and Evolutionary Computing; Computation and Language; Computer Vision and Pattern Recognition. Retrieved
from http://arxiv.org/abs/1603.01417
• Gal, Y. (2015). A Theoretically Grounded Application of Dropout in Recurrent Neural Networks. arXiv Preprint arXiv:
1512.05287. Machine Learning. Retrieved from http://arxiv.org/abs/1512.05287
• Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation of word representations in vector
space. ICLR, 2013.
• Kiros, R., Zhu, Y., Salakhutdinov, R., Zemel, R. S., Torralba, A., Urtasun, R., & Fidler, S. (2015). Skip-Thought Vectors.
In Advances in Neural Information Processing Systems 28 (pp. 3294–3302). Computation and Language; Learning.
20. More Examples
(a) Does the man have good posture ? no (b) Did he fall down ? yes
(c) Are there two cats in the picture ? no (d) What color are the bears ? brown
(e) What are many of the people carrying ? umbrellas (f) What color is the dog ? black
(g) Are these animals tall ? yes (h) What animal is that ? sheep
(i) Are all the cows the same color ? no (j) What is the reflection of in the mirror ? dog
(k) What are the giraffe in the foreground doing ?
eating
(l) What animal is standing in the water other than
birds ? bear
21. Comparative Analysis
(a1) What is the animal on the left ? giraffe
(a2) Can you see trees ? yes
(b1) What is the lady riding ? motorcycle
(b2) Is she riding the motorcycle on the street ? no
22. Failure Examples
(a) What animals are these ? bears ducks (b) What are these animals ? cows goats
(c) What animals are visible ? sheep horses (d) How many animals are depicted ? 2 1
(e) What flavor donut is this ? chocolate strawberry (f) What is the man doing ? playing tennis frisbee
(g) What color are the giraffes eyelashes ? brown
black
(h) What food is the bear trying to eat ? banana
papaya
(i) What kind of animal is used to herd these animals ?
sheep dog
(j) What species of tree are in the background ? pine
palm
(k) Are there any birds on the photo ? no yes (l) Why is the hydrant smiling ? happy
someone drew on it
23. TrimZero in Torch rnn
■ MaskZero - a naive approach
how
what
was
0 1 1 ..
0 0 1 ..
1 1 1 ..
is ..
is ..
your ..
0 what is ..
0 0 is ..
how was your ..
LM
0 s1 s2 ..
0 0 s1 ..
s1 s2 s3 ..
LM s1
s1
s2
s2 ..
s1 ..
s3 ..
0 s1 s2 ..
0 0 s1 ..
s1 s2 s3 ..
recovery
at every step
■ TrimZero - training time reduced by 37.5%
[Kim et al., 2016]
24. Note on TrimZero
■ GPU Computation
• Efficiency of TrimZero is degraded for CUDA computing,
• however, it is mainly affected by batch size, rnn size and the number of
zeros in inputs.
• Empirically, natural language sentences (around mean length 7~8, max
length 26) with batch size = 200 and rnn size = 2400 (skip-thought vectors)
gain decent computational advantage (+37.5%) for CUDA computing.
■ Citation
• Jin-Hwa Kim, Jeonghee Kim, Jung-Woo Ha and Byoung-Tak Zhang,
(2016). TrimZero: A Torch Recurrent Module for Efficient Natural
Language Processing. In Proceedings of KIIS Spring Conference (Vol. 26,
pp. 165–166).