This document provides an overview of deep learning and its applications in anomaly detection and software vulnerability detection. It discusses key deep learning architectures like feedforward networks, recurrent neural networks, and convolutional neural networks. It also covers unsupervised learning techniques such as word embedding, autoencoders, RBMs, and GANs. For anomaly detection, it describes approaches for multichannel anomaly detection, detecting unusual mixed-data co-occurrences, and modeling object lifetimes. It concludes by discussing applications in detecting malicious URLs, unusual source code, and software vulnerabilities.
This document provides an overview of deep learning 1.0 and discusses potential directions for deep learning 2.0. It summarizes limitations of deep learning 1.0 such as lack of reasoning abilities and discusses how incorporating memory and reasoning capabilities could help address these limitations. The document outlines several approaches being explored for neural memory and reasoning, including memory networks, neural Turing machines, and self-attentive associative memories. It argues that memory and reasoning will be important for developing more human-like artificial general intelligence.
Deep learning and applications in non-cognitive domains IIDeakin University
This document provides an overview of applying deep learning techniques to non-cognitive domains, with a focus on healthcare, software engineering, and anomaly detection. It introduces popular deep learning frameworks like Theano and TensorFlow and discusses best practices for building models. For healthcare, examples are given on using recurrent neural networks (RNNs) with electronic medical record (EMR) data and physiological time-series data from intensive care units. Challenges in software engineering like long-term temporal dependencies are discussed. Overall, the document outlines techniques for structured and unstructured data across different non-cognitive domains.
Deep learning and applications in non-cognitive domains IIIDeakin University
This document summarizes an presentation on unsupervised learning and advanced topics in deep learning. It discusses word embeddings, autoencoders, restricted Boltzmann machines, variational autoencoders, generative adversarial networks, graph neural networks, attention mechanisms, and end-to-end memory networks. It emphasizes representing complex domain structures like relations and graphs, and developing memory and attention capabilities in neural networks. The presentation concludes by discussing positioning research opportunities in these emerging areas.
Deep learning and applications in non-cognitive domains IDeakin University
This document outlines an agenda for a presentation on deep learning and its applications in non-cognitive domains. The presentation is divided into three parts: an introduction to deep learning theory, applying deep learning to non-cognitive domains in practice, and advanced topics. The introduction covers neural network architectures like feedforward, recurrent, and convolutional networks. It also discusses techniques for improving training like rectified linear units and skip connections. The practice section will provide hands-on examples in domains like healthcare and software engineering. The advanced topics section will discuss unsupervised learning, structured outputs, and positioning techniques in deep learning.
Deep Learning has taken the digital world by storm. As a general purpose technology, it is now present in all walks of life. Although the fundamental developments in methodology have been slowing down in the past few years, applications are flourishing with major breakthroughs in Computer Vision, NLP and Biomedical Sciences. The primary successes can be attributed to the availability of large labelled data, powerful GPU servers and programming frameworks, and advances in neural architecture engineering. This combination enables rapid construction of large, efficient neural networks that scale to the real world. But the fundamental questions of unsupervised learning, deep reasoning, and rapid contextual adaptation remain unsolved. We shall call what we currently have Deep Learning 1.0, and the next possible breakthroughs as Deep Learning 2.0.
This is part 1 of the Tutorial delivered at IEEE SSCI 2020, Canberra, December 1st (Virtual).
Describing latest research in visual reasoning, in particular visual question answering. Covering both images and videos. Dual-process theories approach. Relational memory.
This is the talk given at the Faculty of Information Technology, Monash University on 19/08/2020. It covers our recent research on the topics of learning to reason, including dual-process theory, visual reasoning and neural memories.
Introducing research works in the area of machine reasoning at our Applied AI Institute, Deakin University, Australia. Covering visual & social reasoning, neural Turing machine and System 2.
This document provides an overview of deep learning 1.0 and discusses potential directions for deep learning 2.0. It summarizes limitations of deep learning 1.0 such as lack of reasoning abilities and discusses how incorporating memory and reasoning capabilities could help address these limitations. The document outlines several approaches being explored for neural memory and reasoning, including memory networks, neural Turing machines, and self-attentive associative memories. It argues that memory and reasoning will be important for developing more human-like artificial general intelligence.
Deep learning and applications in non-cognitive domains IIDeakin University
This document provides an overview of applying deep learning techniques to non-cognitive domains, with a focus on healthcare, software engineering, and anomaly detection. It introduces popular deep learning frameworks like Theano and TensorFlow and discusses best practices for building models. For healthcare, examples are given on using recurrent neural networks (RNNs) with electronic medical record (EMR) data and physiological time-series data from intensive care units. Challenges in software engineering like long-term temporal dependencies are discussed. Overall, the document outlines techniques for structured and unstructured data across different non-cognitive domains.
Deep learning and applications in non-cognitive domains IIIDeakin University
This document summarizes an presentation on unsupervised learning and advanced topics in deep learning. It discusses word embeddings, autoencoders, restricted Boltzmann machines, variational autoencoders, generative adversarial networks, graph neural networks, attention mechanisms, and end-to-end memory networks. It emphasizes representing complex domain structures like relations and graphs, and developing memory and attention capabilities in neural networks. The presentation concludes by discussing positioning research opportunities in these emerging areas.
Deep learning and applications in non-cognitive domains IDeakin University
This document outlines an agenda for a presentation on deep learning and its applications in non-cognitive domains. The presentation is divided into three parts: an introduction to deep learning theory, applying deep learning to non-cognitive domains in practice, and advanced topics. The introduction covers neural network architectures like feedforward, recurrent, and convolutional networks. It also discusses techniques for improving training like rectified linear units and skip connections. The practice section will provide hands-on examples in domains like healthcare and software engineering. The advanced topics section will discuss unsupervised learning, structured outputs, and positioning techniques in deep learning.
Deep Learning has taken the digital world by storm. As a general purpose technology, it is now present in all walks of life. Although the fundamental developments in methodology have been slowing down in the past few years, applications are flourishing with major breakthroughs in Computer Vision, NLP and Biomedical Sciences. The primary successes can be attributed to the availability of large labelled data, powerful GPU servers and programming frameworks, and advances in neural architecture engineering. This combination enables rapid construction of large, efficient neural networks that scale to the real world. But the fundamental questions of unsupervised learning, deep reasoning, and rapid contextual adaptation remain unsolved. We shall call what we currently have Deep Learning 1.0, and the next possible breakthroughs as Deep Learning 2.0.
This is part 1 of the Tutorial delivered at IEEE SSCI 2020, Canberra, December 1st (Virtual).
Describing latest research in visual reasoning, in particular visual question answering. Covering both images and videos. Dual-process theories approach. Relational memory.
This is the talk given at the Faculty of Information Technology, Monash University on 19/08/2020. It covers our recent research on the topics of learning to reason, including dual-process theory, visual reasoning and neural memories.
Introducing research works in the area of machine reasoning at our Applied AI Institute, Deakin University, Australia. Covering visual & social reasoning, neural Turing machine and System 2.
The current deep learning revolution has brought unprecedented changes to how we live, learn, interact with the digital and physical worlds, run business and conduct sciences. These are made possible thanks to the relative ease of construction of massive neural networks that are flexible to train and scale up to the real world. But the flexibility is hitting the limits due to excessive demand of labelled data, the narrowness of the tasks, the failure to generalize beyond surface statistics to novel combinations, and the lack of the key mental faculty of deliberate reasoning. In this talk, I will present a multi-year research program to push deep learning to overcome these limitations. We aim to build dynamic neural networks that can train themselves with little labelled data, compress on-the-fly in response to resource constraints, and respond to arbitrary query about a context. The networks are equipped with capability to make use of external knowledge, and operate that the high-level of objects and relations. The long-term goal is to build persistent digital companions that co-live with us and other AI entities, understand our need and intention, and share our human values and norms. They will be capable of having natural conversations, remembering lifelong events, and learning in an open-ended fashion.
This document discusses research on improving neural Turing machines through the use of program memory.
It introduces the neural universal Turing machine (NUTM), which augments neural Turing machines with a neural stored-program memory (NSM) to store programs. The NSM allows NUTMs to sequence tasks, do continual learning, and answer questions - addressing limitations of current neural Turing machines. Further research opportunities are outlined, including using memory for graphs and relational structures, memory-supported reasoning, and developing full cognitive architectures.
Deep learning for biomedical discovery and data mining IDeakin University
The document provides an agenda and slides for a deep learning tutorial focused on biomedical applications. The agenda covers topics such as classic deep learning architectures, genomics applications, healthcare applications, and improving data efficiency. The slides discuss challenges in applying deep learning to biomedicine like small datasets and the complexity of diseases. They also highlight opportunities like recent advances in deep learning techniques and biomedicine being a major source of new problems.
TL;DR: This tutorial was delivered at KDD 2021. Here we review recent developments to extend the capacity of neural networks to “learning to reason” from data, where the task is to determine if the data entails a conclusion.
The rise of big data and big compute has brought modern neural networks to many walks of digital life, thanks to the relative ease of construction of large models that scale to the real world. Current successes of Transformers and self-supervised pretraining on massive data have led some to believe that deep neural networks will be able to do almost everything whenever we have data and computational resources. However, this might not be the case. While neural networks are fast to exploit surface statistics, they fail miserably to generalize to novel combinations. Current neural networks do not perform deliberate reasoning – the capacity to deliberately deduce new knowledge out of the contextualized data. This tutorial reviews recent developments to extend the capacity of neural networks to “learning to reason” from data, where the task is to determine if the data entails a conclusion. This capacity opens up new ways to generate insights from data through arbitrary querying using natural languages without the need of predefining a narrow set of tasks.
Full day lectures @International University, HCM City, Vietnam, May 2019. Review of AI in 2019; outlook into the future; empirical research in AI; introduction to AI research at Deakin University
A discussion of the nature of AI/ML as an empirical science. Covering concepts in the field, how to position ourselves, how to plan for research, what are empirical methods in AI/ML, and how to build up a theory of AI.
This document provides an overview of a tutorial on machine learning and reasoning for drug discovery.
The tutorial covers several topics: molecular representation and property prediction, including fingerprints, string representations, graph representations, and self-supervised learning; protein representation and protein-drug binding; molecular optimization and generation; and knowledge graph reasoning and drug synthesis.
The introduction discusses the drug discovery pipeline and how machine learning can help with various tasks such as molecular property prediction, target identification, and reaction planning. Neural networks are well-suited for drug discovery due to their expressiveness, learnability, generalizability, and ability to handle large amounts of data.
The document discusses advance techniques of computational intelligence for biomedical image analysis. It provides an overview of computational intelligence, which involves adaptive mechanisms like artificial neural networks, evolutionary computation, fuzzy systems, and swarm intelligence. These techniques exhibit an ability to learn or adapt to new environments. The document also discusses deep learning techniques like convolutional neural networks and recurrent neural networks that are widely used for tasks like image classification.
This document summarizes Melanie Swan's presentation on deep learning. It began with defining key deep learning concepts and techniques, including neural networks, supervised vs. unsupervised learning, and convolutional neural networks. It then explained how deep learning works by using multiple processing layers to extract higher-level features from data and make predictions. Deep learning has various applications like image recognition and speech recognition. The presentation concluded by discussing how deep learning is inspired by concepts from physics and statistical mechanics.
This document provides an overview of deep learning, including definitions of AI, machine learning, and deep learning. It discusses neural network models like artificial neural networks, convolutional neural networks, and recurrent neural networks. The document explains key concepts in deep learning like activation functions, pooling techniques, and the inception model. It provides steps for fitting a deep learning model, including loading data, defining the model architecture, adding layers and functions, compiling, and fitting the model. Examples and visualizations are included to demonstrate how neural networks work.
1. The document discusses model interpretation and techniques for interpreting machine learning models, especially deep neural networks.
2. It describes what model interpretation is, its importance and benefits, and provides examples of interpretability algorithms like dimensionality reduction, manifold learning, and visualization techniques.
3. The document aims to help make machine learning models more transparent and understandable to humans in order to build trust and improve model evaluation, debugging and feature engineering.
Part of the ongoing effort with Skater for enabling better Model Interpretation for Deep Neural Network models presented at the AI Conference.
https://conferences.oreilly.com/artificial-intelligence/ai-ny/public/schedule/detail/65118
Human in the loop: Bayesian Rules Enabling Explainable AIPramit Choudhary
The document provides an overview of a presentation on enabling explainable artificial intelligence through Bayesian rule lists. Some key points:
- The presentation will cover challenges with model opacity, defining interpretability, and how Bayesian rule lists can be used to build naturally interpretable models through rule extraction.
- Bayesian rule lists work well for tabular datasets and generate human-understandable "if-then-else" rules. They aim to optimize over pre-mined frequent patterns to construct an ordered set of conditional statements.
- There is often a tension between model performance and interpretability. Bayesian rule lists can achieve accuracy comparable to more opaque models like random forests on benchmark datasets while maintaining interpretability.
The document is a glossary of deep learning terms from A-Z published by Re-Work, defining important concepts like artificial neural networks, convolutional neural networks, deep learning, embeddings, feedforward networks, generative adversarial networks, and more; each term includes a short definition and links to external sources for further explanation. The glossary aims to explain the key concepts underlying recent advances in deep learning that have enabled applications such as driverless cars, healthcare, and fashion recommendations.
The report covers the diverse field of neural networks compiled from various sources into a compact yet detailed form. It also has the formal report writing pattern incorporated in it.
Following topics are discussed in this presentation:What is Soft Computing?
What is Hard Computing?
What is Fuzzy Logic Models?
What is Neural Networks (NN)?
What is Genetic Algorithms or Evaluation Programming?
What is probabilistic reasoning?
Difference between fuzziness and probability
AI and Soft Computing
Future of Soft Computing
This document outlines the syllabus for an MTCSCS302 course on Soft Computing taught by Dr. Sandeep Kumar Poonia. The course covers topics including neural networks, fuzzy logic, probabilistic reasoning, and genetic algorithms. It is divided into five units: (1) neural networks, (2) fuzzy logic, (3) fuzzy arithmetic and logic, (4) neuro-fuzzy systems and applications of fuzzy logic, and (5) genetic algorithms and their applications. The goal of the course is to provide students with knowledge of soft computing fundamentals and approaches for solving complex real-world problems.
Soft computing is an approach to engineering that is inspired by nature. It includes techniques like fuzzy logic, probabilistic reasoning, evolutionary computation, neural networks, and machine learning. These techniques are useful for problems that are too complex or undefined for conventional analytical or hard computing techniques. Soft computing provides approximate solutions and can handle imprecise data. It has applications in areas like robotics, artificial intelligence, and machine translation.
INTELLIGENT MALWARE DETECTION USING EXTREME LEARNING MACHINEIRJET Journal
This document proposes using an extreme learning machine model called ELM Net to detect malware in real-time using a combination of static, dynamic, and image processing approaches on big data. Current malware detection methods using static and dynamic analysis are time-consuming and ineffective at detecting unknown malware. The proposed approach uses deep learning architectures like convolutional neural networks to automatically extract features without requiring domain expertise. This scalable model aims to enable more effective visual detection of malware, including zero-day variants, for real-world deployments.
This document provides an overview of Mahdi Hosseini Moghaddam's background and work applying machine learning and cognitive computing for intrusion detection. It discusses his education in computer science and engineering and awards. It then outlines the goals of the presentation to discuss real-world applications of machine learning rather than scientific details. The document proceeds to discuss problems with current intrusion detection systems, introduce concepts in machine learning and cognitive computing, and describe Mahdi's methodology and architecture for a hardware-based machine learning system using a cognitive processor to enable fast intrusion detection.
The current deep learning revolution has brought unprecedented changes to how we live, learn, interact with the digital and physical worlds, run business and conduct sciences. These are made possible thanks to the relative ease of construction of massive neural networks that are flexible to train and scale up to the real world. But the flexibility is hitting the limits due to excessive demand of labelled data, the narrowness of the tasks, the failure to generalize beyond surface statistics to novel combinations, and the lack of the key mental faculty of deliberate reasoning. In this talk, I will present a multi-year research program to push deep learning to overcome these limitations. We aim to build dynamic neural networks that can train themselves with little labelled data, compress on-the-fly in response to resource constraints, and respond to arbitrary query about a context. The networks are equipped with capability to make use of external knowledge, and operate that the high-level of objects and relations. The long-term goal is to build persistent digital companions that co-live with us and other AI entities, understand our need and intention, and share our human values and norms. They will be capable of having natural conversations, remembering lifelong events, and learning in an open-ended fashion.
This document discusses research on improving neural Turing machines through the use of program memory.
It introduces the neural universal Turing machine (NUTM), which augments neural Turing machines with a neural stored-program memory (NSM) to store programs. The NSM allows NUTMs to sequence tasks, do continual learning, and answer questions - addressing limitations of current neural Turing machines. Further research opportunities are outlined, including using memory for graphs and relational structures, memory-supported reasoning, and developing full cognitive architectures.
Deep learning for biomedical discovery and data mining IDeakin University
The document provides an agenda and slides for a deep learning tutorial focused on biomedical applications. The agenda covers topics such as classic deep learning architectures, genomics applications, healthcare applications, and improving data efficiency. The slides discuss challenges in applying deep learning to biomedicine like small datasets and the complexity of diseases. They also highlight opportunities like recent advances in deep learning techniques and biomedicine being a major source of new problems.
TL;DR: This tutorial was delivered at KDD 2021. Here we review recent developments to extend the capacity of neural networks to “learning to reason” from data, where the task is to determine if the data entails a conclusion.
The rise of big data and big compute has brought modern neural networks to many walks of digital life, thanks to the relative ease of construction of large models that scale to the real world. Current successes of Transformers and self-supervised pretraining on massive data have led some to believe that deep neural networks will be able to do almost everything whenever we have data and computational resources. However, this might not be the case. While neural networks are fast to exploit surface statistics, they fail miserably to generalize to novel combinations. Current neural networks do not perform deliberate reasoning – the capacity to deliberately deduce new knowledge out of the contextualized data. This tutorial reviews recent developments to extend the capacity of neural networks to “learning to reason” from data, where the task is to determine if the data entails a conclusion. This capacity opens up new ways to generate insights from data through arbitrary querying using natural languages without the need of predefining a narrow set of tasks.
Full day lectures @International University, HCM City, Vietnam, May 2019. Review of AI in 2019; outlook into the future; empirical research in AI; introduction to AI research at Deakin University
A discussion of the nature of AI/ML as an empirical science. Covering concepts in the field, how to position ourselves, how to plan for research, what are empirical methods in AI/ML, and how to build up a theory of AI.
This document provides an overview of a tutorial on machine learning and reasoning for drug discovery.
The tutorial covers several topics: molecular representation and property prediction, including fingerprints, string representations, graph representations, and self-supervised learning; protein representation and protein-drug binding; molecular optimization and generation; and knowledge graph reasoning and drug synthesis.
The introduction discusses the drug discovery pipeline and how machine learning can help with various tasks such as molecular property prediction, target identification, and reaction planning. Neural networks are well-suited for drug discovery due to their expressiveness, learnability, generalizability, and ability to handle large amounts of data.
The document discusses advance techniques of computational intelligence for biomedical image analysis. It provides an overview of computational intelligence, which involves adaptive mechanisms like artificial neural networks, evolutionary computation, fuzzy systems, and swarm intelligence. These techniques exhibit an ability to learn or adapt to new environments. The document also discusses deep learning techniques like convolutional neural networks and recurrent neural networks that are widely used for tasks like image classification.
This document summarizes Melanie Swan's presentation on deep learning. It began with defining key deep learning concepts and techniques, including neural networks, supervised vs. unsupervised learning, and convolutional neural networks. It then explained how deep learning works by using multiple processing layers to extract higher-level features from data and make predictions. Deep learning has various applications like image recognition and speech recognition. The presentation concluded by discussing how deep learning is inspired by concepts from physics and statistical mechanics.
This document provides an overview of deep learning, including definitions of AI, machine learning, and deep learning. It discusses neural network models like artificial neural networks, convolutional neural networks, and recurrent neural networks. The document explains key concepts in deep learning like activation functions, pooling techniques, and the inception model. It provides steps for fitting a deep learning model, including loading data, defining the model architecture, adding layers and functions, compiling, and fitting the model. Examples and visualizations are included to demonstrate how neural networks work.
1. The document discusses model interpretation and techniques for interpreting machine learning models, especially deep neural networks.
2. It describes what model interpretation is, its importance and benefits, and provides examples of interpretability algorithms like dimensionality reduction, manifold learning, and visualization techniques.
3. The document aims to help make machine learning models more transparent and understandable to humans in order to build trust and improve model evaluation, debugging and feature engineering.
Part of the ongoing effort with Skater for enabling better Model Interpretation for Deep Neural Network models presented at the AI Conference.
https://conferences.oreilly.com/artificial-intelligence/ai-ny/public/schedule/detail/65118
Human in the loop: Bayesian Rules Enabling Explainable AIPramit Choudhary
The document provides an overview of a presentation on enabling explainable artificial intelligence through Bayesian rule lists. Some key points:
- The presentation will cover challenges with model opacity, defining interpretability, and how Bayesian rule lists can be used to build naturally interpretable models through rule extraction.
- Bayesian rule lists work well for tabular datasets and generate human-understandable "if-then-else" rules. They aim to optimize over pre-mined frequent patterns to construct an ordered set of conditional statements.
- There is often a tension between model performance and interpretability. Bayesian rule lists can achieve accuracy comparable to more opaque models like random forests on benchmark datasets while maintaining interpretability.
The document is a glossary of deep learning terms from A-Z published by Re-Work, defining important concepts like artificial neural networks, convolutional neural networks, deep learning, embeddings, feedforward networks, generative adversarial networks, and more; each term includes a short definition and links to external sources for further explanation. The glossary aims to explain the key concepts underlying recent advances in deep learning that have enabled applications such as driverless cars, healthcare, and fashion recommendations.
The report covers the diverse field of neural networks compiled from various sources into a compact yet detailed form. It also has the formal report writing pattern incorporated in it.
Following topics are discussed in this presentation:What is Soft Computing?
What is Hard Computing?
What is Fuzzy Logic Models?
What is Neural Networks (NN)?
What is Genetic Algorithms or Evaluation Programming?
What is probabilistic reasoning?
Difference between fuzziness and probability
AI and Soft Computing
Future of Soft Computing
This document outlines the syllabus for an MTCSCS302 course on Soft Computing taught by Dr. Sandeep Kumar Poonia. The course covers topics including neural networks, fuzzy logic, probabilistic reasoning, and genetic algorithms. It is divided into five units: (1) neural networks, (2) fuzzy logic, (3) fuzzy arithmetic and logic, (4) neuro-fuzzy systems and applications of fuzzy logic, and (5) genetic algorithms and their applications. The goal of the course is to provide students with knowledge of soft computing fundamentals and approaches for solving complex real-world problems.
Soft computing is an approach to engineering that is inspired by nature. It includes techniques like fuzzy logic, probabilistic reasoning, evolutionary computation, neural networks, and machine learning. These techniques are useful for problems that are too complex or undefined for conventional analytical or hard computing techniques. Soft computing provides approximate solutions and can handle imprecise data. It has applications in areas like robotics, artificial intelligence, and machine translation.
INTELLIGENT MALWARE DETECTION USING EXTREME LEARNING MACHINEIRJET Journal
This document proposes using an extreme learning machine model called ELM Net to detect malware in real-time using a combination of static, dynamic, and image processing approaches on big data. Current malware detection methods using static and dynamic analysis are time-consuming and ineffective at detecting unknown malware. The proposed approach uses deep learning architectures like convolutional neural networks to automatically extract features without requiring domain expertise. This scalable model aims to enable more effective visual detection of malware, including zero-day variants, for real-world deployments.
This document provides an overview of Mahdi Hosseini Moghaddam's background and work applying machine learning and cognitive computing for intrusion detection. It discusses his education in computer science and engineering and awards. It then outlines the goals of the presentation to discuss real-world applications of machine learning rather than scientific details. The document proceeds to discuss problems with current intrusion detection systems, introduce concepts in machine learning and cognitive computing, and describe Mahdi's methodology and architecture for a hardware-based machine learning system using a cognitive processor to enable fast intrusion detection.
This document presents a study on sign language recognition using computer vision techniques. It aims to develop a system that can identify characters and numbers in Indian Sign Language (ISL) using convolutional neural networks. ISL uses both hands to communicate unlike American Sign Language which uses a single hand. The system creates a dataset of ISL gestures and trains a CNN model on it. It then tests the ability of the trained model to accurately predict numbers from 1 to 10 and letters from A to Z when presented with new sign language inputs. The model achieves over 90% accuracy on test data, providing an effective way to translate ISL signs and bridge communication between deaf/mute and non-signing individuals.
Deep hypersphere embedding for real-time face recognitionTELKOMNIKA JOURNAL
With the advancement of human-computer interaction capabilities of robots, computer vision surveillance systems involving security yields a large impact in the research industry by helping in digitalization of certain security processes. Recognizing a face in the computer vision involves identification and classification of which faces belongs to the same person by means of comparing face embedding vectors. In an organization that has a large and diverse labelled dataset on a large number of epoch, oftentimes, creates a training difficulties involving incompatibility in different versions of face embedding that leads to poor face recognition accuracy. In this paper, we will design and implement robotic vision security surveillance system incorporating hybrid combination of MTCNN for face detection, and FaceNet as the unified embedding for face recognition and clustering.
The document presents a project presentation on innovation in train defect resolution. It discusses developing an AI-assisted railway fault detection system using sensors, IoT, machine learning and computer vision to continuously monitor railway infrastructure for anomalies and potential faults. The system would detect issues like track cracks and obstacles to improve safety, send automated alerts to maintenance teams, and provide real-time monitoring interfaces. The presentation covers the proposed methodology using CNNs, existing problems, hardware and software requirements, work done in obstacle detection and simulation, expected benefits like reduced accidents and costs, and applications in safety and operations.
The document discusses using recurrent neural networks to detect Android malware. It proposes developing a deep learning model using LSTM or GRU networks to efficiently detect malware files. The existing approaches have limitations in detecting new malware. The proposed system would use recurrent networks to model sequential Android app data and detect malware, including new emerging types.
Deep Learning Hardware: Past, Present, & FutureRouyun Pan
Yann LeCun gave a presentation on deep learning hardware, past, present, and future. Some key points:
- Early neural networks in the 1960s-1980s were limited by hardware and algorithms. The development of backpropagation and faster floating point hardware enabled modern deep learning.
- Convolutional neural networks achieved breakthroughs in vision tasks in the 1980s-1990s but progress slowed due to limited hardware and data.
- GPUs and large datasets like ImageNet accelerated deep learning research starting in 2012, enabling very deep convolutional networks for computer vision.
- Recent work applies deep learning to new domains like natural language processing, reinforcement learning, and graph networks.
- Future challenges include memory-aug
Big Data Intelligence: from Correlation Discovery to Causal Reasoning Wanjin Yu
The document discusses using sequence-to-sequence learning models for tasks like machine translation, question answering, and image captioning. It describes how recurrent neural networks like LSTMs can be used in seq2seq models to incorporate memory. Finally, it proposes that seq2seq models can be enhanced by incorporating external memory structures like knowledge bases to enable capabilities like causal reasoning for question answering.
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...CSCJournals
In today's era of digitization and fast internet, many video are uploaded on websites, a mechanism is required to access this video accurately and efficiently. Semantic concept detection achieve this task accurately and is used in many application like multimedia annotation, video summarization, annotation, indexing and retrieval. Video retrieval based on semantic concept is efficient and challenging research area. Semantic concept detection bridges the semantic gap between low level extraction of features from key-frame or shot of video and high level interpretation of the same as semantics. Semantic Concept detection automatically assigns labels to video from predefined vocabulary. This task is considered as supervised machine learning problem. Support vector machine (SVM) emerged as default classifier choice for this task. But recently Deep Convolutional Neural Network (CNN) has shown exceptional performance in this area. CNN requires large dataset for training. In this paper, we present framework for semantic concept detection using hybrid model of SVM and CNN. Global features like color moment, HSV histogram, wavelet transform, grey level co-occurrence matrix and edge orientation histogram are selected as low level features extracted from annotated groundtruth video dataset of TRECVID. In second pipeline, deep features are extracted using pretrained CNN. Dataset is partitioned in three segments to deal with data imbalance issue. Two classifiers are separately trained on all segments and fusion of scores is performed to detect the concepts in test dataset. The system performance is evaluated using Mean Average Precision for multi-label dataset. The performance of the proposed framework using hybrid model of SVM and CNN is comparable to existing approaches.
Designing of Smart Rescue Robotics SystemIRJET Journal
1. The document describes the design of a smart rescue robotics system to assist in disaster environments.
2. The rescue robot is equipped with sensors like CO2 and temperature sensors to detect living victims and map the area.
3. A learning-based semi-autonomous controller allows the robot to continuously learn and improve its performance in unknown disaster scenes while searching for victims.
FACE EXPRESSION RECOGNITION USING CONVOLUTION NEURAL NETWORK (CNN) MODELS ijgca
This paper proposes the design of a Facial Expression Recognition (FER) system based on deep
convolutional neural network by using three model. In this work, a simple solution for facial expression
recognition that uses a combination of algorithms for face detection, feature extraction and classification
is discussed. The proposed method uses CNN models with SVM classifier and evaluates them, these models
are Alex-net model, VGG-16 model and Res-Net model. Experiments are carried out on the Extended
Cohn-Kanada (CK+) datasets to determine the recognition accuracy for the proposed FER system. In this
study the accuracy of AlexNet model compared with Vgg16 model and ResNet model. The result show that
AlexNet model achieved the best accuracy (88.2%) compared to other models.
MDEC Data Matters Series: machine learning and Deep Learning, A PrimerPoo Kuan Hoong
The document provides an overview of machine learning and deep learning. It discusses the history and development of neural networks, including deep belief networks, convolutional neural networks, and recurrent neural networks. Applications of deep learning in areas like computer vision, natural language processing, and robotics are also covered. Finally, popular platforms, frameworks and libraries for developing deep learning models are presented, along with examples of pre-trained models that are available.
DEEPFAKE DETECTION TECHNIQUES: A REVIEWvivatechijri
Noteworthy advancements in the field of deep learning have led to the rise of highly realistic AI generated fake videos, these videos are commonly known as Deepfakes. They refer to manipulated videos, that are generated by sophisticated AI, that yield formed videos and tones that seem to be original. Although this technology has numerous beneficial applications, there are also significant concerns about the disadvantages of the same. So there is a need to develop a system that would detect and mitigate the negative impact of these AI generated videos on society. The videos that get transferred through social media are of low quality, so the detection of such videos becomes difficult. Many researchers in the past have done analysis on Deepfake detection which were based on Machine Learning, Support Vector Machine and Deep Learning based techniques such as Convolution Neural Network with or without LSTM .This paper analyses various techniques that are used by several researchers to detect Deepfake videos.
This document discusses publishing and consuming linked sensor data. It provides motivation for representing sensor data as linked data by discussing challenges in accessing heterogeneous sensor data from different sources. It then outlines some of the key ingredients needed for linked sensor data, including ontologies to model sensor metadata and observations, guidelines for generating identifiers, and query processing engines for accessing the data. Examples of existing linked sensor data sources are also provided.
PREPROCESSING CHALLENGES FOR REAL WORLD AFFECT RECOGNITIONCSEIJJournal
Real world human affect recognition requires immediate attention which is a significant aspect of humancomputer interaction. Audio-visual modalities can make a significant contribution by providing rich
contextual information. Preprocessing is an important step in which the relevant information is extracted.
It has a crucial impact on prominent feature extraction and further processing. The main aim is to
highlight the challenges in preprocessing real world data. The research focuses on experimental testing
and comparative analysis for preprocessing using OpenCV, Single Shot MultiBox Detector (SSD), DLib,
Multi-Task Cascaded Convolutional Neural Networks (MTCNN), and RetinaFace detectors. The
comparative analysis shows that MTCNN and RetinaFace give better performance in real world data. The
performance of facial affect recognition using a pre-trained CNN model is analysed with a lab-controlled
dataset CK+ and a representative wild dataset AFEW. This comparative analysis demonstrates the impact
of preprocessing issues on feature engineering framework in real world affect recognition.
Preprocessing Challenges for Real World Affect Recognition CSEIJJournal
Real world human affect recognition requires immediate attention which is a significant aspect of human-
computer interaction. Audio-visual modalities can make a significant contribution by providing rich
contextual information. Preprocessing is an important step in which the relevant information is extracted.
It has a crucial impact on prominent feature extraction and further processing. The main aim is to
highlight the challenges in preprocessing real world data. The research focuses on experimental testing
and comparative analysis for preprocessing using OpenCV, Single Shot MultiBox Detector (SSD), DLib,
Multi-Task Cascaded Convolutional Neural Networks (MTCNN), and RetinaFace detectors. The
comparative analysis shows that MTCNN and RetinaFace give better performance in real world data. The
performance of facial affect recognition using a pre-trained CNN model is analysed with a lab-controlled
dataset CK+ and a representative wild dataset AFEW. This comparative analysis demonstrates the impact
of preprocessing issues on feature engineering framework in real world affect recognition.
Feature Extraction and Analysis of Natural Language Processing for Deep Learn...Sharmila Sathish
This document discusses using deep learning techniques for multi-modal feature extraction. It proposes a multi-modal neural network with independent sub-networks for each data mode. It also discusses using a bi-directional GRU network for English word segmentation to effectively solve long-distance dependency issues while reducing training and prediction time compared to bi-directional LSTM. Experimental results showed the proposed multi-modal fusion model can effectively extract low-dimensional fused features from original high-dimensional multi-modal data.
The document discusses RFID (radio-frequency identification) technology and its applications. It describes what RFID is, how RFID tags work, and examples of RFID being used for identification of objects, tracking objects in supply chains, access control, contactless payment, and inferring human activities through interactions with tagged objects. The document also provides an example of using an RFID reader and tags in a Java program to detect tagged objects.
This document summarizes a project on developing bioinspired algorithms for peer-to-peer (P2P) networks. The objectives were to improve scalability and fault tolerance of P2P networks using bioinspired algorithms. The project achieved its objectives through developing an evolvable agent architecture for self-organized connectivity in P2P networks and studying mechanisms for fault tolerance. It released its implementations under an open source license and had success disseminating results through publications, conferences and online networks.
Data compression, data security, and machine learningChris Huang
This document discusses using data compression techniques to improve machine learning models. It proposes using model compression or reduction methods to simplify deep neural network (DNN) models in order to run them on mobile devices with similar accuracy. One approach described is removing small weight connections, retraining, then using codebooks and Huffman coding to compress models by 20-49x. The document also discusses using lossless compression prior to machine learning to reduce data volume and speed up execution. Overall, the document explores how compression techniques can help make machine learning models more efficient.
Similar to Deep learning for detecting anomalies and software vulnerabilities (20)
Artificial intelligence in the post-deep learning eraDeakin University
Deep learning has recently reached the heights that pioneers in the field had aspired to, serving as the driving force behind recent breakthroughs in AI, which have arguably surpassed the Turing test. At present, the spotlight is on scaling Transformers and diffusion models on Internet-scale data. In this talk, I will provide an overview of the fundamental principles of deep learning, its powers, and limitations, and explore the new era of post-deep learning. This new era encompasses novel objectives, dynamic architectures, abstract reasoning, neurosymbolic hybrid systems, and LLM-based agent systems.
Deep learning has recently reached the height the pioneers wished for, serving as the driving force behind recent breakthroughs in AI, which have arguably surpassed the Turing test. In this tutorial, we will provide an overview of the fundamental principles of deep learning and explore the latest advances in the field, including Foundation Models. We will also examine the powers and limitations of deep learning, exploring how reasoning may emerge from carefully crafted neural networks and massively pre-trained models.
AI for automated materials discovery via learning to represent, predict, gene...Deakin University
A brief overview of how our AI can help automate the materials discovery process, covering a wide range of problems, from drug design to crystal plasticity.
Deep learning, enabled by powerful compute, and fuelled by massive data, has delivered unprecedented data analytics capabilities. However, major limitations remain. Chiefly among those is that deep neural networks tend to exploit the surface statistics in the data, creating short-cuts from the input to the output, without really deeply understanding of the data. As a result, these networks fail miserably to generalize to novel combinations. This is because the networks perform shallow pattern matching but not deliberate reasoning – the capacity to deliberately deduce new knowledge out of the contextualized data. Second, machine learning is often trained to do just one task at a time, making it impossible to re-define tasks on the fly as needed in a complex operating environment. This talk presents our recent developments to extend the capacity of neural networks to remove these limitations. Our main focus is on learning to reason from data, that is, learning to determine if the data entails a conclusion. This capacity opens up new ways to generate insights from data through arbitrary querying using natural languages without the need of predefining a narrow set of tasks.
Generative AI represents a pivotal moment in computing history, opening up new opportunities for scientific discoveries. By harnessing extensive and diverse datasets, we can construct new general-purpose Foundation Models that can be fine-tuned for specific prediction and exploration tasks. This talk introduces our research program, which focuses on leveraging the power of Generative AI for materials discovery. Generative AI facilitates rapid exploration of vast materials design spaces, enabling the identification of new compounds and combinations. However, this field also presents significant challenges, such as effectively representing crystals in a compact manner and striking the right balance between utilizing known structural regions and venturing into unexplored territories. Our research delves into the development of a new kind of generative models specifically designed to search for diverse molecular/crystal regions that yield high returns, as defined by domain experts. In addition, our toolset includes Large Language Models that have been fine-tuned using materials literature and scientific knowledge. These models possess the ability to comprehend extensive volumes of materials literature, encompassing molecular string representations, mathematical equations in LaTeX, and codebases. We explore the open challenges, including effectively representing deep domain knowledge and implementing efficient querying techniques to address materials discovery problems.
This document discusses generative AI and its potential impacts. It provides an overview of generative AI capabilities like one model for all tasks, emergent behaviors, and in-context learning. Applications discussed include materials discovery, process monitoring, and battery modeling. The document outlines a vision for 2030 where generative AI becomes more general purpose and powerful, enabling new industries and economic growth while also raising risks around concentration of power, misuse, and safe and ethical development.
AI has played a limited role in the COVID-19 pandemic so far, scoring a B- according to one expert. It has helped in some areas like early warning, image-based diagnosis, and optimizing clinical trials. However, it could not demonstrate great impact in regions with complex healthcare systems and high inertia. Going forward, AI may accelerate tasks like forecasting medical resource needs, optimizing logistics, and assisting vaccine and drug discovery for future pandemics if developed with proper objectives, less reliance on historical data, and alignment with human values.
- The document discusses various approaches for applying machine learning and artificial intelligence to drug discovery.
- It describes how molecules and proteins can be represented as graphs, fingerprints, or sequences to be used as input for models.
- Different tasks in drug discovery like target binding prediction, generative design of new molecules, and drug repurposing are framed as questions that AI models can aim to answer.
- Techniques discussed include graph neural networks, reinforcement learning, and conditional generation using techniques like translation models.
- Several recent works applying these approaches for tasks like predicting drug-target interactions and generating synthesizable molecules are referenced.
The document discusses using deep learning models to analyze episodic healthcare data and make predictions. It proposes:
1) Viewing healthcare processes as executable computer programs with hidden "grammars" that can be learned from observational data.
2) Modeling health dynamics as a system of state transitions where treatments shift illness states, and historical events' importance is person-specific.
3) Training models by minimizing prediction loss to forecast outcomes like readmission, mortality, and disease progression based on patients' diseases, treatments, and visits over time.
Deep learning for biomedical discovery and data mining IIDeakin University
(1) The document discusses deep learning techniques for analyzing biomedical data from electronic medical records (EMRs).
(2) It describes models like DeepPatient that use autoencoders to learn representations of patient records that can predict diseases.
(3) Other models like Deepr and DeepCare use convolutional and recurrent neural networks to model temporal patterns in EMRs and predict future health risks and care trajectories.
This document provides an overview of recent advances in applying artificial intelligence and machine learning techniques to matters and materials. It discusses several key ideas and approaches, including:
- Using graph neural networks and message passing algorithms to model molecules as graphs and predict molecular properties.
- Generative models like variational autoencoders and generative adversarial networks to represent molecules in a continuous latent space and generate new molecular structures.
- Reinforcement learning approaches for predicting chemical reactions and planning chemical syntheses.
- Directed generation of molecular graphs using graph variational autoencoders to overcome limitations of string-based representations.
The document outlines many promising directions for using deep learning to tackle important problems in chemistry, materials science
This document discusses representation learning on graphs. It begins by explaining why graph representation learning is important as graphs are pervasive in many scientific disciplines. It then discusses various techniques for graph representation learning including graph embedding methods like DeepWalk and Node2Vec, message passing neural networks, and graph generation methods like variational autoencoders. The document concludes by discussing challenges in graph reasoning to deduce knowledge from graphs in response to queries.
The document discusses how deep learning can be applied to genomics. It outlines several genomic problems that deep learning may be able to help with, such as gene-disease mapping, binding site identification, and sequence generation. It then provides examples of existing deep learning applications for related tasks like predicting gene expression and identifying binding sites. Overall, the document argues that deep learning is a promising approach for many genomics problems by leveraging its ability to learn from large amounts of data and discover complex patterns.
End-to-end pipeline agility - Berlin Buzzwords 2024Lars Albertsson
We describe how we achieve high change agility in data engineering by eliminating the fear of breaking downstream data pipelines through end-to-end pipeline testing, and by using schema metaprogramming to safely eliminate boilerplate involved in changes that affect whole pipelines.
A quick poll on agility in changing pipelines from end to end indicated a huge span in capabilities. For the question "How long time does it take for all downstream pipelines to be adapted to an upstream change," the median response was 6 months, but some respondents could do it in less than a day. When quantitative data engineering differences between the best and worst are measured, the span is often 100x-1000x, sometimes even more.
A long time ago, we suffered at Spotify from fear of changing pipelines due to not knowing what the impact might be downstream. We made plans for a technical solution to test pipelines end-to-end to mitigate that fear, but the effort failed for cultural reasons. We eventually solved this challenge, but in a different context. In this presentation we will describe how we test full pipelines effectively by manipulating workflow orchestration, which enables us to make changes in pipelines without fear of breaking downstream.
Making schema changes that affect many jobs also involves a lot of toil and boilerplate. Using schema-on-read mitigates some of it, but has drawbacks since it makes it more difficult to detect errors early. We will describe how we have rejected this tradeoff by applying schema metaprogramming, eliminating boilerplate but keeping the protection of static typing, thereby further improving agility to quickly modify data pipelines without fear.
Analysis insight about a Flyball dog competition team's performanceroli9797
Insight of my analysis about a Flyball dog competition team's last year performance. Find more: https://github.com/rolandnagy-ds/flyball_race_analysis/tree/main
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeWalaa Eldin Moustafa
Dynamic policy enforcement is becoming an increasingly important topic in today’s world where data privacy and compliance is a top priority for companies, individuals, and regulators alike. In these slides, we discuss how LinkedIn implements a powerful dynamic policy enforcement engine, called ViewShift, and integrates it within its data lake. We show the query engine architecture and how catalog implementations can automatically route table resolutions to compliance-enforcing SQL views. Such views have a set of very interesting properties: (1) They are auto-generated from declarative data annotations. (2) They respect user-level consent and preferences (3) They are context-aware, encoding a different set of transformations for different use cases (4) They are portable; while the SQL logic is only implemented in one SQL dialect, it is accessible in all engines.
#SQL #Views #Privacy #Compliance #DataLake
Codeless Generative AI Pipelines
(GenAI with Milvus)
https://ml.dssconf.pl/user.html#!/lecture/DSSML24-041a/rate
Discover the potential of real-time streaming in the context of GenAI as we delve into the intricacies of Apache NiFi and its capabilities. Learn how this tool can significantly simplify the data engineering workflow for GenAI applications, allowing you to focus on the creative aspects rather than the technical complexities. I will guide you through practical examples and use cases, showing the impact of automation on prompt building. From data ingestion to transformation and delivery, witness how Apache NiFi streamlines the entire pipeline, ensuring a smooth and hassle-free experience.
Timothy Spann
https://www.youtube.com/@FLaNK-Stack
https://medium.com/@tspann
https://www.datainmotion.dev/
milvus, unstructured data, vector database, zilliz, cloud, vectors, python, deep learning, generative ai, genai, nifi, kafka, flink, streaming, iot, edge
The Building Blocks of QuestDB, a Time Series Databasejavier ramirez
Talk Delivered at Valencia Codes Meetup 2024-06.
Traditionally, databases have treated timestamps just as another data type. However, when performing real-time analytics, timestamps should be first class citizens and we need rich time semantics to get the most out of our data. We also need to deal with ever growing datasets while keeping performant, which is as fun as it sounds.
It is no wonder time-series databases are now more popular than ever before. Join me in this session to learn about the internal architecture and building blocks of QuestDB, an open source time-series database designed for speed. We will also review a history of some of the changes we have gone over the past two years to deal with late and unordered data, non-blocking writes, read-replicas, or faster batch ingestion.
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Aggregage
This webinar will explore cutting-edge, less familiar but powerful experimentation methodologies which address well-known limitations of standard A/B Testing. Designed for data and product leaders, this session aims to inspire the embrace of innovative approaches and provide insights into the frontiers of experimentation!
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...Social Samosa
The Modern Marketing Reckoner (MMR) is a comprehensive resource packed with POVs from 60+ industry leaders on how AI is transforming the 4 key pillars of marketing – product, place, price and promotions.
Learn SQL from basic queries to Advance queriesmanishkhaire30
Dive into the world of data analysis with our comprehensive guide on mastering SQL! This presentation offers a practical approach to learning SQL, focusing on real-world applications and hands-on practice. Whether you're a beginner or looking to sharpen your skills, this guide provides the tools you need to extract, analyze, and interpret data effectively.
Key Highlights:
Foundations of SQL: Understand the basics of SQL, including data retrieval, filtering, and aggregation.
Advanced Queries: Learn to craft complex queries to uncover deep insights from your data.
Data Trends and Patterns: Discover how to identify and interpret trends and patterns in your datasets.
Practical Examples: Follow step-by-step examples to apply SQL techniques in real-world scenarios.
Actionable Insights: Gain the skills to derive actionable insights that drive informed decision-making.
Join us on this journey to enhance your data analysis capabilities and unlock the full potential of SQL. Perfect for data enthusiasts, analysts, and anyone eager to harness the power of data!
#DataAnalysis #SQL #LearningSQL #DataInsights #DataScience #Analytics
Global Situational Awareness of A.I. and where its headedvikram sood
You can see the future first in San Francisco.
Over the past year, the talk of the town has shifted from $10 billion compute clusters to $100 billion clusters to trillion-dollar clusters. Every six months another zero is added to the boardroom plans. Behind the scenes, there’s a fierce scramble to secure every power contract still available for the rest of the decade, every voltage transformer that can possibly be procured. American big business is gearing up to pour trillions of dollars into a long-unseen mobilization of American industrial might. By the end of the decade, American electricity production will have grown tens of percent; from the shale fields of Pennsylvania to the solar farms of Nevada, hundreds of millions of GPUs will hum.
The AGI race has begun. We are building machines that can think and reason. By 2025/26, these machines will outpace college graduates. By the end of the decade, they will be smarter than you or I; we will have superintelligence, in the true sense of the word. Along the way, national security forces not seen in half a century will be un-leashed, and before long, The Project will be on. If we’re lucky, we’ll be in an all-out race with the CCP; if we’re unlucky, an all-out war.
Everyone is now talking about AI, but few have the faintest glimmer of what is about to hit them. Nvidia analysts still think 2024 might be close to the peak. Mainstream pundits are stuck on the wilful blindness of “it’s just predicting the next word”. They see only hype and business-as-usual; at most they entertain another internet-scale technological change.
Before long, the world will wake up. But right now, there are perhaps a few hundred people, most of them in San Francisco and the AI labs, that have situational awareness. Through whatever peculiar forces of fate, I have found myself amongst them. A few years ago, these people were derided as crazy—but they trusted the trendlines, which allowed them to correctly predict the AI advances of the past few years. Whether these people are also right about the next few years remains to be seen. But these are very smart people—the smartest people I have ever met—and they are the ones building this technology. Perhaps they will be an odd footnote in history, or perhaps they will go down in history like Szilard and Oppenheimer and Teller. If they are seeing the future even close to correctly, we are in for a wild ride.
Let me tell you what we see.
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataKiwi Creative
Harness the power of AI-backed reports, benchmarking and data analysis to predict trends and detect anomalies in your marketing efforts.
Peter Caputa, CEO at Databox, reveals how you can discover the strategies and tools to increase your growth rate (and margins!).
From metrics to track to data habits to pick up, enhance your reporting for powerful insights to improve your B2B tech company's marketing.
- - -
This is the webinar recording from the June 2024 HubSpot User Group (HUG) for B2B Technology USA.
Watch the video recording at https://youtu.be/5vjwGfPN9lw
Sign up for future HUG events at https://events.hubspot.com/b2b-technology-usa/
Challenges of Nation Building-1.pptx with more important
Deep learning for detecting anomalies and software vulnerabilities
1. 17/1/17 1
Source: rdn consulting
Hanoi, Jan 17th 2017
Trần Thế Truyền
Deakin University
@truyenoz
prada-research.net/~truyen
truyen.tran@deakin.edu.au
letdataspeak.blogspot.com
goo.gl/3jJ1O0
DEEP LEARNING FOR
DETECTING ANOMALIES AND
SOFTWARE VULNERABILITIES
tranthetruyen
5. OUR APPROACH TO SECURITY: (DEEP)
MACHINE LEARNING
Usual detection of attacks are based on profiling and human skills
But attacking tactics change overtime, creating zero-day attacks
Systems are very complex now, and no humans can cover all
à It is best to use machine to learn continuously and automatically.
à Humans can provide feedbacks for the machine to correct itself.
à Deep learning is on the rise.
For now: It is best for human and machine to co-operate.
17/1/17 5
7. AGENDA
Part I: Introduction to deep learning
A brief history
Top 3 architectures
Unsupervised learning
Part II: Anomaly detection
Part III: Software vulnerabilities
17/1/17 7
9. DEEP LEARNING IS SUPER HOT
17/1/17 9
“deep learning”
+ data
“deep
learning” +
intelligence
Dec, 2016
10. 2016
DEEP LEARNING IS NEURAL NETS, BUT …
http://blog.refu.co/wp-content/uploads/2009/05/mlp.png
1986
17/1/17 10
11. TWO LEARNING PARADIGMS
Supervised learning
(mostly machine)
A à B
Unsupervised learning
(mostly human)
Will be quickly solved for “easy”
problems (Andrew Ng)
17/1/17 11
12. KEY IN MACHINE LEARNING: FEATURE
ENGINEERING
In typical machine learning projects, 80-90% effort is on feature engineering
A right feature representation doesn’t need much work. Simple linear methods often work
well.
Text: BOW, n-gram, POS, topics, stemming, tf-idf, etc.
Software: token, LOC, API calls, #loops, developer reputation, team complexity,
report readability, discussion length, etc.
Try yourself on Kaggle.com!
17/1/17 12
20. DEEP AUTOENCODER – SELF
RECONSTRUCTION OF DATA
17/1/17
20
Auto-encoderFeature detector
Representation
Raw data
Reconstruction
Deep Auto-encoder
Encoder
Decoder
21. GENERATIVE MODELS
17/1/17 21
Many applications:
• Text to speech
• Simulate data that are hard to obtain/share in real life (e.g.,
healthcare)
• Generate meaningful sentences conditioned on some input
(foreign language, image, video)
• Semi-supervised learning
• Planning
22. A FAMILY: RBM à DBN à DBM
17/1/17 22
energy
Restricted Boltzmann Machine
(~1994, 2001)
Deep Belief Net
(2006)
Deep Boltzmann Machine
(2009)
23. GAN: GENERATIVE ADVERSARIAL NETS
(GOODFELLOW ET AL, 2014)
Yann LeCun: GAN is one of best idea in past 10 years!
Instead of modeling the entire distribution of data, learns to map ANY random distribution into the
region of data, so that there is no discriminator that can distinguish sampled data
from real data.
Any random distribution
in any space
Binary discriminator,
usually a neural
classifier
Neural net that maps
z à x
24. GAN: GENERATED SAMPLES
The best quality pictures generated thus far!
17/1/17 24
Real Generated
http://kvfrans.com/generative-adversial-networks-explained/
25. DEEP LEARNING IN COGNITIVE DOMAINS
17/1/17 25
http://media.npr.org/
http://cdn.cultofmac.com/
Where human can
recognise, act or answer
accurately within seconds
blogs.technet.microsoft.com
26. dbta.com
DEEP LEARNING IN NON-COGNITIVE DOMAINS
Where humans need extensive training to do well
Domains that demand transparency & interpretability.
… healthcare
… security
… genetics, foods, water …
17/1/17 26
TEKsystems
28. ANOMALY DETECTION
USING UNSUPERVISED LEARNING
17/1/17 28
dbta.com
This work is partially supported by the Telstra-Deakin Centre of Excellence in Big Data and Machine Learning
29. AGENDA
Part I: Introduction to deep learning
Part II: Anomaly detection
Multichannel
Unusual mixed-data co-occurrence
Object lifetime model
Part III: Software vulnerabilities
17/1/17 29
30. BUT – we cannot
define anomaly
apriori
Strategy: learn normality, anything does not fit in is abnormal
17/1/17 30
31. PROJECT: DISCOVERY IN TELSTRA
SECURITY OPERATIONS
We use smart people and smart tools to
discover unknown malicious or risky behaviour to
inform and protect Telstra and its customers.
Discovery workflow:
32. SOFTWARE: SNAPSHOT
A) Anomaly detection systems
B) The main screen showing
the residual signal, and the
threshold for anomaly
detection
C) Top anomalies
D) Event details for selected
anomaly
33. MULTICHANNEL FRAMEWORK
Detect common anomalous events that happen across multiple
information channels
1. Cross-channel Autoencoder (CC-AE)
General framework
channel 1
channel N
anomaly
detection
anomaly
detection
…
…
anomalies 1
anomalies N
anomaly
detection
common cross-
channel
anomalies
35. METHOD: CROSS-CHANNEL AUTOENCODER
1. Single channel anomaly detection
For each channel, model the data with an autoencoder
Determine the anomalies by analysing the reconstruction errors
2. Augmenting the reconstruction errors
Augment the reconstruction errors across channels
Model the reconstruction errors with an autoencoder
3. Cross-channel anomaly detection
Determine the cross-channel anomalies by analysing the reconstruction errors
36. RESULTS 1 – NEWS DATA
A channel is defined to be the stream of articles about a specific topic
published by a news agency, e.g. economy-related articles from BBC
3 news agencies: BBC, Reuters, and CNN
9 predefined topics: politics, sports, health, entertainment, world-news,
technology, and Asian news
Free-form text data
Feature extraction: Bag of words representation
Anomaly injection: Breastfeeding articles
38. RESULTS 2 – DEAKIN SQUID DATA
Squid is a caching and forwarding web proxy.
Each server is defined to be a channel
7 channels with inbound and outbound network data
Sample datapoint:
Bag of words representation
Anomaly Injection: URLs from URLBlackList.com
1469032590.233 19 10.132.169.158 TCP_HIT/200 4771 GET http://Europe.cnn.com/
EUROPE/potd/2001/04/17/tz.pulltizer.ap.jp - NONE/- image/jpeg
1 32 4 5 6 7
7 cont’d 8 9 10
40. AGENDA
Part I: Introduction to deep learning
Part II: Anomaly detection
Multichannel
Unusual mixed-data co-occurrence
Object lifetime model
Part III: Software vulnerabilities
17/1/17 40
47. AGENDA
Part I: Introduction to deep learning
Part II: Anomaly detection
Multichannel
Unusual mixed-data co-occurrence
Object lifetime model
Part III: Software vulnerabilities
17/1/17 47
48. OBJECT LIFETIME MODELS
Objects with a life
User
Devices
Account
Detect unusual behavior at a given time given object’s history
I.e., (Low) conditional probability of next event/action/observation given the history
Two properties:
Irregular time by internal activities
Intervention by external agents
17/1/17 48
49. DEEPEVOLVE: A MODEL OF EVOLVING BODY
States are a dynamic memory process → LSTM moderated by time and
intervention
Discrete observations → vector embedding
Time and previous intervention → “forgetting” of old states
Current intervention → controlling the current states
17/1/17 49
55. TRADITIONAL METHOD: FEATURE
ENGINEERING + CLASSIFIER
Protocols
Domains, countries
IP-based analysis
Lexical analysis
Query analysis
17/1/17 55
Handling shortening &
dynamically-generated queries
N-grams
Special characters
Blacklist
56. NEW METHOD: LEARNABLE CONVOLUTION AS
FEATURE DETECTOR
17/1/17
56
http://colah.github.io/posts/2015-09-NN-Types-FP/
Learnable kernels
andreykurenkov.com
Feature detector,
often many
57. END-TO-END MODEL OF MALICIOUS URLS
17/1/17 57
Safe/Unsafe
max-pooling
convolution --
motif detection
Embedding (may
be one-hot)
Prediction with FFN
1
2
3
4
record
vector
char
vector
h t t p : / / w w w . s
Train on 900K malicious URLs
1,000K good URLs
Accuracy: 96%
No feature engineering!
58. AGENDA
Part I: Introduction to deep learning
Part II: Anomaly detection
Part III: Software vulnerabilities
Malicious URL detection
Unusual source code
Code vulnerabilities
17/1/17 58
60. MOTIVATIONS
Software is eating the world.
IoT development is exploding. Software security is an
extremely critical issue
Vulerable source files: 0.3-5%, depending on code review
policy & quality ot code.
General approach: Machine learning instead of human
manual effort and programming heuristics
Many software metrics have been found:
Bugs, code complexity, churn rate, developer network activity
metrics, fault history metrics,
Question: can machine learn all of these by itself?
17/1/17 60
61. APPROACH: CODE MODELING
Open source code is massive
Bad coding is often the sign of bugs and security holes
Malicious code may be different from the safe code
Ideas:
A code model assigns probability to a piece of code
Given the code context, if conditional probability of a code piece per token is
low compared to the rest à unusual code à more likely to contain defects
or security vulnerability
17/1/17 61
62. A DEEP LANGUAGE MODEL FOR
SOFTWARE CODE (DAM ET AL, FSE’16 SE+NL)
A good language model for source code would capture the long-term
dependencies
The model can be used for various prediction tasks, e.g. defect
prediction, code duplication, bug localization, etc.
17/1/17 62
Slide by Hoa Khanh Dam
63. CHARACTERISTICS OF SOFTWARE CODE
Repetitiveness
E.g. for (int i = 0; i < n; i++)
Localness
E.g. for (int size may appear more often that for (int i in some source files.
Rich and explicit structural information
E.g. nested loops, inheritance hierarchies
Long-term dependencies
try and catch (in Java) or file open and close are not immediately followed each other.
63
Slide by Hoa Khanh Dam
64. A LANGUAGE MODEL FOR SOFTWARE
CODE
Given a code sequence s= <w1, …, wk>, a language model estimate the
probability distribution P(s):
64
Slide by Hoa Khanh Dam
65. TRADITIONAL MODEL: N-GRAMS
Truncates the history length to n-1 words (usually 2 to 5 in practice)
Useful and intuitive in making use of repetitive sequential patterns in code
Context limited to a few code elements
Not sufficient in complex SE prediction tasks.
As we read a piece of code, we understand each code token based on our
understanding of previous code tokens, i.e. the information persists.
65
Slide by Hoa Khanh Dam
66. NEW METHOD: LONG SHORT-TERM
MEMORY (LSTM)
17/1/17 66
ct
it
ft
Input
67. CODE LANGUAGE MODEL
67
Previous work has applied RNNs to model software code (White et al, MSR 2015)
RNNs however do not capture the long-term dependencies in code
Slide by Hoa Khanh Dam
68. EXPERIMENTS
Built dataset of 10 Java projects: Ant, Batik, Cassandra, Eclipse-E4, Log4J, Lucene,
Maven2, Maven3, Xalan-J, and Xerces.
Comments and blank lines removed. Each source code file is tokenized to produce a
sequence of code tokens.
Integers, real numbers, exponential notation, hexadecimal numbers replaced with
<num> token, and constant strings replaced with <str> token.
Replaced less “popular” tokens with <unk>
Code corpus of 6,103,191 code tokens, with a vocabulary of 81,213 unique tokens.
68
Slide by Hoa Khanh Dam
69. EXPERIMENTS (CONT.)
69
Both RNN and LSTM improve with more training data (whose size grows with sequence length).
LSTM consistently performs better than RNN: 4.7% improvement to 27.2% (varying sequence
length), 10.7% to 37.9% (varying embedding size).
Slide by Hoa Khanh Dam
70. AGENDA
Part I: Introduction to deep learning
Part II: Anomaly detection
Part III: Software vulnerabilities
Malicious URL detection
Unusual source code
Code vulnerabilities
17/1/17 70
71. METHOD-1: LD-RNN FOR
SEQUENCE CLASSIFICATION
(CHOETKIERTIKUL ET AL, WORK IN PROGRESS)
LD = Long Deep
LSTM for document representation
Highway-net with tied parameters for
vulnerability score
17/1/17 71
pooling
Embed
LSTM
Vulnerability
score
W1 W2 W3 W4 W5 W6
Recurrent Highway NetRegression
Standardize XD logging to align with
document representation
h1
h2 h3
h4 h5
h6
….
….
….
….
72. METHOD-2: DEEP SEQUENTIAL MULTI-
INSTANCE LEARNING
Code file as a bag
Methods as instances
Data are sequential
17/1/17 72
Headers
Method 1
Method 2
Method N
Vulnerability
level
.
.
.
73. COLUMN BUNDLE FOR N-TO-1 MAPPING
(PHAM ET AL, WORK IN PROGRESS)
17/1/17 73
Function A
Function B
output
Column representation
75. The popular
On the rise
The black sheep
The good old • Representation learning (RBM, DBN, DBM, DDAE)
• Ensemble
• Back-propagation
• Adaptive stochastic gradient
• Dropouts & batch-norm
• Rectifier linear transforms & skip-connections
• Highway nets, LSTM & CNN
• Differential Turing machines
• Memory, attention & reasoning
• Reinforcement learning & planning
• Lifelong learning
• Group theory (Lie algebra, renormalisation group, spin-
class)
76. WHY DEEP LEARNING WORKS: PRINCIPLES
Expressiveness
Can represent the complexity of the world à Feedforward nets are universal function
approximator
Can compute anything computable à Recurrent nets are Turing-complete
Learnability
Have mechanism to learn from the training signals à Neural nets are highly trainable
Generalizability
Work on unseen data à Deep nets systems work in the wild (Self-driving cars, Google
Translate/Voice, AlphaGo)
17/1/17 76
77. WHEN DEEP LEARNING WORKS
Lots of data (e.g., millions)
Strong, clean training signals (e.g., when human can provide correct labels –
cognitive domains).
Andrew Ng of Baidu: When humans do well within sub-second.
Data structures are well-defined (e.g., image, speech, NLP, video)
Data is compositional (luckily, most data are like this)
The more primitive (raw) the data, the more benefit of using deep learning.
17/1/17 77
78. BONUS: HOW TO POSITION
17/1/17 78
“[…] the dynamics of the game will evolve. In the long
run, the right way of playing football is to position yourself
intelligently and to wait for the ball to come to you. You’ll
need to run up and down a bit, either to respond to how
the play is evolving or to get out of the way of the scrum
when it looks like it might flatten you.” (Neil Lawrence,
7/2015, now with Amazon)
http://inverseprobability.com/2015/07/12/Thoughts-on-ICML-2015/
79. THE ROOM IS WIDE OPEN
Architecture engineering
Non-cognitive apps
Unsupervised learning
Graphs
Learning while preserving privacy
Modelling of domain invariance
Better data efficiency
Multimodality
Learning under adversarial stress
Better optimization
Going Bayesian
http://smerity.com/articles/2016/architectures_are_the_new_feature_engineering.html