The document discusses using sequence-to-sequence learning models for tasks like machine translation, question answering, and image captioning. It describes how recurrent neural networks like LSTMs can be used in seq2seq models to incorporate memory. Finally, it proposes that seq2seq models can be enhanced by incorporating external memory structures like knowledge bases to enable capabilities like causal reasoning for question answering.
Explores the type of structure learned by Convolutional Neural Networks, the applications where they're most valuable and a number of appropriate mental models for understanding deep learning.
Convolutional Neural Networks : Popular Architecturesananth
In this presentation we look at some of the popular architectures, such as ResNet, that have been successfully used for a variety of applications. Starting from the AlexNet and VGG that showed that the deep learning architectures can deliver unprecedented accuracies for Image classification and localization tasks, we review other recent architectures such as ResNet, GoogleNet (Inception) and the more recent SENet that have won ImageNet competitions.
Explores the type of structure learned by Convolutional Neural Networks, the applications where they're most valuable and a number of appropriate mental models for understanding deep learning.
Convolutional Neural Networks : Popular Architecturesananth
In this presentation we look at some of the popular architectures, such as ResNet, that have been successfully used for a variety of applications. Starting from the AlexNet and VGG that showed that the deep learning architectures can deliver unprecedented accuracies for Image classification and localization tasks, we review other recent architectures such as ResNet, GoogleNet (Inception) and the more recent SENet that have won ImageNet competitions.
The presentation is coverong the convolution neural network (CNN) design.
First,
the main building blocks of CNNs will be introduced. Then we systematically
investigate the impact of a range of recent advances in CNN architectures and
learning methods on the object categorization (ILSVRC) problem. In the
evaluation, the influence of the following choices of the architecture are
tested: non-linearity (ReLU, ELU, maxout, compatibility with batch
normalization), pooling variants (stochastic, max, average, mixed), network
width, classifier design (convolution, fully-connected, SPP), image
pre-processing, and of learning parameters: learning rate, batch size,
cleanliness of the data, etc.
TensorFlow Korea 논문읽기모임 PR12 243째 논문 review입니다
이번 논문은 RegNet으로 알려진 Facebook AI Research의 Designing Network Design Spaces 입니다.
CNN을 디자인할 때, bottleneck layer는 정말 좋을까요? layer 수는 많을 수록 높은 성능을 낼까요? activation map의 width, height를 절반으로 줄일 때(stride 2 혹은 pooling), channel을 2배로 늘려주는데 이게 최선일까요? 혹시 bottleneck layer가 없는 게 더 좋지는 않은지, 최고 성능을 내는 layer 수에 magic number가 있는 건 아닐지, activation이 절반으로 줄어들 때 channel을 2배가 아니라 3배로 늘리는 게 더 좋은건 아닌지?
이 논문에서는 하나의 neural network을 잘 design하는 것이 아니라 Auto ML과 같은 기술로 좋은 neural network을 찾을 수 있는 즉 좋은 neural network들이 살고 있는 좋은 design space를 design하는 방법에 대해서 얘기하고 있습니다. constraint이 거의 없는 design space에서 human-in-the-loop을 통해 좋은 design space로 그 공간을 좁혀나가는 방법을 제안하였는데요, EfficientNet보다 더 좋은 성능을 보여주는 RegNet은 어떤 design space에서 탄생하였는지 그리고 그 과정에서 우리가 당연하게 여기고 있었던 design choice들이 잘못된 부분은 없었는지 아래 동영상에서 확인하실 수 있습니다~
영상링크: https://youtu.be/bnbKQRae_u4
논문링크: https://arxiv.org/abs/2003.13678
Scene classification using Convolutional Neural Networks - Jayani WithanawasamWithTheBest
Scene Classification is used in Convolutional Neural Networks (CNNs). We seek to redefine computer vision as an AI problem, understand the importance of scene classification as well as challenges, and the difference between traditional machine learning and deep learning. Additionally, we discuss CNNs, using caffe for implementing CNNs and importact reosources to imorove.
CNNs
Jayani Withanawasam
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...Wanjin Yu
ICME2019 Tutorial: Intelligent Image Enhancement and Restoration - From Prior Driven Model to Advanced Deep Learning Part 3: prior embedding deep super resolution
https://telecombcn-dl.github.io/2017-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
Large Convolutional Network models have
recently demonstrated impressive classification
performance on the ImageNet benchmark
(Krizhevsky et al., 2012). However
there is no clear understanding of why they
perform so well, or how they might be improved.
In this paper we address both issues.
We introduce a novel visualization technique
that gives insight into the function of intermediate
feature layers and the operation of
the classifier. Used in a diagnostic role, these
visualizations allow us to find model architectures
that outperform Krizhevsky et al. on
the ImageNet classification benchmark. We
also perform an ablation study to discover
the performance contribution from different
model layers. We show our ImageNet model
generalizes well to other datasets: when the
softmax classifier is retrained, it convincingly
beats the current state-of-the-art results on
Caltech-101 and Caltech-256 datasets
We trained a large, deep convolutional neural network to classify the 1.2 million
high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 dif-
ferent classes. On the test data, we achieved top-1 and top-5 error rates of 37.5%
and 17.0% which is considerably better than the previous state-of-the-art. The
neural network, which has 60 million parameters and 650,000 neurons, consists
of five convolutional layers, some of which are followed by max-pooling layers,
and three fully-connected layers with a final 1000-way softmax. To make train-
ing faster, we used non-saturating neurons and a very efficient GPU implemen-
tation of the convolution operation. To reduce overfitting in the fully-connected
layers we employed a recently-developed regularization method called “dropout”
that proved to be very effective. We also entered a variant of this model in the
ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%,
compared to 26.2% achieved by the second-best entry.
AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.GeeksLab Odessa
4.6.16 AI&BigData Lab
Upcoming events: goo.gl/I2gJ4H
Поговорим об одной из базовых практических техник обучения нейронных сетей - предобучение, finetuning, transfer learning. В каких случаях применять, какие модели использовать, где их брать и как адаптировать.
It’s long ago, approx. 30 years, since AI was not only a topic for Science-Fiction writers, but also a major research field surrounded with huge hopes and investments. But the over-inflated expectations ended in a subsequent crash and followed by a period of absent funding and interest – the so-called AI winter. However, the last 3 years changed everything – again. Deep learning, a machine learning technique inspired by the human brain, successfully crushed one benchmark after another and tech companies, like Google, Facebook and Microsoft, started to invest billions in AI research. “The pace of progress in artificial general intelligence is incredible fast” (Elon Musk – CEO Tesla & SpaceX) leading to an AI that “would be either the best or the worst thing ever to happen to humanity” (Stephen Hawking – Physicist).
What sparked this new Hype? How is Deep Learning different from previous approaches? Are the advancing AI technologies really a threat for humanity? Let’s look behind the curtain and unravel the reality. This talk will explore why Sundar Pichai (CEO Google) recently announced that “machine learning is a core transformative way by which Google is rethinking everything they are doing” and explain why "Deep Learning is probably one of the most exciting things that is happening in the computer industry” (Jen-Hsun Huang – CEO NVIDIA).
Either a new AI “winter is coming” (Ned Stark – House Stark) or this new wave of innovation might turn out as the “last invention humans ever need to make” (Nick Bostrom – AI Philosoph). Or maybe it’s just another great technology helping humans to achieve more.
Deep Learning - The Past, Present and Future of Artificial IntelligenceLukas Masuch
In the last couple of years, deep learning techniques have transformed the world of artificial intelligence. One by one, the abilities and techniques that humans once imagined were uniquely our own have begun to fall to the onslaught of ever more powerful machines. Deep neural networks are now better than humans at tasks such as face recognition and object recognition. They’ve mastered the ancient game of Go and thrashed the best human players. “The pace of progress in artificial general intelligence is incredible fast” (Elon Musk – CEO Tesla & SpaceX) leading to an AI that “would be either the best or the worst thing ever to happen to humanity” (Stephen Hawking – Physicist).
What sparked this new hype? How is Deep Learning different from previous approaches? Let’s look behind the curtain and unravel the reality. This talk will introduce the core concept of deep learning, explore why Sundar Pichai (CEO Google) recently announced that “machine learning is a core transformative way by which Google is rethinking everything they are doing” and explain why “deep learning is probably one of the most exciting things that is happening in the computer industry“ (Jen-Hsun Huang – CEO NVIDIA).
The presentation is coverong the convolution neural network (CNN) design.
First,
the main building blocks of CNNs will be introduced. Then we systematically
investigate the impact of a range of recent advances in CNN architectures and
learning methods on the object categorization (ILSVRC) problem. In the
evaluation, the influence of the following choices of the architecture are
tested: non-linearity (ReLU, ELU, maxout, compatibility with batch
normalization), pooling variants (stochastic, max, average, mixed), network
width, classifier design (convolution, fully-connected, SPP), image
pre-processing, and of learning parameters: learning rate, batch size,
cleanliness of the data, etc.
TensorFlow Korea 논문읽기모임 PR12 243째 논문 review입니다
이번 논문은 RegNet으로 알려진 Facebook AI Research의 Designing Network Design Spaces 입니다.
CNN을 디자인할 때, bottleneck layer는 정말 좋을까요? layer 수는 많을 수록 높은 성능을 낼까요? activation map의 width, height를 절반으로 줄일 때(stride 2 혹은 pooling), channel을 2배로 늘려주는데 이게 최선일까요? 혹시 bottleneck layer가 없는 게 더 좋지는 않은지, 최고 성능을 내는 layer 수에 magic number가 있는 건 아닐지, activation이 절반으로 줄어들 때 channel을 2배가 아니라 3배로 늘리는 게 더 좋은건 아닌지?
이 논문에서는 하나의 neural network을 잘 design하는 것이 아니라 Auto ML과 같은 기술로 좋은 neural network을 찾을 수 있는 즉 좋은 neural network들이 살고 있는 좋은 design space를 design하는 방법에 대해서 얘기하고 있습니다. constraint이 거의 없는 design space에서 human-in-the-loop을 통해 좋은 design space로 그 공간을 좁혀나가는 방법을 제안하였는데요, EfficientNet보다 더 좋은 성능을 보여주는 RegNet은 어떤 design space에서 탄생하였는지 그리고 그 과정에서 우리가 당연하게 여기고 있었던 design choice들이 잘못된 부분은 없었는지 아래 동영상에서 확인하실 수 있습니다~
영상링크: https://youtu.be/bnbKQRae_u4
논문링크: https://arxiv.org/abs/2003.13678
Scene classification using Convolutional Neural Networks - Jayani WithanawasamWithTheBest
Scene Classification is used in Convolutional Neural Networks (CNNs). We seek to redefine computer vision as an AI problem, understand the importance of scene classification as well as challenges, and the difference between traditional machine learning and deep learning. Additionally, we discuss CNNs, using caffe for implementing CNNs and importact reosources to imorove.
CNNs
Jayani Withanawasam
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...Wanjin Yu
ICME2019 Tutorial: Intelligent Image Enhancement and Restoration - From Prior Driven Model to Advanced Deep Learning Part 3: prior embedding deep super resolution
https://telecombcn-dl.github.io/2017-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
Large Convolutional Network models have
recently demonstrated impressive classification
performance on the ImageNet benchmark
(Krizhevsky et al., 2012). However
there is no clear understanding of why they
perform so well, or how they might be improved.
In this paper we address both issues.
We introduce a novel visualization technique
that gives insight into the function of intermediate
feature layers and the operation of
the classifier. Used in a diagnostic role, these
visualizations allow us to find model architectures
that outperform Krizhevsky et al. on
the ImageNet classification benchmark. We
also perform an ablation study to discover
the performance contribution from different
model layers. We show our ImageNet model
generalizes well to other datasets: when the
softmax classifier is retrained, it convincingly
beats the current state-of-the-art results on
Caltech-101 and Caltech-256 datasets
We trained a large, deep convolutional neural network to classify the 1.2 million
high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 dif-
ferent classes. On the test data, we achieved top-1 and top-5 error rates of 37.5%
and 17.0% which is considerably better than the previous state-of-the-art. The
neural network, which has 60 million parameters and 650,000 neurons, consists
of five convolutional layers, some of which are followed by max-pooling layers,
and three fully-connected layers with a final 1000-way softmax. To make train-
ing faster, we used non-saturating neurons and a very efficient GPU implemen-
tation of the convolution operation. To reduce overfitting in the fully-connected
layers we employed a recently-developed regularization method called “dropout”
that proved to be very effective. We also entered a variant of this model in the
ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%,
compared to 26.2% achieved by the second-best entry.
AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.GeeksLab Odessa
4.6.16 AI&BigData Lab
Upcoming events: goo.gl/I2gJ4H
Поговорим об одной из базовых практических техник обучения нейронных сетей - предобучение, finetuning, transfer learning. В каких случаях применять, какие модели использовать, где их брать и как адаптировать.
It’s long ago, approx. 30 years, since AI was not only a topic for Science-Fiction writers, but also a major research field surrounded with huge hopes and investments. But the over-inflated expectations ended in a subsequent crash and followed by a period of absent funding and interest – the so-called AI winter. However, the last 3 years changed everything – again. Deep learning, a machine learning technique inspired by the human brain, successfully crushed one benchmark after another and tech companies, like Google, Facebook and Microsoft, started to invest billions in AI research. “The pace of progress in artificial general intelligence is incredible fast” (Elon Musk – CEO Tesla & SpaceX) leading to an AI that “would be either the best or the worst thing ever to happen to humanity” (Stephen Hawking – Physicist).
What sparked this new Hype? How is Deep Learning different from previous approaches? Are the advancing AI technologies really a threat for humanity? Let’s look behind the curtain and unravel the reality. This talk will explore why Sundar Pichai (CEO Google) recently announced that “machine learning is a core transformative way by which Google is rethinking everything they are doing” and explain why "Deep Learning is probably one of the most exciting things that is happening in the computer industry” (Jen-Hsun Huang – CEO NVIDIA).
Either a new AI “winter is coming” (Ned Stark – House Stark) or this new wave of innovation might turn out as the “last invention humans ever need to make” (Nick Bostrom – AI Philosoph). Or maybe it’s just another great technology helping humans to achieve more.
Deep Learning - The Past, Present and Future of Artificial IntelligenceLukas Masuch
In the last couple of years, deep learning techniques have transformed the world of artificial intelligence. One by one, the abilities and techniques that humans once imagined were uniquely our own have begun to fall to the onslaught of ever more powerful machines. Deep neural networks are now better than humans at tasks such as face recognition and object recognition. They’ve mastered the ancient game of Go and thrashed the best human players. “The pace of progress in artificial general intelligence is incredible fast” (Elon Musk – CEO Tesla & SpaceX) leading to an AI that “would be either the best or the worst thing ever to happen to humanity” (Stephen Hawking – Physicist).
What sparked this new hype? How is Deep Learning different from previous approaches? Let’s look behind the curtain and unravel the reality. This talk will introduce the core concept of deep learning, explore why Sundar Pichai (CEO Google) recently announced that “machine learning is a core transformative way by which Google is rethinking everything they are doing” and explain why “deep learning is probably one of the most exciting things that is happening in the computer industry“ (Jen-Hsun Huang – CEO NVIDIA).
Deep learning algorithms have drawn the attention of researchers working in the field of computer vision, speech
recognition, malware detection, pattern recognition and natural language processing. In this paper, we present an overview of
deep learning techniques like Convolutional neural network, deep belief network, Autoencoder, Restricted Boltzmann machine
and recurrent neural network. With this, current work of deep learning algorithms on malware detection is shown with the
help of literature survey. Suggestions for future research are given with full justification. We also showed the experimental
analysis in order to show the importance of deep learning techniques.
The field of Artificial Intelligence (AI) has been revitalized in this decade, primarily due to the large-scale application of Deep Learning (DL) and other Machine Learning (ML) algorithms. This has been most evident in applications like computer vision, natural language processing, and game bots. However, extraordinary successes within a short period of time have also had the unintended consequence of causing a sharp difference of opinion in research and industrial communities regarding the capabilities and limitations of deep learning. A few questions you might have heard being asked (or asked yourself) include:
a. We don’t know how Deep Neural Networks make decisions, so can we trust them?
b. Can Deep Learning deal with highly non-linear continuous systems with millions of variables?
c. Can Deep Learning solve the Artificial General Intelligence problem?
The goal of this seminar is to provide a 1000-feet view of Deep Learning and hopefully answer the questions above. The seminar will touch upon the evolution, current state of the art, and peculiarities of Deep Learning, and share thoughts on using Deep Learning as a tool for developing power system solutions.
This is an introduction to deep learning presented to Plymouth University students. In the introduction it is explained how a neural network works. In the practical section it is shown how to use Tensorflow for building simple models. Finally the case studies, how to use deep learning in real world applications.
Image Captioning Generator using Deep Machine Learningijtsrd
Technologys scope has evolved into one of the most powerful tools for human development in a variety of fields.AI and machine learning have become one of the most powerful tools for completing tasks quickly and accurately without the need for human intervention. This project demonstrates how deep machine learning can be used to create a caption or a sentence for a given picture. This can be used for visually impaired persons, as well as automobiles for self identification, and for various applications to verify quickly and easily. The Convolutional Neural Network CNN is used to describe the alphabet, and the Long Short Term Memory LSTM is used to organize the right meaningful sentences in this model. The flicker 8k and flicker 30k datasets were used to train this. Sreejith S P | Vijayakumar A "Image Captioning Generator using Deep Machine Learning" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-4 , June 2021, URL: https://www.ijtsrd.compapers/ijtsrd42344.pdf Paper URL: https://www.ijtsrd.comcomputer-science/artificial-intelligence/42344/image-captioning-generator-using-deep-machine-learning/sreejith-s-p
Gives an Introduction to Deep learning, What can you achieve with deep learning. What is deep learning's relationship with machine learning. Technical basics of working of deep learning. Introduction to LSTM. How LSTM can be used for Text classification. Results obtained.. Practical recommendations.
Recurrent Neural Networks (RNNs) represent the reference class of Deep Learning models for learning from sequential data. Despite the widespread success, a major downside of RNNs and commonly derived ‘gating’ variants (LSTM, GRU) is given by the high cost of the involved training algorithms. In this context, an increasingly popular alternative is the Reservoir Computing (RC) approach, which enables limiting the training algorithm to operate only on a restricted set of (output) parameters. RC is appealing for several reasons, including the amenability of being implemented in low-powerful edge devices, enabling adaptation and personalization in IoT and cyber-physical systems applications.
This webinar will introduce Reservoir Computing from scratch, covering all the fundamental design topics as well as good practices. It is targeted to both researchers and practitioners that are interested in setting up fastly-trained Deep Learning models for sequential data.
Similar to Big Data Intelligence: from Correlation Discovery to Causal Reasoning (20)
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...Wanjin Yu
ICME2019 Tutorial: Intelligent Image Enhancement and Restoration - From Prior Driven Model to Advanced Deep Learning Part 4: retinex model based low light enhancement
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...Wanjin Yu
ICME2019 Tutorial: Intelligent Image Enhancement and Restoration - From Prior Driven Model to Advanced Deep Learning Part 2: text centric image style transfer
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...Wanjin Yu
ICME2019 Tutorial: Intelligent Image Enhancement and Restoration - From Prior Driven Model to Advanced Deep Learning Part 1: prior embedding deep rain removal
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptxBrad Spiegel Macon GA
Brad Spiegel Macon GA’s journey exemplifies the profound impact that one individual can have on their community. Through his unwavering dedication to digital inclusion, he’s not only bridging the gap in Macon but also setting an example for others to follow.
Italy Agriculture Equipment Market Outlook to 2027harveenkaur52
Agriculture and Animal Care
Ken Research has an expertise in Agriculture and Animal Care sector and offer vast collection of information related to all major aspects such as Agriculture equipment, Crop Protection, Seed, Agriculture Chemical, Fertilizers, Protected Cultivators, Palm Oil, Hybrid Seed, Animal Feed additives and many more.
Our continuous study and findings in agriculture sector provide better insights to companies dealing with related product and services, government and agriculture associations, researchers and students to well understand the present and expected scenario.
Our Animal care category provides solutions on Animal Healthcare and related products and services, including, animal feed additives, vaccination
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC
Ellisha Heppner, Grant Management Lead, presented an update on APNIC Foundation to the PNG DNS Forum held from 6 to 10 May, 2024 in Port Moresby, Papua New Guinea.
1.Wireless Communication System_Wireless communication is a broad term that i...JeyaPerumal1
Wireless communication involves the transmission of information over a distance without the help of wires, cables or any other forms of electrical conductors.
Wireless communication is a broad term that incorporates all procedures and forms of connecting and communicating between two or more devices using a wireless signal through wireless communication technologies and devices.
Features of Wireless Communication
The evolution of wireless technology has brought many advancements with its effective features.
The transmitted distance can be anywhere between a few meters (for example, a television's remote control) and thousands of kilometers (for example, radio communication).
Wireless communication can be used for cellular telephony, wireless access to the internet, wireless home networking, and so on.
This 7-second Brain Wave Ritual Attracts Money To You.!nirahealhty
Discover the power of a simple 7-second brain wave ritual that can attract wealth and abundance into your life. By tapping into specific brain frequencies, this technique helps you manifest financial success effortlessly. Ready to transform your financial future? Try this powerful ritual and start attracting money today!
Big Data Intelligence: from Correlation Discovery to Causal Reasoning
1. Big Data Intelligence: from Correlation
Discovery to Causal Reasoning
College of Computer Science
Zhejiang University
Fei Wu
http://mypage.zju.edu.cn/wufei/
2019 July
2. Table of Content
1. Correlation discovery via seq2seq learning
2. Seq2seq learning via memory model
3. From correlation discovery to causal reasoning
4. Conclusion
2
3. Big data intelligence: From big data to knowledge
3
Big data is valuable, but knowledge is more powerful.
Data
Information
Knowledge
Wisdom
1. Collect, transmit and
aggregate data from various
types of sources.
2. Concentrate data to improve
the density of data value.
3. Analyze data in depth and
find the knowledge.
Future
4. From Big data to Knowledge and Decision
From Data to Knowledge to Decision
5. Yueting Zhuang, Fei Wu, Chun Chen, Yunhe Pan, Challenges and Opportunities: From Big Data
to Knowledge in AI 2.0, Frontiers of Information Technology & Electronic Engineering,
2017,18(1):3-14
From Big data to Knowledge and Decision
7. Latent
correlation
Visual features
Acoustical features
Visual features
500D
Acoustical features
400D
(3, 9,5,…,2,8,6)
(34,56,49,…,47,45)
(45,51,43,…,53,59)
(61,41,55,…,43,42)
(54,52,63,…,57,48)
(39,36,46,…,55,56)
(2,8,6,…,3,7,5)
(2,8,6,…,3,8,6)
(4,7, 3,…2,,4,7)
(5,6,8,…, 5,5,6)
Hotelling, H., Relations Between Two Sets of Variates, Biometrika 28 (3-4): 321-377,1936
CCA (Canonical Correlation Analysis) and its extensions
Kernel CCA, Sparse CCA, Sparse Structure CCA;2D CCA, local 2D-CCA, sparse
2D-CCA, 3-D CCA
correlation learning in a shallow manner
9. The shallow model: The Representation in term of bag of latent topics
First, Latent Dirichlet Allocation(LDA) is conducted to perform uni-modal topic
modeling for images and texts respectively.
correlation learning in a shallow manner
10. The shallow model: Multi-modal Mutual Topic Reinforce Modeling
Second, cross-modal topics (reflecting the same semantic information)
are encouraged and therefore enhanced.
Yanfei Wang, Fei Wu, Jun Song, Xi Li, Yueting Zhuang, Multi-modal Mutual Topic Reinforce Modeling for Cross-media
Retrieval, ACM Multimedia (FULL paper), 2014
Yueting Zhuang, Haidong Gao, Fei Wu, Siliang Tang, Yin Zhang, Zhongfei Zhang, Probabilistic Word Selection via Topic
Modeling, IEEE Transactions on Knowledge and Data Engineering(accepted)
correlation learning in a shallow manner
11. In order to find relevant correlation between multi-modal data, we should bridge
both semantic-gap and heterogeneity gap
Audio
Video
Webpage
Relevant multi-modal Data
The common representation space
across modalities
correlation learning in a shallow manner
12. Cross-Media Hashing with Neural Networks, ACM Multimedia 2014
Learning Multimodal Neural Network with Ranking Examples, ACM Multimedia 2014
Deep multi-model embedding
Learn the deep representations(embedding) of multi-modal data via deep models
(i.e., cnn or rnn) respectively.
devise a decent loss layer to fine-tune the multi-modal deep models to generate a
more discriminative common space
Back-propagate errors
for fine-tuning
Multi-modal
embedding
multimodal embedding via deep learning
13. isolated semantics embedding: visual objects or textual entities are bi-
directionally grounded
compositional semantics embedding: phrases or neighboring visual objects
are bi-directionally grounded
Deep Compositional Cross-modal Learning to Rank via Local-Global Alignment, ACM Multimedia 2015(Full Paper)
Multi-modal Deep Embedding via Hierarchical Grounded Compositional SemanticsIEEE Transactions on Circuits and
Systems for Video Technology, 28(1):76-89,2018
compositional semanticsisolated semanticsa pair of image-sentence
compositional multimodal embedding via deep learning
14. preserves not only the relevance between an image and text, but also the relevance between
visual objects and textual words
compositional multimodal embedding via deep learning
18. The architecture of Seq2Seq Learning
w1 w2 w3 w4 w5
Encoder
Decoder
v1 v2 v3 v4
correlation learning via seq2seq architecture
19. Part of Speech
Parsing
Semantic Analysis
Jordan likes playing basketball
Jordan/NNP likes/VBZ playing/VBG basketball/NN
S
NP
VP
Jordan/NNP
likes/VBZ
S VP playing/VBG
NP basketball/NN
AD V A1
AD V A1
Jordan likes playing basketball
乔丹 喜欢 打篮球
seq2seq learning: machine translation
20. seq2seq learning: machine translation
Encoder
Jordan likes playing basketball
Encoder Encoder Encoder
Decoder Decoder Decoder
乔丹 喜欢 打篮球
Data-driven learning via amounts of bilingual corpus
(the aligned source-target sentences )
21. seq2seq learning: visual Q-A
Convolutional
Neural Network
what is the man doing ?
Encoder Decoder
Riding a bike
23. seq2seq learning: video action classification
Encoder
Decoder
NO ACTION pitching pitching pitching NO ACTION
24. Seq2seq learning: put it together
One Input
Many
Output
Many Input
One Output
One Input
One Output
Image
Classification
Image
Captioning
Sentiment
Analysis
25. Seq2seq learning: put it together
Machine
Translation
Video
Storyline
Many Input
Many Output
Many Input
Many Output
26. 26
The basic unit of deep learning:cell/neuron
aj :Activation value of unit j
wj,i :Weight on link from unit j to unit i
ini :Weighted sum of inputs to unit i
ai :Activation value of unit i
g :Activation function
Biological inspiration: Modeling one neuron
27. Neural Networks are modeled as collections of neurons that are connected in an
acyclic graph (layer-wise organization).
the most common layer type is the fully-connected layer in which neurons
between two adjacent layers are fully pairwise connected, but neurons within a
single layer share no connections.
Feed-forward Neural Network
28. Multi-Layer Perceptron(MLP) is by nature a feedforward directed acyclic
network .
An MLP consists of multiple layers and can map input data to output data via
a set of nonlinear activation functions. MLP utilizes a supervised learning
technique called backpropagation for training the network.
MappingInput Output
non-linear
end-to-end
differentiable
sequential
Feed-forward Neural Network
29. Recurrent Neural Network: deal with a sequence of vectors x by
applying a recurrence formula at every time step
Predicted
output
Input
sequence
new state
function f with
Parameters W, U
and V
old state
input x at
Time t
Notice: the same function and the same set of parameters are shared at every time step
, , 1( , )t W U V t th f h x
Recurrent Neural Network
30. Recurrent Neural Network: An RNN has recurrent connections (connections to
previous time steps of the same layer).
RNN are powerful but can get extremely complicated. Computations derived from
earlier input are fed back into the network, which gives RNN a kind of memory.
Standard RNNs suffer from both exploding and vanishing gradients due to their
iterative nature.
Mapping
sequence input
(x0…xt)
Embedding vector
(ht)
Recurrent Neural Network
31. LSTM is an RNN devised to deal with exploding and vanishing gradient problems in
RNN.
An LSTM hidden layer consists of a set of recurrently connected blocks, known as
memory cells.
Each of memory cells is connected by three multiplicative units - the input, output and
forget gates.
The input to the cells is multiplied by the activation of the input gate, the output to the
net is multiplied by the output gate, and the previous cell values are multiplied by the
forget gate.
Sepp Hochreiter &Jűrgen Schmidhuber, Long short-term memory, Neural computation, Vol. 9(8), pp.
1735--1780, MIT Press, 1997
Long Short-Term Memory (LSTM) Model
33. Table of Content
1. Correlation discovery via seq2seq learning
2. Seq2seq learning via memory model
3. From correlation discovery to causal reasoning
4. Conclusion
33
34. The behavior of the computer at any moment is determined by the symbols which he is
observing and his 'state of mind' at that moment.” – Alan Turing
Three kinds of memories in our brain
Sensory memory
(multi-modal perception)
duration:< 5 sec
Working memory
(intuition, inspiration and reasoning)
duration:< 30 sec
Long-term memory
(priors and knowledge)
duration: 1 sec--lifelong
Baddeley, A., Working Memory, Science, 1992, 255(5044):556-559
35. From Turing Machine to Neural Turing Machine
35
A.M.Turing, On Computable Numbers with an Application to the Entscheidungsproblem,
Proceedings of the London Mathematical Society, Ser. 2, Vol. 42, 1937
36. Alex Graves, et al., Hybrid computing using a neural network with dynamic external memory, Nature 538, 471–
476,2016
Graves Alex, et al., Neural Turing Machines
Jason Weston, et al., Memory Networks, arXiv:1410.3916
From Turing Machine to Neural Turing Machine
37. seq2seq Q-A via knowledge memory
document
question
answer
Temporal Interaction and Causation Influence in Community-based Question Answering, IEEE Transactions on
Knowledge and Data Engineering, 29(10):2304-2317,2017
Temporality-enhanced knowledge memory network for factoid question answering, Frontiers of Information
Technology & Electronic Engineering,19(1):104-115,2018
38. seq2seq Q-A via knowledge memory
Temporal Interaction and Causation Influence in Community-based Question Answering, IEEE Transactions on
Knowledge and Data Engineering, 29(10):2304-2317,2017
Temporality-enhanced knowledge memory network for factoid question answering, Frontiers of Information
Technology & Electronic Engineering,19(1):104-115,2018
40. Reasoning, Attention, Memory (RAM)
Shallow models Deep models Memo
Language model
Neural language model
Mission: how to
effectively utilize
data, priors and
knowledge into the
data-driven
learning manner
Bayesian Learning Bayesian deep learning
Turing Machine Neural Turing Machine
Reinforcement Learning Deep Reinforcement Learning
X Deep or Neural + X
41. Table of Content
1. Correlation discovery via seq2seq learning
2. Knowledge Discovery: from data to knowledge
3. From correlation discovery to causal reasoning
4. Conclusion
41
42. Rooster's crow and the rising of the sun (Rooster does not cause the sun to rise)
Correlation Does Not Mean Causation
43. Causal Inference: Simpson’s Paradox
UC Berkeley gender bias
The admission figures for the fall of 1973 in UC
Berkeley showed that men applying were more
likely than women to be admitted.
Peter J. Bickel,Eugene A. Hammel,O’Connell, J. W, Sex bias in graduate admissions: Data from Berkeley, Science,
187(4175):398-404,1975
when examining the individual departments, it appeared that six out of
85 departments were significantly biased against men, whereas only
four were significantly biased against women. In fact, the pooled and
corrected data showed a "small but statistically significant bias in favor
of women.“ The data from the six largest departments are listed
women tended to apply to competitive departments with low rates of admission even among qualified applicants (such as in the
English Department), whereas men tended to apply to less-competitive departments with high rates of admission among the
qualified applicants (such as in engineering and chemistry).
44. 𝐛
𝐚
<
𝐝
𝐜
,
𝐛′
𝐚′
<
𝐝′
𝐜′
𝐛 + 𝐛′
𝐚 + 𝐚′
>
𝐝 + 𝐝′
𝐜 + 𝐜′
Drug No Drug
Men 81 out of 87 recovered (93%) 234 out of 270 recovered (87%)
Women 192 out of 263 recovered (73%) 55 out of 80 recovered (69%)
Combined data 273 out of 350 recovered (78%) 289 out of 350 recovered (83%)
Results of a study into a new drug, with gender being taken into account
In male patients, drug takers had a better recovery rate than those who went without the drug (93% vs 87%). In
female patients, again, those who took the drug had a better recovery rate than nontakers (73% vs 69%).
However, in the combined population, those who did not take the drug had a better recovery rate than those who did
(83% vs 78%).
The data seem to say that if we know the patient’s gender—male or female—we can prescribe the drug, but if the
gender is unknown we should not! Obviously, that conclusion is ridiculous. If the drug helps men and women, it
must help anyone; our lack of knowledge of the patient’s gender cannot make the drug harmful.
Causal Inference: Simpson’s Paradox
45. 𝐛
𝐚
<
𝐝
𝐜
,
𝐛′
𝐚′
<
𝐝′
𝐜′
𝐛 + 𝐛′
𝐚 + 𝐚′
>
𝐝 + 𝐝′
𝐜 + 𝐜′
Results of a study into a new drug, with gender being taken into account
In order to decide whether the drug will harm or help a patient, we first have to understand the story behind the data—the causal
mechanism that led to, or generated, the results we see.
For instance, suppose we knew an additional fact: Estrogen has a negative effect on recovery, so women are less likely to
recover than men, regardless of the drug. In addition, as we can see from the data, women are significantly more likely to take
the drug than men are. So, the reason the drug appears to be harmful overall is that, if we select a drug user at random, that
person is more likely to be a woman and hence less likely to recover than a random person who does not take the drug.
Put differently, being a woman is a common cause of both drug taking and failure to recover. Therefore, to assess the
effectiveness, we need to compare subjects of the same gender, thereby ensuring that any difference in recovery rates between
those who take the drug and those who do not is not ascribable to estrogen. This means we should consult the segregated data,
which shows us unequivocally that the drug is helpful. This matches our intuition, which tells us that the segregated data is
“more specific,” hence more informative, than the unsegregated data.
Drug No Drug
Men 81 out of 87 recovered (93%) 234 out of 270 recovered (87%)
Women 192 out of 263 recovered (73%) 55 out of 80 recovered (69%)
Combined data 273 out of 350 recovered (78%) 289 out of 350 recovered (83%)
Causal Inference: Simpson’s Paradox
46. Difference between Statistical Learning and Causal Inference
Traditional statistical inference paradigm
Judea Pearl, Causality: models, reasoning, and inference (second edition), Cambridge University Press, 2009
Judea Pearl, Madelyn Glymour, Nicholas P. Jewell, Causal inference in statistics: a primer, John Wiley & Sons, 2016
Data
Inference
Q(P)
(Aspects of P)
Joint
Distribution
P
e.g.,
Infer whether customers who bought product A
would also buy product B ( modeling of the joint distribution of A and B).
Q = P(B | A)
“The object of
statistical methods is
the reduction of data”
(Fisher 1922).
47. From statistical inference to causal inference
Data
Q(P′)
(Aspects of P′)
change
Joint
Distribution
P
Joint
Distribution
P′
Inference
e.g., Estimate P′(sales) if we double the price.
How does P change to P′? New oracle
e.g., Estimate P′(cancer) if we ban smoking.
Difference between Statistical Learning and Causal Inference
48. Causal Inference hierarchy
Observational
Questions
What if we see A
(what is?)
𝑃(𝑦|𝐴)
Action Questions What if we do A (what
if?)
𝑃(𝑦|𝑑𝑜(𝐴))
Counterfactual
Questions
What if we did things
differently
(why?) 𝑃(𝑦′
|𝐴)
Options: with what probability
Actions: B will be true if we do A.
Counterfactuals: B would be different if A were true
49. Theoretical Impediments to Machine Learning With Seven Sparks from the Causal Revolution,
Judea Pearl (2018)
Level(Symbol) Typical
Activity
Typical Questions Examples
Association
𝑃(𝑦|𝑥)
seeing What is?
How would seeing 𝑋
change my belief in 𝑌
What does a symptom tell me
about a disease?
What does a survey tell us about
the election results?
Intervention
𝑃(𝑦|𝑑𝑜 𝑥 , 𝑧)
Doing
intervening
What if?
What if I do ?
What if I take aspirin, will my
headache be cured?
What if we ban cigarettes?
Counterfactuals
𝑃(𝑦𝑥|𝑥′
, 𝑦′
)
Imagining,
Retrospection
Why?
Was it 𝑋 that caused in 𝑌 ?
What if I had acted
differently?
Was it the aspirin that stopped my
headache?
Would Kennedy be alive had
Oswald not shot him?
What if I had not been smoking the
past 2 years?
Correlation to Causal
50. Table of Content
1. Correlation discovery via seq2seq learning
2. Knowledge Discovery
3. From correlation discovery to causal reasoning
4. Conclusion
50
51. China’s National 15-year AI Plan
This is the first top-level Artificial Intelligence plan: develop new
generation of artificial intelligence (AI) in China for the next 15 years,
and set up ambitious goals up to 2030
53. Yunhe Pan, Heading toward
Artificial Intelligence 2.0,
Engineering, 2016,409-413
New Generation Artificial Intelligence: AI 2.0
Yunhe Pan
The leader of New AI
President of Zhejiang University (1995-2006)
Deputy VP of the Chinese Academy of
Engineering (2006-2015)
…AI faces important
adjustments, and
scientific foundations
are confronted with
new breakthroughs, as
AI enters a new
stage: AI 2.0…
54. The driving forces to new generation AI
In general, there are five features in new generation of AI
compared with the existing AI:
1. From rule-based logic reasoning inference to data-driven knowledge
learning
New Knowledge
water paws
is an
chew teeth
furry
tail
stripes
vertebra viviparity
has
has is
is a
a
is
lives
in
has
zebra
catwhale
Animal Mammal
Inference
From knowledge (e.g., logic) to new knowledge
55. • Concepts
• Entities
• Attributes
• Relations
Machine
learning
From big data (e.g., documents and images) to knowledge
The driving forces to new generation AI
56. 2. From single media data processing to cross-media learning
and reasoning
How to utilize the data with different
modalities from different sources to
understand our real world becomes a great
challenge
Images
Social Media
Micro Messages
/Short Texts
Other Sensors
…
Video
Surveillance
Video
Webpages
The driving forces to new generation AI
57. bridging the gap between low-level features and high-level
semantics
Audio
Video
Webpage
Relevant multi-modal Data
The common representation space
across modalities
The driving forces to new generation AI
58. Human Brain Machine Intelligence
self-learning learning by examples
adaptation routine
common sense No
intuition logic
… …
The driving forces to new generation AI
3. From machine intelligence to Human-machine hybrid-
augmented intelligence
59. How to introduce human cognitive capabilities or human-like
cognitive models (i.e., intuition or transfer learning) into AI systems?
The driving forces to new generation AI
60. 4. From individual intelligence to collective intelligence
(crowdsourcing intelligence)
Collective intelligence can combine the strengths of humans and
computers to accomplish tasks that neither can do alone.
The driving forces to new generation AI
61. Crowd Intelligence
• Wikipedia
• City Transportation
• Online recommendation
• Smart cities
The driving forces to new generation AI
62. 5. From robots to autonomous unmanned systems
autonomous unmanned systems are systems that are man-made and
capable of carrying out operations or management by means of
advanced technologies without human intervention
The driving force to new generation AI
63. The main architecture of new generation AI
Fundamental AI research
AI chips, AI super-computing system, AI
software and hardware, AI cloud platform
Big
Data
intelligence
Collective
intelligence
Cross-
media
intelligence
Human-
machine
Hybrid
intelligence
unmanned
intelligence
intelligent economy and smart society
64. Conclusion
Traditional AI Artificial General Intelligence
Focus on having knowledge and
skills
Focus on acquiring knowledge
and skills
Action acquiring via programing Ability acquiring via learning
domain-specific ability via rule-
based and exemplar-based
general ability via abstraction
(intuition) and context
(common sense)
Learning by data and rules Learning to learn
65. The Next Generation AI
integrates data-driven machine learning approaches
(bottom-up) with knowledge-guided methods (top-
down).
employ data with different modalities (e.g., visual,
auditory, and natural language processing) to perform
cross-media learning and inference.
from the pursuit of an intelligent machine to the
hybrid- augmented intelligence (i.e., high-level man-
machine collaboration and fusion).
a more explainable, robust, open, and general AI
66. The convergence of AI learning models
Learning
from experience
Learning
from data
Learning
from rules
L o g i c
i n f e r e n c e
( f o r m a l
m e t h o d s )
D i s c o v e r y
h i d d e n
p a t t e r n
( s t a t i s t i c a l
m e t h o d s )
e x p l o r a t i o n
a n d
e x p l o i t a t i o n
( c o n t r o l
t h e o r y )
Fusion models?
67. The convergence of AI, Neuroscience and domain-specific application
AI Neuroscience
Domain-specific
application
How to borrow strength from each other and collaborate each other