Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...Preferred Networks
This presentation explains basic ideas of graph neural networks (GNNs) and their common applications. Primary target audiences are students, engineers and researchers who are new to GNNs but interested in using GNNs for their projects. This is a modified version of the course material for a special lecture on Data Science at Nara Institute of Science and Technology (NAIST), given by Preferred Networks researcher Katsuhiko Ishiguro, PhD.
How does ChatGPT work: an Information Retrieval perspectiveSease
In this talk, we will explore the underlying mechanisms of ChatGPT, a large-scale language model developed by OpenAI, from the perspective of Information Retrieval (IR). We will delve into the process of training the model using massive amounts of data, the techniques used to optimize the model’s performance, and how the IR concepts such as tokenization, vectorization, and ranking are used in generating responses. We will also discuss how ChatGPT handles contextual understanding and how it leverages the power of transfer learning to generate high-quality and relevant responses. Software engineers will gain insights into how a modern conversational AI system like ChatGPT works, providing a better understanding of its strengths and limitations, and how to best integrate it into their software applications.
This abstract has been fully written by ChatGPT with the simple prompt in input <Write an abstract for a talk called “How does ChatGPT work? An Information Retrieval perspective”, the audience is software engineers>.
PyBay23: Understanding LangChain Agents and Tools with Twilio (or with SMS)....Elizabeth (Lizzie) Siegle
With LangChain, developers “chain” together different LLM components to create more advanced use cases around LLMs. Agents use LLMs to decide what actions should be taken. Get introduced to LangChain about what you can do with Agents, Tools, and communication APIs!
Talk given at PyBay 2023 in San Francisco, CA on Sunday, October 8, 2023
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...Preferred Networks
This presentation explains basic ideas of graph neural networks (GNNs) and their common applications. Primary target audiences are students, engineers and researchers who are new to GNNs but interested in using GNNs for their projects. This is a modified version of the course material for a special lecture on Data Science at Nara Institute of Science and Technology (NAIST), given by Preferred Networks researcher Katsuhiko Ishiguro, PhD.
How does ChatGPT work: an Information Retrieval perspectiveSease
In this talk, we will explore the underlying mechanisms of ChatGPT, a large-scale language model developed by OpenAI, from the perspective of Information Retrieval (IR). We will delve into the process of training the model using massive amounts of data, the techniques used to optimize the model’s performance, and how the IR concepts such as tokenization, vectorization, and ranking are used in generating responses. We will also discuss how ChatGPT handles contextual understanding and how it leverages the power of transfer learning to generate high-quality and relevant responses. Software engineers will gain insights into how a modern conversational AI system like ChatGPT works, providing a better understanding of its strengths and limitations, and how to best integrate it into their software applications.
This abstract has been fully written by ChatGPT with the simple prompt in input <Write an abstract for a talk called “How does ChatGPT work? An Information Retrieval perspective”, the audience is software engineers>.
PyBay23: Understanding LangChain Agents and Tools with Twilio (or with SMS)....Elizabeth (Lizzie) Siegle
With LangChain, developers “chain” together different LLM components to create more advanced use cases around LLMs. Agents use LLMs to decide what actions should be taken. Get introduced to LangChain about what you can do with Agents, Tools, and communication APIs!
Talk given at PyBay 2023 in San Francisco, CA on Sunday, October 8, 2023
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2022/08/how-transformers-are-changing-the-direction-of-deep-learning-architectures-a-presentation-from-synopsys/
Tom Michiels, System Architect for DesignWare ARC Processors at Synopsys, presents the “How Transformers are Changing the Direction of Deep Learning Architectures” tutorial at the May 2022 Embedded Vision Summit.
The neural network architectures used in embedded real-time applications are evolving quickly. Transformers are a leading deep learning approach for natural language processing and other time-dependent, series data applications. Now, transformer-based deep learning network architectures are also being applied to vision applications with state-of-the-art results compared to CNN-based solutions.
In this presentation, Michiels introduces transformers and contrast them with the CNNs commonly used for vision tasks today. He examines the key features of transformer model architectures and shows performance comparisons between transformers and CNNs. He concludes the presentation with insights on why Synopsys thinks transformers are an important approach for future visual perception tasks.
Transformer modality is an established architecture in natural language processing that utilizes a framework of self-attention with a deep learning approach.
This presentation was delivered under the mentorship of Mr. Mukunthan Tharmakulasingam (University of Surrey, UK), as a part of the ScholarX program from Sustainable Education Foundation.
GPT-2: Language Models are Unsupervised Multitask LearnersYoung Seok Kim
Review of paper
Language Models are Unsupervised Multitask Learners
(GPT-2)
by Alec Radford et al.
Paper link: https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf
YouTube presentation: https://youtu.be/f5zULULWUwM
(Slides are written in English, but the presentation is done in Korean)
And then there were ... Large Language ModelsLeon Dohmen
It is not often even in the ICT world that one witnesses a revolution. The rise of the Personal Computer, the rise of mobile telephony and, of course, the rise of the Internet are some of those revolutions. So what is ChatGPT really? Is ChatGPT also such a revolution? And like any revolution, does ChatGPT have its winners and losers? And who are they? How do we ensure that ChatGPT contributes to a positive impulse for "Smart Humanity?".
During a key note om April 3 and 13 2023 Piek Vossen explained the impact of Large Language Models like ChatGPT.
Prof. PhD. Piek Th.J.M. Vossen, is Full professor of Computational Lexicology at the Faculty of Humanities, Department of Language, Literature and Communication (LCC) at VU Amsterdam:
What is ChatGPT? What technology and thought processes underlie it? What are its consequences? What choices are being made? In the presentation, Piek will elaborate on the basic principles behind Large Language Models and how they are used as a basis for Deep Learning in which they are fine-tuned for specific tasks. He will also discuss a specific variant GPT that underlies ChatGPT. It covers what ChatGPT can and cannot do, what it is good for and what the risks are.
Word embedding, Vector space model, language modelling, Neural language model, Word2Vec, GloVe, Fasttext, ELMo, BERT, distilBER, roBERTa, sBERT, Transformer, Attention
An introduction to the Transformers architecture and BERTSuman Debnath
The transformer is one of the most popular state-of-the-art deep (SOTA) learning architectures that is mostly used for natural language processing (NLP) tasks. Ever since the advent of the transformer, it has replaced RNN and LSTM for various tasks. The transformer also created a major breakthrough in the field of NLP and also paved the way for new revolutionary architectures such as BERT.
Zaikun Xu from the Università della Svizzera Italiana presented this deck at the 2016 Switzerland HPC Conference.
“In the past decade, deep learning as a life-changing technology, has gained a huge success on various tasks, including image recognition, speech recognition, machine translation, etc. Pio- neered by several research groups, Geoffrey Hinton (U Toronto), Yoshua Benjio (U Montreal), Yann LeCun(NYU), Juergen Schmiduhuber (IDSIA, Switzerland), Deep learning is a renaissance of neural network in the Big data era.
Neural network is a learning algorithm that consists of input layer, hidden layers and output layers, where each circle represents a neural and the each arrow connection associates with a weight. The way neural network learns is based on how different between the output of output layer and the ground truth, following by calculating the gradients of this discrepancy w.r.b to the weights and adjust the weight accordingly. Ideally, it will find weights that maps input X to target y with error as lower as possible.”
Watch the video presentation: http://insidehpc.com/2016/03/deep-learning/
See more talks in the Swiss Conference Video Gallery: http://insidehpc.com/2016-swiss-hpc-conference/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
[딥논읽] Meta-Transfer Learning for Zero-Shot Super-Resolution paper reviewtaeseon ryu
105번째 논문리뷰,
오늘 소개 드릴 논문은 2020 CVPR에서 발표된 Meta-Transfer Learning for Zero-Shot Super-Resolution 라는 논문입니다!
제목에서 유추가 가능하신것 처럼 학습 데이터없이 저해상도 사진을 고해상도 사진으로 바꿔주는 Zero Shot Super Resolution을 위한 Meta Transfer Learning을 소개합니다. Internal Learning에 적합한 General한 초기 parameter를 찾는것에 기반하여 한번의 Gradient Update만으로 최적의 성능을 보여주는것 방법에 대해서 소개합니다.
논문에 대한 자세한 리뷰를 이미지 처리팀 김선옥 님이 자세한 리뷰 도와주셨습니다!
https://youtu.be/lEqbXLrUlW4
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2022/08/how-transformers-are-changing-the-direction-of-deep-learning-architectures-a-presentation-from-synopsys/
Tom Michiels, System Architect for DesignWare ARC Processors at Synopsys, presents the “How Transformers are Changing the Direction of Deep Learning Architectures” tutorial at the May 2022 Embedded Vision Summit.
The neural network architectures used in embedded real-time applications are evolving quickly. Transformers are a leading deep learning approach for natural language processing and other time-dependent, series data applications. Now, transformer-based deep learning network architectures are also being applied to vision applications with state-of-the-art results compared to CNN-based solutions.
In this presentation, Michiels introduces transformers and contrast them with the CNNs commonly used for vision tasks today. He examines the key features of transformer model architectures and shows performance comparisons between transformers and CNNs. He concludes the presentation with insights on why Synopsys thinks transformers are an important approach for future visual perception tasks.
Transformer modality is an established architecture in natural language processing that utilizes a framework of self-attention with a deep learning approach.
This presentation was delivered under the mentorship of Mr. Mukunthan Tharmakulasingam (University of Surrey, UK), as a part of the ScholarX program from Sustainable Education Foundation.
GPT-2: Language Models are Unsupervised Multitask LearnersYoung Seok Kim
Review of paper
Language Models are Unsupervised Multitask Learners
(GPT-2)
by Alec Radford et al.
Paper link: https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf
YouTube presentation: https://youtu.be/f5zULULWUwM
(Slides are written in English, but the presentation is done in Korean)
And then there were ... Large Language ModelsLeon Dohmen
It is not often even in the ICT world that one witnesses a revolution. The rise of the Personal Computer, the rise of mobile telephony and, of course, the rise of the Internet are some of those revolutions. So what is ChatGPT really? Is ChatGPT also such a revolution? And like any revolution, does ChatGPT have its winners and losers? And who are they? How do we ensure that ChatGPT contributes to a positive impulse for "Smart Humanity?".
During a key note om April 3 and 13 2023 Piek Vossen explained the impact of Large Language Models like ChatGPT.
Prof. PhD. Piek Th.J.M. Vossen, is Full professor of Computational Lexicology at the Faculty of Humanities, Department of Language, Literature and Communication (LCC) at VU Amsterdam:
What is ChatGPT? What technology and thought processes underlie it? What are its consequences? What choices are being made? In the presentation, Piek will elaborate on the basic principles behind Large Language Models and how they are used as a basis for Deep Learning in which they are fine-tuned for specific tasks. He will also discuss a specific variant GPT that underlies ChatGPT. It covers what ChatGPT can and cannot do, what it is good for and what the risks are.
Word embedding, Vector space model, language modelling, Neural language model, Word2Vec, GloVe, Fasttext, ELMo, BERT, distilBER, roBERTa, sBERT, Transformer, Attention
An introduction to the Transformers architecture and BERTSuman Debnath
The transformer is one of the most popular state-of-the-art deep (SOTA) learning architectures that is mostly used for natural language processing (NLP) tasks. Ever since the advent of the transformer, it has replaced RNN and LSTM for various tasks. The transformer also created a major breakthrough in the field of NLP and also paved the way for new revolutionary architectures such as BERT.
Zaikun Xu from the Università della Svizzera Italiana presented this deck at the 2016 Switzerland HPC Conference.
“In the past decade, deep learning as a life-changing technology, has gained a huge success on various tasks, including image recognition, speech recognition, machine translation, etc. Pio- neered by several research groups, Geoffrey Hinton (U Toronto), Yoshua Benjio (U Montreal), Yann LeCun(NYU), Juergen Schmiduhuber (IDSIA, Switzerland), Deep learning is a renaissance of neural network in the Big data era.
Neural network is a learning algorithm that consists of input layer, hidden layers and output layers, where each circle represents a neural and the each arrow connection associates with a weight. The way neural network learns is based on how different between the output of output layer and the ground truth, following by calculating the gradients of this discrepancy w.r.b to the weights and adjust the weight accordingly. Ideally, it will find weights that maps input X to target y with error as lower as possible.”
Watch the video presentation: http://insidehpc.com/2016/03/deep-learning/
See more talks in the Swiss Conference Video Gallery: http://insidehpc.com/2016-swiss-hpc-conference/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
[딥논읽] Meta-Transfer Learning for Zero-Shot Super-Resolution paper reviewtaeseon ryu
105번째 논문리뷰,
오늘 소개 드릴 논문은 2020 CVPR에서 발표된 Meta-Transfer Learning for Zero-Shot Super-Resolution 라는 논문입니다!
제목에서 유추가 가능하신것 처럼 학습 데이터없이 저해상도 사진을 고해상도 사진으로 바꿔주는 Zero Shot Super Resolution을 위한 Meta Transfer Learning을 소개합니다. Internal Learning에 적합한 General한 초기 parameter를 찾는것에 기반하여 한번의 Gradient Update만으로 최적의 성능을 보여주는것 방법에 대해서 소개합니다.
논문에 대한 자세한 리뷰를 이미지 처리팀 김선옥 님이 자세한 리뷰 도와주셨습니다!
https://youtu.be/lEqbXLrUlW4
Python과 Tensorflow를 활용한 AI Chatbot 개발 및 실무 적용Susang Kim
도입
AI Chatbot 소개
Chatbot Ecosystem
Closed vs Open Domain
Rule Based vs AI
Chat IF Flow and Story Slot
AI기반의 학습을 위한 Data 구성 방법
Data를 구하는 법 / Train을 위한 Word Representation
Data의 구성 / Data Augmentation(Intent, NER)
자연어처리 위한 AI 적용 방안
Intent (Char-CNN) / QnA (Seq2Seq)
Named Entity Recognition (Bi-LSTM CRF) / Ontology (Graph DB)
Chatbot Service를 위한 Architecture 구성
Chatbot Architecture
NLP Architecture
Web Service Architecture
Bot builder / Chatbot API
Test Codes for Chatbot
실무에서 발생하는 문제와 해결 Tips
Ensemble and voting / Trigger / Synonym(N-Gram)
Tone Generator / Parallel processing / Response Speed
마무리
[설명 코드]
Text Augmentation / Slot Bot / QA Bot / Graph DB / Response Generator
@PyCon Korea 2014
NLTK 덕에 파이썬으로 자연어처리를 하는 것이 편리해졌다. 단, 한국어만 분석하려하지 않는다면. 파이썬으로 한국어를 분석할 수는 없을까? 국문, 영문, 중문 등 다양한 문자가 섞여 있는 문서는 어떻게 분석을 할 수 있을까?
이 발표에서는 자연어처리의 기초적인 개념을 다룬 후, NLTK 등의 자연어처리 라이브러리와 한국어 분석을 위해 개발중인 KoNLPy를 소개한다. 또, 파이썬으로 한국어를 분석할 때 유용한 몇 가지 트릭을 공유한다.
http://konlpy.readthedocs.org
6. Motivation
논문 목표
비지도학습으로 Unlabeled Text 활용해서 다양한 Specific Task를 풀고 싶음
“Our setup does not require these target tasks to be in the same domain as the unlabeled corpus.”
왜 해야하는가?
1) 도메인 데이터가 적음
2) 데이터를 구하려면 시간과 돈이 많이 들어감
3) Universal Representation을 학습
지도학습이 가능하더라도 성능향상에 도움이 됨
ex) Word embedding
7. Challenging
1. 어떤 Optimization Object 가 효과적인지?
- LM, Translation, Discourse Coherence
2. Transfer 하는 방법에 대한 Consensus가 없음
- architecture를 task-specific하게 수정 (ELMo)
- Task-aware input사용 → Architecture 변화를 최소화
- Using intricate learning scheme(ULMFiT)
- Adding auxilary learning objective
15. Performance
발표 당시 12개 dataset 중에 9개에서 sota 기록
○ NLI에서 앞선 점수, 다른 task에서도 좋은 성적을 거두지 않았나하고 추측. 파고들지는 않았음.
○ 엄청 작은 데이터셋(2490eg)인 RTE에서는 BiLSTM이 이김.
○ Handle long-range contexts effectively.
○ Corpus of Linguistic Acceptability(문법 판단, 언어 편향성 체크)에서 큰 성능 향상.
○ 작은 데이터셋(5.7K)부터 큰 데이터셋(550K)까지 잘 학습함.
17. Zero-shot Behavior (w/o fine-tuning)
감성 분석의 경우 모든 예제에 very라는 토큰을
넣고 LM의 output을 positive와 negative라는 단어만
나오게 함.
왜 LM pre-training of transformer가 효과적?
Our hypothesis
1) the underlying generative model learns to perform
many of the tasks we evaluate on in order to
improve its language modeling capability
2) more structured attentional memory of the
transformer assists in transfer compared to
LSTMs
18. Ablation Studies
1) Larger Dataset이 Auxiliary Objective 효과가 큼.
2) LSTM보다 좋음.
3) Pre-Trained 안하면 성능이 대폭 하락.
개인적인 생각으론 Transformer가 나온지 꽤 됐지만 잘 활용하기가 쉽지 않았었나 봄.
19. FYI
● OpenAI는 GPT2가 악용될 소지를 걱정하여 모델을 제한하여 공개
● 적절한 조치였는가? (대중에게 공포심을 심어주는 것 아닐까?)