Thank you for your interest in Matsuo laboratory.
Through this presentation, I would like to give you a brief introduction of our lab
Matsuo Lab belongs to the Graduate School of Engineering at The University of Tokyo
and specializes in Artificial Intelligence (AI) and Web Engineering research
In our lab, we have over 50 staff members including 10 researchers, who are engaged in fundamental research,
and 40 engineering and business experts, who are planning AI lectures and managing entrepreneurship classes, and handling collaborative research with private companies.
We also have 40 assigned students learning at the Lab.
Our representative, Professor Matsuo also serves as an outside director and technical advisor to companies,
He also serves as a government commissioner and association board member, contributing widely to industry, government, and academia.
Recently, He is appointed as the Chairman of AI strategy roundtable for Japanese Government.
To realize our vision, we are engaging in 4 activities: Fundamental research, Education, Implementation, and Incubation.
We aim to nurture the technological seeds born from Fundamental Research and engage in joint research with companies.
The knowledge gained from these activities are used to develop human resources, such as student entrepreneurs.
University ventures and graduates who have learned about technology will become leaders in promoting digital transformation in the industrial world and contribute to Japanese society as a whole.
If the results are returned to academia, the next generation of technology and human resources will be created.
In this spiral of innovation, we hope that many pioneers will be born in this spiral of innovation.
Next, I will describe each of our activities.
In fundamental research, we conduct research and development focusing on deep learning and its application.
Through fundamental research, we aim to discover human intelligence in engineering
To achieve this, we are working from two perspectives: first, on algorithms for deep learning, and second, on applications of deep learning.
One of the technologies we believe is important for achieving intelligence is the world model.
A world model is a technology to make Deep Learning models train AI to compensate the gaps in information and to be able to predict the future with understanding of physical law and passage of time.
Learning this model would allow AI to think like human; in other world, they will be able to predict the future from the current state and to imagine parts of an object that cannot be seen.
For example, humans are able to predict what happens when a glass falls. Although we are aware that the glass would break, robots and AIs won’t be able to predict on site. In order to overcome this inability, world model provides AI and robots the ability to predict and respond to causal dynamics in the physical world.
The world model has attracted attention from institutes and companies around the world, including GAFA, and various studies have recently conducted.
As an example, the research by google can predict how the external world will look from a new perspective based on partial observations of the external world.
Furthermore, in the research below, after the world model of the game environment is learned from the video, we can learn the agent's behavior only within that world model.
The advantage of using world model as an approach is that, unlike in the real world, the learning can be repeated as many as the researcher want. In a real environment, things can break if you move them around too many times, but learning on the world model does not have this problem.These world model studies are expected to be applied to robotics and other fields.
Here I present one of the our foundational studies of the world model. Existing world models did not take into account what objects exist in the environment and did not learn individual representations.
However, humans understand what each object is and can predict what will happen when that object moves. In other words, we can learn object-centric representations based on environmental information.
Object-centric world model is an area of research that attempts to achieve this in world models. In this approach, a representation corresponding to each object is acquired by learning.
The goal is to infer the appropriate representation from the image and predict the original image from it, even without annotations about the objects in the image.
We proposed to separate object representations into time-dependent dynamic representations and non-time-dependent global representations.
Dynamic representations correspond to elements such as object positions, and global representations correspond to elements such as object colors. For example, in the figure, there are two sequences with two objects. and these positions change over time. Our proposed world model is capable of swapping only the colors of objects between two transitions by replacing only the global representations.. This result demonstrates that we can obtain global representations in object-centric world models.
We also found that this separation of representations improves the performance of future predictions by properly capturing object-sense interactions.
We are also involved in various activities to promote the world model in our research community.
For example, we hold a session on the World Model every year at JSAI, the largest AI conference in Japan.
This year, we held a workshop on the World Model at IROS, one of the top-tier robotics conferences in the world.
The world model is also applied in automated driving technology to improve its capabilities.
For example, there are two obstacles in front of a car, bicycle and pole.
The current pipeline processing in automated driving can recognize the presence of bicycle but not the pole.
By using world model, the automated driven car can predict the movement of bicycle driving in pathway that avoids the poles.
We are aiming to apply this world model to automated driving technology and so on.
The Matsuo Lab also has a team working on the application of deep learning algorithms to robotics.
We have a robotics contest team called "TRAIL," which currently consists of first-year undergraduates,
The team is aiming to improve the accuracy of its household robots by training them to clean up a room or pick up and deliver a specified item from a shelf.
The team won first place in Japan Robocup and won third place in the World Robocup held in France 2023 summer.
In this presentation, we will present you the video of robot working on tasks.
Playback normally until 02:00
Explanation "This is actually how the robot actually opens the drawers based on the technology I just described."
After opening the first drawer
Explanation "The video is long, so I'm going to shortcut it."
Shortcut the video to 02:59.
02:59 "The robot recognizes an object on the floor and grabs it with its arm,
"and stores it in a designated place for each type of object.
Robot grabs an apple.
Explanation: "The robot recognizes and grabs the apple.
Robot grabs detergent.
Explanation: "The robot recognized and grabbed the detergent and put it in a different basket than the apples because it is detergent.
Robot grabs banana.
Explanation: "The banana is also fruit, so it recognized it and put it on the same tray as the apple.
Robot grabbing other things
Explanation: "There are other types of tasks in the robotics competition as well”
Stop the video.
Explanation "As you can see in the video, the robot is still moving slowly.
Matsuo Lab is also aiming to realize more complex and smooth robot activities using the world model technology I just introduced."
We are also researching on large language models, LLM.
I know many of you use ChatGPT, so I assume you are familiar with prompt engineering. Our work presented here is a pioneering study of prompt engineering.
Our research member, Takeshi Kojima, found a prompt “Let’s think step by step” and how it improves the answer generated by ChatGPT, especially on logical problems.
For example, the chatGPT gave wrong answer for standard prompt which is on left side. On the other hand, with the proposed prompt, chatGPT was able to think thoroughly and gave correct answer.
While we conduct research on how to utilize the existing service and technology, we also aim to develop our own.
We released Large Language Models called WEBLAB-10B, trained with 10billion size parameters, 2023 summer.
At the time of its release it was the most accurate open LLM in Japan.
Our researches have been accepted by many top-tier international conferences, and the number is increasing every year.
In parallel, the number of researchers in Matsuo Laboratory is also increasing every year. We are also building new technologies that are completely different from traditional deep learning
These are a list of recently accepted papers.
A wide range of research, from basic research to applied research, has been accepted at international conferences. We will continue our research toward the realization of intelligence.
Another important function of Matsuo laboratory is providing lectures.
Matsuo Lab offers more than 15 educational programs under four themes: Web Engineering, Data Science, Deep Learning, and Entrepreneurship Development.
Most of the courses can be taken by not only the students of the University of Tokyo but also students of other universities and high schools.
The number of students has been increasing since the lecture began.
The current number of students taking our open DL classes is over 5700 annually, and the total number of attendance is expected to be over 10,000 at the end of this year.
We also provide an internship program for those students who want to gain hands-on experience after taking lectures. As a junior engineer member, students will join in collaborative research program.
The collaborative research projects are conducted with clients of various industries such as motor, chemical, construction, medical industry, and so on.
Some of the collaborative project in 2023 includes Collaboration with medical institutions on early detection of Alzheimer's disease by detecting minute hemorrhages from MRI images of the brain,
We are also working with a chemical plant on a project for early detection of abnormalities in the plant and investigation of their causes.
we also have collaboration researches with companies that are not mentioned here.
Students who participate in such internships will acquire business sense and team development skills, which are fundamental in starting their startup companies.
We offer various projects to nurture and support such entrepreneurs.
One of the example of entrepreneurship program we offer is Kigyo quest.
Kigyo means start up in Japanese and it is an education program to nurture entrepreneurs through E-learning classes and on-the-job-training projects.
This program is designed to help participants become AI startup entrepreneurs after the completion of this quest.
So far, there are 19 start-ups launched by graduates of Matsuo lab and two of them got listed.