SlideShare a Scribd company logo
Kusk Object Dataset: Recording Access
to Objects in Food Preparation
Atsushi Hashimoto, Masaaki Iiyama, Shinsuke Mori,
Michihiko Minoh
Kyoto University
http://kusk.mm.media.kyoto-u.ac.jp/en/
Computer Vision (CV) meets
Natural Language Processing (NLP)
• CV-NLP collaboration is an active field.
– Supported by Matured Machine Learning Tech.
– Cooking Media can be a good practice field!
• Long text (Recipe) and organized activity (Cooking)
Video
observation
/instruction
Machine-
Readable
Description
(BN/DNN)
Recog.
Text
Generation
CV NLP
Recog./ParseRetrieve
Human-
Friendly
Description
Vision Language Real WorldReal World
Grand goal: Comparing Recipe and Human
Actions
• From a viewpoint of computer engineering…
– Recipe: A kind of script language
– Human Actions: An execution of the script by
human
• Potential Applications
– Automatic Cooking, Online recipe navigation
– Cooking Record for Healthcare , Recipe Generation
Pascal Sentence Dataset
http://vision.cs.uiuc.edu/pascal-sentences/
Cyrus Rashtchian, Peter Young, Micah Hodosh, and Julia Hockenmaier. “Collecting
Image Annotations Using Amazon's Mechanical Turk”.
In Proc. of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with
Amazon's Mechanical Turk.
• One jet lands at an airport while another
takes off next to it.
• Two airplanes parked in an airport.
• Two jets taxi past each other.
• Two parked jet airplanes facing opposite
directions.
• two passenger planes on a grassy plain
Pascal Sentence Dataset: captions and images
- Images are obtained from Pascal Dataset
Captions are annotated by Amazon Mechanical Turk
CV/NLP Datasets in CEA fields
• NLP
– Cooking Ontology (CEA2014, Japanese)
– Cookpad/Rakuten Recipe (2015, Japanese)
• CV
– TUM Kitchen Data Set (2009)
– CMU Multi-Modal Activity Database (2009)
– Actions for Cooking Eggs Dataset (2012)
– MPII Cooking Activities Dataset (2012)
– 50 Salads dataset (2013)
– The Breakfast Actions Dataset (2014)
• CV x NLP
– Yummly API
– Flow Graph Corpus (2014) × KUSK Dataset (CEA2014)
KUSK Dataset x Flow Graph Corpus
KUSK Dataset (Hashimoto,CEA2014) Flow Graph Corpus (Mori, 2014)
Water Flow Sensors
Eye Tracker
Touch Display
Electric Consumption Sensors
Load Sensing Tables
20 recipes, which are shared with flow-graph corpus
60 observations by 33 subjects.
The list of 20 recipes
CookPad ID KUSK ID Title of Recipe (Original title is in Japanese
00121196 2014RC01 Chicken and Chinese cabbage starchy soup
00180223 2014RC02 Tomato soup - Japanese style
00196551 2014RC03 Omelets
00162433 2014RC04 Mother’s chicken salad
00201826 2014RC05 Batter-less Fried croquette
00200883 2014RC06 Beef and mushrooms - Korean style
00176550 2014RC07 Saute of Shiitake and Shimeji Mushrooms
00202059 2014RC08 Potato salad with fresh potatoes
00171343 2014RC09 Celery leaves soup
00148537 2014RC10 Cooked Tomato with Chicken and Soy beans
00185809 2014RC11 Fried broccoli with chicken
00196431 2014RC12 Spicy cooked beans with chicken
00157755 2014RC13 Black sesame-crusted fried chicken
00192913 2014RC14 Zestily flavored fried eggplants
00195151 2014RC15 Meat miso wrap
00187900 2014RC16 Simmered Chinese cabbage
00155229 2014RC17 Chinese style open tofu omelet
00193642 2014RC18 Aglio e olio peperoncino
00182653 2014RC19 Radish cake
00168029 2014RC20 Noshidori
* a certain complexity
* common ingredients
KUSK Object Dataset (expansion from CEA2014)
• Provide object recognition results in KUSK Dataset
Videos
– A baseline for CV research
– Real image processing results as a input for NLP
• Resources: grabbed/released objects
– object class name, timestamp, region (rectangle)
– Informative to predict forthcoming cooking process(*
• Statistics
– 4391 unique images
– Total 133 categories (Each recipe has different cat. set.)
* A. Hashimoto et al, “Intention-Sensing Recipe Guidance via User Accessing to Objects,”
International Journal of Human-Computer Interaction, 2016
Obtained images (a select review)
IngredientsUtensilsSeasonings
Backgrounds
Cauliflowers Garlics Tofu
Enoki dake mushrooms Cabbages Pasta
Chop Sticks Bowls Colander Chop. Board Knife
soup stock powder ketchup Pepper Stem of foodDish detergent Sponge Corner Trash Bag
Semi-automated Annotation (1/2)
• 3 manual tasks for annotation
1. Correcting Errors in object region extraction by a
method from our previous research(Hashimoto, 2012)
2. List up object names appearing in each recipe
3. Adding names (from 2.) to each region (from 1.)
Treatment for orthogonal variants at the 2nd task.
> Cooking Ontology (Nanba,CEA2014)
– We manually treated items that are not listed in the ontology.
Semi-automated Annotation (1/2)
• Workers : students who do not major informatics
– # of workers: More than 20 students
– term: two months at maximum for each worker
– selection: cooking more than once in a week in the last
half year
• Interface: GUI working on Google Chrome
– Most of worker get used to operate the browser.
– double-check (reject if two annotators answered
differently)
• rejected annotation is meta-reviewed by another worker.
– Check and Advise by authors if necessary.
Object Feature and Recognition Result
• Feature: Output of the last layer of ResNet(*
• ResNet: the best CNN model in 2015 competitions
• No fine-tuned
(ResNet training does not run in public CNN libraries)
• Classifier Linear SVM (trained for each recipe)
• Assumption: Recipe is known, thereby objects too.
*) Kaiming He et al., “Deep Residual Learning for Image Recognition”
arXiv preprint arXiv:1512.03385, 2015”
Object Recognition Accuracy
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
80.00%
90.00%
100.00%
2014RC01
2014RC02
2014RC03
2014RC04
2014RC05
2014RC06
2014RC07
2014RC08
2014RC09
2014RC10
2014RC11
2014RC12
2014RC13
2014RC14
2014RC15
2014RC16
2014RC17
2014RC18
2014RC19
2014RC20
Total
Evaluation by CMC curve
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
All Cat.
Ingredients
Seasonings
Utensils
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1 2 3 4 5 6 7 8 9 10 11 12 13 14
全種類 材料
調味料 調理器具
Rank
Acc.
Discussion
• Difficulty in food recognition
– Variations: wrapped? cut? and others (eggs change
appearance extremely)
• Relatively easy to recognize utensils and
seasonings:
– Every kitchen has limited variations.
(environment adaptive system is promised)
• Possibility of RCNN approach
– To deal with failures in object region extraction.
Conclusion
• KUSK Dataset x Flow Graph Corpus
– hope to be a base dataset for CV x NLP research
– problem: texts (and dishes) are Japanese.
• A dataset from Yummly is available for English speakers.
• KUSK Object Dataset ⊂KUSK Dataset
– History of user accessing objects in cooking
• Contains important information to predict forthcoming process.
• Organized by object name, put/taken label, timestamp, and rect.
• Features from ResNet and Recognition Results by Linear SVM
Future works
Mail: a_hasimoto@mm.media.kyoto-u.ac.jp
Twitter: @a_hasimoto or Facebook, Researchgate…
Original KUSK Dataset and old version of KUSK Object Dataset.
http://kusk.mm.media.kyoto-u.ac.jp/
• Collaborative research with NLP team in Kyoto Univ.
– CV2NLP: Vision-assisted NLP, Recipe Text Generation
– NLP2CV: Scenario-guided CV + PR
To get KUSK Object Dataset, please do not
hesitate to contact us.

More Related Content

Similar to Kusk Object Dataset: Recording Access to Objects in Food Preparation

CV_10/17
CV_10/17CV_10/17
Application of soft computing in food processing sector
Application of soft computing in food processing sectorApplication of soft computing in food processing sector
Application of soft computing in food processing sector
Ramabhau Patil
 
Vision Based Food Analysis System
Vision Based Food Analysis SystemVision Based Food Analysis System
Vision Based Food Analysis System
IRJET Journal
 
Computational Biology thesis defense
Computational Biology thesis defenseComputational Biology thesis defense
Computational Biology thesis defense
csfunk
 
Work_measurement_for_estimating_food_preparation_t.pdf
Work_measurement_for_estimating_food_preparation_t.pdfWork_measurement_for_estimating_food_preparation_t.pdf
Work_measurement_for_estimating_food_preparation_t.pdf
RobinPabustan
 
Automatic meal inspection system using lbp hf feature for central kitchen
Automatic meal inspection system using lbp hf feature for central kitchenAutomatic meal inspection system using lbp hf feature for central kitchen
Automatic meal inspection system using lbp hf feature for central kitchen
sipij
 
Articulated human pose estimation by deep learning
Articulated human pose estimation by deep learningArticulated human pose estimation by deep learning
Articulated human pose estimation by deep learning
Wei Yang
 
IRJET- A Food Recognition System for Diabetic Patients based on an Optimized ...
IRJET- A Food Recognition System for Diabetic Patients based on an Optimized ...IRJET- A Food Recognition System for Diabetic Patients based on an Optimized ...
IRJET- A Food Recognition System for Diabetic Patients based on an Optimized ...
IRJET Journal
 
2014 11-13-sbsm032-reproducible research
2014 11-13-sbsm032-reproducible research2014 11-13-sbsm032-reproducible research
2014 11-13-sbsm032-reproducible research
Yannick Wurm
 
Reseasrch topics
Reseasrch topicsReseasrch topics
Reseasrch topics
Ahyoung Choi
 
Time Table Management system
Time Table Management systemTime Table Management system
Time Table Management system
Shaswat Lovee
 
IRJET - One Tap Food Recipe Generation
IRJET -  	  One Tap Food Recipe GenerationIRJET -  	  One Tap Food Recipe Generation
IRJET - One Tap Food Recipe Generation
IRJET Journal
 
Two degree of freedom PID based inferential control of continuous bioreactor ...
Two degree of freedom PID based inferential control of continuous bioreactor ...Two degree of freedom PID based inferential control of continuous bioreactor ...
Two degree of freedom PID based inferential control of continuous bioreactor ...
ISA Interchange
 
Fiducial Marker Tracking Using Machine Vision with Saurabh Ghanekar and Kazut...
Fiducial Marker Tracking Using Machine Vision with Saurabh Ghanekar and Kazut...Fiducial Marker Tracking Using Machine Vision with Saurabh Ghanekar and Kazut...
Fiducial Marker Tracking Using Machine Vision with Saurabh Ghanekar and Kazut...
Databricks
 
Open-source tools for generating and analyzing large materials data sets
Open-source tools for generating and analyzing large materials data setsOpen-source tools for generating and analyzing large materials data sets
Open-source tools for generating and analyzing large materials data sets
Anubhav Jain
 
Identification and Recognition of Snack Foods from Cluttered Scene for Dietar...
Identification and Recognition of Snack Foods from Cluttered Scene for Dietar...Identification and Recognition of Snack Foods from Cluttered Scene for Dietar...
Identification and Recognition of Snack Foods from Cluttered Scene for Dietar...
IJCSIS Research Publications
 
Too good to be true? How validate your data
Too good to be true? How validate your dataToo good to be true? How validate your data
Too good to be true? How validate your data
Alex Henderson
 
Ergonomic Analysis of NREGA work
Ergonomic Analysis of NREGA workErgonomic Analysis of NREGA work
Ergonomic Analysis of NREGA workManan Shukla
 
Using immersive virtual reality to enhance anatomical understanding
Using immersive virtual reality to enhance anatomical understandingUsing immersive virtual reality to enhance anatomical understanding
Using immersive virtual reality to enhance anatomical understanding
SHU Learning & Teaching
 
Towards automated phenotypic cell profiling with high-content imaging
Towards automated phenotypic cell profiling with high-content imagingTowards automated phenotypic cell profiling with high-content imaging
Towards automated phenotypic cell profiling with high-content imaging
Ola Spjuth
 

Similar to Kusk Object Dataset: Recording Access to Objects in Food Preparation (20)

CV_10/17
CV_10/17CV_10/17
CV_10/17
 
Application of soft computing in food processing sector
Application of soft computing in food processing sectorApplication of soft computing in food processing sector
Application of soft computing in food processing sector
 
Vision Based Food Analysis System
Vision Based Food Analysis SystemVision Based Food Analysis System
Vision Based Food Analysis System
 
Computational Biology thesis defense
Computational Biology thesis defenseComputational Biology thesis defense
Computational Biology thesis defense
 
Work_measurement_for_estimating_food_preparation_t.pdf
Work_measurement_for_estimating_food_preparation_t.pdfWork_measurement_for_estimating_food_preparation_t.pdf
Work_measurement_for_estimating_food_preparation_t.pdf
 
Automatic meal inspection system using lbp hf feature for central kitchen
Automatic meal inspection system using lbp hf feature for central kitchenAutomatic meal inspection system using lbp hf feature for central kitchen
Automatic meal inspection system using lbp hf feature for central kitchen
 
Articulated human pose estimation by deep learning
Articulated human pose estimation by deep learningArticulated human pose estimation by deep learning
Articulated human pose estimation by deep learning
 
IRJET- A Food Recognition System for Diabetic Patients based on an Optimized ...
IRJET- A Food Recognition System for Diabetic Patients based on an Optimized ...IRJET- A Food Recognition System for Diabetic Patients based on an Optimized ...
IRJET- A Food Recognition System for Diabetic Patients based on an Optimized ...
 
2014 11-13-sbsm032-reproducible research
2014 11-13-sbsm032-reproducible research2014 11-13-sbsm032-reproducible research
2014 11-13-sbsm032-reproducible research
 
Reseasrch topics
Reseasrch topicsReseasrch topics
Reseasrch topics
 
Time Table Management system
Time Table Management systemTime Table Management system
Time Table Management system
 
IRJET - One Tap Food Recipe Generation
IRJET -  	  One Tap Food Recipe GenerationIRJET -  	  One Tap Food Recipe Generation
IRJET - One Tap Food Recipe Generation
 
Two degree of freedom PID based inferential control of continuous bioreactor ...
Two degree of freedom PID based inferential control of continuous bioreactor ...Two degree of freedom PID based inferential control of continuous bioreactor ...
Two degree of freedom PID based inferential control of continuous bioreactor ...
 
Fiducial Marker Tracking Using Machine Vision with Saurabh Ghanekar and Kazut...
Fiducial Marker Tracking Using Machine Vision with Saurabh Ghanekar and Kazut...Fiducial Marker Tracking Using Machine Vision with Saurabh Ghanekar and Kazut...
Fiducial Marker Tracking Using Machine Vision with Saurabh Ghanekar and Kazut...
 
Open-source tools for generating and analyzing large materials data sets
Open-source tools for generating and analyzing large materials data setsOpen-source tools for generating and analyzing large materials data sets
Open-source tools for generating and analyzing large materials data sets
 
Identification and Recognition of Snack Foods from Cluttered Scene for Dietar...
Identification and Recognition of Snack Foods from Cluttered Scene for Dietar...Identification and Recognition of Snack Foods from Cluttered Scene for Dietar...
Identification and Recognition of Snack Foods from Cluttered Scene for Dietar...
 
Too good to be true? How validate your data
Too good to be true? How validate your dataToo good to be true? How validate your data
Too good to be true? How validate your data
 
Ergonomic Analysis of NREGA work
Ergonomic Analysis of NREGA workErgonomic Analysis of NREGA work
Ergonomic Analysis of NREGA work
 
Using immersive virtual reality to enhance anatomical understanding
Using immersive virtual reality to enhance anatomical understandingUsing immersive virtual reality to enhance anatomical understanding
Using immersive virtual reality to enhance anatomical understanding
 
Towards automated phenotypic cell profiling with high-content imaging
Towards automated phenotypic cell profiling with high-content imagingTowards automated phenotypic cell profiling with high-content imaging
Towards automated phenotypic cell profiling with high-content imaging
 

More from Atsushi Hashimoto

Ocha 20191204
Ocha 20191204Ocha 20191204
Ocha 20191204
Atsushi Hashimoto
 
人の行動をモデル化して予測する -調理作業支援を題材とした行動予測と情報提示-
人の行動をモデル化して予測する -調理作業支援を題材とした行動予測と情報提示- 人の行動をモデル化して予測する -調理作業支援を題材とした行動予測と情報提示-
人の行動をモデル化して予測する -調理作業支援を題材とした行動予測と情報提示-
Atsushi Hashimoto
 
Eccv2018 report day4
Eccv2018 report day4Eccv2018 report day4
Eccv2018 report day4
Atsushi Hashimoto
 
Eccv2018 report day3
Eccv2018 report day3Eccv2018 report day3
Eccv2018 report day3
Atsushi Hashimoto
 
Eccv2018 report day2
Eccv2018 report day2Eccv2018 report day2
Eccv2018 report day2
Atsushi Hashimoto
 
ECCV2018参加速報(一日目)
ECCV2018参加速報(一日目)ECCV2018参加速報(一日目)
ECCV2018参加速報(一日目)
Atsushi Hashimoto
 
Cvpr2018 参加報告(速報版)3日目
Cvpr2018 参加報告(速報版)3日目Cvpr2018 参加報告(速報版)3日目
Cvpr2018 参加報告(速報版)3日目
Atsushi Hashimoto
 
CVPR2018 参加報告(速報版)2日目
CVPR2018 参加報告(速報版)2日目CVPR2018 参加報告(速報版)2日目
CVPR2018 参加報告(速報版)2日目
Atsushi Hashimoto
 
CVPR2018 参加報告(速報版)初日
CVPR2018 参加報告(速報版)初日CVPR2018 参加報告(速報版)初日
CVPR2018 参加報告(速報版)初日
Atsushi Hashimoto
 
関西Cvprml勉強会2017.9資料
関西Cvprml勉強会2017.9資料関西Cvprml勉強会2017.9資料
関西Cvprml勉強会2017.9資料
Atsushi Hashimoto
 
CVPR2017 参加報告 速報版 本会議 4日目
CVPR2017 参加報告 速報版 本会議 4日目CVPR2017 参加報告 速報版 本会議 4日目
CVPR2017 参加報告 速報版 本会議 4日目
Atsushi Hashimoto
 
CVPR2017 参加報告 速報版 本会議3日目
CVPR2017 参加報告 速報版 本会議3日目CVPR2017 参加報告 速報版 本会議3日目
CVPR2017 参加報告 速報版 本会議3日目
Atsushi Hashimoto
 
CVPR2017 参加報告 速報版 本会議 2日目
CVPR2017 参加報告 速報版 本会議 2日目CVPR2017 参加報告 速報版 本会議 2日目
CVPR2017 参加報告 速報版 本会議 2日目
Atsushi Hashimoto
 
CVPR2017 参加報告 速報版 本会議 1日目
CVPR2017 参加報告 速報版 本会議 1日目CVPR2017 参加報告 速報版 本会議 1日目
CVPR2017 参加報告 速報版 本会議 1日目
Atsushi Hashimoto
 
人工知能研究振興財団研究助成に対する成果報告
人工知能研究振興財団研究助成に対する成果報告人工知能研究振興財団研究助成に対する成果報告
人工知能研究振興財団研究助成に対する成果報告
Atsushi Hashimoto
 
春の情報処理祭り 2015 [リクルートx情報処理学会] CVIM 橋本
春の情報処理祭り 2015 [リクルートx情報処理学会] CVIM 橋本春の情報処理祭り 2015 [リクルートx情報処理学会] CVIM 橋本
春の情報処理祭り 2015 [リクルートx情報処理学会] CVIM 橋本Atsushi Hashimoto
 

More from Atsushi Hashimoto (16)

Ocha 20191204
Ocha 20191204Ocha 20191204
Ocha 20191204
 
人の行動をモデル化して予測する -調理作業支援を題材とした行動予測と情報提示-
人の行動をモデル化して予測する -調理作業支援を題材とした行動予測と情報提示- 人の行動をモデル化して予測する -調理作業支援を題材とした行動予測と情報提示-
人の行動をモデル化して予測する -調理作業支援を題材とした行動予測と情報提示-
 
Eccv2018 report day4
Eccv2018 report day4Eccv2018 report day4
Eccv2018 report day4
 
Eccv2018 report day3
Eccv2018 report day3Eccv2018 report day3
Eccv2018 report day3
 
Eccv2018 report day2
Eccv2018 report day2Eccv2018 report day2
Eccv2018 report day2
 
ECCV2018参加速報(一日目)
ECCV2018参加速報(一日目)ECCV2018参加速報(一日目)
ECCV2018参加速報(一日目)
 
Cvpr2018 参加報告(速報版)3日目
Cvpr2018 参加報告(速報版)3日目Cvpr2018 参加報告(速報版)3日目
Cvpr2018 参加報告(速報版)3日目
 
CVPR2018 参加報告(速報版)2日目
CVPR2018 参加報告(速報版)2日目CVPR2018 参加報告(速報版)2日目
CVPR2018 参加報告(速報版)2日目
 
CVPR2018 参加報告(速報版)初日
CVPR2018 参加報告(速報版)初日CVPR2018 参加報告(速報版)初日
CVPR2018 参加報告(速報版)初日
 
関西Cvprml勉強会2017.9資料
関西Cvprml勉強会2017.9資料関西Cvprml勉強会2017.9資料
関西Cvprml勉強会2017.9資料
 
CVPR2017 参加報告 速報版 本会議 4日目
CVPR2017 参加報告 速報版 本会議 4日目CVPR2017 参加報告 速報版 本会議 4日目
CVPR2017 参加報告 速報版 本会議 4日目
 
CVPR2017 参加報告 速報版 本会議3日目
CVPR2017 参加報告 速報版 本会議3日目CVPR2017 参加報告 速報版 本会議3日目
CVPR2017 参加報告 速報版 本会議3日目
 
CVPR2017 参加報告 速報版 本会議 2日目
CVPR2017 参加報告 速報版 本会議 2日目CVPR2017 参加報告 速報版 本会議 2日目
CVPR2017 参加報告 速報版 本会議 2日目
 
CVPR2017 参加報告 速報版 本会議 1日目
CVPR2017 参加報告 速報版 本会議 1日目CVPR2017 参加報告 速報版 本会議 1日目
CVPR2017 参加報告 速報版 本会議 1日目
 
人工知能研究振興財団研究助成に対する成果報告
人工知能研究振興財団研究助成に対する成果報告人工知能研究振興財団研究助成に対する成果報告
人工知能研究振興財団研究助成に対する成果報告
 
春の情報処理祭り 2015 [リクルートx情報処理学会] CVIM 橋本
春の情報処理祭り 2015 [リクルートx情報処理学会] CVIM 橋本春の情報処理祭り 2015 [リクルートx情報処理学会] CVIM 橋本
春の情報処理祭り 2015 [リクルートx情報処理学会] CVIM 橋本
 

Recently uploaded

Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Vladimir Iglovikov, Ph.D.
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
ThomasParaiso2
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 

Recently uploaded (20)

Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 

Kusk Object Dataset: Recording Access to Objects in Food Preparation

  • 1. Kusk Object Dataset: Recording Access to Objects in Food Preparation Atsushi Hashimoto, Masaaki Iiyama, Shinsuke Mori, Michihiko Minoh Kyoto University http://kusk.mm.media.kyoto-u.ac.jp/en/
  • 2. Computer Vision (CV) meets Natural Language Processing (NLP) • CV-NLP collaboration is an active field. – Supported by Matured Machine Learning Tech. – Cooking Media can be a good practice field! • Long text (Recipe) and organized activity (Cooking) Video observation /instruction Machine- Readable Description (BN/DNN) Recog. Text Generation CV NLP Recog./ParseRetrieve Human- Friendly Description Vision Language Real WorldReal World
  • 3. Grand goal: Comparing Recipe and Human Actions • From a viewpoint of computer engineering… – Recipe: A kind of script language – Human Actions: An execution of the script by human • Potential Applications – Automatic Cooking, Online recipe navigation – Cooking Record for Healthcare , Recipe Generation
  • 4. Pascal Sentence Dataset http://vision.cs.uiuc.edu/pascal-sentences/ Cyrus Rashtchian, Peter Young, Micah Hodosh, and Julia Hockenmaier. “Collecting Image Annotations Using Amazon's Mechanical Turk”. In Proc. of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk. • One jet lands at an airport while another takes off next to it. • Two airplanes parked in an airport. • Two jets taxi past each other. • Two parked jet airplanes facing opposite directions. • two passenger planes on a grassy plain Pascal Sentence Dataset: captions and images - Images are obtained from Pascal Dataset Captions are annotated by Amazon Mechanical Turk
  • 5. CV/NLP Datasets in CEA fields • NLP – Cooking Ontology (CEA2014, Japanese) – Cookpad/Rakuten Recipe (2015, Japanese) • CV – TUM Kitchen Data Set (2009) – CMU Multi-Modal Activity Database (2009) – Actions for Cooking Eggs Dataset (2012) – MPII Cooking Activities Dataset (2012) – 50 Salads dataset (2013) – The Breakfast Actions Dataset (2014) • CV x NLP – Yummly API – Flow Graph Corpus (2014) × KUSK Dataset (CEA2014)
  • 6. KUSK Dataset x Flow Graph Corpus KUSK Dataset (Hashimoto,CEA2014) Flow Graph Corpus (Mori, 2014) Water Flow Sensors Eye Tracker Touch Display Electric Consumption Sensors Load Sensing Tables 20 recipes, which are shared with flow-graph corpus 60 observations by 33 subjects.
  • 7. The list of 20 recipes CookPad ID KUSK ID Title of Recipe (Original title is in Japanese 00121196 2014RC01 Chicken and Chinese cabbage starchy soup 00180223 2014RC02 Tomato soup - Japanese style 00196551 2014RC03 Omelets 00162433 2014RC04 Mother’s chicken salad 00201826 2014RC05 Batter-less Fried croquette 00200883 2014RC06 Beef and mushrooms - Korean style 00176550 2014RC07 Saute of Shiitake and Shimeji Mushrooms 00202059 2014RC08 Potato salad with fresh potatoes 00171343 2014RC09 Celery leaves soup 00148537 2014RC10 Cooked Tomato with Chicken and Soy beans 00185809 2014RC11 Fried broccoli with chicken 00196431 2014RC12 Spicy cooked beans with chicken 00157755 2014RC13 Black sesame-crusted fried chicken 00192913 2014RC14 Zestily flavored fried eggplants 00195151 2014RC15 Meat miso wrap 00187900 2014RC16 Simmered Chinese cabbage 00155229 2014RC17 Chinese style open tofu omelet 00193642 2014RC18 Aglio e olio peperoncino 00182653 2014RC19 Radish cake 00168029 2014RC20 Noshidori * a certain complexity * common ingredients
  • 8. KUSK Object Dataset (expansion from CEA2014) • Provide object recognition results in KUSK Dataset Videos – A baseline for CV research – Real image processing results as a input for NLP • Resources: grabbed/released objects – object class name, timestamp, region (rectangle) – Informative to predict forthcoming cooking process(* • Statistics – 4391 unique images – Total 133 categories (Each recipe has different cat. set.) * A. Hashimoto et al, “Intention-Sensing Recipe Guidance via User Accessing to Objects,” International Journal of Human-Computer Interaction, 2016
  • 9. Obtained images (a select review) IngredientsUtensilsSeasonings Backgrounds Cauliflowers Garlics Tofu Enoki dake mushrooms Cabbages Pasta Chop Sticks Bowls Colander Chop. Board Knife soup stock powder ketchup Pepper Stem of foodDish detergent Sponge Corner Trash Bag
  • 10. Semi-automated Annotation (1/2) • 3 manual tasks for annotation 1. Correcting Errors in object region extraction by a method from our previous research(Hashimoto, 2012) 2. List up object names appearing in each recipe 3. Adding names (from 2.) to each region (from 1.) Treatment for orthogonal variants at the 2nd task. > Cooking Ontology (Nanba,CEA2014) – We manually treated items that are not listed in the ontology.
  • 11. Semi-automated Annotation (1/2) • Workers : students who do not major informatics – # of workers: More than 20 students – term: two months at maximum for each worker – selection: cooking more than once in a week in the last half year • Interface: GUI working on Google Chrome – Most of worker get used to operate the browser. – double-check (reject if two annotators answered differently) • rejected annotation is meta-reviewed by another worker. – Check and Advise by authors if necessary.
  • 12. Object Feature and Recognition Result • Feature: Output of the last layer of ResNet(* • ResNet: the best CNN model in 2015 competitions • No fine-tuned (ResNet training does not run in public CNN libraries) • Classifier Linear SVM (trained for each recipe) • Assumption: Recipe is known, thereby objects too. *) Kaiming He et al., “Deep Residual Learning for Image Recognition” arXiv preprint arXiv:1512.03385, 2015”
  • 14. Evaluation by CMC curve 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 All Cat. Ingredients Seasonings Utensils 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 2 3 4 5 6 7 8 9 10 11 12 13 14 全種類 材料 調味料 調理器具 Rank Acc.
  • 15. Discussion • Difficulty in food recognition – Variations: wrapped? cut? and others (eggs change appearance extremely) • Relatively easy to recognize utensils and seasonings: – Every kitchen has limited variations. (environment adaptive system is promised) • Possibility of RCNN approach – To deal with failures in object region extraction.
  • 16. Conclusion • KUSK Dataset x Flow Graph Corpus – hope to be a base dataset for CV x NLP research – problem: texts (and dishes) are Japanese. • A dataset from Yummly is available for English speakers. • KUSK Object Dataset ⊂KUSK Dataset – History of user accessing objects in cooking • Contains important information to predict forthcoming process. • Organized by object name, put/taken label, timestamp, and rect. • Features from ResNet and Recognition Results by Linear SVM
  • 17. Future works Mail: a_hasimoto@mm.media.kyoto-u.ac.jp Twitter: @a_hasimoto or Facebook, Researchgate… Original KUSK Dataset and old version of KUSK Object Dataset. http://kusk.mm.media.kyoto-u.ac.jp/ • Collaborative research with NLP team in Kyoto Univ. – CV2NLP: Vision-assisted NLP, Recipe Text Generation – NLP2CV: Scenario-guided CV + PR To get KUSK Object Dataset, please do not hesitate to contact us.