3 d vision based dietary inspection for the central kitchen automationcsandit
This paper proposes an intelligent and automatic dietary inspection system which can be
applied to the dietary inspection for the application of central kitchen automation. The diet
specifically designed for the patients are required with providing personalized diet such as low
sodium intake or some necessary food. Hence, the proposed system can benefit the inspection
process that is often performed manually. In the proposed system, firstly, the meal box can be
detected and located automatically with the vision-based method and then all the food
ingredients can be identified by using the color and LBP-HF texture features. Secondly, the
quantity for each of food ingredient is estimated by using the image depth information. The
experimental results show that the dietary inspection accuracy can approach 80%, dietary
inspection efficiency can reach1200ms, and the food quantity accuracy is about 90%. The
proposed system is expected to increase the capacity of meal supply over 50% and be helpful to
the dietician in the hospital for saving the time in the diet inspection process.
3 d vision based dietary inspection for the central kitchen automationcsandit
This paper proposes an intelligent and automatic dietary inspection system which can be
applied to the dietary inspection for the application of central kitchen automation. The diet
specifically designed for the patients are required with providing personalized diet such as low
sodium intake or some necessary food. Hence, the proposed system can benefit the inspection
process that is often performed manually. In the proposed system, firstly, the meal box can be
detected and located automatically with the vision-based method and then all the food
ingredients can be identified by using the color and LBP-HF texture features. Secondly, the
quantity for each of food ingredient is estimated by using the image depth information. The
experimental results show that the dietary inspection accuracy can approach 80%, dietary
inspection efficiency can reach1200ms, and the food quantity accuracy is about 90%. The
proposed system is expected to increase the capacity of meal supply over 50% and be helpful to
the dietician in the hospital for saving the time in the diet inspection process.
Automatic meal inspection system using lbp hf feature for central kitchensipij
This paper proposes an intelligent and automatic meal inspection system which can be applied to the meal
inspection for the application of central kitchen automation. The diet specifically designed for the patients are required with providing personalized diet such as low sodium intake or some necessary food. Hence,
the proposed system can benefit the inspection process that is often performed manually. In the proposed
system, firstly, the meal box can be detected and located automatically with the vision-based method and
then all the food ingredients can be identified by using the color and LBP-HF texture features. Secondly,
the quantity for each of food ingredient is estimated by using the image depth information. The experimental results show that the meal inspection accuracy can approach 80%, meal inspection efficiency can reach1200ms, and the food quantity accuracy is about 90%. The proposed system is expected to increase the capacity of meal supply over 50% and be helpful to the dietician in the hospital for saving the time in the diet inspection process.
This slide introduces the research topic of HCI Lab, Gachon University, Korea (Professor Ahyoung Choi)
For more information, please visit our research web site.
https://sites.google.com/view/hcilab/home
Two degree of freedom PID based inferential control of continuous bioreactor ...ISA Interchange
This article presents the development of inferential control scheme based on Adaptive linear neural network (ADALINE) soft sensor for the control of fermentation process. The ethanol concentration of bioreactor is estimated from temperature profile of the process using soft sensor. The prediction accuracy of ADALINE is enhanced by retraining it with immediate past measurements. The ADALINE and retrained ADALINE are used along with PID and 2-DOF-PID leading to APID, A2PID, RAPID and RA2PID inferential controllers. Further the parameters of 2-DOF-PID are optimized using Non-dominated sorted genetic algorithm-II and used with retrained ADALINE soft sensor which leads to RAN2PID inferential controller. Simulation results demonstrate that performance of proposed RAN2PID controller is better in comparison to other designed controllers in terms of qualitative and quantitative performance indices.
Fiducial Marker Tracking Using Machine Vision with Saurabh Ghanekar and Kazut...Databricks
Advanced machine vision is increasingly being used to investigate, diagnose, and identify potential remedies and their progressions for complex health issues. In this study, a behavioral neuroscientist at the University of Chicago and his colleagues have collaborated with Kavi Global to characterize 3D feeding behavior and its potential changes caused by neurological conditions such as ALS, Parkinson’s disease, and stroke, or oral environmental changes such as tooth extraction and dental implants.
Videos of rodents feeding on kibble are recorded by a high-speed biplanar videofluoroscopy technique (XROMM). Their feeding behavior is then analyzed by tracking radio-opaque fiducial markers implanted in their head region. The marker tracking process, until now, was manual and tedious, and was not designed to process massive amounts of longitudinal data. This session will highlight a near-automated, deep learning-based solution for detecting and tracking fiducial markers in the videos, resulting in a more efficient and robust process, with a 300+ times reduction in data processing time compared to a manual use of the existing software.
Our approach involved the following steps: (i) Marker Detection-Deep Learning algorithms were used to identify the pixels corresponding to markers within each frame; (ii) Marker Tracking-Kalman filtering along with Hungarian algorithm were used for tracking markers across frames; (iii) 2D to 3D Conversion- sequence matching of videos recorded by both cameras, and triangulating marker locations in 2D track coordinates to generate 3D marker locations. The features extracted from videos would be used to characterize behaviorally relevant kinematic features such as rhythmic chewing or swallowing. The solution involved the use of TensorFlow-Python APIs and Spark.
In this paper, we propose an easy approach of
identification and classification of high calorie snacks for dietary
assessment using machine learning. As an object detection
technique we have use point features matching algorithm to
identify the object of interest from a cluttered scene. After
detecting the object, a Bag of Features (BoF) model is created by
extracting Speed up Robust features (SURF) features. This BoF
model is used to recognize and classify the snacks items of different
categories. We have used three types of snacks images named: Icecream,
Chips and Chocolate for experimental purpose. Depending
on the experimental results our proposed algorithm is able to
detect and classify different types of snacks with around 85%
accuracy.
Towards automated phenotypic cell profiling with high-content imagingOla Spjuth
Presentation by Ola Spjuth (Uppsala University and Scaleout) at the Chemical Biology Seminar Series, February 6th, at Karolinska Institutet and Science for Life Laboratory, Stockholm, Sweden.
ABSTRACT
Phenotypic profiling of cells with high-content imaging is emerging as an important methodology with high predictive power. The true power of these methods comes when integrated into automated, robotized systems that can be run continuously and not restricted to batch analysis. One of the main challenges then becomes how to manage and continuously analyze the large amounts of data produced. In this talk I will present our efforts to establish an automated lab for cell profiling of drugs using multiplexed fluorescence imaging (Cell Painting). I will describe our computational and lab infrastructure as well as the systems, tools an methods we are developing to sustain continuous profiling of cells and continuous AI modeling. A key objective in the group is on improving screening and toxicity assessment, but also to explore predictions of mechanisms and pathways. The long-term goal is to build a closed-loop system where results from analyses are used by an AI system to design the next round of experiments and iteratively improve the confidence in predictions. Research website: https://pharmb.io
Automatic meal inspection system using lbp hf feature for central kitchensipij
This paper proposes an intelligent and automatic meal inspection system which can be applied to the meal
inspection for the application of central kitchen automation. The diet specifically designed for the patients are required with providing personalized diet such as low sodium intake or some necessary food. Hence,
the proposed system can benefit the inspection process that is often performed manually. In the proposed
system, firstly, the meal box can be detected and located automatically with the vision-based method and
then all the food ingredients can be identified by using the color and LBP-HF texture features. Secondly,
the quantity for each of food ingredient is estimated by using the image depth information. The experimental results show that the meal inspection accuracy can approach 80%, meal inspection efficiency can reach1200ms, and the food quantity accuracy is about 90%. The proposed system is expected to increase the capacity of meal supply over 50% and be helpful to the dietician in the hospital for saving the time in the diet inspection process.
This slide introduces the research topic of HCI Lab, Gachon University, Korea (Professor Ahyoung Choi)
For more information, please visit our research web site.
https://sites.google.com/view/hcilab/home
Two degree of freedom PID based inferential control of continuous bioreactor ...ISA Interchange
This article presents the development of inferential control scheme based on Adaptive linear neural network (ADALINE) soft sensor for the control of fermentation process. The ethanol concentration of bioreactor is estimated from temperature profile of the process using soft sensor. The prediction accuracy of ADALINE is enhanced by retraining it with immediate past measurements. The ADALINE and retrained ADALINE are used along with PID and 2-DOF-PID leading to APID, A2PID, RAPID and RA2PID inferential controllers. Further the parameters of 2-DOF-PID are optimized using Non-dominated sorted genetic algorithm-II and used with retrained ADALINE soft sensor which leads to RAN2PID inferential controller. Simulation results demonstrate that performance of proposed RAN2PID controller is better in comparison to other designed controllers in terms of qualitative and quantitative performance indices.
Fiducial Marker Tracking Using Machine Vision with Saurabh Ghanekar and Kazut...Databricks
Advanced machine vision is increasingly being used to investigate, diagnose, and identify potential remedies and their progressions for complex health issues. In this study, a behavioral neuroscientist at the University of Chicago and his colleagues have collaborated with Kavi Global to characterize 3D feeding behavior and its potential changes caused by neurological conditions such as ALS, Parkinson’s disease, and stroke, or oral environmental changes such as tooth extraction and dental implants.
Videos of rodents feeding on kibble are recorded by a high-speed biplanar videofluoroscopy technique (XROMM). Their feeding behavior is then analyzed by tracking radio-opaque fiducial markers implanted in their head region. The marker tracking process, until now, was manual and tedious, and was not designed to process massive amounts of longitudinal data. This session will highlight a near-automated, deep learning-based solution for detecting and tracking fiducial markers in the videos, resulting in a more efficient and robust process, with a 300+ times reduction in data processing time compared to a manual use of the existing software.
Our approach involved the following steps: (i) Marker Detection-Deep Learning algorithms were used to identify the pixels corresponding to markers within each frame; (ii) Marker Tracking-Kalman filtering along with Hungarian algorithm were used for tracking markers across frames; (iii) 2D to 3D Conversion- sequence matching of videos recorded by both cameras, and triangulating marker locations in 2D track coordinates to generate 3D marker locations. The features extracted from videos would be used to characterize behaviorally relevant kinematic features such as rhythmic chewing or swallowing. The solution involved the use of TensorFlow-Python APIs and Spark.
In this paper, we propose an easy approach of
identification and classification of high calorie snacks for dietary
assessment using machine learning. As an object detection
technique we have use point features matching algorithm to
identify the object of interest from a cluttered scene. After
detecting the object, a Bag of Features (BoF) model is created by
extracting Speed up Robust features (SURF) features. This BoF
model is used to recognize and classify the snacks items of different
categories. We have used three types of snacks images named: Icecream,
Chips and Chocolate for experimental purpose. Depending
on the experimental results our proposed algorithm is able to
detect and classify different types of snacks with around 85%
accuracy.
Towards automated phenotypic cell profiling with high-content imagingOla Spjuth
Presentation by Ola Spjuth (Uppsala University and Scaleout) at the Chemical Biology Seminar Series, February 6th, at Karolinska Institutet and Science for Life Laboratory, Stockholm, Sweden.
ABSTRACT
Phenotypic profiling of cells with high-content imaging is emerging as an important methodology with high predictive power. The true power of these methods comes when integrated into automated, robotized systems that can be run continuously and not restricted to batch analysis. One of the main challenges then becomes how to manage and continuously analyze the large amounts of data produced. In this talk I will present our efforts to establish an automated lab for cell profiling of drugs using multiplexed fluorescence imaging (Cell Painting). I will describe our computational and lab infrastructure as well as the systems, tools an methods we are developing to sustain continuous profiling of cells and continuous AI modeling. A key objective in the group is on improving screening and toxicity assessment, but also to explore predictions of mechanisms and pathways. The long-term goal is to build a closed-loop system where results from analyses are used by an AI system to design the next round of experiments and iteratively improve the confidence in predictions. Research website: https://pharmb.io
Similar to Kusk Object Dataset: Recording Access to Objects in Food Preparation (20)
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIVladimir Iglovikov, Ph.D.
Presented by Vladimir Iglovikov:
- https://www.linkedin.com/in/iglovikov/
- https://x.com/viglovikov
- https://www.instagram.com/ternaus/
This presentation delves into the journey of Albumentations.ai, a highly successful open-source library for data augmentation.
Created out of a necessity for superior performance in Kaggle competitions, Albumentations has grown to become a widely used tool among data scientists and machine learning practitioners.
This case study covers various aspects, including:
People: The contributors and community that have supported Albumentations.
Metrics: The success indicators such as downloads, daily active users, GitHub stars, and financial contributions.
Challenges: The hurdles in monetizing open-source projects and measuring user engagement.
Development Practices: Best practices for creating, maintaining, and scaling open-source libraries, including code hygiene, CI/CD, and fast iteration.
Community Building: Strategies for making adoption easy, iterating quickly, and fostering a vibrant, engaged community.
Marketing: Both online and offline marketing tactics, focusing on real, impactful interactions and collaborations.
Mental Health: Maintaining balance and not feeling pressured by user demands.
Key insights include the importance of automation, making the adoption process seamless, and leveraging offline interactions for marketing. The presentation also emphasizes the need for continuous small improvements and building a friendly, inclusive community that contributes to the project's growth.
Vladimir Iglovikov brings his extensive experience as a Kaggle Grandmaster, ex-Staff ML Engineer at Lyft, sharing valuable lessons and practical advice for anyone looking to enhance the adoption of their open-source projects.
Explore more about Albumentations and join the community at:
GitHub: https://github.com/albumentations-team/albumentations
Website: https://albumentations.ai/
LinkedIn: https://www.linkedin.com/company/100504475
Twitter: https://x.com/albumentations
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
GridMate - End to end testing is a critical piece to ensure quality and avoid...ThomasParaiso2
End to end testing is a critical piece to ensure quality and avoid regressions. In this session, we share our journey building an E2E testing pipeline for GridMate components (LWC and Aura) using Cypress, JSForce, FakerJS…
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Kusk Object Dataset: Recording Access to Objects in Food Preparation
1. Kusk Object Dataset: Recording Access
to Objects in Food Preparation
Atsushi Hashimoto, Masaaki Iiyama, Shinsuke Mori,
Michihiko Minoh
Kyoto University
http://kusk.mm.media.kyoto-u.ac.jp/en/
2. Computer Vision (CV) meets
Natural Language Processing (NLP)
• CV-NLP collaboration is an active field.
– Supported by Matured Machine Learning Tech.
– Cooking Media can be a good practice field!
• Long text (Recipe) and organized activity (Cooking)
Video
observation
/instruction
Machine-
Readable
Description
(BN/DNN)
Recog.
Text
Generation
CV NLP
Recog./ParseRetrieve
Human-
Friendly
Description
Vision Language Real WorldReal World
3. Grand goal: Comparing Recipe and Human
Actions
• From a viewpoint of computer engineering…
– Recipe: A kind of script language
– Human Actions: An execution of the script by
human
• Potential Applications
– Automatic Cooking, Online recipe navigation
– Cooking Record for Healthcare , Recipe Generation
4. Pascal Sentence Dataset
http://vision.cs.uiuc.edu/pascal-sentences/
Cyrus Rashtchian, Peter Young, Micah Hodosh, and Julia Hockenmaier. “Collecting
Image Annotations Using Amazon's Mechanical Turk”.
In Proc. of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with
Amazon's Mechanical Turk.
• One jet lands at an airport while another
takes off next to it.
• Two airplanes parked in an airport.
• Two jets taxi past each other.
• Two parked jet airplanes facing opposite
directions.
• two passenger planes on a grassy plain
Pascal Sentence Dataset: captions and images
- Images are obtained from Pascal Dataset
Captions are annotated by Amazon Mechanical Turk
5. CV/NLP Datasets in CEA fields
• NLP
– Cooking Ontology (CEA2014, Japanese)
– Cookpad/Rakuten Recipe (2015, Japanese)
• CV
– TUM Kitchen Data Set (2009)
– CMU Multi-Modal Activity Database (2009)
– Actions for Cooking Eggs Dataset (2012)
– MPII Cooking Activities Dataset (2012)
– 50 Salads dataset (2013)
– The Breakfast Actions Dataset (2014)
• CV x NLP
– Yummly API
– Flow Graph Corpus (2014) × KUSK Dataset (CEA2014)
6. KUSK Dataset x Flow Graph Corpus
KUSK Dataset (Hashimoto,CEA2014) Flow Graph Corpus (Mori, 2014)
Water Flow Sensors
Eye Tracker
Touch Display
Electric Consumption Sensors
Load Sensing Tables
20 recipes, which are shared with flow-graph corpus
60 observations by 33 subjects.
7. The list of 20 recipes
CookPad ID KUSK ID Title of Recipe (Original title is in Japanese
00121196 2014RC01 Chicken and Chinese cabbage starchy soup
00180223 2014RC02 Tomato soup - Japanese style
00196551 2014RC03 Omelets
00162433 2014RC04 Mother’s chicken salad
00201826 2014RC05 Batter-less Fried croquette
00200883 2014RC06 Beef and mushrooms - Korean style
00176550 2014RC07 Saute of Shiitake and Shimeji Mushrooms
00202059 2014RC08 Potato salad with fresh potatoes
00171343 2014RC09 Celery leaves soup
00148537 2014RC10 Cooked Tomato with Chicken and Soy beans
00185809 2014RC11 Fried broccoli with chicken
00196431 2014RC12 Spicy cooked beans with chicken
00157755 2014RC13 Black sesame-crusted fried chicken
00192913 2014RC14 Zestily flavored fried eggplants
00195151 2014RC15 Meat miso wrap
00187900 2014RC16 Simmered Chinese cabbage
00155229 2014RC17 Chinese style open tofu omelet
00193642 2014RC18 Aglio e olio peperoncino
00182653 2014RC19 Radish cake
00168029 2014RC20 Noshidori
* a certain complexity
* common ingredients
8. KUSK Object Dataset (expansion from CEA2014)
• Provide object recognition results in KUSK Dataset
Videos
– A baseline for CV research
– Real image processing results as a input for NLP
• Resources: grabbed/released objects
– object class name, timestamp, region (rectangle)
– Informative to predict forthcoming cooking process(*
• Statistics
– 4391 unique images
– Total 133 categories (Each recipe has different cat. set.)
* A. Hashimoto et al, “Intention-Sensing Recipe Guidance via User Accessing to Objects,”
International Journal of Human-Computer Interaction, 2016
9. Obtained images (a select review)
IngredientsUtensilsSeasonings
Backgrounds
Cauliflowers Garlics Tofu
Enoki dake mushrooms Cabbages Pasta
Chop Sticks Bowls Colander Chop. Board Knife
soup stock powder ketchup Pepper Stem of foodDish detergent Sponge Corner Trash Bag
10. Semi-automated Annotation (1/2)
• 3 manual tasks for annotation
1. Correcting Errors in object region extraction by a
method from our previous research(Hashimoto, 2012)
2. List up object names appearing in each recipe
3. Adding names (from 2.) to each region (from 1.)
Treatment for orthogonal variants at the 2nd task.
> Cooking Ontology (Nanba,CEA2014)
– We manually treated items that are not listed in the ontology.
11. Semi-automated Annotation (1/2)
• Workers : students who do not major informatics
– # of workers: More than 20 students
– term: two months at maximum for each worker
– selection: cooking more than once in a week in the last
half year
• Interface: GUI working on Google Chrome
– Most of worker get used to operate the browser.
– double-check (reject if two annotators answered
differently)
• rejected annotation is meta-reviewed by another worker.
– Check and Advise by authors if necessary.
12. Object Feature and Recognition Result
• Feature: Output of the last layer of ResNet(*
• ResNet: the best CNN model in 2015 competitions
• No fine-tuned
(ResNet training does not run in public CNN libraries)
• Classifier Linear SVM (trained for each recipe)
• Assumption: Recipe is known, thereby objects too.
*) Kaiming He et al., “Deep Residual Learning for Image Recognition”
arXiv preprint arXiv:1512.03385, 2015”
15. Discussion
• Difficulty in food recognition
– Variations: wrapped? cut? and others (eggs change
appearance extremely)
• Relatively easy to recognize utensils and
seasonings:
– Every kitchen has limited variations.
(environment adaptive system is promised)
• Possibility of RCNN approach
– To deal with failures in object region extraction.
16. Conclusion
• KUSK Dataset x Flow Graph Corpus
– hope to be a base dataset for CV x NLP research
– problem: texts (and dishes) are Japanese.
• A dataset from Yummly is available for English speakers.
• KUSK Object Dataset ⊂KUSK Dataset
– History of user accessing objects in cooking
• Contains important information to predict forthcoming process.
• Organized by object name, put/taken label, timestamp, and rect.
• Features from ResNet and Recognition Results by Linear SVM
17. Future works
Mail: a_hasimoto@mm.media.kyoto-u.ac.jp
Twitter: @a_hasimoto or Facebook, Researchgate…
Original KUSK Dataset and old version of KUSK Object Dataset.
http://kusk.mm.media.kyoto-u.ac.jp/
• Collaborative research with NLP team in Kyoto Univ.
– CV2NLP: Vision-assisted NLP, Recipe Text Generation
– NLP2CV: Scenario-guided CV + PR
To get KUSK Object Dataset, please do not
hesitate to contact us.