Jacobs socs-2013


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Jacobs socs-2013

  1. 1. Computer Vision with Humans in the Loop David Jacobs (University of Maryland, College Park) Introduction BIOTRACKER: Combines computer vision, state-of-the-art mobile phone technologies, and internet Encourage science enthusiasts to gather biological data Help scientists to identify new species Several projects under the large umbrella called BIOTRACKER Clustering Images with Human in the Loop Subclustering: summarizing large image databases Odd Leaf Out: Computer game to identify labeling errors And many others! Active Image Clustering (Biswas and Jacobs) Goal: Improve clustering performance, minimize total human effort Cluster images with pairwise constraints (must-link and can’t-link) from humans Main Contribution: Find the best image pair out of O(N2) possible image pairs Look at the effect of each image pair on the overall clustering Choose the pair for which the expected change in clustering is maximum Experimental Results Clustering performance is evaluated using Relative Jaccard’s Coefficient w.r.t ground truth We use two different domains (leaves and faces): leaf dataset (subset of the database collected for Leafsnap) face dataset (subset of Pubfig dataset) (a) Leaf − 1042 (b) Face − 500 Active Subclustering (Biswas and Jacobs) ACTIVE SUBCLUSTERING DIFFERENT FINAL SUBCLUSTERING OUTPUT PASSIVE SUBCLUSTERING Clustering large datasets is hard; even with human in the loop Cluster only a subset of the data; useful in many applications Odd Leaf Out (Hansen et al.) Odd Leaf Out is an Online Game. The game helps in refining Large Image Databases for Computer Vision Research. Fun for players but useful information for vision researchers and biological enthusiasts. Research Questions: How do we build a game that is interesting, simple and useful? How can we motivate users to continue to play when we are dealing with some imperfect data that will sometimes provide two “correct” answers? How do we choose the game elements (in Odd Leaf Out set of six images)? How can data provided by novice users be employed to enhance the work of experts? Game Design Selection of Image Sets: We choose five images from one species and one from a different one. We can create a set using each leaf in our database as a seed leaf (say this is Li1 and is in species S). The other five leaves are chosen in the following way: Seed Leaf Least Similar leaf from seed leaf in S (Li2) A leaf from a different species other than S (Lj); set difficulty depends on dis- tance between Lj and Li1 Distinct randomly cho- sen leaf from S (Li3) Distinct randomly cho- sen leaf from S (Li4) Distinct randomly chosen leaf from S (Li5) Different versions of the game: We have four versions of the game: Three Lives version, Contestation, Multiple guesses, skip Database: For all our experiments, we use the leaf dataset collected as part of a project called Leafsnap. This is an iphone application developed by researchers in University of Maryland, Columbia University and Smithsonian Institution. The iPhone application is now available in Apple store !! What Do We Get From This Game? Identify errors in the dataset Discover if color helps humans identify leaves (caution: Leaf color changes over the year) Feedback on how enjoyable or difficult the game is. Based on that we will improve our game. The game interface: Example Cases We give two sample scenarios which can happen if labels are wrong, however in reality we see many other scenarios When the Odd leaf is wrongly labeled it can be same as the other five leaves. Players pick all the leaves with equal probability. When one of the non-Odd leaves is wrongly labeled, there are two different looking leaves. Players pick the Odd leaf and the wrongly labeled leaves with equal probabilities. About Biotracker People in Biotracker: David Jacobs, Jennifer Preece, Derek Hansen, Dana Rotman, Anne Bowser, Carol Boston, Yurong He, Arijit Biswas, Jen Hammond, Cynthia Parr and many others! Publications from Biotracker: Arijit Biswas, David Jacobs. Active Image Clustering: Seeking Constraints from Humans to Complement Algorithms. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012. Derek Hansen, David Jacobs, Darcy Lewis, Arijit Biswas, Jennifer Preece, Dana Rotman, and Eric Stevens. 2011. Odd Leaf Out: Improving visual recognition with games. In Proceedings of the IEEE International Conference on Social Computing. Boston, MA. Ahn J., Hammock J., Parr C., Preece J., Shneidernam B., Schulz K., Hansen D., Rotman D., He Y. Visually Exploring Social Participation in Encyclopedia of Life. ASE International Conference on Social Informatics 2012. Rotman, D., Preece, J., Hammock, J., Procita, K., Hansen, D., Parr, C.S., Lewis, D., Jacobs, D. Dynamic changes in motivation in collaborative ecological citizen science projects. CSCW 2012. Rotman, D., Procita, K., Hansen, D., Sims Parr, C. and Preece, J. (2012), Supporting content curation communities, The Case of the Encyclopedia of Life J. Am. Soc. Inf. Sci.. Neeraj Kumar, Peter N. Belhumeur, Arijit Biswas, David Jacobs, W. John Kress, Ida Lopez, Joao V. B. Soares. Leafsnap: A Computer Vision System for Automatic Plant Species Identification. European Conference in Computer Vision (ECCV), 2012. Conclusion Improved image clustering with humans in the loop Clustering subset of a dataset Finding Labeling errors in large image databases Many other works are going on! Acknowledgement: This work was supported by NSF grant #0968546. University of Maryland, College Park email: arijit@cs.umd.edu WWW: http://biotrackers.net/