Your SlideShare is downloading. ×
Data-driven modeling: Lecture 10
Data-driven modeling: Lecture 10
Data-driven modeling: Lecture 10
Data-driven modeling: Lecture 10
Data-driven modeling: Lecture 10
Data-driven modeling: Lecture 10
Data-driven modeling: Lecture 10
Data-driven modeling: Lecture 10
Data-driven modeling: Lecture 10
Data-driven modeling: Lecture 10
Data-driven modeling: Lecture 10
Data-driven modeling: Lecture 10
Data-driven modeling: Lecture 10
Data-driven modeling: Lecture 10
Data-driven modeling: Lecture 10
Data-driven modeling: Lecture 10
Data-driven modeling: Lecture 10
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Data-driven modeling: Lecture 10

4,364

Published on

Published in: Education, Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
4,364
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
0
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Data-driven modeling APAM E4990 Jake Hofman Columbia University April 9, 2012Jake Hofman (Columbia University) Data-driven modeling April 9, 2012 1 / 11
  • 2. Clustering images Clustering is an unsupervised learning task by which we look for structure in the data, grouping similar examples together e.g., find groups of similar pixels within a single imageJake Hofman (Columbia University) Data-driven modeling April 9, 2012 2 / 11
  • 3. Clustering images Clustering is an unsupervised learning task by which we look for structure in the data, grouping similar examples together e.g., find groups of similar images across a collection of imagesJake Hofman (Columbia University) Data-driven modeling April 9, 2012 2 / 11
  • 4. K-means clustering K-means: represent each cluster by the average of its points Learn by iteratively updating cluster means and point assigmentsJake Hofman (Columbia University) Data-driven modeling April 9, 2012 3 / 11
  • 5. K-means clustering K-means: Choose number of clusters Initialize cluster centers While not converged: Assign each point to closest cluster Update cluster centersJake Hofman (Columbia University) Data-driven modeling April 9, 2012 4 / 11
  • 6. K-means clustering K-means: Choose number of clusters Initialize cluster centers While not converged: Assign each point to closest cluster Update cluster centersJake Hofman (Columbia University) Data-driven modeling April 9, 2012 4 / 11
  • 7. K-means clustering K-means: Choose number of clusters Initialize cluster centers While not converged: Assign each point to closest cluster Update cluster centersJake Hofman (Columbia University) Data-driven modeling April 9, 2012 4 / 11
  • 8. K-means clustering K-means: Choose number of clusters Initialize cluster centers While not converged: Assign each point to closest cluster Update cluster centersJake Hofman (Columbia University) Data-driven modeling April 9, 2012 4 / 11
  • 9. K-means clustering K-means: Choose number of clusters Initialize cluster centers While not converged: Assign each point to closest cluster Update cluster centersJake Hofman (Columbia University) Data-driven modeling April 9, 2012 4 / 11
  • 10. Clustering pixels Find groups of similar pixels within a single image (e.g. “the bright red circles”) Represent each pixel as a separate example with its (R,G,B) value as a 3-d feature vectorJake Hofman (Columbia University) Data-driven modeling April 9, 2012 5 / 11
  • 11. Images as arrays Color images ↔ 3-d arrays of M × N × 3 RGB pixel intensities import matplotlib . image as mpimg I = mpimg . imread ( chairs . jpg )Jake Hofman (Columbia University) Data-driven modeling April 9, 2012 6 / 11
  • 12. Images as arrays Color images ↔ 3-d arrays of M × N × 3 RGB pixel intensities import matplotlib . image as mpimg I = mpimg . imread ( chairs . jpg )Jake Hofman (Columbia University) Data-driven modeling April 9, 2012 6 / 11
  • 13. Clustering pixels Group pixels within candy.jpg into 7 clusters ./ cluster_pixels . py candy . jpg 7Jake Hofman (Columbia University) Data-driven modeling April 9, 2012 7 / 11
  • 14. Clustering images Find groups of similar images within a collection of images (e.g. “warm” photos) Represent each image with a binned RGB intensity histogramJake Hofman (Columbia University) Data-driven modeling April 9, 2012 8 / 11
  • 15. Intensity histograms Disregard all spatial information, simply count pixels by intensities (e.g. lots of pixels with bright green and dark blue)Jake Hofman (Columbia University) Data-driven modeling April 9, 2012 9 / 11
  • 16. Intensity histograms How many bins for pixel intensities? Too many bins gives a noisy, overly complex representation of the data, while using too few bins results in an overly simple oneJake Hofman (Columbia University) Data-driven modeling April 9, 2012 10 / 11
  • 17. Clustering images Group ’vivid’ images into 3 clusters ./ cluster_flickr . py flickr_vivid 7 10Jake Hofman (Columbia University) Data-driven modeling April 9, 2012 11 / 11

×