1. March 9, 20041
Photo Quality Improvement
Marc Abramowitz
March 9, 2004
Or how Y! Personals came to be full of crop
2. March 9, 20042
Photo Quality Improvement
What is it?
Why do it?
Overview of changes
Cropping tool (a.k.a.: The Cropinator)
Cropping existing photos
Uploading to YMDB
Automatic face detection
Frontend changes
Demo
What worked
What worked (less well)
Future
3. March 9, 20043
What is it?
More than just cropping – photo quality
improvement
Rotation
Brightness
Cropping photos to be a headshot
Also cropping to a specific range of aspect ratio
These are applied by reviewers (not the user) to
the user’s primary photo
4. March 9, 20044
Aspect ratio
Aspect ratio = ratio of width:height
E.g.: 4 x 6, 5 x 7, 8 x 10
Photos uploaded by our users come in a wide
range of aspect ratios
Cropped photos are constrained to be between
80:96 and 90:96, with 90:96 being somewhat
preferred.
Asked for by Gooey – helps them with page
layouts
5. March 9, 20045
Why do this?
Makes profiles look more attractive in search
results => more detail views => more subs
Makes search results look more consistent, more
professional
Helps searchers see the faces of their potential
dates
Constrained aspect ratio helps with page layout
Smaller image sizes => faster page loads, esp. for
dial-up users
7. March 9, 20047
High-res photos
Start with highest resolution photo available to
get minimal quality loss after transformations
Y! Photos historically did not keep fileid for high-
res. Thus, early on:
Ensured photo upload server passes this back to us
Extended UDM to store this info in the LDB photos keys
Ran script to get hr fileid for a large # of older photos
Problem with photos whose resolution is too
high…
10200 x 16800 x 32 bpp = 654 MB
Can crash server
Solution is to limit resolution
8. March 9, 20048
Overview of changes
This project touched a wide range of subsystems
UDM
• Store fileid for high-res photos
• Transparently return cropped photos to old code; new
enums for specifying uncropped
Search
• New “photoc” column in search db
• Consumer copies cropped photos to search db
UMT queue consumer – new logic
UMT – cropping tool, queue manager, crop gallery, new stats
pages, integration with other pages, performance
optimizations, backing up photos to Netapp
Edit photos pages – new flows, redesigns, features
Profile detail – more photos -> all photos (n), etc…
Mailbox – 7 new messages to cover various events
9. March 9, 20049
Cropping tool
A Web-based component of UMT that allows reviewers to
crop, rotate, and brighten
Also allows reviewer to demote or reject photos
Reviewer can pick a more suitable photo for cropping while
accepting user’s submission as secondary or rejecting
outright
Reviewer can also say that no photos are croppable.
A challenging UI to present
User drags with the mouse to define a cropping region
Shows previews of what thumbnail and screensize will look
like
Visual cues help you determine if resolution sufficient
Automatic face detection makes the job quicker and easier
10. March 9, 200410
Cropping tool – details
PHP-based
Makes extensive use of PHP GD functions
E.g.: ImageJPEG, ImageCopy, ImageRotate, etc…
Several useful image functions in
Yahoo/Personals/umt/image.inc
Uses JavaScript and CSS tricks to do rich
interaction on the client-side – looks almost more
like an application than a web page
C++ code on server handles automatic face
detection and YMDB access
11. March 9, 200411
Cropping tool – details
UMT queue consumer reads LDB transactions
and inserts photos into primary photo queues
Cropping tool reads primary photos from primary
photo queues
Always operate on highest res photo available –
use PHP cURL extension to grab via HTTP from
YMDB.
Writes results to photoedit table in MySQL db
Crop coords, rotation angle, brightness factor
A cron job periodically reads the photoedit table
and uploads the processed photos to YMDB
12. March 9, 200412
Cropping existing photos
Product goal: Make a big splash by having most photos
already cropped on day of launch
Cropping tool had to be ready way before launch
Strategy: Begin cropping incoming photos and then go
back and crop primary photo for every searchable ad
780,000 photos to crop in the U.S. alone
An individual @6 crops/min
= 360/hr = 2,880 /8_hr_day
Would take a mere 270 days (just U.S.)
Takeaways
Need lots of folks working on this => ACS
Tool needs to be easy and fast
Automatic face detection – a big win
Need a tool that lets reviewers manage the queuing of existing
photos; add to queue on demand so as not to hurt SL for new
photos => Crop queue manager
13. March 9, 200413
Uploading to YMDB
Currently performed by a cron job which reads from the
photoedit table in the MySQL db.
Why separate? Bought us flexibility to deploy tool early
Future – integrate into cropping tool if feasible
Wrote a new PHP extension: yapache_libphp_ymdb
Uses the YMDB API to allow uploading photos to YMDB
Accomplished by streaming the file to a TCPSocket
Returns a fileid which needs to be stored in the LDB to
generate URLs later
Cron job:
Reads photo and edits (crops, rotates, etc…) from db
Uses PHP GD functions to create a temp file with the
transformed photo
Uses yapache_libphp_ymdb to upload temp file to YMDB
Uses cURL to test that photo can be accessed
Updates LDB with meta-data of new cropped photo
May change photo_main if necessary
14. March 9, 200414
Automatic face detection
Since we had to crop several hundred thousand
existing photos, automation was a big win
We run an algorithm (server-side) to detect a
bounding box containing the face
The cropping tool presents the image with a box
drawn for the suggested crop
The reviewer can accept as-is, tweak the crop, or
start fresh
Estimated accuracy – 60% or more depending on
how tweaky you are
15. March 9, 200415
Automatic face detection
Evaluated a multitude of free and commercial libraries.
Many approaches – neural networks, Support Vector
Machines, Eigenfaces, Haar
Settled on OpenCV from Intel
My FreeBSD build of Intel’s OpenCV library, yinstable
Open source library for computer vision
http://www.intel.com/research/mrl/research/opencv/
Haar-based face detection algorithm
Wrote a PHP extension: yapache_libphp_facedetect
A simple API that wraps OpenCV, makes it understandable and
accessible from PHP
Uses default configuration parameters that were tested by me
to work well on our photos
16. March 9, 200416
Frontend changes
On add/edit photo screens, more emphasis
placed on distinction between primary and
secondary photos
We show the user the original and the crop
7 new mailbox messages explain the numerous
combinations of events that can happen
Uploaded photos are automatically submitted –
no extra step that can be forgotten
Even queued photos can be deleted
18. March 9, 200418
What worked
Tool was very popular – “addictive”
Tool performance was great – after optimizations and
getting through some UMT problems
OpenCV – performed surprisingly well
Queue manager great for managing queuing of existing
photos and gauging progress
Decoupling tool from YMDB upload (via MySQL) allowed
fast deployment of the tool so reviewers could begin ASAP;
let me develop YMDB code later
Database also allowed easy collection of stats – e.g.:
Number of crops/day over time
Best and worst croppers
Number of people active on system
19. March 9, 200419
What worked (less well)
UI – deceptively complex, went through revisions
Periods of tool sluggishness – UMT never had so many
concurrent users, disk space problems => optimizations,
redundancy
30+ reviewers at times
Difficult to constantly keep an eye on performance and
stability
Spent lots of time profiling, playing detective, repairing
corrupted databases
Only developer QA on cropping tool
Photos getting stuck in queue, miscellaneous error
messages => Marc getting pages, emails and IMs from
ACS.
Decoupling tool from YMDB upload (via MySQL) allows
database to become out of sync with reality, introduces
delay that is undesirable for QA.
Stats were never in product spec but folks want them
anyway
20. March 9, 200420
Future
Integrate YMDB upload into tool – simpler and
less prone to error, hopefully not too much
impact on performance
Make sure that rotations and brightness
adjustments are applied to original as well as
crop
Fix some confusing aspects of photos frontend
that surfaced in testing