SlideShare a Scribd company logo
1 of 1
Download to read offline
Joseph P. Robinson, Ryan Birke, Timothy Gillis, Justin Xia, Yue Wu, Yun Fu
We	present	a	large-scale	dataset	for	visual	kin-based	problems,	i.e.,	kinship	
verification	and	family	recognition.	Namely,	the	Families	in	the	Wild	(FIW)	dataset.	
Motivated	by	the	lack	of	a	single,	unified	image	dataset	available	for	kinship	tasks,	our	
goal	is	to	provide	the	research	community	with	a	dataset	that	captivates	interests	for	
involvement,	i.e.,	large	enough	in	scope	to	inherently	provide	platforms	for	multiple	
vision	tasks.	We	were	able	to	collect	and	label	the	largest	set	of	family	images	to	date	
with	a	small	team	and	an	efficient	labelling	tool	that	was	designed	to	optimize	the	
process	of	marking	complex	hierarchical	relationships,	attributes,	and	local	label	
information	in	family	photos.	We	experimentally	compare	our	dataset	to	existing	
kinship	image	datasets,	and	demonstrate	the	practical	value	of	our	FIW	dataset.	We	
also	show	that	using	a	pre-trained	CNN model	as	an	off-the-shelf	feature	extractor	
works	better	than	features	traditionally	used.	We	further	boost	performance	by	fine-
tuning	CNN	on	FIW	data.	We	also	measure	human	performance	and	show	their	
performance	does	not	match	up	to	that	of	machine	vision	algorithms.	
Abstract
A 5-fold	cross	validation	kinship	verification.	90%	images	of	each	family	are	used	to	
fine-tune	the	model	and	the	rest	for	validation.	We	remove	the	last	fully- connected	
layer	which	is	used	to	identify	2,622	people	and	utilize	the	triplet	loss	[11]	as	the	loss	
function.	The	network	was	frozen	except	the	last	fully-connected	layer	which	was	
used	as	the	features	for	kinship	verification.	To	fine-tune	CNN	we	interfaced	with	the	
well-known	Caffe [10]	framework.	The	learning	rate	was	initially	set	to	10−5	and	
decreases	by	a	factor	of	10	every	700	iterations.	The	model	was	fine-tuned	for	about	
1,400	iterations.	The	batch	size	was	set	to	128	images.	The	other	setting	of	the	
network	was	the	same	with	the	original	VGG-FACE.	The	training	was	conducted	on	a	
single	GTX	Titan	X	using	about	10	GB	GPU	memory.	
Deep	CNN:	Fine	Tuning
• Built	the	largest	kinship	database	to	date,	along	with	the	labels,	baseline	results,	
and	evaluation	protocols	needed	to	further	and	track	future	progress.
• Found	pre-trained	CNNs	yield	the	best	features	for	our	unconstrained	dataset.
• Revealed	algorithms	outperform	humans	doing	the	verification	tasks.
• Obtained	top	scores	for	both	kinship	verification	and	family	recognition	by	fine-
tuning	CNN	network	on	FIW	data.
• Develop	project	page	to	go	live	upon	being	published	in	peer	review	paper.
• Generate	additional	baseline	results	for	tasks	new	to	visual	kinship	(e.g.,	fine-
grain	classification	and	search	&	retrieval).
• Use	data	to	explore	natural	inheritance	from	a	visual	perspective.
Conclusions/Future	Work
1. Ahonen,	T.,	Hadid,	A.,	Pietikainen,	M.:	Face	description	with	local	binary	patterns:	
Application	to	face	recognition.	PAMI	(2006).
2. Lowe,	D.G.:	Distinctive	image	features	from	scale-invariant	keypoints.	IJCV	(2004).
3. Parkhi,	O.M.,	Vedaldi,	A.,	Zisserman,	A.:	Deep	face	recognition.	In:	BMVC	(2015).
4. Girshick,	R.B.,	Donahue,	J.,	Darrell,	T.,	Malik,	J.:	Rich	feature	hierarchies	for	accurate	
object	detection	and	semantic	segmentation.	CoRR (2013).
5. Xia,	S.,	Shao,	M.,	Luo,	J.,	Fu,	Y.:	Understanding	kin	relationships	in	a	photo.	IEEE	
Transactions	on	Multimedia	14(4)	(2012).
6. Fang,	R.,	Tang,	K.D.,	Snavely,	N.,	Chen,	T.:	Towards	computational	models	of	kinship	
verification.	ICIP	(2010).
7. Qin,	X..,	Chen,	S.	Tri-subject	kinship	verification:	Understanding	the	core	of	a	family.
8. Xia,	S.,	Shao,	M.,	Fu,	Y.:	Kinship	verification	through	transfer	learning.	IJCAI	(2011).
9. Guo Y.,	Dibeklioglu H.,	vander Maaten L.	Graph-based	kin	recognition	ICPR	(2014).
10. Y.	Jia,	E.	Donahue,	S.	Karayev,	J.	Long,	R.	Girshick,	S.	Guadarrama,	and	T.	Darrell.	Caffe:	
Convolutional	architecture	for	fast	feature	embedding.	arXiv:1408.5093,	2014.	
11. F.	Schroff,	D.	Kalenichenko,	and	J.	Philbin.	Facenet:	A	unified	embedding	for	face	
recognition	and	clustering.	CVPR	2015.
References
Table	3 Features	and	parameters	used	throughout	 this	work.
Kinship	Verification	on	FIW.
• Extracted	features	listed	in	Table	3,	and	used	a	cosine	distance	metric.
• Benchmarked	results	for	2	tasks,	kinship	verification	and	family	recognition.
o Included	various	feature	types,	both	raw	and	transformed	with	different	
metric	learning	and	dimensionality	reduction	schemes	.
o Repeated	experiments	using	several	pre-existing	kinship	datasets	to	further	
examine	differences	(i.e.,	FIW	vs	its	predecessors).
• Kin	relation	types	and	sample	sizes	are	listed	in	Table	1 and	displayed	in	Fig 3.		
o The	number	of	positive	and	negative	samples	is	split	50-50	(e.g.,	F-S	pair	was	
made	up	of	35k	pairs,	17.5k	positive	samples	and	17.5k		negative	samples).	
• Cross-validated	on	5-fold	[see Table	4].			
o F-S,	as	an	example,	split	3.5k	positive	&	3.5k	negative	pairs	in	each	fold.	
o Determines	kinship	from	cosine	similarity	scores	on	4	test	folds	for	each	pair.
A	threshold	was	used	to	determine	the	kin	relation	of	each	pair.	The	average	
performance	of	5	folds	is	reported.
• Fine-tuned	VGG-Face	deep	CNN	model	achieved	top	scores.
Family	Recognition.
• VGG-Face	+	one-vs-rest	SVMs	were	used	as	baseline.
• 2	different	experimental	settings	were	used:
1. Families	which	have	more	than	20	images	were	selected,	resulting	in	399	families	
with	total	11,158	images.	80%	images	of	each	family	was	randomly	set	as	training	
data	and	the	rest	was	set	as	testing.	
2. For	the	second	setting,	we	extract	the	families	which	have	more	than	5	members	
in	our	dataset.	Then	we	choose	5	members	which	have	the	most	images	in	each	
family.	This	results	in	316	families	with	7,772	images.	Then	we	split	these	images	
into	5	folds,	each	contained	one	family	member	for	training,	then	test	on	all	other	
members.
• Fine-tuned	VGG-Face	deep	CNN,	which	again	achieved	top	score	 [see	Table 5].
Human	Kinship	Verification.
• Measured	human	performance	on	kinship	verification	using	200	face	pairs	of	the	
11	pairwise	types	supported	in	FIW.
• Human	performers	scored	an	overall	average	of	56.6%,	which	was	outscored	by	
fine-tuned	CNN	by	approximately	15	%	[see	Fig	5].
Experiment
Fig	3	 Sample	pairs	for	the	11	kinship	relations	provided	by	the	FIW	database.
Feature Description
SIFT	[1] • Resized	images	to	64x64,	with	the	block size	set	as	
16x16.	The	stride	is	8,	making	for	49	blocks	per
image.	Feature dimensions:	 12x8x49	=	6,272D.
LBP	[2]	 • Resized image	to	64x64; divided image	into	16x16	
non-overlapping	 blocks	(i.e., 16	blocks	per	image);	
extracted	LBP	features	with radius	=	2	and	mapped to	
8-neighbors. Each	block	were binned	as	256D	
histogram to	yield	a	feature	vector	of dimension	
256x16	=	4,096D.
VGG-Face	CNN	
Descriptors	[3]
• Very	Deep architecture,	small	convolutional	kernels	
(i.e.,	,	3x3),	&	a	convolutional	stride	of	1	pixel;	trained	
on	~2.6M	images	of	2,622	celebrity	faces.	The	pre-
trained	CNN	was	used	as	a	feature	extractor;	fed	face	
images	of	size	224x224	through	 the	network	with	the	
output	set	as	the	2nd to	last	fully-connected	layer	(i.e.,	
fc7-layer),	resulting	vectors	were	4,096D.
F-D F-S M-D M-S SIBS B-B S-S GF-D GF-S GM-D GM-S Avg.
HOG 56.2 56.9 56.8 55.7 59.3 50.3 57.8 62.4 58.9 59.5 57.7 57.4
HOG PCA 56.1 56.5 56.4 55.3 58.7 50.3 57.4 59.3 66.9 60.4 56.9 57.7
LBP 55.0 55.2 55.4 56.0 57.1 57.0 55.9 59.0 56.0 55.8 60.3 56.6
LBP PCA 55.0 55.3 55.4 55.9 57.1 56.8 55.8 58.5 59.1 55.6 60.1 56.8
VGG-Face 64.3 63.3 66.4 64.2 73.2 71.4 70.6 66.1 61.1 64.9 60.4 66.0
VGG-Face PCA 64.4 63.4 66.2 64.0 73.2 71.5 70.8 64.4 68.6 66.2 63.5 66.9
Fine-Tuned 67.8 66.6 66.7 68.2 72.3 70.8 70.3 69.5 68.3 69.5 63.5 68.5
Fine-Tuned PCA 69.4 68.2 68.4 69.4 74.4 73.0 72.5 72.9 72.3 72.4 68.3 71.0
Families in the Wild (FIW): A Large-Scale Kinship Recognition Database
Father-Daughter
Mother-Daughter
Mother-Son
Father-Son
Grandfather-Granddaughter
Grandmother-Grandson
Grandfather-Grandson
Grandmother-Granddaughter
Sister-Sister
Brother-Brother
Siblings
Table	4 Verification	scores	for	5-fold	experiment	on	FIW.	Note,	there	was	no	family	overlap	between	folds.	Top	accuracies	
resulted	from	fine-tuning	the	VGG-Face	model	using	FIW	data	by	replacing	the	topmost	layer	with	a	triplet	loss.	
Table	1	No.	pairs	in	FIW	and	other	kinship	image	collections.
Pair-Type KFW-II Sibling	
Face
Group	
Face
Family	
101
FIW	
(Ours)
Brother-Brother -- 232 40 -- 86k
Sister-Sister -- 211 32 -- 86k
Siblings -- 277 53 -- 75k
Father-Daughter 250 -- 69 147 45k
Father-Son 250 -- 69 213 43k
Mother-Daughter 250 -- 62 148 44k
Mother-Son 250 -- 70 184 37k
GF-GD -- -- -- -- 410
GF-GS -- -- -- -- 350
GM-GD -- -- -- -- 550
GM-GS -- -- -- -- 750
Total 1,000 720 395 607 >418k
Dataset No.	
Fam.	
No.	
People	
No.	
Faces	
Age	
Varies	
Full	
Fam.
Highlights	
CornellKin
[5]
150 300 300 No No Parent-child	pairs.
UB	KinFace-I	
[8]
90 180 270 Yes No Parent-child	pairs.	Parents’	139	images	
at	various	ages.	
UB	KinFace-
II	[8]
200 400 600 Yes No Parent-child	pairs.	Parents’	139	images	
at	various	ages.	
KFW-I	[6] — 1,066 1k No No Parent-child	pairs.	
KFW-II		[6] — 2,000 2k No No Parent-child	pairs.	
TSKinFace
[9]
787 2,589 — Yes Yes Two	parents-child	pairs	for	tri-
verification.	
Family101	
[7]
101 607 14k Yes Yes Family	structured,	variations
in	age	and	ethnicity.
FIW(Ours)	 1k 10.6k 27k Yes Yes A	corpus	of	1k	family	trees	that	
provides	both	depth	and	breadth	and	
multi-task	evaluation	offerings.	
Table	2	 Comparison	of	FIW	with	related	datasets.
Fold VGG-Face VGG-Face	(fine-tuned)
1 9.6 10.9
2 14.5 14.8
3 11.6 12.5
4 12.7 14.8
5 13.1 13.5
Avg. 12.3 13.3
Table	5 Family	recognition	results	for	5-fold	experiment.	
Top	score	results	from	fine-tuned	CNN	.	
Fig	1 Work-flow	of	labelling	tool	used	to	build	FIW (a).	Each	time	a	face	is	selected	it	is	surrounded	 by	a	
resizable	bounding	 box	(b).	If	the	family	member	has	been	added	to	dataset	their	name	is	specified.	
Otherwise,	’new’	adds	a	member	(c)– their	name	and	gender	are	given	(d),	then	relationships	to	others	(e).
Advancing	visual	kinship	recognition	technology	is	key	for	many	real-world	
applications	(e.g.,	kinship	verification,	automatic	photo	library	management,	
genealogical	analysis,	and	applied	in	social	media).	Although	there	has	been	some	
efforts	the	machine	vision	and	multimedia	research	communities	made	since	2010,	
the	lack	of	large	enough	datasets	has	restricted	any	substantial	progress.	For	this,	we	
built	the	largest	visual	kinship	dataset	to	date	,	Families	in	the	Wild	(FIW)	[see	Table	1	
&	2].	Annotations	(i.e.,	ground	truth	labels)	were	generated	with	our	efficient	SW	tool	
KAT-SMILE	[see	Fig 1]	,	which	was	developed	to	annotate	the	complex	hierarchical	
nature	of	1,000	different	family	trees	such	to	generate	a	rich	set	of	labels	capable	of	
supporting	various	evaluation	types	(i.e.,	multi-task	purposes).	Several	
comprehensive	experiments	were	designed,	conducted,	and	presented	here.	
Introduction
Graduate
Category:	Engineering	and	Technology
Degree	Level:	PhD
Abstract	ID#	1271
Fig	4 Box-and-Whisker	chart	to	depict	the	scores	of	20	human	observers	doing	kinship	
verification	on	FIW	dataset.	The	magenta	color	cross	(+)	marks	the	average	score	for	the	fine-
tuned	CNN	model,	which	outperforms	 the	top	human	scorer	in	every	category.	
Fig	2 Visualizing	 structure	of	FIW.	Family	trees	span	up	to	5	generations	in	depth	with	10	set	of	parents	in	a	single	tree.	As	
shown	the	Spielberg	family		(i.e.,	family	ID	703	out	of	1,000)	is	made	up	of	3	parents	with	10	children	in	total.	Each	column	
(right),	 contains	samples	of	an	individual	 family	member.	Notice	there	are	multiple	samples	for	each	person	at	various	ages.
Steven
(a)
Kate
(c)
(d)
(e)
(b)
Spielberg Family: Face Samples
Spielberg Family Tree
YoungerßAGEàOlder
0.2
0.3
0.4
0.5
0.6
0.7
0.8
F-D F-S M-D M-S SIB GF-GD GF-GS GM-GD GM-GS Avg.
Kinship	Verification:	Human	Observers

More Related Content

Featured

Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...DevGAMM Conference
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationErica Santiago
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellSaba Software
 

Featured (20)

Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
 

kinship_poster

  • 1. Joseph P. Robinson, Ryan Birke, Timothy Gillis, Justin Xia, Yue Wu, Yun Fu We present a large-scale dataset for visual kin-based problems, i.e., kinship verification and family recognition. Namely, the Families in the Wild (FIW) dataset. Motivated by the lack of a single, unified image dataset available for kinship tasks, our goal is to provide the research community with a dataset that captivates interests for involvement, i.e., large enough in scope to inherently provide platforms for multiple vision tasks. We were able to collect and label the largest set of family images to date with a small team and an efficient labelling tool that was designed to optimize the process of marking complex hierarchical relationships, attributes, and local label information in family photos. We experimentally compare our dataset to existing kinship image datasets, and demonstrate the practical value of our FIW dataset. We also show that using a pre-trained CNN model as an off-the-shelf feature extractor works better than features traditionally used. We further boost performance by fine- tuning CNN on FIW data. We also measure human performance and show their performance does not match up to that of machine vision algorithms. Abstract A 5-fold cross validation kinship verification. 90% images of each family are used to fine-tune the model and the rest for validation. We remove the last fully- connected layer which is used to identify 2,622 people and utilize the triplet loss [11] as the loss function. The network was frozen except the last fully-connected layer which was used as the features for kinship verification. To fine-tune CNN we interfaced with the well-known Caffe [10] framework. The learning rate was initially set to 10−5 and decreases by a factor of 10 every 700 iterations. The model was fine-tuned for about 1,400 iterations. The batch size was set to 128 images. The other setting of the network was the same with the original VGG-FACE. The training was conducted on a single GTX Titan X using about 10 GB GPU memory. Deep CNN: Fine Tuning • Built the largest kinship database to date, along with the labels, baseline results, and evaluation protocols needed to further and track future progress. • Found pre-trained CNNs yield the best features for our unconstrained dataset. • Revealed algorithms outperform humans doing the verification tasks. • Obtained top scores for both kinship verification and family recognition by fine- tuning CNN network on FIW data. • Develop project page to go live upon being published in peer review paper. • Generate additional baseline results for tasks new to visual kinship (e.g., fine- grain classification and search & retrieval). • Use data to explore natural inheritance from a visual perspective. Conclusions/Future Work 1. Ahonen, T., Hadid, A., Pietikainen, M.: Face description with local binary patterns: Application to face recognition. PAMI (2006). 2. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV (2004). 3. Parkhi, O.M., Vedaldi, A., Zisserman, A.: Deep face recognition. In: BMVC (2015). 4. Girshick, R.B., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. CoRR (2013). 5. Xia, S., Shao, M., Luo, J., Fu, Y.: Understanding kin relationships in a photo. IEEE Transactions on Multimedia 14(4) (2012). 6. Fang, R., Tang, K.D., Snavely, N., Chen, T.: Towards computational models of kinship verification. ICIP (2010). 7. Qin, X.., Chen, S. Tri-subject kinship verification: Understanding the core of a family. 8. Xia, S., Shao, M., Fu, Y.: Kinship verification through transfer learning. IJCAI (2011). 9. Guo Y., Dibeklioglu H., vander Maaten L. Graph-based kin recognition ICPR (2014). 10. Y. Jia, E. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell. Caffe: Convolutional architecture for fast feature embedding. arXiv:1408.5093, 2014. 11. F. Schroff, D. Kalenichenko, and J. Philbin. Facenet: A unified embedding for face recognition and clustering. CVPR 2015. References Table 3 Features and parameters used throughout this work. Kinship Verification on FIW. • Extracted features listed in Table 3, and used a cosine distance metric. • Benchmarked results for 2 tasks, kinship verification and family recognition. o Included various feature types, both raw and transformed with different metric learning and dimensionality reduction schemes . o Repeated experiments using several pre-existing kinship datasets to further examine differences (i.e., FIW vs its predecessors). • Kin relation types and sample sizes are listed in Table 1 and displayed in Fig 3. o The number of positive and negative samples is split 50-50 (e.g., F-S pair was made up of 35k pairs, 17.5k positive samples and 17.5k negative samples). • Cross-validated on 5-fold [see Table 4]. o F-S, as an example, split 3.5k positive & 3.5k negative pairs in each fold. o Determines kinship from cosine similarity scores on 4 test folds for each pair. A threshold was used to determine the kin relation of each pair. The average performance of 5 folds is reported. • Fine-tuned VGG-Face deep CNN model achieved top scores. Family Recognition. • VGG-Face + one-vs-rest SVMs were used as baseline. • 2 different experimental settings were used: 1. Families which have more than 20 images were selected, resulting in 399 families with total 11,158 images. 80% images of each family was randomly set as training data and the rest was set as testing. 2. For the second setting, we extract the families which have more than 5 members in our dataset. Then we choose 5 members which have the most images in each family. This results in 316 families with 7,772 images. Then we split these images into 5 folds, each contained one family member for training, then test on all other members. • Fine-tuned VGG-Face deep CNN, which again achieved top score [see Table 5]. Human Kinship Verification. • Measured human performance on kinship verification using 200 face pairs of the 11 pairwise types supported in FIW. • Human performers scored an overall average of 56.6%, which was outscored by fine-tuned CNN by approximately 15 % [see Fig 5]. Experiment Fig 3 Sample pairs for the 11 kinship relations provided by the FIW database. Feature Description SIFT [1] • Resized images to 64x64, with the block size set as 16x16. The stride is 8, making for 49 blocks per image. Feature dimensions: 12x8x49 = 6,272D. LBP [2] • Resized image to 64x64; divided image into 16x16 non-overlapping blocks (i.e., 16 blocks per image); extracted LBP features with radius = 2 and mapped to 8-neighbors. Each block were binned as 256D histogram to yield a feature vector of dimension 256x16 = 4,096D. VGG-Face CNN Descriptors [3] • Very Deep architecture, small convolutional kernels (i.e., , 3x3), & a convolutional stride of 1 pixel; trained on ~2.6M images of 2,622 celebrity faces. The pre- trained CNN was used as a feature extractor; fed face images of size 224x224 through the network with the output set as the 2nd to last fully-connected layer (i.e., fc7-layer), resulting vectors were 4,096D. F-D F-S M-D M-S SIBS B-B S-S GF-D GF-S GM-D GM-S Avg. HOG 56.2 56.9 56.8 55.7 59.3 50.3 57.8 62.4 58.9 59.5 57.7 57.4 HOG PCA 56.1 56.5 56.4 55.3 58.7 50.3 57.4 59.3 66.9 60.4 56.9 57.7 LBP 55.0 55.2 55.4 56.0 57.1 57.0 55.9 59.0 56.0 55.8 60.3 56.6 LBP PCA 55.0 55.3 55.4 55.9 57.1 56.8 55.8 58.5 59.1 55.6 60.1 56.8 VGG-Face 64.3 63.3 66.4 64.2 73.2 71.4 70.6 66.1 61.1 64.9 60.4 66.0 VGG-Face PCA 64.4 63.4 66.2 64.0 73.2 71.5 70.8 64.4 68.6 66.2 63.5 66.9 Fine-Tuned 67.8 66.6 66.7 68.2 72.3 70.8 70.3 69.5 68.3 69.5 63.5 68.5 Fine-Tuned PCA 69.4 68.2 68.4 69.4 74.4 73.0 72.5 72.9 72.3 72.4 68.3 71.0 Families in the Wild (FIW): A Large-Scale Kinship Recognition Database Father-Daughter Mother-Daughter Mother-Son Father-Son Grandfather-Granddaughter Grandmother-Grandson Grandfather-Grandson Grandmother-Granddaughter Sister-Sister Brother-Brother Siblings Table 4 Verification scores for 5-fold experiment on FIW. Note, there was no family overlap between folds. Top accuracies resulted from fine-tuning the VGG-Face model using FIW data by replacing the topmost layer with a triplet loss. Table 1 No. pairs in FIW and other kinship image collections. Pair-Type KFW-II Sibling Face Group Face Family 101 FIW (Ours) Brother-Brother -- 232 40 -- 86k Sister-Sister -- 211 32 -- 86k Siblings -- 277 53 -- 75k Father-Daughter 250 -- 69 147 45k Father-Son 250 -- 69 213 43k Mother-Daughter 250 -- 62 148 44k Mother-Son 250 -- 70 184 37k GF-GD -- -- -- -- 410 GF-GS -- -- -- -- 350 GM-GD -- -- -- -- 550 GM-GS -- -- -- -- 750 Total 1,000 720 395 607 >418k Dataset No. Fam. No. People No. Faces Age Varies Full Fam. Highlights CornellKin [5] 150 300 300 No No Parent-child pairs. UB KinFace-I [8] 90 180 270 Yes No Parent-child pairs. Parents’ 139 images at various ages. UB KinFace- II [8] 200 400 600 Yes No Parent-child pairs. Parents’ 139 images at various ages. KFW-I [6] — 1,066 1k No No Parent-child pairs. KFW-II [6] — 2,000 2k No No Parent-child pairs. TSKinFace [9] 787 2,589 — Yes Yes Two parents-child pairs for tri- verification. Family101 [7] 101 607 14k Yes Yes Family structured, variations in age and ethnicity. FIW(Ours) 1k 10.6k 27k Yes Yes A corpus of 1k family trees that provides both depth and breadth and multi-task evaluation offerings. Table 2 Comparison of FIW with related datasets. Fold VGG-Face VGG-Face (fine-tuned) 1 9.6 10.9 2 14.5 14.8 3 11.6 12.5 4 12.7 14.8 5 13.1 13.5 Avg. 12.3 13.3 Table 5 Family recognition results for 5-fold experiment. Top score results from fine-tuned CNN . Fig 1 Work-flow of labelling tool used to build FIW (a). Each time a face is selected it is surrounded by a resizable bounding box (b). If the family member has been added to dataset their name is specified. Otherwise, ’new’ adds a member (c)– their name and gender are given (d), then relationships to others (e). Advancing visual kinship recognition technology is key for many real-world applications (e.g., kinship verification, automatic photo library management, genealogical analysis, and applied in social media). Although there has been some efforts the machine vision and multimedia research communities made since 2010, the lack of large enough datasets has restricted any substantial progress. For this, we built the largest visual kinship dataset to date , Families in the Wild (FIW) [see Table 1 & 2]. Annotations (i.e., ground truth labels) were generated with our efficient SW tool KAT-SMILE [see Fig 1] , which was developed to annotate the complex hierarchical nature of 1,000 different family trees such to generate a rich set of labels capable of supporting various evaluation types (i.e., multi-task purposes). Several comprehensive experiments were designed, conducted, and presented here. Introduction Graduate Category: Engineering and Technology Degree Level: PhD Abstract ID# 1271 Fig 4 Box-and-Whisker chart to depict the scores of 20 human observers doing kinship verification on FIW dataset. The magenta color cross (+) marks the average score for the fine- tuned CNN model, which outperforms the top human scorer in every category. Fig 2 Visualizing structure of FIW. Family trees span up to 5 generations in depth with 10 set of parents in a single tree. As shown the Spielberg family (i.e., family ID 703 out of 1,000) is made up of 3 parents with 10 children in total. Each column (right), contains samples of an individual family member. Notice there are multiple samples for each person at various ages. Steven (a) Kate (c) (d) (e) (b) Spielberg Family: Face Samples Spielberg Family Tree YoungerßAGEàOlder 0.2 0.3 0.4 0.5 0.6 0.7 0.8 F-D F-S M-D M-S SIB GF-GD GF-GS GM-GD GM-GS Avg. Kinship Verification: Human Observers