SlideShare a Scribd company logo
Turning	your	
Iterators	up	to	11
New	features	for	complex	Iterator	implementations
Ivan	Bella
Turning	your	Iterators	up	to	1110/16/2017
Lifecycle
• init(source,	options,	env)
• seek(range,	cfs,	includeCfs)
• while	hasTop()
• getTopKey(),	getTopValue()
• next()
Turning	your	Iterators	up	to	1110/16/2017
2
Returned	Keys
• Keys	must	be	returned	in	sorted	order
• Keys	must	be	within	the	range	seeked
Turning	your	Iterators	up	to	1110/16/2017
3
Stateful
• You	can	keep	state	in	an	iterator
• If	tablet	moves,	then	state	is	lost
– Rebalance	or	tserver	process	is	stopped
– Can	use	external	caches	for	persistent	state
• Preserve	state	in	deepCopy
• There	is	no	close	method	(yet)
Turning	your	Iterators	up	to	1110/16/2017
4
Re-seek	after	Teardown
• Iterators	may	be	“torn	down”
• seek(new	Range(lastKeyReturned,	false,	…),	…)
• How	can	you	tell	the	difference	from	the	initial	seek?
• Use	start	range	inclusive	flag
• Keep	track	of	last	key	returned	and	compare
• Form	of	last	key	returned	might	be	used,	but	this	could	be	fragile
Turning	your	Iterators	up	to	1110/16/2017
5
Multi-threading	an	iterator
• Can	be	used	to	execute	multiple	parts	of	the	range	
concurrently,	or	to	pre-fetch	results
• Use	static	thread	pools
• Remember	returned	keys	must	be	sorted
• source.deepCopy	per	thread
Turning	your	Iterators	up	to	1110/16/2017
6
deepCopy
• Creates	a	copy	of	an	iterator	that	can	be	used	independently
• Make	sure	you
• Keep	track	of	copies
• Reuse	copies	if	possible
• One	copy	per	thread
Turning	your	Iterators	up	to	1110/16/2017
7
Pre-fetch	Example
Turning	your	Iterators	up	to	11
Iterator
K’,V’
Range…
Sorted	Set
Transform
Transform
Transform
K’,V’
34	
1:	The	seek	call	splits	range	into	parts
2:	The	transforms	run	within	a	thread	pool
3:	The	sorted	set	should	be	backed	by	HDFS	when	buffer	overflows
4:	The	seek	does	not	complete	until	the	final	sorted	set	is	ready
2	
K’,V’
Source	Pool
Source
Source
deepCopy
1	
Thread	
Pool
10/16/2017
8
Pipeline	Example
Turning	your	Iterators	up	to	11
Iterator
K’,V’
Source	Pool
Pipeline
K,V
K’,V’
Work	Queue
Thread	Pool
K,V
K’,V’
2
3
1:	On	each	call	to	the	Pipeline,	the	work	queue	is	filled
2:	The	work	queue	is	FIFO,	and	the	size	must	be	capped
3:	The	thread	pool	size	must	be	capped
1	
Source
Source
deepCopy
10/16/2017
9
Long	running	scans
• If	you	have	a	seek	or	next	call	that	can	take	a	really	long	time	
resulting	in	starving	out	other	scans	within	the	same	thread	
pool,	try	implementing	the	YieldingKeyValue	interface
• enableYielding(YieldCallback)
Turning	your	Iterators	up	to	1110/16/2017
10
Yield	
Callback
Yielding	Lifecycle
Turning	your	Iterators	up	to	11
Iterator
Tablet	
Server
init
enableYielding(callback)
seek	or	a	following	next
yield(k)
hasYielded
getPositionAndReset
seek(new	Range(k,	false,	…),	…)
10/16/2017
11
Our	Use	Case
• Boolean	queries	against	a	document	store
• Documents	are	composed	of	field,	value	tuples
• Iterator	takes	a	boolean	query	with	a	range,	and	returns	
documents	that	match	the	query
Turning	your	Iterators	up	to	1110/16/2017
12
Data	Structure
Turning	your	Iterators	up	to	11
Document	Table
…
Shard_1
Global	Index
fv ->				fn ->			shard_x
…
Field	Index
fn,fv ->			doc_n
…
Documents
Doc_n ->	fn,	fv
…
10/16/2017
13
Data	Structure
• Global	Index
• Maps	values	to	field	names	and	then	to	shards
• Field	Index
• Within	a	shard,	maps	field	name,values	to	doc	ids
• Documents:	indexed	by	shard	id	+	doc	id
• Within	a	shard,	maps	doc	id	to	field	name,value	tuples
Turning	your	Iterators	up	to	1110/16/2017
14
Query	Iterator
Turning	your	Iterators	up	to	11
Iterator
K,V	/	yield	K
ID,	Query,
Shard	Range
Field	Index
(may	pre-fetch)
Evaluation
Pipeline
Query,	
Range,
Source
K,V	/	yield	K
Doc	id
10/16/2017
15
Metrics
• Emanate	real-time	metrics	from	the	tservers
• See	Timely
• What	to	count
• deepCopies
• next	and	seek	calls
• Specific	iterators
• All	of	our	metrics	are	tagged	with	the	query	id
Turning	your	Iterators	up	to	1110/16/2017
16
Debugging	iterators
• Unit	Tests:
• Iterator	test	framework
• Minicluster
• Scan	options
• Include	ids	with	your	scans	that	you	can	track
• Logging
Turning	your	Iterators	up	to	1110/16/2017
17
Debugging	iterators
• Metrics
• Include	ids
• Use	table	context	on	a	cloned	table
• Allows	testing	a	separate	deployment	at	scale
• Attaching	to	tservers
• Do	not	to	stop	process	when	you	hit	a	breakpoint
Turning	your	Iterators	up	to	1110/16/2017
18
Future
• Prioritization	of	scans
• Set	at	scan	time
• Change	priority	dynamically
• Invoking	a	close	method
• Automatically	once	hasTop	is	false
• Invoked	externally	when	client	has	left
Turning	your	Iterators	up	to	1110/16/2017
19

More Related Content

What's hot

Pythonic Deployment with Fabric 0.9
Pythonic Deployment with Fabric 0.9Pythonic Deployment with Fabric 0.9
Pythonic Deployment with Fabric 0.9
Corey Oordt
 
Sp ch05
Sp ch05Sp ch05
17 Linux Basics #burningkeyboards
17 Linux Basics #burningkeyboards17 Linux Basics #burningkeyboards
17 Linux Basics #burningkeyboards
Denis Ristic
 
Systemcall1
Systemcall1Systemcall1
Systemcall1
pavimalpani
 
Basics of Linux
Basics of LinuxBasics of Linux
Basics of Linux
SaifUrRahman180
 
Assic 16th Lecture
Assic 16th LectureAssic 16th Lecture
Assic 16th Lecture
babak danyal
 
Linux basics
Linux basicsLinux basics
Linux basics
Geeta Vinnakota
 
Linux Command Line
Linux Command LineLinux Command Line
Linux Command Line
Prima Yogi Loviniltra
 

What's hot (8)

Pythonic Deployment with Fabric 0.9
Pythonic Deployment with Fabric 0.9Pythonic Deployment with Fabric 0.9
Pythonic Deployment with Fabric 0.9
 
Sp ch05
Sp ch05Sp ch05
Sp ch05
 
17 Linux Basics #burningkeyboards
17 Linux Basics #burningkeyboards17 Linux Basics #burningkeyboards
17 Linux Basics #burningkeyboards
 
Systemcall1
Systemcall1Systemcall1
Systemcall1
 
Basics of Linux
Basics of LinuxBasics of Linux
Basics of Linux
 
Assic 16th Lecture
Assic 16th LectureAssic 16th Lecture
Assic 16th Lecture
 
Linux basics
Linux basicsLinux basics
Linux basics
 
Linux Command Line
Linux Command LineLinux Command Line
Linux Command Line
 

Recently uploaded

Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
AnirbanRoy608946
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 

Recently uploaded (20)

Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 

Turning Your Iterators Up to 11: New Features for Complex Iterator Implementations