SlideShare a Scribd company logo
1 of 23
Download to read offline
Lessons	Learned
OPTIMIZING	ACCUMULO	AS	A	BACKEND	FOR	USER	
APPLICATIONS
Common	Web	Architecture
Problems	With	the	Web	Architecture
Volume Response Time
Why	Accumulo
Ability to Scale Visibility Built In Flexible Storage Model
Accumulo’s	Shortcomings
Indices	are	not	built	in
No	SLA	for	response	time
Storage	technique	can	greatly	affect	scan	response	time
Large	datasets	can	take	minutes	to	fully	return
Solutions	to	Specific	Issues
Cache	query	results
Increase	speed	of	scans
Data	sampling
Returning	partial	results/pagination
Caching	Results
Problem
◦ No	SLA	for	response	time
◦ Many	users	issue	the	same	queries
Benefit
◦ Results	of	common	queries	can	be	stored
◦ Reduce	the	amount	of	times	Accumulo	is	scanned
Requirement
◦ Mirror	Accumulo’s	sort	order
◦ Implement	visibility
Solution
◦ Redis
Caching	Results:	Redis
Open	source
In-memory	data	structure	store
Supports	multiple	data	structures
Built	in	replication
Caching	Results:	Sorted	Set
Redis	contains	a	few	data	structures	but	we	are	looking	to	easily	
replicate	Accumulo’s	behavior
The	best	choice	is	a	sorted	set,	but	it	may	not	be	for	the	most	
obvious	of	reasons
Default	behavior	is	to	sort	based	on	the	elements	score
◦ Secondary	behavior:	elements	with	the	same	score	are	sorted	
lexicographically
◦ We	can	exploit	this	functionality	to	easily	mimic	Accumulo’s	behavior	in	
Redis
Caching	Results:	Adding	Elements
redis> ZADD accumulo 1 apple
(integer) 1
redis> ZADD accumulo 1 banana
(integer) 1
redis> ZADD accumulo 1 beet
(integer) 1
redis> ZADD accumulo 1 avocado
(integer) 1
Caching	Results:	Retrieving	Elements
redis> ZRANGEBYLEX accumulo - +
1) "apple"
2) "avocado"
3) "banana"
4) "beet"
redis> ZRANGEBYLEX accumulo [avocado (beet
1) "avocado"
2) "banana"
Caching	Results:	Reverse	Order
redis> ZREVRANGEBYLEX accumulo + -
1) "beet"
2) "banana"
3) "avocado"
4) "apple"
redis> ZREVRANGEBYLEX accumulo (beet [avocado
1) "banana"
2) "avocado"
Caching	Results:	Enforcing	Visibility
redis> ZADD accumulo:visibility 1 pear
(integer) 1
redis> ZADD accumulo:visibility 1 plum
(integer) 1
redis> KEYS *
1) "accumulo:visibility"
2) "accumulo"
Encode it in the key!
Increase	Speed	of	Scans
Problem
◦ Trying	to	return	too	much	data
◦ Storage	is	not	optimal
Benefit
◦ Lowering	response	time	for	Accumulo	access
Requirement
◦ Minimal	effort	to	increase	scan	time
Solutions
◦ Combine	results	and	compress	results
◦ Store	results	more	efficiently
Increase	Speed	of	Scan:	Combine	and	Compress
RowID CF:CQ Vis Value
apple fruit:produce [] 10
avocado fruit:produce [] 4
banana fruit:produce [] 11
beet vegetable:produce [] 3
Results will be returned in order: 10, 4, 11, 3
Using an iterator to combine the results and compress them will result in less
network traffic
Increase	Speed	of	Scans:	Store	More	Efficiently
RowID CF:CQ Vis Value
fruit department:produce [] {"apple":10, "avocado":4, "banana":11}
vegetable department:produce [] {"beet":3}
Similar to previous solution except no need for an iterator
Greatly reduces the amount of next() and seek() calls
Data	Sampling
Problem
◦ Data	being	retrieved	is	too	large
◦ Users	may	only	want	to	see	representation
Benefit
◦ Quickly	return	results
Requirement
◦ Know	how	much	data	to	return
Solution
◦ Reservoir	sampling
Data	Sampling:	Reservoir	Sampling
Randomly	chooses	a	sample	of	k items	from	a	list	S containing	n
items,	where	n is	either	a	very	large	or	unknown	number
A	sample	algorithm
◦ Keep	the	first	k items	in	memory
◦ For	the	i-th item,	where	i>k:
◦ with	probability	k/i,	keep	the	new	item	(discard	an	old	one,	selecting	which	to	replace	at	
random,	each	with	chance	1/k)
◦ with	probability	1−k/i,	keep	the	old	items	(ignore	the	new	one)
Partial	Results/Pagination
Problem
◦ Data	being	retrieved	is	too	large
Benefit
◦ Quickly	return	results
◦ Cache	results	
◦ Show	fraction	of	entire	result
Requirement
◦ Know	how	much	data	to	return
Solution
◦ Multithreaded	client	caching	results	in	Redis
Partial	Results:	Multithreaded	Client
Immediately	return	the	first	n results
In	the	background	load	the	remaining	results	into	the	
cache
As	users	want	more	data,	load	more	data	from	cache
Pagination:	Redis	Cache
redis> ZRANGE accumulo 0 1
1) "apple"
2) "avocado"
redis> ZRANGE accumulo 2 4
1) "banana"
2) "beet"
Known	Gaps
Cache
◦ Redis	is	single	threaded
◦ Invalidate	cache	due	to	changing	data
◦ Support	for	full	visibility	expressions
Pagination
◦ Start	from	middle	of	set
◦ Know	all	the	pages	within	a	dataset	before	the	entire	set	is	read
Questions
Contact	Info
Email:	zachary.radtka@gmail.com
Twitter:	@zachradtka

More Related Content

What's hot

Module Owb External Execution
Module Owb External ExecutionModule Owb External Execution
Module Owb External Execution
Nicholas Goodman
 

What's hot (14)

linkTuner Webinar - March 2013
linkTuner Webinar - March 2013linkTuner Webinar - March 2013
linkTuner Webinar - March 2013
 
2.2 Reliable Message Bus based on RocketMQ
2.2 Reliable Message Bus based on RocketMQ2.2 Reliable Message Bus based on RocketMQ
2.2 Reliable Message Bus based on RocketMQ
 
Geek Sync I Capacity Planning for Improved Uptime
Geek Sync I Capacity Planning for Improved UptimeGeek Sync I Capacity Planning for Improved Uptime
Geek Sync I Capacity Planning for Improved Uptime
 
Module Owb External Execution
Module Owb External ExecutionModule Owb External Execution
Module Owb External Execution
 
ESRI UC 2010 - ArcGIS Server Virtualization and High-Performance Computing
ESRI UC 2010 - ArcGIS Server Virtualization and High-Performance ComputingESRI UC 2010 - ArcGIS Server Virtualization and High-Performance Computing
ESRI UC 2010 - ArcGIS Server Virtualization and High-Performance Computing
 
Module Owb Lifecycle
Module Owb LifecycleModule Owb Lifecycle
Module Owb Lifecycle
 
Presentation AuthZForce
Presentation AuthZForcePresentation AuthZForce
Presentation AuthZForce
 
How Kafka and Modern Databases Benefit Apps and Analytics
How Kafka and Modern Databases Benefit Apps and AnalyticsHow Kafka and Modern Databases Benefit Apps and Analytics
How Kafka and Modern Databases Benefit Apps and Analytics
 
Common SQL Server Mistakes and How to Avoid Them with Tim Radney
Common SQL Server Mistakes and How to Avoid Them with Tim RadneyCommon SQL Server Mistakes and How to Avoid Them with Tim Radney
Common SQL Server Mistakes and How to Avoid Them with Tim Radney
 
Optimizing Data Management for MongoDB
Optimizing Data Management for MongoDBOptimizing Data Management for MongoDB
Optimizing Data Management for MongoDB
 
Using Apache Spark for Predicting Degrading and Failing Parts in Aviation
Using Apache Spark for Predicting Degrading and Failing Parts in AviationUsing Apache Spark for Predicting Degrading and Failing Parts in Aviation
Using Apache Spark for Predicting Degrading and Failing Parts in Aviation
 
JOSA TechTalk - Lambda architecture and real-time processing
JOSA TechTalk - Lambda architecture and real-time processingJOSA TechTalk - Lambda architecture and real-time processing
JOSA TechTalk - Lambda architecture and real-time processing
 
Introduction to MemSQL
Introduction to MemSQLIntroduction to MemSQL
Introduction to MemSQL
 
Summit Australia 2019 - PowerApps Component Framework (PCF) - Andrew Ly & Aun...
Summit Australia 2019 - PowerApps Component Framework (PCF) - Andrew Ly & Aun...Summit Australia 2019 - PowerApps Component Framework (PCF) - Andrew Ly & Aun...
Summit Australia 2019 - PowerApps Component Framework (PCF) - Andrew Ly & Aun...
 

Similar to Lessons Learned: Optimizing Accumulo as a Backend for User Applications

Oracle performance project public
Oracle performance project publicOracle performance project public
Oracle performance project public
Carlos Oliveira
 
NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabases
Adi Challa
 
Debate Initial Post and Response Rubric Student Name .docx
Debate Initial Post and Response Rubric Student Name     .docxDebate Initial Post and Response Rubric Student Name     .docx
Debate Initial Post and Response Rubric Student Name .docx
simonithomas47935
 
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD Editor
 
Website Performance
Website PerformanceWebsite Performance
Website Performance
Hugo Fonseca
 

Similar to Lessons Learned: Optimizing Accumulo as a Backend for User Applications (20)

Extreme SSAS - Part II
Extreme SSAS - Part IIExtreme SSAS - Part II
Extreme SSAS - Part II
 
Iod session 3423 analytics patterns of expertise, the fast path to amazing ...
Iod session 3423   analytics patterns of expertise, the fast path to amazing ...Iod session 3423   analytics patterns of expertise, the fast path to amazing ...
Iod session 3423 analytics patterns of expertise, the fast path to amazing ...
 
Oracle UCM: Web Site Performance Tuning
Oracle UCM: Web Site Performance TuningOracle UCM: Web Site Performance Tuning
Oracle UCM: Web Site Performance Tuning
 
Data mining and warehousing (uca15 e04)
Data mining and warehousing (uca15 e04)Data mining and warehousing (uca15 e04)
Data mining and warehousing (uca15 e04)
 
Taming the Beast: Optimizing Oracle EBS for Radical Efficiency
Taming the Beast: Optimizing Oracle EBS for Radical EfficiencyTaming the Beast: Optimizing Oracle EBS for Radical Efficiency
Taming the Beast: Optimizing Oracle EBS for Radical Efficiency
 
05_DP_300T00A_Optimize.pptx
05_DP_300T00A_Optimize.pptx05_DP_300T00A_Optimize.pptx
05_DP_300T00A_Optimize.pptx
 
Coronel_PPT_Ch11.ppt
Coronel_PPT_Ch11.pptCoronel_PPT_Ch11.ppt
Coronel_PPT_Ch11.ppt
 
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational CacheUsing Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
 
Welcome To The 2016 Query Store!
Welcome To The 2016 Query Store!Welcome To The 2016 Query Store!
Welcome To The 2016 Query Store!
 
Oracle performance project public
Oracle performance project publicOracle performance project public
Oracle performance project public
 
Building Low Cost Scalable Web Applications Tools & Techniques
Building Low Cost Scalable Web Applications   Tools & TechniquesBuilding Low Cost Scalable Web Applications   Tools & Techniques
Building Low Cost Scalable Web Applications Tools & Techniques
 
Drupal Caching For Dummies
Drupal Caching For DummiesDrupal Caching For Dummies
Drupal Caching For Dummies
 
Best practices: Backup and Recovery for Windows Workloads
Best practices: Backup and Recovery for Windows WorkloadsBest practices: Backup and Recovery for Windows Workloads
Best practices: Backup and Recovery for Windows Workloads
 
High availability disaster recovery 101
High availability   disaster recovery 101High availability   disaster recovery 101
High availability disaster recovery 101
 
NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabases
 
Scalability Considerations
Scalability ConsiderationsScalability Considerations
Scalability Considerations
 
Debate Initial Post and Response Rubric Student Name .docx
Debate Initial Post and Response Rubric Student Name     .docxDebate Initial Post and Response Rubric Student Name     .docx
Debate Initial Post and Response Rubric Student Name .docx
 
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
 
Website Performance
Website PerformanceWebsite Performance
Website Performance
 
My sql performance tuning course
My sql performance tuning courseMy sql performance tuning course
My sql performance tuning course
 

Recently uploaded

Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
shivangimorya083
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
shivangimorya083
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
shambhavirathore45
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
Lars Albertsson
 

Recently uploaded (20)

Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 

Lessons Learned: Optimizing Accumulo as a Backend for User Applications