SlideShare a Scribd company logo
1 of 64
Download to read offline
HBASE
CASSANDRA
ACCUMULO
AND
We’ll	look	at
• Architecture
• Data	Model
• Security
• Query	Support
• Gotchas
• Popularity
Accumulo
Apache	Accumulo is	a	computer	software	project	
that	developed	a	sorted,	distributed	key-value	store	
based	on	the	BigTable technology	from	Google.
Known	to	store	on	the	order	of	100	trillion	(1014)	
entries	in	a	single	table.	Single	instance	can	run	on	
thousands	of	machines,	sometimes	over	multiple	
instances	of	HDFS.
http://www.pdl.cmu.edu/SDI/2013/slides/big_graph_nsa_rd_2013_56002v1.pdf
Architecture
Service
Storage
New	Tablet	Servers	join	the	cluster	immediately,	and	can	
begin	serving	requests.	Data	is	replicated	underneath	
asynchronously
Architecture
Fans	of	the	CAP	theorem	will	recognize	that	
Accumulo (as	BigTable)	is	a	CP	system.
Recovery	from	single	node	failure	means	some	
amount	of	unavailability	while	the	WAL	is	replayed	
for	some	number	of	tablets.
Each	key	is	managed	by	exactly	one	server,	ensuring	
high	consistency.
Architecture
Accumulo has	a	metadata	table	that	can	split	and	
grow	to	make	it	possible	to	keep	track	of	a	huge	
number	of	user	tablets.
Data	Model
Adds	a	new	component,	‘Visibility’,	to	the	original	BigTable
model
Same	as	BigTable’s but	with	the	addition	of	a	Column	Visibility
Data	Model:	Column	Visibility
• “1st class”	component	in	data	model
• Security	filtering	is	implemented	in	a	system	
iterator	that	can’t	be	turned	off
• Column	Visibilities	are	stored	as	human	readable	
strings	– no	mapping	that	could	introduce	
confusion
Data	Model:	Locality	Groups
• Column	families	can	be	created	dynamically
• Column	families	are	put	into	the	default	locality	
group	until	assigned	otherwise
• Column	Families	can	be	reassigned	to	locality	
groups	as	needed
• Provides	columnar	scanning	optimization
• Locality	groups	are	simply	sections	within	Rfiles
• no	additional	overhead	on	HDFS	NameNode
• no	restrictions	on	column	family	names
Data	Model:	Locality	Groups
Security
• On	by	default
• User,	group/role,	access	control
• “Cell-level”	security	via	Column	Visibilities
• Fail-safe
• Security	label	operators	include	&	and	|	or
Common	problem	for	new	users	is	writing	data	they	can’t	
even	see.
Query	Support
• No	secondary	indexing	capabilities	built	in
• Several	secondary	indexing	patterns	are	well	
supported
• Support	for	scanning	long	rows
• BatchScanner for	fetching	results
• Iterators	enable	storing	index	entries	with	records
• No	official	query	language
Some	Gotchas
• Usually	requires	tuning	beyond	what	distros	provide
• Not	balancing	clients	and	tablet	servers
• Having	many	small	tables	vs	few	large	tables
• Not	as	many	free	resources	online	like	blog	posts,	
tutorials,	forums,	etc.
• Larger	individual	servers	mean	that	server	failure	can	
result	in	a	large	amount	of	data	needing	to	be	
replicated.	Accumulo only	needs	to	process	recent	
write-ahead	log	entries,	however,	before	everything	is	
back	online.
Popularity
• 4th most	popular	‘wide	column	store’,	behind	
Cassandra,	HBase,	and	Microsoft	Azure	Cosmos	DB
• About	a	16th as	popular	as	HBase
• 60th most	popular	DB	overall,	higher	than:
• Cloudant
• MemSQL
• Apache	Drill
• Oracle	NoSQL
• Amazon	Simple	DB
• LevelDB
• VoltDB
• Google	Cloud	BigTable
• MapR-DB
https://db-engines.com/en/system/Accumulo
https://db-engines.com/en/system/Accumulo
Recent	Improvements	in	1.7
• Client	Authentication	with	Kerberos
• Data-Center	Replication
• User-Initiated	Compaction	Strategies
• API	Clarification
• Faster	Startup	via	Configurable	Threadpool Size	for	Assignments
• Group-Commit	Threshold	as	a	Factor	of	Data	Size
• Balancing	Groups	of	Tablets
• User-specified	Durability
• Hadoop	Metrics2	Support
• Distributed	Tracing	with	Htrace
• Per-Table	Volume	Chooser
• Table	and	namespace	custom	properties
Recent	Improvements	in	1.8
• Speed	up	WAL	roll	overs
• User	level	API	for	Rfile
• Suspend	Tablet	assignment	for	rolling	restarts
• Run	multiple	Tablet	Servers	on	one	node
• Rate	limiting	Major	Compactions
• Table	Sampling
Speed	up	WAL	Rollovers
HBase
“Use	Apache	HBase™	when	you	need	random,	real-
time	read/write	access	to	your	Big	Data.	This	
project's	goal	is	the	hosting	of	very	large	tables	--
billions	of	rows	X	millions	of	columns	-- atop	clusters	
of	commodity	hardware.”
Billions	x	Millions	=	Trillions,	1012
Billions,	you	say?
Architecture
Basically	the	same	as	Accumulo’s.
HBase now	stores	it’s	root	table	in	ZooKeeper.	The	
hbase:meta table	doesn’t	split.
Architecture
Facebook	and	others	didn’t	like	the	recovery	time	
associated	with	reading	recent	write-ahead	log	
entries,	and	so	in	2015	(0.98)	read	replication	was	
introduced
Hortonworks	says	improves	from	99.9%	to	99.99%[1]
Facebook	claims	HydraBase design	has	99.999%[2]
1.	https://hortonworks.com/blog/apache-hbase-high-availability-next-level/
2.	https://code.facebook.com/posts/321111638043166/hydrabase-the-evolution-of-hbase-facebook/
Architecture
“With	read	replicas	enabled,	the	HMaster distributes	
read-only	copies	of	regions	(replicas)	to	different	
RegionServers in	the	cluster.	One	RegionServer
services	the	default	or primary replica,	which	is	the	
only	replica	which	can	service	write	requests.	If	the	
RegionServer servicing	the	primary	replica	is	down,	
writes	will	fail.”
Stale	reads	are	also	now	possible	and	unavoidable
https://www.cloudera.com/documentation/enterprise/5-4-x/topics/admin_hbase_read_replicas.html
Architecture
“...	make	sure	to	account	for	their	increased	heap	
memory	requirements.	Although	no	additional	
copies	of	HFile data	are	created,	read-only	replicas	
regions	have	the	same	memory	footprint	as	normal	
regions	and	need	to	be	considered	when	calculating	
the	amount	of	increased	heap	memory	required.”
https://www.cloudera.com/documentation/enterprise/5-4-x/topics/admin_hbase_read_replicas.html
Data	Model
Same	as	BigTable’s.
But	HBase does	not	implement	BigTable’s Locality	
Group	feature.	Each	column	family	is	a	directory	in	
HDFS,	effectively	separating	storage	similarly	to	a	
locality	group.
Column	families	must	be	declared	via	DDL	
statements	before	using	and	be	printable
Data	Model:	Column	Families
A	blog	from	2009	recommends	less	than	100	column	
families:
"While	the	number	of	rows	and	columns	is	
theoretically	unbound	the	number	of	column	
families	is	not.	This	is	a	design	trade-off	but	does	not	
impose	too	much	restrictions	if	the	tables	and	key	
are	designed	accordingly.”
http://www.larsgeorge.com/2009/11/hbase-vs-bigtable-comparison.html
Data	Model:	Column	Families
“In	Cloud	Bigtable,	unlike	in	HBase,	you	can	use	up	to	
~100	column	families	while	maintaining	excellent	
performance.”
https://cloud.google.com/bigtable/docs/schema-design
Data	Model:	Column	Families
"HBase currently	does	not	do	well	with	anything	
above	two	or	three	column	families	so	keep	the	
number	of	column	families	in	your	schema	low."
http://hbase.apache.org/book.html#table_schema_rules_of_thumb
Data	Model:	Column	Families
"Try	to	make	do	with	one	column	family	if	you	can	in	
your	schemas."
http://hbase.apache.org/book.html#table_schema_rules_of_thumb
Data	Model:	Value	Sizes
"storing	10-50MB	objects	in	HBase would	probably	
be	too	much	to	ask”[1]
“Aim	to	have	cells	no	larger	than	10	MB,	or	50	MB	if	
you	use MOB”[2]
1. http://hbase.apache.org/0.94/book/supported.datatypes.html
2. http://hbase.apache.org/book.html#table_schema_rules_of_thumb
Security
• Typical	group	/	role	based	access	control
• Cell-level	control	was	added	in	2014,	using	the	co-
processor	mechanism
Security
• To	use	cell	level	security
• Ensure	HBase is	configured	to	use	v	3	Hfile storage
• VisibilityController must	be	added	to	the	list	of	co-
processors
• Setup	Hadoop	Group	Mapping	mechanism
• By	default,	visibility	labels	are	lost	on	replication
• !	(not)	included	as	an	operator,	making	it	more	
important	to	ensure	that	clients	can’t	drop	user	
authorization	tokens	to	avoid	elevation	of	privilege
Query	Support
Despite	several	attempts,	it	appears	secondary	
indexing	is	handled	most	often	outside	of	HBase,	
using	Solr
https://www.cloudera.com/documentation/enterprise/5-6-x/topics/search.html
Gotchas
Covered	above
https://www.cloudera.com/documentation/enterprise/5-6-x/topics/search.html
Popularity
• 2nd most	popular	‘wide	columnar	store’
• About	half	as	popular	as	Cassandra
• Still	growing	in	popularity,	but	growth	has	slowed
https://db-engines.com/en/system/HBase
https://db-engines.com/en/system/HBase
Cassandra
“The	Apache	Cassandra	database	is	the	right	choice	
when	you	need	scalability	and	high	availability	
without	compromising	performance. Linear	
scalability	and	proven	fault-tolerance	on	commodity	
hardware	or	cloud	infrastructure	make	it	the	perfect	
platform	for	mission-critical	data.	Cassandra's	
support	for	replicating	across	multiple	datacenters	is	
best-in-class,	providing	lower	latency	for	your	users	
and	the	peace	of	mind	of	knowing	that	you	can	
survive	regional	outages”
Architecture
• Tries	to	combine	parts	of	BigTable and	Amazon’s	
Dynamo
• Designed	to	span	data	centers,	allows	users	to	
choose	between	CP	and	AP
• Every	node	is	the	same,	no	masters,	no	zookeeper,	
storage	is	coupled	with	service
• Each	server	still	uses	a	memtable,	sstable files	on	
disk,	compaction,	sorting,	etc
• Use	the	‘gossip’	peer-to-peer	protocol
Architecture:	Consistent	Hashing
Data	Model
• Same	as	BigTable,	but	interaction	with	the	data	
model	directly	has	been	completely	eclipsed	by	a	
more	table-like	abstraction	as	part	of	CQL
• Table	schemas	have	to	be	declared	up	front
• Now	features	‘partition	keys’,	‘clustering	columns’,	
and	other,	regular	columns
Data	Model
• ‘Partition	keys’	determine	to	which	partition	and	
server	a	row	belongs
• ‘Clustering	columns’	determine	how	other	columns	
are	grouped	within	a	partition
Data	Model
Videos	by	User
user_id K
uploaded_timestamp C
video_id C
email
first_name
last_name
title
description
Data	Model
user_id uploaded	
timestamp
video_id email first_name last_name title description
Partition	
key
clustering	
column
clustering	
column
column column column column column
Row Col Value
hash(user_id)::uploaded_timestamp::video_id email …
hash(user_id)::uploaded_timestamp::video_id first_name …
hash(user_id)::uploaded_timestamp::video_id last_name …
hash(user_id)::uploaded_timestamp::video_id title …
hash(user_id)::uploaded_timestamp::video_id description …
Data	Model
• All	dynamism	appears	to	be	absent	from	the	data	
model
• All	data	has	to	be	modeled	up	front
• Columns	are	even	typed
Security
• Cassandra	supports	a	security	model	similar	to	that	
of	relational	databases,	which	supports	controlling	
access	to	key- spaces,	tables,	and	rows.[1]
• Row	level	access	relies	on	exact	string	matches.[2]
• Column	level	permissions	are	not	yet	
implemented.[3]
1	http://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/security/secPermissions.html
2	http://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/security/secRlac.html
3	https://issues.apache.org/jira/browse/CASSANDRA-12859
Query	Support
• Cassandra	features	a	SQL	like	language	called	CQL
• Used	to	specify	table	schemas
• Used	for	querying
• Also	features	built-in	secondary	indexing
Query	Support:	Index	distribution
Query	Support
• hash(user_id)	implies	no	range	scans	over	partition	
keys
• only	lookups	on	partition	keys	or	clustering	
columns	are	efficient	without	secondary	indexing
Row Col Value
hash(user_id)::uploaded_timestamp::video_id email …
hash(user_id)::uploaded_timestamp::video_id first_name …
hash(user_id)::uploaded_timestamp::video_id last_name …
hash(user_id)::uploaded_timestamp::video_id title …
hash(user_id)::uploaded_timestamp::video_id description …
Query	Support:	Caveats
• If	not	including	partition	keys	in	your	query,	have	to	
filter	(not	allowed	by	default)
• Possible	to	order	partition	keys	to	allow	range	
scans,	but	not	on	by	default
• Without	all	the	clustering	columns	in	a	query,	it	will	
be	rejected	unless	filtering	is	on
• Can	do	some	range	scans	on	clustering	columns
Query	Support:	Caveats
• Queries	on	secondary	index	restricted	to	equality	
or	‘contains’
• Non-indexed	columns	can	be	part	of	a	query	with	
additional	filtering
• When	querying	secondary	index,	Cassandra	must	
query	all	partitions	but	avoids	naively	querying	all	
at	once
Query	Support:	rounds	strategy
Query	Support:	Indexing	Caveats
Avoid	very	low	cardinality	index	
“e.g.	index	where	the	number	of	distinct	values	is	
very	low.	A	good	example	is	an	index	on	the	gender	
of	an	user.	On	each	node,	the	whole	user	population	
will	be	distributed	on	only	2	different	partitions	for	
the	index:	MALE	&	FEMALE.	If	the	number	of	users	
per	node	is	very	dense	(e.g.	millions)	we’ll	have	very	
wide	partitions	for	MALE	&	FEMALE	index,	which	is	
bad”
https://www.datastax.com/dev/blog/cassandra-native-secondary-index-deep-dive
Query	Support:	Indexing	Caveats
Avoid	very	high	cardinality	index.	
“For	example,	indexing	user	by	their	email	address	is	
a	very	bad	idea.	Generally	an	email	address	is	used	
by	at	most	1	user.	So	there	are	as	many	distinct	index	
values	(email	addresses)	as	there	are	users.	When	
searching	user	by	email,	in	the	best	case	the	
coordinator	will	hit	1	node	and	find	the	user	by	
chance.	The	worst	case	is	when	the	coordinator	hits	
all	primary	replicas	without	finding	any	answer	(0	
rows	for	querying	N/RF	nodes	!)”
https://www.datastax.com/dev/blog/cassandra-native-secondary-index-deep-dive
Query	Support:	Indexing	Caveats
Avoid	indexing	a	column	which	is	updated	(or	
removed	then	created)	frequently.	
By	design	the	index	data	are	stored	in	a	Cassandra	
table	and	Cassandra	data	structure	is	designed	for	
immutability.	Indexing	frequently	updated	data	will	
increase	write	amplification	(for	the	base	table	+	for	
the	index	table)
https://www.datastax.com/dev/blog/cassandra-native-secondary-index-deep-dive
Other	Gotchas
• Adding	a	replacement	node	can	take	a	long	time	
(days)	because	data	must	replicate	– more	smaller	
servers	alleviates	this
• Hinted	handoffs	can	be	problematic
• Large	number	of	tables	can	be	problematic
• People	who	run	9000	nodes	(Netflix)	spread	them	
across	100s	of	cluster	– averaging	100	nodes	per	
cluster
Popularity
• Most	popular	‘wide	columnar	store’
• 8th most	popular	database	overall
• Popularity	has	plateaued	somewhat
https://db-engines.com/en/system/Cassandra
https://db-engines.com/en/system/Cassandra
So	which	to	use?
0 2 4 6 8 10 12 14 16 18
ad
built	on
cloud
enterprise
gaming
marketing
mobile
social
web
Relative	Usage	by	Declaration
Cassandra Hbase
https://en.wikipedia.org/wiki/Apache_Cassandra
https://hbase.apache.org/poweredbyhbase.html
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Accumulo Hbase Cassandra
Relative	Usage	By	Job	Postings
web
cloud
retail
telecom
enterprise
finance
defense
https://www.indeed.com/
Ideas	for	Accumulo
• Security	is	still	good,	but	might	not	be	a	strong	
enough	differentiator
• Grow	community	support
• Offer	a	secondary	indexing	with	high	level	query	
language?
Questions?
Thanks!
Aaron	Cordova,	Koverse
@aaroncordova

More Related Content

What's hot

HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseenissoz
 
Securing Hadoop with Apache Ranger
Securing Hadoop with Apache RangerSecuring Hadoop with Apache Ranger
Securing Hadoop with Apache RangerDataWorks Summit
 
Hadoop Security Architecture
Hadoop Security ArchitectureHadoop Security Architecture
Hadoop Security ArchitectureOwen O'Malley
 
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?DataWorks Summit
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveDataWorks Summit
 
2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific DashboardCeph Community
 
Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...
Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...
Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...Simplilearn
 
Apache Hadoop YARNとマルチテナントにおけるリソース管理
Apache Hadoop YARNとマルチテナントにおけるリソース管理Apache Hadoop YARNとマルチテナントにおけるリソース管理
Apache Hadoop YARNとマルチテナントにおけるリソース管理Cloudera Japan
 
Hadoop Strata Talk - Uber, your hadoop has arrived
Hadoop Strata Talk - Uber, your hadoop has arrived Hadoop Strata Talk - Uber, your hadoop has arrived
Hadoop Strata Talk - Uber, your hadoop has arrived Vinoth Chandar
 
Ozone- Object store for Apache Hadoop
Ozone- Object store for Apache HadoopOzone- Object store for Apache Hadoop
Ozone- Object store for Apache HadoopHortonworks
 
Apache Kuduは何がそんなに「速い」DBなのか? #dbts2017
Apache Kuduは何がそんなに「速い」DBなのか? #dbts2017Apache Kuduは何がそんなに「速い」DBなのか? #dbts2017
Apache Kuduは何がそんなに「速い」DBなのか? #dbts2017Cloudera Japan
 
Hadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the FieldHadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the FieldDataWorks Summit
 
The columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowThe columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowJulien Le Dem
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to RedisDvir Volk
 

What's hot (20)

HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
 
Securing Hadoop with Apache Ranger
Securing Hadoop with Apache RangerSecuring Hadoop with Apache Ranger
Securing Hadoop with Apache Ranger
 
Hadoop Security Architecture
Hadoop Security ArchitectureHadoop Security Architecture
Hadoop Security Architecture
 
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
 
Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DM
Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DMUpgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DM
Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DM
 
2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard
 
Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...
Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...
Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...
 
Apache Hadoop YARNとマルチテナントにおけるリソース管理
Apache Hadoop YARNとマルチテナントにおけるリソース管理Apache Hadoop YARNとマルチテナントにおけるリソース管理
Apache Hadoop YARNとマルチテナントにおけるリソース管理
 
Hadoop Strata Talk - Uber, your hadoop has arrived
Hadoop Strata Talk - Uber, your hadoop has arrived Hadoop Strata Talk - Uber, your hadoop has arrived
Hadoop Strata Talk - Uber, your hadoop has arrived
 
Ozone- Object store for Apache Hadoop
Ozone- Object store for Apache HadoopOzone- Object store for Apache Hadoop
Ozone- Object store for Apache Hadoop
 
Apache Kuduは何がそんなに「速い」DBなのか? #dbts2017
Apache Kuduは何がそんなに「速い」DBなのか? #dbts2017Apache Kuduは何がそんなに「速い」DBなのか? #dbts2017
Apache Kuduは何がそんなに「速い」DBなのか? #dbts2017
 
Hadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the FieldHadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the Field
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
 
Hadoopエコシステムのデータストア振り返り
Hadoopエコシステムのデータストア振り返りHadoopエコシステムのデータストア振り返り
Hadoopエコシステムのデータストア振り返り
 
The columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowThe columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache Arrow
 
Kudu Deep-Dive
Kudu Deep-DiveKudu Deep-Dive
Kudu Deep-Dive
 
Apache Ranger
Apache RangerApache Ranger
Apache Ranger
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
 

Similar to Comparing Accumulo, Cassandra, and HBase

Time to Science, Time to Results. Accelerating Scientific research in the Cloud
Time to Science, Time to Results. Accelerating Scientific research in the CloudTime to Science, Time to Results. Accelerating Scientific research in the Cloud
Time to Science, Time to Results. Accelerating Scientific research in the CloudAmazon Web Services
 
Acunu Whitepaper v1
Acunu Whitepaper v1Acunu Whitepaper v1
Acunu Whitepaper v1Acunu
 
Introduction to Apache Kudu
Introduction to Apache KuduIntroduction to Apache Kudu
Introduction to Apache KuduJeff Holoman
 
Introduction to Kudu - StampedeCon 2016
Introduction to Kudu - StampedeCon 2016Introduction to Kudu - StampedeCon 2016
Introduction to Kudu - StampedeCon 2016StampedeCon
 
AWS Cloud for HPC and Big Data
AWS Cloud for HPC and Big DataAWS Cloud for HPC and Big Data
AWS Cloud for HPC and Big Datainside-BigData.com
 
Red Hat Ceph Storage: Past, Present and Future
Red Hat Ceph Storage: Past, Present and FutureRed Hat Ceph Storage: Past, Present and Future
Red Hat Ceph Storage: Past, Present and FutureRed_Hat_Storage
 
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...confluent
 
Big Data and its emergence
Big Data and its emergenceBig Data and its emergence
Big Data and its emergencekoolkalpz
 
Infinitely Scalable Clusters - Grid Computing on Public Cloud - London
Infinitely Scalable Clusters - Grid Computing on Public Cloud - LondonInfinitely Scalable Clusters - Grid Computing on Public Cloud - London
Infinitely Scalable Clusters - Grid Computing on Public Cloud - LondonHentsū
 
Big Data Day LA 2016/ NoSQL track - Apache Kudu: Fast Analytics on Fast Data,...
Big Data Day LA 2016/ NoSQL track - Apache Kudu: Fast Analytics on Fast Data,...Big Data Day LA 2016/ NoSQL track - Apache Kudu: Fast Analytics on Fast Data,...
Big Data Day LA 2016/ NoSQL track - Apache Kudu: Fast Analytics on Fast Data,...Data Con LA
 
Introducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseJames Serra
 
Optimizing Big Data to run in the Public Cloud
Optimizing Big Data to run in the Public CloudOptimizing Big Data to run in the Public Cloud
Optimizing Big Data to run in the Public CloudQubole
 
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016Mladen Kovacevic
 
Webinar: The Performance Challenge: Providing an Amazing Customer Experience ...
Webinar: The Performance Challenge: Providing an Amazing Customer Experience ...Webinar: The Performance Challenge: Providing an Amazing Customer Experience ...
Webinar: The Performance Challenge: Providing an Amazing Customer Experience ...DataStax
 
Cloud Computing .ppt
Cloud Computing .pptCloud Computing .ppt
Cloud Computing .pptPrukaBay
 

Similar to Comparing Accumulo, Cassandra, and HBase (20)

Time to Science, Time to Results. Accelerating Scientific research in the Cloud
Time to Science, Time to Results. Accelerating Scientific research in the CloudTime to Science, Time to Results. Accelerating Scientific research in the Cloud
Time to Science, Time to Results. Accelerating Scientific research in the Cloud
 
Acunu Whitepaper v1
Acunu Whitepaper v1Acunu Whitepaper v1
Acunu Whitepaper v1
 
Introduction to Apache Kudu
Introduction to Apache KuduIntroduction to Apache Kudu
Introduction to Apache Kudu
 
Introduction to Kudu - StampedeCon 2016
Introduction to Kudu - StampedeCon 2016Introduction to Kudu - StampedeCon 2016
Introduction to Kudu - StampedeCon 2016
 
AWS Cloud for HPC and Big Data
AWS Cloud for HPC and Big DataAWS Cloud for HPC and Big Data
AWS Cloud for HPC and Big Data
 
Red Hat Ceph Storage: Past, Present and Future
Red Hat Ceph Storage: Past, Present and FutureRed Hat Ceph Storage: Past, Present and Future
Red Hat Ceph Storage: Past, Present and Future
 
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
 
Big Data and its emergence
Big Data and its emergenceBig Data and its emergence
Big Data and its emergence
 
Infinitely Scalable Clusters - Grid Computing on Public Cloud - London
Infinitely Scalable Clusters - Grid Computing on Public Cloud - LondonInfinitely Scalable Clusters - Grid Computing on Public Cloud - London
Infinitely Scalable Clusters - Grid Computing on Public Cloud - London
 
Big Data Day LA 2016/ NoSQL track - Apache Kudu: Fast Analytics on Fast Data,...
Big Data Day LA 2016/ NoSQL track - Apache Kudu: Fast Analytics on Fast Data,...Big Data Day LA 2016/ NoSQL track - Apache Kudu: Fast Analytics on Fast Data,...
Big Data Day LA 2016/ NoSQL track - Apache Kudu: Fast Analytics on Fast Data,...
 
Introducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data Warehouse
 
EFFICIENT TRUSTED CLOUD STORAGE USING PARALLEL CLOUD COMPUTING
EFFICIENT TRUSTED CLOUD STORAGE USING PARALLEL CLOUD COMPUTINGEFFICIENT TRUSTED CLOUD STORAGE USING PARALLEL CLOUD COMPUTING
EFFICIENT TRUSTED CLOUD STORAGE USING PARALLEL CLOUD COMPUTING
 
Optimizing Big Data to run in the Public Cloud
Optimizing Big Data to run in the Public CloudOptimizing Big Data to run in the Public Cloud
Optimizing Big Data to run in the Public Cloud
 
Computer project
Computer projectComputer project
Computer project
 
Scaling Up vs. Scaling-out
Scaling Up vs. Scaling-outScaling Up vs. Scaling-out
Scaling Up vs. Scaling-out
 
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
 
Webinar: The Performance Challenge: Providing an Amazing Customer Experience ...
Webinar: The Performance Challenge: Providing an Amazing Customer Experience ...Webinar: The Performance Challenge: Providing an Amazing Customer Experience ...
Webinar: The Performance Challenge: Providing an Amazing Customer Experience ...
 
WTIA Cloud Computing Series - Part I: The Fundamentals
WTIA Cloud Computing Series - Part I: The FundamentalsWTIA Cloud Computing Series - Part I: The Fundamentals
WTIA Cloud Computing Series - Part I: The Fundamentals
 
ppt2.pdf
ppt2.pdfppt2.pdf
ppt2.pdf
 
Cloud Computing .ppt
Cloud Computing .pptCloud Computing .ppt
Cloud Computing .ppt
 

Recently uploaded

꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 

Recently uploaded (20)

꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 

Comparing Accumulo, Cassandra, and HBase