Apache Accumulo, originally developed by the National Security Agency and now an Apache Software Foundation project, builds upon Google's Bigtable design to provide a scalable, lightly-structured database capability complementing the ubiquitous Hadoop environment. The core capabilities of Accumulo include cell-level security, flexible schemas, real-time analytics, bulk I/O, and linear scalability beyond trillions of entries and petabytes of data. These new capabilities lead to techniques that unlock the power of Big Data, but don't fit into traditional database design patterns. Learn about the advantages of Apache Accumulo and how it fits into the Hadoop and NoSQL ecosystem.
Presenter: Adam Fuchs, CTO, sqrrl
How To Become A Big Data Engineer? EdurekaEdureka!
** Big Data Masters Training Program: https://www.edureka.co/masters-program/big-data-architect-training **
This edureka PPT on "How to become a Big Data Engineer" is a complete career guide for aspiring Big Data Engineers. It includes the following topics:
Who is a Big Data Engineer?
What does a Big Data Engineer do?
Big Data Engineer Responsibilities
Big Data Engineer Skills
Big Data Engineering Learning Path
Follow us to never miss an update in the future.
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Just the sketch: advanced streaming analytics in Apache MetronDataWorks Summit
Doing advanced analytics in streaming architectures presents unique challenges around the tradeoff of having more context vs. performance. Typically performance and scalability requirements mandate that each message in a stream be operated on without the context of other messages in the stream that may have come before. In this talk, we will talk about using sketching algorithms to engineering a compromise which allows us to consider historical state without compromising scalability.
What we found analyzing the capabilities of many similar SIEMs and cybersecurity platforms is that a good portion of the advanced anaytics boil down to either simple rules enriched with the ability to do statistical baselining, set existence, and set cardinality computations. These operations are necessarily difficult to do in-stream, so often they're done after the fact. We look at ways to open up these analytics to stream computation without sacrificing scalability.
Specifically, we will introduce the infrastructure built for Apache Metron to perform these kinds of tasks. We will cover the novel integration between an Apache Storm and Apache Hbase and orchestrated by a custom domain specific language called Stellar to take all the sting out of constructing sketches and using them to accomplish simple and more advanced analytics such as statistical outlier analysis in stream. CASEY STELLA, Principal Software Engineer, Hortonworks
What is Splunk? At the end of this session you’ll have a high-level understanding of the pieces that make up the Splunk Platform, how it works, and how it fits in the landscape of Big Data. You’ll see practical examples that differentiate Splunk while demonstrating how to gain quick time to value.
Apache Accumulo, originally developed by the National Security Agency and now an Apache Software Foundation project, builds upon Google's Bigtable design to provide a scalable, lightly-structured database capability complementing the ubiquitous Hadoop environment. The core capabilities of Accumulo include cell-level security, flexible schemas, real-time analytics, bulk I/O, and linear scalability beyond trillions of entries and petabytes of data. These new capabilities lead to techniques that unlock the power of Big Data, but don't fit into traditional database design patterns. Learn about the advantages of Apache Accumulo and how it fits into the Hadoop and NoSQL ecosystem.
Presenter: Adam Fuchs, CTO, sqrrl
How To Become A Big Data Engineer? EdurekaEdureka!
** Big Data Masters Training Program: https://www.edureka.co/masters-program/big-data-architect-training **
This edureka PPT on "How to become a Big Data Engineer" is a complete career guide for aspiring Big Data Engineers. It includes the following topics:
Who is a Big Data Engineer?
What does a Big Data Engineer do?
Big Data Engineer Responsibilities
Big Data Engineer Skills
Big Data Engineering Learning Path
Follow us to never miss an update in the future.
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Just the sketch: advanced streaming analytics in Apache MetronDataWorks Summit
Doing advanced analytics in streaming architectures presents unique challenges around the tradeoff of having more context vs. performance. Typically performance and scalability requirements mandate that each message in a stream be operated on without the context of other messages in the stream that may have come before. In this talk, we will talk about using sketching algorithms to engineering a compromise which allows us to consider historical state without compromising scalability.
What we found analyzing the capabilities of many similar SIEMs and cybersecurity platforms is that a good portion of the advanced anaytics boil down to either simple rules enriched with the ability to do statistical baselining, set existence, and set cardinality computations. These operations are necessarily difficult to do in-stream, so often they're done after the fact. We look at ways to open up these analytics to stream computation without sacrificing scalability.
Specifically, we will introduce the infrastructure built for Apache Metron to perform these kinds of tasks. We will cover the novel integration between an Apache Storm and Apache Hbase and orchestrated by a custom domain specific language called Stellar to take all the sting out of constructing sketches and using them to accomplish simple and more advanced analytics such as statistical outlier analysis in stream. CASEY STELLA, Principal Software Engineer, Hortonworks
What is Splunk? At the end of this session you’ll have a high-level understanding of the pieces that make up the Splunk Platform, how it works, and how it fits in the landscape of Big Data. You’ll see practical examples that differentiate Splunk while demonstrating how to gain quick time to value.
Big Data is an increasingly powerful enterprise asset and this talk will explore the relationship between big data and cyber security, how we preserve privacy whilst exploiting the advantages of data collection and processing. Big Data technologies provide both governments and corporations powerful tools to offer more efficient and personalized services. The rapid adoption of these technologies has of course created tremendous social benefits. Unfortunately unwanted side effects are the potential rich pickings available to those with malicious intentions. Increasingly, the sophisticated cyber attacker is able to exploit the rich array public data to build detailed profiles on their adversaries to support their malicious intentions
Harnessing the Power of Apache Hadoop SeriesCloudera, Inc.
How to Manage Your Apache Hadoop Lifecycle.
So you’ve got Apache Hadoop in development. Now what? In this webinar, Cloudera’s VP of Products Charles Zedlewski will explain how to plan for and manage the Apache Hadoop lifecycle inside a Cloudera deployment.
Sqrrl February Webinar: Breaking Down Data SilosSqrrl
In this talk, Adam Fuchs, the CTO of Sqrrl and co-founder of the Accumulo project discusses some of the lessons learned for properly architecting, applying, and managing cell-level security labels in customer environments.
Adam Fuchs' presentation slides on what's next in the evolution of BigTable implementations (transactions, indexing, etc.) and what these advances could mean for the massive database that gave rise to Google.
This slides show
1. How to obtain code coverage information for Java code
2. What kind of code coverage it is possible to get
3. Is 100% block coverage feasible, is it useful
4. How the code coverage could be used for more than discovering a percentage of uncovered code
Big Data is an increasingly powerful enterprise asset and this talk will explore the relationship between big data and cyber security, how we preserve privacy whilst exploiting the advantages of data collection and processing. Big Data technologies provide both governments and corporations powerful tools to offer more efficient and personalized services. The rapid adoption of these technologies has of course created tremendous social benefits. Unfortunately unwanted side effects are the potential rich pickings available to those with malicious intentions. Increasingly, the sophisticated cyber attacker is able to exploit the rich array public data to build detailed profiles on their adversaries to support their malicious intentions
Harnessing the Power of Apache Hadoop SeriesCloudera, Inc.
How to Manage Your Apache Hadoop Lifecycle.
So you’ve got Apache Hadoop in development. Now what? In this webinar, Cloudera’s VP of Products Charles Zedlewski will explain how to plan for and manage the Apache Hadoop lifecycle inside a Cloudera deployment.
Sqrrl February Webinar: Breaking Down Data SilosSqrrl
In this talk, Adam Fuchs, the CTO of Sqrrl and co-founder of the Accumulo project discusses some of the lessons learned for properly architecting, applying, and managing cell-level security labels in customer environments.
Adam Fuchs' presentation slides on what's next in the evolution of BigTable implementations (transactions, indexing, etc.) and what these advances could mean for the massive database that gave rise to Google.
This slides show
1. How to obtain code coverage information for Java code
2. What kind of code coverage it is possible to get
3. Is 100% block coverage feasible, is it useful
4. How the code coverage could be used for more than discovering a percentage of uncovered code
Solution Use Case Demo: The Power of Relationships in Your Big DataInfiniteGraph
In this security solution demo, we have integrated Oracle NoSQL DB with InfiniteGraph to demonstrate the power of using the right tools for the solution. By integrating the key value technology of Oracle with the InfiniteGraph distributed graph database, we are able to create new views of existing Call Detail Record (CDR) details to enable discovery of connections, paths and behaviors that may otherwise be missed.
Discover how to add value to your existing Big Data to increase revenues and performance!
For users of Hadoop, MapReduce is a new territory. MapReduce design patterns are all about documenting the knowledge and lessons learned of the seasoned Hadoop developer so that new developers can leverage the experts’ experience in solving problems. This talk outlines a few of the most popular patterns and give an verview of the rest.
Objective 1: Understand what kinds of problems are solvable by Hadoop and MapReduce.
After this session you will be able to:
Objective 2: Understand why Hadoop engineers need to know what MapReduce Design Patterns are and what they are useful for day-to-day.
Objective 3: Begin to understand how to summarize, reorganize, and search through your data with Hadoop and MapReduce
Oracle Active Data Guard: Best Practices and New Features Deep Dive Glen Hawkins
Oracle Data Guard and Oracle Active Data Guard have long been the answer for the real-time protection, availability, and usability of Oracle data. This presentation provides an in-depth look at several key new features that will make your life easier and protect your data in new and more flexible ways. Learn how Oracle Active Data Guard 19c has been integrated with Oracle Database In-Memory and offers a faster application response after a role transition. See how DML can now be redirected from an Oracle Active Data Guard standby to its primary for more flexible data protection in today’s data centers or your data clouds. This technical deep dive on Active Data Guard is designed to give you a glimpse into upcoming new features brought to you by Oracle Development.
Leveraging Threat Intelligence to Guide Your HuntsSqrrl
This webinar training session covers everything from what threat intelligence is to specific examples of how to hunt with it; applying intel during a tactical hunt and what you should be looking out for when searching for adversaries on your enterprise network. Taught by Keith Gilbert, Keith is an experienced threat researcher with a background in Digital Forensics and Incident Response.
How to Hunt for Lateral Movement on Your NetworkSqrrl
Once inside your network, most cyber-attacks go sideways. They progressively move deeper into the network, laterally compromising other systems as they search for key assets and data. Would you spot this lateral movement on your enterprise network?
In this training session, we review the various techniques attackers use to spread through a network, which data sets you can use to reliably find them, and how data science techniques can be used to help automate the detection of lateral movement.
Machine Learning for Incident Detection: Getting StartedSqrrl
This presentation walks you through the uses of machine learning in incident detection and response, outlining some of the basic features of machine learning and specific tools you can use.
Watch the presentation with audio here: https://www.youtube.com/watch?v=4pArapSIu_w
Building a Next-Generation Security Operations Center (SOC)Sqrrl
So, you need to build a Security Operations Center (SOC)? What does that mean? What does the modern SOC need to do? Learn from Dr. Terry Brugger, who has been doing information security work for over 15 years, including building out a SOC for a large Federal agency and consulting for numerous large enterprises on their security operations.
Watch the presentation with audio here: http://info.sqrrl.com/sqrrl-october-webinar-next-generation-soc
User and Entity Behavior Analytics using the Sqrrl Behavior GraphSqrrl
UEBA leverages advanced statistical techniques and machine learning to surface subtle behaviors that are indicative of attacker presence. In this presentation, Sqrrl's Director of Data Science, Chris McCubbin, and Sqrrl's Director of Products, Joe Travaglini, provide an overview of how machine learning and UEBA can be used to detect cyber threats using Sqrrl's Behavior Graph.
Watch the presentation with audio here: http://info.sqrrl.com/april-2016-ueba-webinar-on-demand
Threat Hunting Platforms (Collaboration with SANS Institute)Sqrrl
Traditional security measures like firewalls, IDS, endpoint protection, and SIEMs are only part of the network security puzzle. Threat hunting is a proactive approach to uncovering threats that lie hidden in your network or system, that can evade more traditional security tools. Go in-depth with Sqrrl and SANS Institute to learn how hunting platforms work.
Watch the recording with audio here: http://info.sqrrl.com/sans-sqrrl-threat-hunting-webcast
Sqrrl and IBM: Threat Hunting for QRadar UsersSqrrl
This joint webinar, in collaboration with IBM, offers a look at the industry leading Threat Hunting App for IBM QRadar. By combining the threat detection capabilities of QRadar and Sqrrl, security analysts are armed with advanced analytics and visualization to hunt for unknown threats and more efficiently investigate known incidents.
Watch the training with audio here: http://info.sqrrl.com/sqrrl-ibm-threat-hunting-for-qradar-users
Threat Hunting for Command and Control ActivitySqrrl
Sqrrl's Security Technologist Josh Liburdi provides an overview of how to detect C2 through a combination of automated detection and hunting.
Watch the presentation with audio here: http://info.sqrrl.com/threat-hunting-for-command-and-control-activity
Today's threats demand a more active role in detecting and isolating sophisticated attacks. This must-see presentation provides practical guidance on modernizing your SOC and building out an effective threat hunting program. Ed Amoroso and David Bianco discuss best practices for developing and staffing a modern SOC, including the essential shifts in how to think about threat detection.
Watch the presentation with audio here: http://info.sqrrl.com/webinar-modernizing-your-security-operations
Threat Hunting vs. UEBA: Similarities, Differences, and How They Work Together Sqrrl
This presentation explains how security teams can leverage hunting and analytics to detect advanced threats faster, more reliably, and with common analyst skill sets. Watch the presentation with audio here: http://info.sqrrl.com/threat-hunting-and-ueba-webinar
In this training session, two leading security experts review how adversaries use DNS to achieve their mission, how to use DNS data as a starting point for launching an investigation, the data science behind automated detection of DNS-based malicious techniques and how DNS tunneling and DGA machine learning algorithms work.
Watch the presentation with audio here: http://info.sqrrl.com/leveraging-dns-for-proactive-investigations
If you follow the trade press, one theme you hear over and over again is that organizations are drowning in alerts. It’s true that we need technological solutions to prioritize and escalate the most important alerts to our analysts, but the humans have a critical part to play in this process as well. The quicker they are able to make decisions about the alerts they review, the better they are able to keep up. An incident responders’ most common task is alert triage, the process of investigation and escalation that ultimately results in the creation of security incidents. As crucial as this process is, there has been remarkably little written about how to do it correctly and efficiently. In this presentation, learn incident response best practices from Sqrrl security expert, David Bianco.
Slides from the webinar led by Ely Kahn and Luis Maldonado discussing strategies to reduce Mean Time to Know in detecting cybersecurity attacks, threats, or data breaches.
Sqrrl Enterprise: Big Data Security Analytics Use CaseSqrrl
Organizations are utilizing Sqrrl Enterprise to securely integrate vast amounts of multi-structured data (e.g., tens of petabytes) onto a single Big Data platform and then are building real-time applications using this data and Sqrrl Enterprise’s analytical interfaces. The secure integration is enabled by Accumulo’s innovative cell-level security capabilities and Sqrrl Enterprise’s security extensions, such as encryption.
Benchmarking The Apache Accumulo Distributed Key–Value StoreSqrrl
This paper presents results of benchmarking Apache Accumulo distributed table store using the continuous tests suite included in its open source distribution.
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
Sort order across all keys.Columns can differ across rows.Value can have many different types.Entry can convey information with an empty value.Each key has a visibility – a given read will see a subset of the data.Visibility is part of key uniqueness – PSYCH_JD is withholding information from JD.
Tablet Servers have 4 primary functions:Hosting RPCs (read, write, etc.)Managing resources (RAM, CPU, File I/O, etc.)Scheduling background tasks (compactions, caching, etc.)Handling key/value pairs (via Iterators)BecauseAccumulodoesn’t use hashing to assign key-value pairs to servers, we need:We need to store the mapping of TabletServer-to-Tablet . This mapping is stored in another Tablet in Accumulo called the Metadata Table. A client need only scan a portion of the Metdata table to find which TabletServers have the Tablets it wants. (binary search through the metadata hierarchy (NEED input, is this correct) The Metadata table’sTabletServer-to-Tablet assignments must also be stored somewhere. These are written to the first Tablet of the Metadata table, called the Root Tablet. However, you may notice that the Root Tablet itself is stored in Accumulo! Somewhat of a circular dependency. That’s what we use Zookeeper for: The location of the Root Tablet is always known to ZooKeeper.
ApachAccumulo runs on top of Hadoop and ZooKeeperIt relies on HDFS for storage of data and Zookeeper for:- storage of config data (location of metadata Root Tablet) - locking of tabletsZookeeper uses Quorum consistency algorithms for High Availability to prevent Single Point of FailureZookeeper itself is actually a K/V datastore, but holds very little dataAlthough we’ll be running ZK on the same machines as Accumulo and Hadoop, it’s recommended that the Quorum of servers be separate.