HopsFS provides a 10x performance increase over HDFS by externalizing the HDFS NameNode metadata state to NDB, a NewSQL in-memory database. This allows the NameNode to scale beyond a single server by distributing the metadata across the NDB cluster. HopsFS also leverages implicit locking, path component caching, and heterogeneous storage including erasure coding to further optimize performance. Hops-YARN similarly improves YARN scalability by storing the ResourceManager state in NDB.
Hadoop is a well-known framework used for big data processing now-a-days. It implements MapReduce for processing and utilizes distributed file system known as Hadoop Distributed File System (HDFS) to store data. HDFS provides fault tolerant, distributed and scalable storage for big data so that MapReduce can easily perform jobs on this data. Knowledge and understanding of data storage over HDFS is very important for a researcher working on Hadoop for big data storage and processing optimization. The aim of this presentation is to describe the architecture and process flow of HDFS. This presentation highlights prominent features of this file system implemented by Hadoop to execute MapReduce jobs. Moreover the presentation provides the description of process flow for achieving the design objectives of HDFS. Future research directions to explore and improve HDFS performance are also elaborated on.
Hadoop is a well-known framework used for big data processing now-a-days. It implements MapReduce for processing and utilizes distributed file system known as Hadoop Distributed File System (HDFS) to store data. HDFS provides fault tolerant, distributed and scalable storage for big data so that MapReduce can easily perform jobs on this data. Knowledge and understanding of data storage over HDFS is very important for a researcher working on Hadoop for big data storage and processing optimization. The aim of this presentation is to describe the architecture and process flow of HDFS. This presentation highlights prominent features of this file system implemented by Hadoop to execute MapReduce jobs. Moreover the presentation provides the description of process flow for achieving the design objectives of HDFS. Future research directions to explore and improve HDFS performance are also elaborated on.
Ravi Namboori Hadoop & HDFS ArchitectureRavi namboori
HDFS Architecture: An HDFS cluster consists of a single NameNode, a master server that manages the file system namespace and regulates access to files by clients.
Here we can see the figure explaining about all by a cisco evangelist Ravi Namboori.
An insight into NoSQL solutions implemented at RTV Slovenia and elsewhere, what problems we are trying to solve and an introduction to solving them with Redis.
Talk given at #wwwh @ Ljubljana, 30.1.2013 by me, Tit Petric
Hadoop World 2011: HDFS Federation - Suresh Srinivas, HortonworksCloudera, Inc.
Scalability of the NameNode has been a key issue for HDFS clusters. Because the entire file system metadata is stored in memory on a single NameNode, and all metadata operations are processed on this single system, the NameNode both limits the growth in size of the cluster and makes the NameService a bottleneck for the MapReduce framework as demand increases. This presentation will describe the features and implementation of HDFS Federation scheduled for release with Hadoop-0.23.
MyRocks is an open source LSM based MySQL database, created by Facebook. This slides introduce MyRocks overview and how we deployed at Facebook, as of 2017.
Using Anaconda to light up dark data. My talk given to the Berkeley Institute of Data Science describing Anaconda and the Blaze ecosystem for bringing a virtual analytical database to your data.
Since April 2016, Spark-as-a-service has been available to researchers in Sweden from the Swedish ICT SICS Data Center at www.hops.site. Researchers work in an entirely UI-driven environment on a platform built with only open-source software.
Spark applications can be either deployed as jobs (batch or streaming) or written and run directly from Apache Zeppelin. Spark applications are run within a project on a YARN cluster with the novel property that Spark applications are metered and charged to projects. Projects are also securely isolated from each other and include support for project-specific Kafka topics. That is, Kafka topics are protected from access by users that are not members of the project. In this talk we will discuss the challenges in building multi-tenant Spark streaming applications on YARN that are metered and easy-to-debug. We show how we use the ELK stack (Elasticsearch, Logstash, and Kibana) for logging and debugging running Spark streaming applications, how we use Graphana and Graphite for monitoring Spark streaming applications, and how users can debug and optimize terminated Spark Streaming jobs using Dr Elephant. We will also discuss the experiences of our users (over 120 users as of Sept 2016): how they manage their Kafka topics and quotas, patterns for how users share topics between projects, and our novel solutions for helping researchers debug and optimize Spark applications.
To conclude, we will also give an overview on our course ID2223 on Large Scale Learning and Deep Learning, in which 60 students designed and ran SparkML applications on the platform.
Ravi Namboori Hadoop & HDFS ArchitectureRavi namboori
HDFS Architecture: An HDFS cluster consists of a single NameNode, a master server that manages the file system namespace and regulates access to files by clients.
Here we can see the figure explaining about all by a cisco evangelist Ravi Namboori.
An insight into NoSQL solutions implemented at RTV Slovenia and elsewhere, what problems we are trying to solve and an introduction to solving them with Redis.
Talk given at #wwwh @ Ljubljana, 30.1.2013 by me, Tit Petric
Hadoop World 2011: HDFS Federation - Suresh Srinivas, HortonworksCloudera, Inc.
Scalability of the NameNode has been a key issue for HDFS clusters. Because the entire file system metadata is stored in memory on a single NameNode, and all metadata operations are processed on this single system, the NameNode both limits the growth in size of the cluster and makes the NameService a bottleneck for the MapReduce framework as demand increases. This presentation will describe the features and implementation of HDFS Federation scheduled for release with Hadoop-0.23.
MyRocks is an open source LSM based MySQL database, created by Facebook. This slides introduce MyRocks overview and how we deployed at Facebook, as of 2017.
Using Anaconda to light up dark data. My talk given to the Berkeley Institute of Data Science describing Anaconda and the Blaze ecosystem for bringing a virtual analytical database to your data.
Since April 2016, Spark-as-a-service has been available to researchers in Sweden from the Swedish ICT SICS Data Center at www.hops.site. Researchers work in an entirely UI-driven environment on a platform built with only open-source software.
Spark applications can be either deployed as jobs (batch or streaming) or written and run directly from Apache Zeppelin. Spark applications are run within a project on a YARN cluster with the novel property that Spark applications are metered and charged to projects. Projects are also securely isolated from each other and include support for project-specific Kafka topics. That is, Kafka topics are protected from access by users that are not members of the project. In this talk we will discuss the challenges in building multi-tenant Spark streaming applications on YARN that are metered and easy-to-debug. We show how we use the ELK stack (Elasticsearch, Logstash, and Kibana) for logging and debugging running Spark streaming applications, how we use Graphana and Graphite for monitoring Spark streaming applications, and how users can debug and optimize terminated Spark Streaming jobs using Dr Elephant. We will also discuss the experiences of our users (over 120 users as of Sept 2016): how they manage their Kafka topics and quotas, patterns for how users share topics between projects, and our novel solutions for helping researchers debug and optimize Spark applications.
To conclude, we will also give an overview on our course ID2223 on Large Scale Learning and Deep Learning, in which 60 students designed and ran SparkML applications on the platform.
Data scientists spend too much of their time collecting, cleaning and wrangling data as well as curating and enriching it. Some of this work is inevitable due to the variety of data sources, but there are tools and frameworks that help automate many of these non-creative tasks. A unifying feature of these tools is support for rich metadata for data sets, jobs, and data policies. In this talk, I will introduce state-of-the-art tools for automating data science and I will show how you can use metadata to help automate common tasks in Data Science. I will also introduce a new architecture for extensible, distributed metadata in Hadoop, called Hops (Hadoop Open Platform-as-a-Service), and show how tinker-friendly metadata (for jobs, files, users, and projects) opens up new ways to build smarter applications.
The Presentation Talks about how Cloud Computing is Big Data's Best Friend and How AWS Cloud Components Fit in to complete your Big Data Life Cycle.
Agenda:
- How Big is Big Data Actually growing?
- How Cloud has the potential to become Big Data's Best Friend
- A tour on The Big Data Life Cycle
- How AWS Cloud Components Fit in to this Life Cycle
- A Case Study of Our Log Analytics Tool Cloudlytics, using Big Data Implementation
on AWS Cloud.
Cortana Analytics Workshop: Azure Data CatalogMSAdvAnalytics
Julie Strauss. This session introduces the newest services in the Cortana Analytics family. The Azure Data Catalog is an enterprise-wide metadata catalog that enables self-service data source discovery. Data Catalog is a fully managed service that stores, describes, indexes, and provides information on how to access any registered data source in your organization. This session presents an overview of the Data Catalog and how – by using it to register, enrich, discover, understand and consume data sources – you can close the gap between those seeking information and those creating it.
In this talk, we will present a new distribution of Hadoop, Hops, that can scale the Hadoop Filesystem (HDFS) by 16X, from 70K ops/s to 1.2 million ops/s on Spotiy's industrial Hadoop workload. Hops is an open-source distribution of Apache Hadoop that supports distributed metadata for HSFS (HopsFS) and the ResourceManager in Apache YARN. HopsFS is the first production-grade distributed hierarchical filesystem to store its metadata normalized in an in-memory, shared nothing database. For YARN, we will discuss optimizations that enable 2X throughput increases for the Capacity scheduler, enabling scalability to clusters with >20K nodes. We will discuss the journey of how we reached this milestone, discussing some of the challenges involved in efficiently and safely mapping hierarchical filesystem metadata state and operations onto a shared-nothing, in-memory database. We will also discuss the key database features needed for extreme scaling, such as multi-partition transactions, partition-pruned index scans, distribution-aware transactions, and the streaming changelog API. Hops (www.hops.io) is Apache-licensed open-source and supports a pluggable database backend for distributed metadata, although it currently only support MySQL Cluster as a backend. Hops opens up the potential for new directions for Hadoop when metadata is available for tinkering in a mature relational database.
Big Data in Container; Hadoop Spark in Docker and MesosHeiko Loewe
3 examples for Big Data analytics containerized:
1. The installation with Docker and Weave for small and medium,
2. Hadoop on Mesos w/ Appache Myriad
3. Spark on Mesos
A brief introduction to Hadoop distributed file system. How a file is broken into blocks, written and replicated on HDFS. How missing replicas are taken care of. How a job is launched and its status is checked. Some advantages and disadvantages of HDFS-1.x
Less is More: 2X Storage Efficiency with HDFS Erasure CodingZhe Zhang
Ever since its creation, HDFS has been relying on data replication to shield against most failure scenarios. However, with the explosive growth in data volume, replication is getting quite expensive: the default 3x replication scheme incurs a 200% overhead in storage space and other resources (e.g., network bandwidth when writing the data). Erasure coding (EC) uses far less storage space while still providing the same level of fault tolerance. Under typical configurations, EC reduces the storage cost by ~50% compared with 3x replication.
Daniel Krasner - High Performance Text Processing with Rosetta PyData
This talk covers rapid prototyping of a high performance scalable text processing pipeline development in Python. We demonstrate how Python modules, in particular from the Rosetta library, can be used to analyze, clean, extract features, and finally perform machine learning tasks such as classification or topic modeling on millions of documents. Our style is to build small and simple modules (each with command line interfaces) that use very little memory and are parallelized with the multiprocessing library.
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...Simplilearn
This video on Hadoop interview questions part-1 will take you through the general Hadoop questions and questions on HDFS, MapReduce and YARN, which are very likely to be asked in any Hadoop interview. It covers all the topics on the major components of Hadoop. This Hadoop tutorial will give you an idea about the different scenario-based questions you could face and some multiple-choice questions as well. Now, let us dive into this Hadoop interview questions video and gear up for youe next Hadoop Interview.
What is this Big Data Hadoop training course about?
The Big Data Hadoop and Spark developer course have been designed to impart an in-depth knowledge of Big Data processing using Hadoop and Spark. The course is packed with real-life projects and case studies to be executed in the CloudLab.
What are the course objectives?
This course will enable you to:
1. Understand the different components of the Hadoop ecosystem such as Hadoop 2.7, Yarn, MapReduce, Pig, Hive, Impala, HBase, Sqoop, Flume, and Apache Spark
2. Understand Hadoop Distributed File System (HDFS) and YARN as well as their architecture, and learn how to work with them for storage and resource management
3. Understand MapReduce and its characteristics, and assimilate some advanced MapReduce concepts
4. Get an overview of Sqoop and Flume and describe how to ingest data using them
5. Create database and tables in Hive and Impala, understand HBase, and use Hive and Impala for partitioning
6. Understand different types of file formats, Avro Schema, using Arvo with Hive, and Sqoop and Schema evolution
7. Understand Flume, Flume architecture, sources, flume sinks, channels, and flume configurations
8. Understand HBase, its architecture, data storage, and working with HBase. You will also understand the difference between HBase and RDBMS
9. Gain a working knowledge of Pig and its components
10. Do functional programming in Spark
11. Understand resilient distribution datasets (RDD) in detail
12. Implement and build Spark applications
13. Gain an in-depth understanding of parallel processing in Spark and Spark RDD optimization techniques
14. Understand the common use-cases of Spark and the various interactive algorithms
15. Learn Spark SQL, creating, transforming, and querying Data frames
Learn more at https://www.simplilearn.com/big-data-and-analytics/big-data-and-hadoop-training
My Presentation from the ILUG 2010 Belfast conference.
Here the Abstract: "Come to this session to learn on real examples how to read and understand those NSD files that can give you so mush information for troubleshooting and debugging your Domino Servers, Notes Clients and even your Applications. You will learn how to find what files and documents has been open, what agents has been running and how much memory was available, as the last crash occurred. You will also see all kinds of tips and tricks around the system diagnostics, that will allow you to troubleshoot problems faster and more effective. "
New to MongoDB? We'll provide an overview of installation, high availability through replication, scale out through sharding, and options for monitoring and backup. No prior knowledge of MongoDB is assumed. This session will jumpstart your knowledge of MongoDB operations, providing you with context for the rest of the day's content.
Most users know HDFS as the reliable store of record for big data analytics. HDFS is also used to store transient and operational data when working with cloud object stores, such as Azure HDInsight and Amazon EMR. In these settings - but also in more traditional, on premise deployments - applications often manage data stored in multiple storage systems or clusters, requiring a complex workflow for synchronizing data between filesystems to achieve goals for durability, performance, and coordination.
Building on existing heterogeneous storage support, we add a storage tier to HDFS to work with external stores, allowing remote namespaces to be "mounted" in HDFS. This capability not only supports transparent caching of remote data as HDFS blocks, it also supports (a)synchronous writes to remote clusters for business continuity planning (BCP) and supports hybrid cloud architectures.
This idea was presented at last year’s Summit in San Jose. Lots of progress has been made since then and active development is ongoing at the Apache Software Foundation on branch HDFS-9806, driven by Microsoft and Western Digital. We will discuss the refined design & implementation and present how end-users and admins will be able to use this powerful functionality.
With Hadoop-3.0.0-alpha2 being released in January 2017, it's time to have a closer look at the features and fixes of Hadoop 3.0.
We will have a look at Core Hadoop, HDFS and YARN, and answer the emerging question whether Hadoop 3.0 will be an architectural revolution like Hadoop 2 was with YARN & Co. or will it be more of an evolution adapting to new use cases like IoT, Machine Learning and Deep Learning (TensorFlow)?
PyData Berlin 2023 - Mythical ML Pipeline.pdfJim Dowling
This talk is a mental map for building ML systems as ML Pipelines that are factored into Feature Pipelines, Training Pipelines, and Inference Pipelines.
Building Hopsworks, a cloud-native managed feature store for machine learning Jim Dowling
Cloud Native London talk about the control layer of Hopsworks.ai and our choice of cloud native services. We built our own multi-tenant services as cloud native services, for the most part.
Metadata and Provenance for ML Pipelines with Hopsworks Jim Dowling
This talk describes the scale-out, consistent metadata architecture of Hopsworks and how we use it to support custom metadata and provenance for ML Pipelines with Hopsworks Feature Store, NDB, and ePipe . The talk is here: https://www.youtube.com/watch?v=oPp8PJ9QBnU&feature=emb_logo
Asynchronous Hyperparameter Search with Spark on Hopsworks and MaggyJim Dowling
Spark AI Summit Europe 2019 talk: Asynchronous Hyperparameter Search with Spark on Hopsworks and Maggy. How can you do directed search efficiently with Spark? The answer is Maggy - asynchronous directed search on PySpark.
Hopsworks at Google AI Huddle, SunnyvaleJim Dowling
Hopsworks is a platform for designing and operating End to End Machine Learning using PySpark and TensorFlow/PyTorch. Early access is now available on GCP. Hopsworks includes the industry's first Feature Store. Hopsworks is open-source.
Hopsworks in the cloud Berlin Buzzwords 2019 Jim Dowling
This talk, given at Berlin Buzzwords 2019, describes the recent progress in making Hopsworks a cloud-native platform, with HA data-center support added for HopsFS.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
Hopsfs 10x HDFS performance
1. HopsFS: 10X your HDFS with NDB
Jim Dowling
Associate Prof @ KTH
Senior Researcher @ SICS
CEO @ Logical Clocks AB
Oracle, Stockholm, 6th September 2016
www.hops.io
@hopshadoop
2. Hops Team
Active: Jim Dowling, Seif Haridi, Tor Björn Minde,
Gautier Berthou, Salman Niazi, Mahmoud Ismail,
Theofilos Kakantousis, Johan Svedlund Nordström,
Ermias Gebremeskel, Antonios Kouzoupis.
Alumni: Vasileios Giannokostas, Misganu Dessalegn,
Rizvi Hasan, Paul Mälzer, Bram Leenders, Juan Roca,
K “Sri” Srijeyanthan, Steffen Grohsschmiedt,
Alberto Lorente, Andre Moré, Ali Gholami, Davis Jaunzems,
Stig Viaene, Hooman Peiro, Evangelos Savvidis,
Jude D’Souza, Qi Qi, Gayana Chandrasekara,
Nikolaos Stanogias, Daniel Bali, Ioannis Kerkinos,
Peter Buechler, Pushparaj Motamari, Hamid Afzali,
Wasif Malik, Lalith Suresh, Mariano Valles, Ying Lieu.
3. Marketing 101: Celebrity Endorsements
*Turing Award Winner 2014, Father of Distributed Systems
Hi!
I’m Leslie Lamport* and
even though you’re not
using Paxos, I approve
this product.
9. Max Pause times for NameNode Heap Sizes*
9
Max Pause-Times
(ms)
100
1000
10000
10
JVM Heap Size (GB)
50 75 100 150
*OpenJDK or Oracle JVM
10. NameNode and Decreasing Memory Costs
10
Size (GB)
250
500
1000
Year
2016 2017 2018 2019 2020
0
750
11. Externalizing the NameNode State
•Problem:
NameNode not scaling up with lower RAM prices
•Solution:
Move the metadata off the JVM Heap
•Move it where?
An in-memory storage system that can be efficiently
queried and managed. Preferably Open-Source.
•MySQL Cluster (NDB)
11
13. Pluggable DBs: Data Abstraction Layer (DAL)
13
NameNode
(Apache v2)
DAL API
(Apache v2)
NDB-DAL-Impl
(GPL v2)
Other DB
(Other License)
hops-2.5.0.jar dal-ndb-2.5.0-7.5.3.jar
17. Concurrency Model: Implicit Locking
• Serializabile FS ops using implicit locking of subtrees.
17
[Hakimzadeh, Peiro, Dowling, ”Scaling HDFS with a Strongly Consistent Relational Model for Metadata”, DAIS 2014]
18. Preventing Deadlock and Starvation
•Acquire FS locks in agreed order using FS Hierarchy.
•Block-level operations follow the same agreed order.
•No cycles => Freedom from deadlock
•Pessimistic Concurrency Control ensures progress
18
/user/jim/myFilemv
read
block_report
DataNodeNameNodeClient
19. Per Transaction Cache
•Reusing the HDFS codebase resulted in too many
roundtrips to the database per transaction.
•We cache intermediate transaction results at
NameNodes (i.e., snapshot).
20. Sometimes, Transactions Just ain’t Enough
•Large Subtree Operations (delete, mv, set-quota)
can’t always be executed in a single Transaction.
•4-phase Protocol
• Isolation and Consistency
• Aggressive batching
• Transparent failure handling
• Failed ops retried on new NN.
• Lease timeout for failed clients.
20
21. Leader Election using NDB
•Leader to coordinate replication/lease management
•NDB as shared memory for Leader Election of NN.
21
[Niazi, Berthou, Ismail, Dowling, ”Leader Election in a NewSQL Database”, DAIS 2015]
22. Path Component Caching
•The most common operation in HDFS is resolving
pathnames to inodes
- 67% of operations in Spotify’s Hadoop workload
•We cache recently resolved inodes at NameNodes so
that we can resolve them using a single batch
primary key lookup.
- We validate cache entries as part of transactions
- The cache converts O(N) round trips to the database to
O(1) for a hit for all inodes in a path.
22
23. Path Component Caching
•Resolving a path of length N gives O(N) round-trips
•With our cache, O(1) round-trip for a cache hit
/user/jim/myFile
NDB
getInode
(0, “user”) getInode
(1, “jim”) getInode
(2, “myFile”)
NameNode
/user/jim/myFile
NDB
validateInodes
([(0, “user”),
(1,”jim”),
(2,”myFile”)])
NameNode
Cache
getInodes(“/user/jim/myFile”)
24. Hotspots
•Mikael saw 1-2 maxed out LDM threads
•Partitioning by parent inodeId meant
fantastic performance for ‘ls’
- Partition-pruned index scans
- At high load hotspots appeared at the
top of the directory hierarchy
•Current Solution:
- Cache the root inode at NameNodes
- Pseudo-random partition key for top-level
directories, but keep partition by parent
inodeId at lower levels
- At least 4x throughput increase!
24
/
/Users /Projects
/NSA /MyProj
/Dataset1 /Dataset2
25. Scalable Blocking Reporting
•On 100PB+ clusters, internal maintenance protocol
traffic makes up much of the network traffic
•Block Reporting
- Leader Load Balances
- Work-steal when exiting
safe-mode
SafeBlocks
DataNodes
NameNodes
NDB
Leader
Blocks
work steal
33. NDB Performance Lessons
•NDB is quite stable!
•ClusterJ is (nearly) good enough
- sun.misc.Cleaner has trouble keeping up at high
throughput – OOM for ByteBuffers
- Transaction hint behavior not respected
- DTO creation time affected by Java Reflection
- Nice features would be:
• Projections
• Batched scan operations support
• Event API
•Event API and Asynchronous API needed for
performance in Hops-YARN
33
34. Heterogeneous Storage in HopsFS
34
•Storage Types in HopsFS: Default, EC-RAID5, SSD
- Default: 3X overhead - triple replication on spinning disks
- SSD: 3X overhead - triple replication on SSDs
- EC-RAID5: 1.4X overhead with low reconstruction overhead!
37. ePipe: Indexing HopsFS’ Namespace
37
Free-Text
Search
NDBElasticSearch
Polyglot Persistence
The Distributed Database is the Single Source of Truth.
Foreign keys ensure the integrity of Extended Metadata.
MetaData
Designer
MetaData
Entry
NDB Event API
40. NDB
ResourceManager– Monolithic but Modular
40
ApplicationMaster
Service
ResourceTracker
Service
Scheduler
Client
Service
YARN Client
Admin
Service
Security
Cluster State
HopsResourceTracker
Cluster State
HopsScheduler
NodeManagerNodeManagerYARN Client App MasterApp Master
ResourceManager
~2k ops/s ~10k ops/s
ClusterJ Event API
43. Hopsworks – Project-Based Multi-Tenancy
•A project is a collection of
- Users with Roles
- HDFS DataSets
- Kafka Topics
- Notebooks, Jobs
•Per-Project quotas
- Storage in HDFS
- CPU in YARN
• Uber-style Pricing
•Sharing across Projects
- Datasets/Topics
43
project
dataset 1
dataset N
Topic 1
Topic N
Kafka
HDFS
47. Summary
•HopsFS is the world’s fastest, most scalable HDFS
implementation
•Powered by NDB, the world’s fastest database
•Thanks to Mikael, Craig, Frazer, Bernt and others
•Still room for improvement….
47
www.hops.io
I am going to talk about realizing Bill Gate’s vision for a filesystem in the Hadoop Ecosystem.
“WinFS was an attempt to bring the benefits of schema and relational databases to the Windows file system. …The WinFS effort was started around 1999 as the successor to the planned storage layer of Cairo and died in 2006 after consuming many thousands of hours of efforts from really smart engineers.”
[Brian Welcker]**
**http://blogs.msdn.com/b/bwelcker/archive/2013/02/11/the-vision-thing.aspx
The kind of challenges you have with the NN are managing large clusters and configuring the NN.
Slope of the Bottom Line is based on improvements in garbage collection technology – Azul JVM, Shenndowagh, etc
Slope of the top line is based on Moore’s Law.
Apache Spark already moving in this direction – Tachyon
The NameNode has multi-reader, single writer concurrency semantics.
Operations that would hold the write lock for too long, starving clients, are not executed atomically. For example, deleting a directory subtree with millions of files, involves deleting batches of files, yielding the global lock for a period, then re-acquiring it, to continue the operation.
With global lock, it’s easy.
If something is not atomic, you have to handl all possible failures
Only new Protocol Buffer Message we added to DNs
reconstruction read is expensive
The Resource Manager (RM) is a bottleneck.
Zookeeper throughput not high enough to persist all RM state
Standby resource manager can only recover partial state
All running jobs must be restarted.
RM state not queryable.
The RM is a State-Machine. Almost no session state to manage.
Privileges – upload/download data, run analysis jobs
Like RBAC solution.
All access via HopsWorks.