Network attached storage (NAS) is a file storage device connected directly to a computer network that provides file-level access to stored data. NAS uses common network protocols like NFS to allow files to be accessed over the network. A NAS device contains disk storage and runs its own operating system to provide file storage functionality to clients. Benefits of NAS include easy sharing of files across a network and low cost to add additional storage capacity.
The document proposes a cloud environment for backup and data storage using remote servers that can be accessed through the Internet. It involves using the disks of cluster nodes as a global storage system with PVFS2 parallel file system for improved performance. The proposed system aims to increase data availability and reduce information loss by storing data on a private cloud using PVFS2 and developing a multiplatform client application for fast data transfer. It allows reuse of existing infrastructure to reduce costs and gives users experience of managing a private cloud.
EMC SAN provides benefits such as high availability and manageability, improved application performance through dedicated storage networks, fast scalability through centralized storage, and better data replication and recovery options. Case studies show that EMC SAN solutions can help businesses reduce costs through storage consolidation, improve business continuity through centralized data management, and increase business flexibility to support growth. EMC SAN migration services help ensure business impact through detailed planning and elimination of downtime during implementation.
Storage devices are used to store data outside of a computer's main memory. There are different types of storage including primary storage like RAM and cache that is directly accessible by the CPU. Secondary storage like hard disks requires accessing through input/output channels. Tertiary storage uses robotic mechanisms to store data offline. Linux uses disk partitioning to organize storage across physical disks using schemes like MBR and GPT. Logical volumes and RAID provide additional abstraction and redundancy. Network storage solutions like NAS export file systems over a network while SANs export block storage using protocols like Fibre Channel and iSCSI.
The document provides an overview of storage concepts including:
1) It defines online, nearline and offline storage and their characteristics.
2) It discusses the evolution of storage technologies from DAS to SAN and some advantages of SAN such as increased performance and scalability.
3) It describes some common storage components and technologies used in SAN implementations like HBAs, switches, fabrics and replication.
This document provides an overview of various data storage technologies including RAID, DAS, NAS, and SAN. It discusses RAID levels like RAID 0, 1, 5 which provide data striping and redundancy. Direct attached storage (DAS) connects directly to servers but cannot be shared, while network attached storage (NAS) uses file sharing protocols over IP networks. Storage area networks (SAN) use dedicated storage networks like Fibre Channel and iSCSI to provide block-level access to consolidated storage. The key is choosing the right solution based on capacity, performance, scalability, availability, data protection needs, and budget.
The document provides an introduction to network attached storage (NAS). It discusses the basics of NAS including how it works, differences from SAN storage, features like snapshots and global namespace, and common environments where NAS is used. It also summarizes IBM's NAS solutions including the SONAS enterprise NAS platform and N series unified storage platform, and notes that real-time compression can increase storage efficiency by up to 80% without impacting performance.
Network attached storage (NAS) is a file storage device connected directly to a computer network that provides file-level access to stored data. NAS uses common network protocols like NFS to allow files to be accessed over the network. A NAS device contains disk storage and runs its own operating system to provide file storage functionality to clients. Benefits of NAS include easy sharing of files across a network and low cost to add additional storage capacity.
The document proposes a cloud environment for backup and data storage using remote servers that can be accessed through the Internet. It involves using the disks of cluster nodes as a global storage system with PVFS2 parallel file system for improved performance. The proposed system aims to increase data availability and reduce information loss by storing data on a private cloud using PVFS2 and developing a multiplatform client application for fast data transfer. It allows reuse of existing infrastructure to reduce costs and gives users experience of managing a private cloud.
EMC SAN provides benefits such as high availability and manageability, improved application performance through dedicated storage networks, fast scalability through centralized storage, and better data replication and recovery options. Case studies show that EMC SAN solutions can help businesses reduce costs through storage consolidation, improve business continuity through centralized data management, and increase business flexibility to support growth. EMC SAN migration services help ensure business impact through detailed planning and elimination of downtime during implementation.
Storage devices are used to store data outside of a computer's main memory. There are different types of storage including primary storage like RAM and cache that is directly accessible by the CPU. Secondary storage like hard disks requires accessing through input/output channels. Tertiary storage uses robotic mechanisms to store data offline. Linux uses disk partitioning to organize storage across physical disks using schemes like MBR and GPT. Logical volumes and RAID provide additional abstraction and redundancy. Network storage solutions like NAS export file systems over a network while SANs export block storage using protocols like Fibre Channel and iSCSI.
The document provides an overview of storage concepts including:
1) It defines online, nearline and offline storage and their characteristics.
2) It discusses the evolution of storage technologies from DAS to SAN and some advantages of SAN such as increased performance and scalability.
3) It describes some common storage components and technologies used in SAN implementations like HBAs, switches, fabrics and replication.
This document provides an overview of various data storage technologies including RAID, DAS, NAS, and SAN. It discusses RAID levels like RAID 0, 1, 5 which provide data striping and redundancy. Direct attached storage (DAS) connects directly to servers but cannot be shared, while network attached storage (NAS) uses file sharing protocols over IP networks. Storage area networks (SAN) use dedicated storage networks like Fibre Channel and iSCSI to provide block-level access to consolidated storage. The key is choosing the right solution based on capacity, performance, scalability, availability, data protection needs, and budget.
The document provides an introduction to network attached storage (NAS). It discusses the basics of NAS including how it works, differences from SAN storage, features like snapshots and global namespace, and common environments where NAS is used. It also summarizes IBM's NAS solutions including the SONAS enterprise NAS platform and N series unified storage platform, and notes that real-time compression can increase storage efficiency by up to 80% without impacting performance.
A brief study on Storage Area Network (SAN), SAN architecture & its importance. It focuses on the techniques and the technologies that have evolved around SAN & its Security.
Class lecture by Prof. Raj Jain on Storage Virtualization. The talk covers Disk Arrays, Data Access Methods, SCSI (Small Computer System Interface), Advanced Technology Attachment (ATA), ESCON and FICON, Fibre Chanel, Fibre Channel Devices, Fibre Channel Protocol Layers, Fibre Channel Flow Control, Fibre Channel Classes of Service, What is Storage Virtualization?, Benefits of Storage Virtualization, Virtualizing Storage, RAID Levels, Nested RAIDs, Synchronous vs. Asynchronous Replication, Virtual Storage Area Network (VSAN), Physical Storage Network, Virtual Storage Network, SAN vs. NAS, iSCSI (Internet Small Computer System Interface), iFCP (Internet Fiber Channel Protocol), FCIP (Fibre Channel over IP), FCoE (Fibre Channel over Ethernet), Virtual File Systems. Video recording available in YouTube.
Network attached storage different from traditional file servers & implemenIAEME Publication
This document discusses Network Attached Storage (NAS), specifically Windows-based NAS. It begins by defining NAS and how it differs from traditional file servers and direct attached storage. It then discusses the advantages of NAS such as expandability, fault tolerance, and easier management. The document outlines several key technologies that Windows-based NAS utilizes, such as high availability features, easy deployment and management through a web interface, and seamless integration with Active Directory. It concludes by discussing NAS servers and their benefits like rapid installation, support for heterogeneous environments, server consolidation, and improved server performance.
Cloud Computing System models for Distributed and cloud computing & Performan...hrmalik20
Advantage of Clouds over Traditional
Distributed Systems,Clouds,Service-Oriented Architecture (SOA) Layered Architecture,Performance Metrics and Scalability Analysis,System Efficiency,Performance Challenges in Cloud Computing,What is cloud computing and why is it distinctive?,CLOUD SERVICE DELIVERY MODELS AND THEIR
PERFORMANCE CHALLENGES,Cloud computing security,What does Cloud Computing Security mean,Cloud Security Landscape,Distinctions between Security and Privacy,Energy Efficiency of Cloud Computing,How energy-efficient is cloud computing?
Network attached storage (NAS) allows multiple users to access files over a local area network. A NAS device contains one or more hard drives configured in a RAID array for redundancy. It connects directly to the network and has its own IP address. NAS provides a simple way for organizations to centralize, share, and protect their data. Common uses of NAS include file sharing, email storage, and databases. Maintenance includes monitoring performance, addressing failures, tuning storage usage, and supporting users. Future developments aim to improve NAS speed, flexibility, and functionality for high-security environments.
Block Level Storage Vs File Level StoragePradeep Jagan
Video Management System is responsible for accessing, controlling and managing the video content management environment across an Internet Protocol Network.
This document discusses data management strategies in a virtualized environment. It covers topics such as storage design impacts on reliability, availability and scalability. It also discusses VMware backup challenges and solutions like VMware Consolidated Backup (VCB), vStorage APIs for Data Protection (VADP), and vStorage APIs for Array Integration (VAAI). Specific solutions mentioned include data deduplication, thin provisioning, replication and snapshots.
Cloud computing has spawned a new taxonomy for IT. Ubuntu explains 50 key terms to help DevOps and IT professionals to lead their organizations through the journey to the cloud.
The document discusses storage area networks (SANs) and fiber channel technology. It provides background on SANs and how they function as a separate high-speed network connecting storage resources like RAID systems directly to servers. It then covers SAN topologies using fiber channel, including point-to-point, arbitrated loop, and fabric switch configurations. Finally, it discusses planning, managing and the management perspective of SANs in the data center.
Virtualization allows the abstraction and isolation of hardware resources and the sharing of those resources. It enables higher-level functions and services to operate independently of the underlying physical hardware. There are different types of virtualization including hardware, storage, and network virtualization. Virtualization provides benefits such as increased hardware utilization, reduced costs, improved flexibility, and greater security.
The document discusses storing SQL Server database files on a network attached storage (NAS) device. It describes how SQL Server maps databases to operating system files, including primary data files, secondary data files, and log files. It then explains that NAS allows hard disk storage to be added to a network without shutting down servers, and that SQL Server databases can be configured to store files on a NAS by enabling trace flag 1807 and identifying a file share with full access.
Understanding nas (network attached storage)sagaroceanic11
The document discusses network attached storage and storage area networks. It covers various storage models including direct attached storage (DAS), network attached storage (NAS), storage area networks (SANs) and content addressed storage (CAS). For SANs specifically, it describes the key components which include host bus adapters, fibre cabling, fibre channel switches/hubs, storage arrays and management systems. It also discusses SAN connectivity, topologies, management functions and deployment examples.
Innovation for Participation - Paul De Decker, Sun Microsystemsrobinwauters
The document discusses Sun Microsystems' strategy of providing an open source software stack called Solaris AMP (Apache, MySQL, PHP) that is optimized to run on their Solaris operating system. It promotes the benefits of the Solaris operating system and tools to help speed development and deployment. Additionally, it outlines Sun's approach of providing many free and open source software options along with support services to gain customers.
Storage Area Network (SAN) is a dedicated, high-speed network that connects servers to storage devices like disks, disk arrays, and tapes. A SAN provides centralized storage that can be accessed by multiple servers, providing high capacity, high availability, and scalability compared to Direct Attached Storage. Fiber Channel is commonly used as the networking technology for SANs, allowing blocks of data to be accessed by servers over the high-speed SAN fabric.
Cloud computing enables users to access applications, database resources, and other high-end infrastructure over the internet without worrying about maintenance or management of actual resources. It provides services over public and private networks. There are different deployment models like public, private, hybrid, and community clouds. Service models include Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS). Cloud computing offers many benefits like cost-effectiveness, mobility, and collaboration for businesses.
The document discusses how cloud computing and virtualization can support grid infrastructures. It introduces key concepts like virtualization platforms, distributed virtual machine management, and provisioning virtual resources as a cloud service. The RESERVOIR project aims to integrate these technologies with grid computing to provide dynamic, on-demand access to resources like a utility. Virtualization can help address barriers to adopting grid computing by isolating workloads and dynamically allocating resources.
Cloud computing provides on-demand access to computing resources like applications and storage over the Internet. It has various deployment models including public, private, hybrid and community clouds. The main service models are Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS). IaaS provides fundamental computing resources, PaaS supplies platforms for application development, and SaaS delivers software applications to users.
The webinar discusses multi-tenant business intelligence in a cloud computing environment. It defines multi-tenancy as a software architecture where a single instance of an application serves multiple organizations. The webinar then covers use cases for multi-tenant BI and the benefits of the approach. It also outlines four main approaches to multi-tenant BI and the steps to onboard new clients for each approach.
Network attached storage (NAS) is a file-level computer data storage server connected to a computer network providing data access to a heterogeneous group of clients. It is designed to be easy to set up and maintain while providing reliable storage that scales easily. NAS uses standard network protocols like NFS and CIFS to provide file sharing capabilities without requiring dedicated server hardware or software. This makes NAS simpler and more cost-effective than traditional server-based storage while offering high performance and reliability for small or large storage needs.
SAN vs NAS vs DAS: Decoding Data Storage SolutionsMaryJWilliams2
Discover the advantages and differences of SAN, NAS, and DAS storage solutions. With our detailed comparison and insights, you'll be able to determine which data storage system suits your needs best.
For more information visit: https://stonefly.com/blog/san-vs-nas-vs-das-a-closer-look/
In this presentation, we will discuss in details about challenges in managing the IT infrastructure with a focus on server sizing, storage capacity planning and internet connectivity. We will also discuss about how to set up security architecture and disaster recovery plan.
To know more about Welingkar School’s Distance Learning Program and courses offered, visit:
http://www.welingkaronline.org/distance-learning/online-mba.html
A brief study on Storage Area Network (SAN), SAN architecture & its importance. It focuses on the techniques and the technologies that have evolved around SAN & its Security.
Class lecture by Prof. Raj Jain on Storage Virtualization. The talk covers Disk Arrays, Data Access Methods, SCSI (Small Computer System Interface), Advanced Technology Attachment (ATA), ESCON and FICON, Fibre Chanel, Fibre Channel Devices, Fibre Channel Protocol Layers, Fibre Channel Flow Control, Fibre Channel Classes of Service, What is Storage Virtualization?, Benefits of Storage Virtualization, Virtualizing Storage, RAID Levels, Nested RAIDs, Synchronous vs. Asynchronous Replication, Virtual Storage Area Network (VSAN), Physical Storage Network, Virtual Storage Network, SAN vs. NAS, iSCSI (Internet Small Computer System Interface), iFCP (Internet Fiber Channel Protocol), FCIP (Fibre Channel over IP), FCoE (Fibre Channel over Ethernet), Virtual File Systems. Video recording available in YouTube.
Network attached storage different from traditional file servers & implemenIAEME Publication
This document discusses Network Attached Storage (NAS), specifically Windows-based NAS. It begins by defining NAS and how it differs from traditional file servers and direct attached storage. It then discusses the advantages of NAS such as expandability, fault tolerance, and easier management. The document outlines several key technologies that Windows-based NAS utilizes, such as high availability features, easy deployment and management through a web interface, and seamless integration with Active Directory. It concludes by discussing NAS servers and their benefits like rapid installation, support for heterogeneous environments, server consolidation, and improved server performance.
Cloud Computing System models for Distributed and cloud computing & Performan...hrmalik20
Advantage of Clouds over Traditional
Distributed Systems,Clouds,Service-Oriented Architecture (SOA) Layered Architecture,Performance Metrics and Scalability Analysis,System Efficiency,Performance Challenges in Cloud Computing,What is cloud computing and why is it distinctive?,CLOUD SERVICE DELIVERY MODELS AND THEIR
PERFORMANCE CHALLENGES,Cloud computing security,What does Cloud Computing Security mean,Cloud Security Landscape,Distinctions between Security and Privacy,Energy Efficiency of Cloud Computing,How energy-efficient is cloud computing?
Network attached storage (NAS) allows multiple users to access files over a local area network. A NAS device contains one or more hard drives configured in a RAID array for redundancy. It connects directly to the network and has its own IP address. NAS provides a simple way for organizations to centralize, share, and protect their data. Common uses of NAS include file sharing, email storage, and databases. Maintenance includes monitoring performance, addressing failures, tuning storage usage, and supporting users. Future developments aim to improve NAS speed, flexibility, and functionality for high-security environments.
Block Level Storage Vs File Level StoragePradeep Jagan
Video Management System is responsible for accessing, controlling and managing the video content management environment across an Internet Protocol Network.
This document discusses data management strategies in a virtualized environment. It covers topics such as storage design impacts on reliability, availability and scalability. It also discusses VMware backup challenges and solutions like VMware Consolidated Backup (VCB), vStorage APIs for Data Protection (VADP), and vStorage APIs for Array Integration (VAAI). Specific solutions mentioned include data deduplication, thin provisioning, replication and snapshots.
Cloud computing has spawned a new taxonomy for IT. Ubuntu explains 50 key terms to help DevOps and IT professionals to lead their organizations through the journey to the cloud.
The document discusses storage area networks (SANs) and fiber channel technology. It provides background on SANs and how they function as a separate high-speed network connecting storage resources like RAID systems directly to servers. It then covers SAN topologies using fiber channel, including point-to-point, arbitrated loop, and fabric switch configurations. Finally, it discusses planning, managing and the management perspective of SANs in the data center.
Virtualization allows the abstraction and isolation of hardware resources and the sharing of those resources. It enables higher-level functions and services to operate independently of the underlying physical hardware. There are different types of virtualization including hardware, storage, and network virtualization. Virtualization provides benefits such as increased hardware utilization, reduced costs, improved flexibility, and greater security.
The document discusses storing SQL Server database files on a network attached storage (NAS) device. It describes how SQL Server maps databases to operating system files, including primary data files, secondary data files, and log files. It then explains that NAS allows hard disk storage to be added to a network without shutting down servers, and that SQL Server databases can be configured to store files on a NAS by enabling trace flag 1807 and identifying a file share with full access.
Understanding nas (network attached storage)sagaroceanic11
The document discusses network attached storage and storage area networks. It covers various storage models including direct attached storage (DAS), network attached storage (NAS), storage area networks (SANs) and content addressed storage (CAS). For SANs specifically, it describes the key components which include host bus adapters, fibre cabling, fibre channel switches/hubs, storage arrays and management systems. It also discusses SAN connectivity, topologies, management functions and deployment examples.
Innovation for Participation - Paul De Decker, Sun Microsystemsrobinwauters
The document discusses Sun Microsystems' strategy of providing an open source software stack called Solaris AMP (Apache, MySQL, PHP) that is optimized to run on their Solaris operating system. It promotes the benefits of the Solaris operating system and tools to help speed development and deployment. Additionally, it outlines Sun's approach of providing many free and open source software options along with support services to gain customers.
Storage Area Network (SAN) is a dedicated, high-speed network that connects servers to storage devices like disks, disk arrays, and tapes. A SAN provides centralized storage that can be accessed by multiple servers, providing high capacity, high availability, and scalability compared to Direct Attached Storage. Fiber Channel is commonly used as the networking technology for SANs, allowing blocks of data to be accessed by servers over the high-speed SAN fabric.
Cloud computing enables users to access applications, database resources, and other high-end infrastructure over the internet without worrying about maintenance or management of actual resources. It provides services over public and private networks. There are different deployment models like public, private, hybrid, and community clouds. Service models include Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS). Cloud computing offers many benefits like cost-effectiveness, mobility, and collaboration for businesses.
The document discusses how cloud computing and virtualization can support grid infrastructures. It introduces key concepts like virtualization platforms, distributed virtual machine management, and provisioning virtual resources as a cloud service. The RESERVOIR project aims to integrate these technologies with grid computing to provide dynamic, on-demand access to resources like a utility. Virtualization can help address barriers to adopting grid computing by isolating workloads and dynamically allocating resources.
Cloud computing provides on-demand access to computing resources like applications and storage over the Internet. It has various deployment models including public, private, hybrid and community clouds. The main service models are Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS). IaaS provides fundamental computing resources, PaaS supplies platforms for application development, and SaaS delivers software applications to users.
The webinar discusses multi-tenant business intelligence in a cloud computing environment. It defines multi-tenancy as a software architecture where a single instance of an application serves multiple organizations. The webinar then covers use cases for multi-tenant BI and the benefits of the approach. It also outlines four main approaches to multi-tenant BI and the steps to onboard new clients for each approach.
Network attached storage (NAS) is a file-level computer data storage server connected to a computer network providing data access to a heterogeneous group of clients. It is designed to be easy to set up and maintain while providing reliable storage that scales easily. NAS uses standard network protocols like NFS and CIFS to provide file sharing capabilities without requiring dedicated server hardware or software. This makes NAS simpler and more cost-effective than traditional server-based storage while offering high performance and reliability for small or large storage needs.
SAN vs NAS vs DAS: Decoding Data Storage SolutionsMaryJWilliams2
Discover the advantages and differences of SAN, NAS, and DAS storage solutions. With our detailed comparison and insights, you'll be able to determine which data storage system suits your needs best.
For more information visit: https://stonefly.com/blog/san-vs-nas-vs-das-a-closer-look/
In this presentation, we will discuss in details about challenges in managing the IT infrastructure with a focus on server sizing, storage capacity planning and internet connectivity. We will also discuss about how to set up security architecture and disaster recovery plan.
To know more about Welingkar School’s Distance Learning Program and courses offered, visit:
http://www.welingkaronline.org/distance-learning/online-mba.html
Introduction to Enterprise Data Storage, Direct Attached Storage, Storage Ar...ssuserec8a711
1. Cloud storage systems store multiple copies of data across many servers in various locations so that if one system fails, the data can be accessed from another location.
2. Storage providers use virtualization software to aggregate storage assets from various devices into a single cloud storage system called StorageGRID.
3. StorageGRID creates a virtualization layer that retrieves storage from different storage devices and manages it through a common file system interface over the internet.
A SAN (Storage Area Network) is a network designed to transfer data from servers to storage targets as an alternative to directly attached storage. The document defines SAN architecture, which accesses storage at the block level and provides high performance, shared storage with good management tools. It discusses various SAN technologies like Fiber Channel and IP-based solutions. SANs connect storage subsystems, while NAS uses a general network to connect file-based storage. The document also covers SAN topologies, virtualization, protocols, advantages and disadvantages.
The document discusses data storage and cloud computing. It provides an overview of different types of data storage, including direct attached storage (DAS), network attached storage (NAS), and storage area networks (SANs). It also describes different classes of cloud storage, such as unmanaged and managed cloud storage. The document outlines some of the challenges of cloud storage and how cloud providers create virtual storage containers to manage data storage in the cloud.
This document provides an overview of the key differences between storage area networks (SANs) and network attached storage (NAS). It explains that a NAS is a single network storage device that operates on data files, while a SAN connects multiple storage devices that can share data. The document also discusses how NAS and SAN connect and communicate with other devices on the network and how they are sometimes used together in a unified SAN configuration.
The document provides an overview of storage networking concepts including network attached storage (NAS), storage area networks (SANs), and RAID. It discusses the differences between server attached storage (SAS), NAS, and SANs. NAS uses file-level protocols to access storage over IP networks, while SANs use block-level protocols over dedicated fiber channel networks. RAID configurations like RAID 5 provide data redundancy through parity striping.
Direct attached storage (DAS) involves connecting storage devices like hard disk drives directly to a server without a storage network. This provides exclusive access to the disks for the server but has limitations in scalability and availability. Storage area networks (SANs) address these issues by connecting multiple servers and storage devices via a high-speed dedicated network using fiber channel technology. This allows for centralized management of storage that can be dynamically allocated and accessed simultaneously by multiple servers.
A presentation for FY and SY student about basic knowledge of NAS which includes :
1. Introduction of NAS
2. Applications
3. Benefits
4. Advantages
5. Disadvantages
6. NAS vs SAN
7. Future of NAS
Complete configuration of SAN using ESXI Environment and Installation guide. Now you will be able to configure storage area network with the help of these slides.
This configuration helps user to configure ESXI 4, ESXI 3.0 Servers
Software-defined storage abstracts storage resources from physical hardware for greater flexibility and programmability. Storage virtualization pools physical storage into a single virtual storage device that is easier to manage. Hyperconverged storage bundles compute, storage, and networking resources together for simpler management. An essential IT disaster recovery program anticipates disasters, plans responses, and enables quick resumption of operations.
What is a Network-Attached-Storage device and how does it work?MaryJWilliams2
A network-attached storage device, or NAS for short, is a specialised type of computer designed to provide file-based data storage services to a computer network. In contrast to a standard desktop or laptop PC, which typically stores its data on an internal hard drive, a NAS device contains one or more large-capacity drives that are accessible by all devices on the network. This makes it an ideal solution for centrally storing and sharing files among multiple users. But what exactly is a NAS and how does it work then must visit :
https://stonefly.com/blog/network-attached-storage-appliance-practicality-and-usage
What is Network Attached Storage Used for?.pdfEnterprisenas
To summarize it, a NAS network storage is used for unstructured data storage. This can be surveillance videos, files, backups, snapshots, emails, etc. The network attached storage is very handy for HPC (High Performance Computing) requirements. For more information, visit : https://stonefly.com/blog/network-attached-storage-appliance-practicality-and-usage
Maybe your business has outgrown its file server and you’re thinking of replacing it. Or perhaps your server is dated and not supporting your business like it should, so you’re considering moving to the cloud. It might be that you’re starting a new business and wondering if an in-house server is adequate or if you should adopt cloud technology from the start.
Regardless of why you’re debating an in-house server versus a cloud-based server, it’s a tough decision that will impact your business on a daily basis. We know there’s a lot to think about, and we’re here to help show why you should consolidate your file servers and move your data to the cloud.
In this webinar with Talon Storage Solutions, we covered:
-Challenges of using a physical file server
-Benefits of using a cloud file server
-Current State of the File Server market
-Reference Architecture examples for cloud file servers
-Demo: how to architect a cloud file server with highly-available storage
Learn more at https://www.softnas.com
Storage Virtualization: Towards an Efficient and Scalable FrameworkCSCJournals
Enterprises in the corporate world demand high speed data protection for all kinds of data. Issues such as complex server environments with high administrative costs and low data protection have to be resolved. In addition to data protection, enterprises demand the ability to recover/restore critical information in various situations. Traditional storage management solutions such as direct-attached storage (DAS), network-attached storage (NAS) and storage area networks (SAN) have been devised to address such problems. Storage virtualization is the emerging technology that amends the underlying complications of physical storage by introducing the concept of cloud storage environments. This paper covers the DAS, NAS and SAN solutions of storage management and emphasizes the benefits of storage virtualization. The paper discusses a potential cloud storage structure based on which storage virtualization architecture will be proposed.
In this Presentation we talk about :-
What is PACS (Picture Archiving and Communication System).
Functions carried out by PACS.
Storage Devices in PACS
RAID Techniques
Cloud Based PACS
Direct Attached Storage - Information Storage and Management.pptxMithun B N
This ppt contains slides on DAS.
Direct – Attached storage (DAS) is a an architecture where storage connects directly to servers. Applications access data from DAS using block-level access protocols. DAS is ideal for localized data access and sharing in environments that have a small number of servers.
This document discusses Hitachi's Unified Storage (HUS) and Hitachi NAS Platform (HNAS) solutions for file storage. It summarizes that these solutions provide high performance, scalability, and efficiency to help organizations consolidate more file data using less storage. This allows organizations to reduce costs through features like deduplication while improving productivity. The solutions include a range of models and flexibility to address various workload sizes and requirements.
Understanding Inductive Bias in Machine LearningSUTEJAS
This presentation explores the concept of inductive bias in machine learning. It explains how algorithms come with built-in assumptions and preferences that guide the learning process. You'll learn about the different types of inductive bias and how they can impact the performance and generalizability of machine learning models.
The presentation also covers the positive and negative aspects of inductive bias, along with strategies for mitigating potential drawbacks. We'll explore examples of how bias manifests in algorithms like neural networks and decision trees.
By understanding inductive bias, you can gain valuable insights into how machine learning models work and make informed decisions when building and deploying them.
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMSIJNSA Journal
The smart irrigation system represents an innovative approach to optimize water usage in agricultural and landscaping practices. The integration of cutting-edge technologies, including sensors, actuators, and data analysis, empowers this system to provide accurate monitoring and control of irrigation processes by leveraging real-time environmental conditions. The main objective of a smart irrigation system is to optimize water efficiency, minimize expenses, and foster the adoption of sustainable water management methods. This paper conducts a systematic risk assessment by exploring the key components/assets and their functionalities in the smart irrigation system. The crucial role of sensors in gathering data on soil moisture, weather patterns, and plant well-being is emphasized in this system. These sensors enable intelligent decision-making in irrigation scheduling and water distribution, leading to enhanced water efficiency and sustainable water management practices. Actuators enable automated control of irrigation devices, ensuring precise and targeted water delivery to plants. Additionally, the paper addresses the potential threat and vulnerabilities associated with smart irrigation systems. It discusses limitations of the system, such as power constraints and computational capabilities, and calculates the potential security risks. The paper suggests possible risk treatment methods for effective secure system operation. In conclusion, the paper emphasizes the significant benefits of implementing smart irrigation systems, including improved water conservation, increased crop yield, and reduced environmental impact. Additionally, based on the security analysis conducted, the paper recommends the implementation of countermeasures and security approaches to address vulnerabilities and ensure the integrity and reliability of the system. By incorporating these measures, smart irrigation technology can revolutionize water management practices in agriculture, promoting sustainability, resource efficiency, and safeguarding against potential security threats.
Advanced control scheme of doubly fed induction generator for wind turbine us...IJECEIAES
This paper describes a speed control device for generating electrical energy on an electricity network based on the doubly fed induction generator (DFIG) used for wind power conversion systems. At first, a double-fed induction generator model was constructed. A control law is formulated to govern the flow of energy between the stator of a DFIG and the energy network using three types of controllers: proportional integral (PI), sliding mode controller (SMC) and second order sliding mode controller (SOSMC). Their different results in terms of power reference tracking, reaction to unexpected speed fluctuations, sensitivity to perturbations, and resilience against machine parameter alterations are compared. MATLAB/Simulink was used to conduct the simulations for the preceding study. Multiple simulations have shown very satisfying results, and the investigations demonstrate the efficacy and power-enhancing capabilities of the suggested control system.
International Conference on NLP, Artificial Intelligence, Machine Learning an...gerogepatton
International Conference on NLP, Artificial Intelligence, Machine Learning and Applications (NLAIM 2024) offers a premier global platform for exchanging insights and findings in the theory, methodology, and applications of NLP, Artificial Intelligence, Machine Learning, and their applications. The conference seeks substantial contributions across all key domains of NLP, Artificial Intelligence, Machine Learning, and their practical applications, aiming to foster both theoretical advancements and real-world implementations. With a focus on facilitating collaboration between researchers and practitioners from academia and industry, the conference serves as a nexus for sharing the latest developments in the field.
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...IJECEIAES
Climate change's impact on the planet forced the United Nations and governments to promote green energies and electric transportation. The deployments of photovoltaic (PV) and electric vehicle (EV) systems gained stronger momentum due to their numerous advantages over fossil fuel types. The advantages go beyond sustainability to reach financial support and stability. The work in this paper introduces the hybrid system between PV and EV to support industrial and commercial plants. This paper covers the theoretical framework of the proposed hybrid system including the required equation to complete the cost analysis when PV and EV are present. In addition, the proposed design diagram which sets the priorities and requirements of the system is presented. The proposed approach allows setup to advance their power stability, especially during power outages. The presented information supports researchers and plant owners to complete the necessary analysis while promoting the deployment of clean energy. The result of a case study that represents a dairy milk farmer supports the theoretical works and highlights its advanced benefits to existing plants. The short return on investment of the proposed approach supports the paper's novelty approach for the sustainable electrical system. In addition, the proposed system allows for an isolated power setup without the need for a transmission line which enhances the safety of the electrical network
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTjpsjournal1
The rivalry between prominent international actors for dominance over Central Asia's hydrocarbon
reserves and the ancient silk trade route, along with China's diplomatic endeavours in the area, has been
referred to as the "New Great Game." This research centres on the power struggle, considering
geopolitical, geostrategic, and geoeconomic variables. Topics including trade, political hegemony, oil
politics, and conventional and nontraditional security are all explored and explained by the researcher.
Using Mackinder's Heartland, Spykman Rimland, and Hegemonic Stability theories, examines China's role
in Central Asia. This study adheres to the empirical epistemological method and has taken care of
objectivity. This study analyze primary and secondary research documents critically to elaborate role of
china’s geo economic outreach in central Asian countries and its future prospect. China is thriving in trade,
pipeline politics, and winning states, according to this study, thanks to important instruments like the
Shanghai Cooperation Organisation and the Belt and Road Economic Initiative. According to this study,
China is seeing significant success in commerce, pipeline politics, and gaining influence on other
governments. This success may be attributed to the effective utilisation of key tools such as the Shanghai
Cooperation Organisation and the Belt and Road Economic Initiative.
Introduction- e - waste – definition - sources of e-waste– hazardous substances in e-waste - effects of e-waste on environment and human health- need for e-waste management– e-waste handling rules - waste minimization techniques for managing e-waste – recycling of e-waste - disposal treatment methods of e- waste – mechanism of extraction of precious metal from leaching solution-global Scenario of E-waste – E-waste in India- case studies.
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesChristina Lin
Traditionally, dealing with real-time data pipelines has involved significant overhead, even for straightforward tasks like data transformation or masking. However, in this talk, we’ll venture into the dynamic realm of WebAssembly (WASM) and discover how it can revolutionize the creation of stateless streaming pipelines within a Kafka (Redpanda) broker. These pipelines are adept at managing low-latency, high-data-volume scenarios.
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Cloud storage infrastructures
1. Cloud Storage Infrastructures
Prof. NIKHILKUMAR B SHARDOOR
Department of Computer Science Engineering.
School of Engineering, MIT ADT University, Pune.
2. Content
• Introduction to Cloud Storage Infrastructure.
• Direct-Attached Storage (DAS) architecture.
• Storage Area Network (SAN) attributes -components, topologies, connectivity
options and zoning.
• SAN’s- FC protocol stack, addressing, flow control.
• Networked Attached Storage (NAS) components, protocols.
• IP Storage Area Network (IP SAN) iSCSI, FCIP and FCoE architecture.
• Content Addressed Storage (CAS) elements, storage, and retrieval processes.
• Server architectures- Stand-alone, blades, stateless, clustering.
• Cloud file systems: GFS and HDFS, BigTable, HBase and Dynamo.
3. Introduction
Cloud Storage : Cloud storage is a service model in which data is maintained, managed, backed
up remotely and made available to users over a network (typically the Internet).
Cloud Storage Infrastructure : A cloud storage infrastructure is the hardware and software
framework that supports the computing requirements of a private or public cloud storage service.
Both public and private cloud storage infrastructures are known for their elasticity, scalability and
flexibility.
Cloud General Architecture:
Cloud storage architectures are primarily about delivery of storage on demand in a highly scalable and
multi-tenant way. cloud storage architectures consist of a front end that exports an API to access the
storage.
4. Cloud Storage Architecture
Characteristic Description
Manageability
The ability to manage a system with minimal
resources
Access method Protocol through which cloud storage is exposed
Performance Performance as measured by bandwidth and latency
Multi-tenancy Support for multiple users (or tenants)
Scalability
Ability to scale to meet higher demands or load in a
graceful manner
Data availability Measure of a system’s uptime
Control
Ability to control a system — in particular, to
configure for cost, performance, or other
characteristics
Storage efficiency Measure of how efficiently the raw storage is used
Cost
Measure of the cost of the storage (commonly in
dollars per gigabytes) Fig. General Cloud Architecture
5. Cloud Storage Types
• DAS – Direct Attached Storage
• NAS Network Attached Storage.
• SAN- Storage Area Network.
Which Storage technology I should use for my Business Application.?
6. Cloud Storage Infrastructure – Direct Attached Storage(DAS)
• DAS – Direct attached Storage
• DAS stands for Direct Attached Storage and as the name suggests,
it is an architecture where storage connects directly to hosts.
• Examples of DAS include hard drives, SSD, optical disc drives
and external storage drives.
• DAS is ideal for localized data access and sharing in environment
where small server are located for instance, small businesses,
departments etc.
• Block-level access protocols are used to access data through
applications and it can also be used in combination with SAN and
NAS.
7. Cloud Storage Infrastructure – Direct Attached Storage(DAS)
Based on the location of storage devices with respect to host, DAS can be classified as external or
internal.
Internal DAS: The storage device is internally connected to the host by serial or parallel buses.
Most internal buses have distance limitations and can only be used for short distance
connectivity and can also connect only a limited number of devices. And also hamper
maintenance as they occupy large amount of space inside the server.
External DAS: the server connects directly to the external storage devices. SCSI or FC protocol
are used to communicate between host and storage devices.
It overcomes the limitation of internal DAS and overcome the distance and device count
limitations and also provides central administration of storage devices
8. Cloud Storage Infrastructure – Direct Attached Storage(DAS)
Why and why not to go for DAS?
Why to go for DAS:
• It requires low investment than other networking architectures.
• Less hardware and software are needed to setup and operate DAS.
• Configuration is simple and can be deployed easily.
• Managing DAS is easy as host based tools such as host OS are used.
Why not to go for DAS:
• Major limitation of DAS is that it doesn’t scale up well and it restricts the number of hosts that can be directly
connected to the storage.
• Limited bandwidth in DAS hampers the available I/O processing capability and when capability is reached, service
availability may be compromised.
• It doesn’t make use of optimal use of resources due to its lack of ability to share front end ports.
9. Cloud Storage Infrastructure –Network Attached Storage(NAS)
NAS is a file-level computer data storage server connected to a network and providing data accessibility to a
diverse group of clients.
NAS is specialized for the task assigned to it either by its hardware, software or by both and provides the
advantage of server consolidation by removing the need of having multiple file servers.
NAS also uses its own OS which works on its own peripheral devices.
A NAS operating systems is optimized for file I/O and, therefore performs file I/O better than a primitive server.
It also uses different protocols like TCP/IP, CIFS and NFS which are basically used for data transfer and for
accessing remote file service.
Components of NAS
NAS head which is basically a CPU and a memory.
More than one Network Interface Cards (NIC’s).
Optimized Operating System.
Protocols for file sharing (NFS or CIFS).
Protocols to connect and manage storage devices like ATA, SCSI, or FC.
10. Cloud Storage Infrastructure –Network Attached Storage(NAS)
• Centralized storage device for storing data on a
network.
• Will have multiple hard drives in RAID
configuration.
• Directly attaches to a switch or router on a
network.
• Are used in Small businesses.
Drawbacks
• Single point of Failure.
FIG: NAS
Fig: Network Attached Storage
11. Cloud Storage Infrastructure –Storage Area Network(SAN)
• A storage area network (SAN) provides access to consolidated, block level data storage that is accessible by
the application running on any of the networked server.
• It carries data between servers (hosts) and storage devices through fibre channel switches.
• A SAN helps in aiding organizations to connect geographically isolated hosts and provide robust
communication between hosts and storage devices.
• In a SAN, each storage server and storage device is linked through a switch which includes SAN features like
storage virtualization, quality of service, security and remote sensing etc.
Components of SAN: Cabling, Host Bus Adapters (HBA) and Switches.
• Cabling:- is the physical medium which is used to for establishing a link between every SAN device.
• HBA or Host Bus Adapter is an expansion card that fits into expansion slot in a server.
• Switch is used to handle and direct traffic between different network devices. It accepts traffic and then
transmits the traffic to the desired endpoint device.
12. Cloud Storage Infrastructure –Storage Area Network(SAN)
• A Special High Speed network that stores and
provides access to large amounts of data.
• SAN’s are Fault Tolerant.
• Data is shared among several disk arrays.
• Server access data as if it was accessing data from
local drive.
• iSCSI(Cheaper) and FC(Expensive) protocols
used.
• SAN’s are not affected by network traffic.
• Highly scalable, Highly Redundant and High
Speed(interconnected with fibre channel).
• Expensive.
Fig: Storage Area Network
13. Cloud Storage Infrastructure –Key Difference between DAS, NAS and SAN
• DAS–Directly Attached Storage.
-Usually disk or tape.
-Directly attached by a cable to the computer processor.(The hard disk drive inside a PC or a tape drive attached
to a single server are simple types of DAS.) I/O requests (also called protocols or commands).
-Access devices directly.
• NAS–Network Attached Storage.
-A NAS device (“appliance”), usually an integrated processor plus disk storage, is attached to a TCP/IP-based
network (LAN or WAN), and accessed using specialized file access/file sharing protocols.
-File requests received by a NAS are translated by the internal processor to device requests.
• SAN-Storage Area Network.
-Storage resides on a dedicated network.
-I/O requests access devices directly.
-Uses Fiber Channel media, providing an any-to-any connection for processors and storage on that network.
-Ethernet media using an I/O protocol called iSCSI is emerging in.
14. DAS,NAS,SAN-Best Case Scenario Vs Worst Case Scenario
Storage
Type
Best Case Scenario Worst Case Scenario
DAS DAS is ideal for small businesses that only need to
share data locally, have a defined, non-growth budget
to work with and have little to no IT support to
maintain a complex system
DAS is not a good choice for businesses that are
growing quickly, need to scale quickly, need to
share across distance and collaborate or support a
lot of system users and activity at once
NAS NAS is perfect for SMBs and organizations that need
a minimal-maintenance, reliable and flexible storage
system that can quickly scale up as needed to
accommodate new users or growing data
Server-class devices at enterprise organizations that
need to transfer block-level data supported by a
Fibre Channel connection may find that NAS can’t
deliver everything that’s needed. Maximum data
transfer issues could be a problem with NAS
SAN SAN is best for block-level data sharing of mission-
critical files or applications at data centers or large-
scale enterprise organizations.
SAN can be a significant investment and is a
sophisticated solution that’s typically reserved for
serious large-scale computing needs. A small-to-
midsize organization with a limited budget and few
IT staff or resources likely wouldn’t need SAN.
15. Storage Networking (FC, iSCSi, FCoE)
Fibre Channel (FC) is a technology for transmitting data between computer devices at data rates of up to 20 Gbps at present
time and more in the near future.
• Fibre Channel began in the late 1980s as part of the IPI (Intelligent Peripheral Interface) Enhanced Physical Project to
increase the capabilities of the IPI protocol. That effort widened to investigate other interface protocols as candidates for
augmentation. In 1998, Fiber Channel was approved as a project and now have become and industry standard.
iSCSI - Internet Small Computer System Interface, is a storage networking standard used to link different storage
facilities.
• iSCSI is used to transmit data over local area networks, wide area networks or the Internet and can enable location-
independent data storage and retrieval and is one of two main approaches to storage data transmission over IP networks.
Fibre Channel over IP, translates Fibre Channel control codes and data into IP packets for transmission between
geographically distant Fibre Channel SANs.
16. FCoE Benefits
• Mapping of Fibre Channel frames over Ethernet
• Fibre Channel enabled to run on a lossless Ethernet
network
• Wire server only once
• Fewer cables and adapters
• Software provisioning of I/O
• Interoperates with existing Fibre Channel SANs
• No gateway; stateless
iSCSI Benefits
• SCSI transport protocol that operates over TCP
• Encapsulation of SCSI command descriptor blocks and data
in TCP/IP byte streams
• Wire server only once
• Fewer cables and adaptors
• New operational model
• Broad industry support; OS vendors support their iSCSI
drivers, gateways (routers, bridges), and native iSCSI storage
arrays
18. • FCIP uses a tunnel to transfer data between networks. It relies on SCSI.
• FCoE was developed to simplify switches and consolidate I/O in comparison with FCIP. It replaces
FC links with high speed ethernet links between the devices that support the network.
• iFCP is a new standard that broadens the way data can be transferred over the internet. It combines
the FCIP and iSCSI protocols.
For More Details refer this
Link 1 https://www.cisco.com/c/en/us/products/collateral/switches/nexus-5000-series-switches/white_paper_c11-495142.html
link 2: http://www.provision.ro/storage-infrastructure/storage-networking-fc-iscsi-fcoe#pagei-1|pagep-1|
19. Summary:
• FCoE was not designed to make iSCSI obsolete. iSCSI has many applications that FCoE does not cover, in particular in low-
end systems and in small, remote branch offices, where IP connectivity is of paramount importance.
• Some customers have limited I/O requirements in the 100-Mbps range, and iSCSI is just the right solution for them. This is
why iSCSI has taken off and is so successful in the SMB market: it is cheap, and it gets the job done.
• Large enterprises are adopting virtualization, have much higher I/O requirements, and want to preserve their investments and
training in Fibre Channel. For them, FCoE is probably a better solution.
• FCoE will take a large share of the SAN market. It will not make iSCSI obsolete, but it will reduce its potential market.
20. Cloud File System
A cloud file system is a distributed file system that allows many clients to have access to data and supports
operations on that data.
A File system also ensure the security in terms of Confidentiality, Availability and Integrity.
Types of Cloud File System
• GFS - Google File System.
• HDFS- Hadoop Distributed File System.
• BigTable
• HBase
• Dynamo
21. Cloud File System: Google File System
Fig: Architecture of GFS
• GFS is a proprietary distributed file
system developed by Google for its own
use.
• GFS is used to store and process huge
volumes of data in a distributed
manner.
• GFS consists of a single master and
multiple chunk servers.
• Files are divided into fixed sized chunks
• Each chunk has 64 MB of data in it.
• Each chunk is replicated on multiple
chunk servers (3 by default). Even if any
chunk server crashes, the data file will
still be present in other chunk servers.
22. Cloud File System: Google File System
Files are divided into fixed sized chunks of has
64 MB SIZE.
Each chunk is replicated on multiple chunk servers (3 by
default). Even if any chunk server crashes, the data file will still
be present in other chunk servers
23. Cloud File System: HDFS
• HDFS is a Apache project; Yahoo, Facebook, IBM etc.
are based on HDFS.
• HDFS is the storage unit of Hadoop that is used to store
and process huge volumes of data on multiple data
nodes.
• It is designed with low cost hardware that provides data
across multiple Hadoop clusters.
• It has high fault tolerance and throughput.
• Large file is broken down into small blocks of data,
default block size of 128 MB which can be increased as
per requirement.
• Multiple copies of each block are stored in the cluster in
a distributed manner on different nodes.
Fig: Architecture of HDFS
24. Cloud File System GFS Vs HDFS
GFS and HDFS are similar in many aspects and are used for storing large amount of data sets.
There are a few aspects where these can be proven to be a little different from each other.
The key aspects which differ are below:
Key Aspects GFS HDFS
Load Division GFS comprises of a single Master node and
multiple Chunk Servers.
HDFS has single Namenode and multiple
Datanodes in the file system.
Size of the
blocks
GFS stores its data into blocks and the size of
each block is 64MB which is the default block
size.
HDFS divides data into blocks and size of
each block is 128MB which is the default
block size.
Data chunk’s
storage location
GFS checks all the chunk servers in the startup
and will not maintain any particular record for
checking the replication information of any
particular data chunk.
The HDFS maintains the record of all the
data nodes information in the name node.
25. Cloud File System GFS Vs HDFS
Key Aspects GFS HDFS
Atomic Record
Appends
GFS provides an append option along with the
offset option. here the users can append the
file with a different offset which specifies the
same file. this kind of approach helps in
random read and write ability to the GPS
which the HDFS lacks.
HDFS can append a certain file along with
another but it does not provide an option of
offset.
Data Integrity GFS Check servers use checksums to detect
corruption of the stored data and another way
of checking the corruption is by comparing the
files for replications.
HDFS checks the contents of the HDFS
files when any file is corrupted. It uses
client software and applies checksum
checking.
Deletion In GFS the resources of the deleted files are
not reclaimed immediately as it is done in
HDFS, instead, they are stored in a different
file and they are forcibly removed if the file
won't get deleted within three days.
In HDFS the deleted files are directly
removed into a particular folder and then
they are removed by a garbage collector.
Snapshot GFS allows individual files and directories to
be snapshotted.
HDFS can take snapshots up to 65,536 for
each file.
26. Cloud file System: BigTables
• Bigtable is a compressed, high performance, proprietary data storage system built on Google File System,
developed by Google.
• Designed to scale to a very large size
• Petabytes of data across thousands of servers
• Used for many Google projects
• Web indexing, Personalized Search, Google Earth, Google Analytics, Google Finance
• Flexible, high-performance solution for all of Google’s products
Goals
• Want asynchronous processes to be continuously updating different pieces of data
• Want access to most current data at any time
• Need to support:
• Very high read/write rates (millions of ops per second)
• Efficient scans over all or interesting subsets of data
• Efficient joins of large one-to-one and one-to-many datasets
• Often want to examine data changes over time
• E.g. Contents of a web page over multiple crawls
27. Building Blocks
• Building blocks:
• Google File System (GFS): Raw storage
• Scheduler: schedules jobs onto machines
• Lock service: distributed lock manager
• MapReduce: simplified large-scale data processing
• BigTable uses of building blocks:
• GFS: stores persistent data (SSTable file format for storage of data)
• Scheduler: schedules jobs involved in BigTable serving
• Lock service: master election, location bootstrapping
• Map Reduce: often used to read/write BigTable data
28. Basic Data Model
• A BigTable is a sparse, distributed persistent multi-dimensional sorted map
(row, column, timestamp) -> cell contents
• Good match for most Google applications
29. WebTable Example
• Want to keep copy of a large collection of web pages and related information
• Use URLs as row keys
• Various aspects of web page as column names
• Store contents of web pages in the contents: column under the timestamps when they were fetched.
30. Rows
• Name is an arbitrary string
• Access to data in a row is atomic
• Row creation is implicit upon storing data
• Rows ordered lexicographically
• Rows close together lexicographically usually on one or a small number of machines
• Reads of short row ranges are efficient and typically require communication with a small number of
machines.
• Can exploit this property by selecting row keys so they get good locality for data access.
• Example:
math.gatech.edu, math.uga.edu, phys.gatech.edu, phys.uga.edu
VS
edu.gatech.math, edu.gatech.phys, edu.uga.math, edu.uga.phys
31. Columns
• Columns have two-level name structure:
• family:optional_qualifier
• Column family
• Unit of access control
• Has associated type information
• Qualifier gives unbounded columns
• Additional levels of indexing, if desired
32. Timestamps
• Used to store different versions of data in a cell
• New writes default to current time, but timestamps for writes can also be set explicitly by clients
• Lookup options:
• “Return most recent K values”
• “Return all values in timestamp range (or all values)”
• Column families can be marked w/ attributes:
• “Only retain most recent K values in a cell”
• “Keep values until they are older than K seconds”
33. Cloud File System :HBase and Dynamo
• HBase is a distributed column-oriented database built on top
of the Hadoop file system. It is an open-source project and is
horizontally scalable.
• HBase is a data model that is similar to Google’s big table
designed to provide quick random access to huge amounts of
structured data. It leverages the fault tolerance provided by the
Hadoop File System (HDFS).
• It is a part of the Hadoop ecosystem that provides random
real-time read/write access to data in the Hadoop File System.
• One can store the data in HDFS either directly or through
HBase. Data consumer reads/accesses the data in HDFS
randomly using HBase. HBase sits on top of the Hadoop File
System and provides read and write access
34. Features of HBase
• HBase is linearly scalable.
• It has automatic failure support.
• It provides consistent read and writes.
• It integrates with Hadoop, both as a source
and a destination.
• It has easy java API for client.
• It provides data replication across clusters.
Where to Use HBase
•Apache HBase is used to have random, real-time read/write
access to Big Data.
•It hosts very large tables on top of clusters of commodity
hardware.
•Apache HBase is a non-relational database modeled after
Google's Bigtable. Bigtable acts up on Google File System,
likewise Apache HBase works on top of Hadoop and HDFS.
Applications of HBase
•It is used whenever there is a need to write heavy applications.
•HBase is used whenever we need to provide fast random access to available data.
•Companies such as Facebook, Twitter, Yahoo, and Adobe use HBase internally.
35. Architecture of HBase
• HBase has three major components: the client library, a master server, and region servers.
36. Architecture of HBase
• HBase has three major components: the client library, a
master server, and region servers.
• Region servers can be added or removed as per requirement.
The master server -
• Assigns regions to the region servers and takes the
help of Apache ZooKeeper for this task.
• Handles load balancing of the regions across region
servers. It unloads the busy servers and shifts the
regions to less occupied servers.
• Maintains the state of the cluster by negotiating the
load balancing.
• Is responsible for schema changes and other metadata
operations such as creation of tables and column
families.
37. Regions
• Regions are nothing but tables that are split up and spread across the region servers.
Region server
• The region servers have regions that -
• Communicate with the client and handle data-related operations.
• Handle read and write requests for all the regions under it.
• Decide the size of the region by following the region size thresholds.
Zookeeper
• Zookeeper is an open-source project that provides services like maintaining configuration information,
naming, providing distributed synchronization, etc.
• Zookeeper has ephemeral nodes representing different region servers. Master servers use these nodes to
discover available servers.
• In addition to availability, the nodes are also used to track server failures or network partitions.
• Clients communicate with region servers via zookeeper.
• In pseudo and standalone modes, HBase itself will take care of zookeeper.
38. Dynamo
• Amazon DynamoDB is a fully managed
NoSQL database service that allows to create
database tables that can store and retrieve any
amount of data.
• It automatically manages the data traffic of
tables over multiple servers and maintains
performance.
• It also relieves the customers from the burden
of operating and scaling a distributed database.
• Hardware provisioning, setup, configuration,
replication, software patching, cluster scaling, etc.
is managed by Amazon
39. Benefits of DynamoDB
• Managed service − Amazon DynamoDB is a managed service. There is no need to hire experts to
manage NoSQL installation. Developers need not worry about setting up, configuring a distributed
database cluster, managing ongoing cluster operations, etc. It handles all the complexities of
scaling, partitions and re-partitions data over more machine resources to meet I/O performance
requirements.
• Scalable − Amazon DynamoDB is designed to scale. There is no need to worry about predefined
limits to the amount of data each table can store. Any amount of data can be stored and retrieved.
DynamoDB will spread automatically with the amount of data stored as the table grows.
• Fast − Amazon DynamoDB provides high throughput at very low latency. As datasets grow,
latencies remain stable due to the distributed nature of DynamoDB's data placement and request
routing algorithms.
40. • Durable and highly available − Amazon DynamoDB replicates data over at least 3
different data centers’ results. The system operates and serves data even under
various failure conditions.
• Flexible: Amazon DynamoDB allows creation of dynamic tables, i.e. the table can
have any number of attributes, including multi-valued attributes.
• Cost-effective: Payment is for what we use without any minimum charges. Its
pricing structure is simple and easy to calculate.