The document provides an overview of key concepts covered in a GPFS 4.1 system administration course, including backups using mmbackup, SOBAR integration, snapshots, quotas, clones, and extended attributes. The document includes examples of commands and procedures for administering these GPFS functions.
Basic knowledge of Storage technology and complete understanding on DAS, NAS & SAN with advantages and disadvantages. A quick understanding on storage will help you make the best decision in terms of cost and need.
IBM Spectrum Scale Best Practices for Genomics Medicine WorkloadsUlf Troppens
Genomics medicine requires physicians, data scientists and researchers to analyze huge amounts of genomics data quickly. The IBM Spectrum Scale Best Practices for Genomics Medicine Workload provides composable infrastructure that enables IT architects to customize deployments for varying functional and performance needs. The described scale-out architecture is capable to store, access and manage genomics data from a few 100 TB to tens of PB. The solution integrates compute resources and an easy-to-use Web User Interface to submit high-throughput batch jobs to analyze genomics data sets. While the best practices are optimized for genomics medicine workloads, most of the settings are generic and applicable to other workloads and industries.
Basic knowledge of Storage technology and complete understanding on DAS, NAS & SAN with advantages and disadvantages. A quick understanding on storage will help you make the best decision in terms of cost and need.
IBM Spectrum Scale Best Practices for Genomics Medicine WorkloadsUlf Troppens
Genomics medicine requires physicians, data scientists and researchers to analyze huge amounts of genomics data quickly. The IBM Spectrum Scale Best Practices for Genomics Medicine Workload provides composable infrastructure that enables IT architects to customize deployments for varying functional and performance needs. The described scale-out architecture is capable to store, access and manage genomics data from a few 100 TB to tens of PB. The solution integrates compute resources and an easy-to-use Web User Interface to submit high-throughput batch jobs to analyze genomics data sets. While the best practices are optimized for genomics medicine workloads, most of the settings are generic and applicable to other workloads and industries.
Big Lab Problems Solved with Spectrum Scale: Innovations for the Coral Programinside-BigData.com
In this video from the DDN User Group at SC16, Sven Oehme Chief Research Strategist, IBM, presents "Big Lab Problems Solved with Spectrum Scale: Innovations for the Coral Program."
Watch the video presentation: http://wp.me/p3RLHQ-g52
Sign up for our insideHPC Newsletter: http://wp.me/p3RLHQ-g52
IBM Spectrum Scale ECM - Winning CombinationSasikanth Eda
This presentation describes various deployment options to configure IBM enterprise content management (ECM) FileNet® Content Manager components to use IBM Spectrum Scale™ (formerly known as IBM GPFS™) as back-end storage. It also describes various IBM Spectrum Scale value-added features with FileNet Content Manager
to facilitate an efficient and effective data-management solution.
• We sleeping well. And our mobile ringing and ringing. Message: DISASTER! In this session (on slides) we are NOT talk about potential disaster (such BCM); we talk about: And what NOW? New version old my old well-known session updated for whole changes which happened in DBA World in last two-three years.
• So, from the ground to the Sky and further - everything for surviving disaster. Which tasks should have been finished BEFORE. Is virtual or physical SQL matter? We talk about systems, databases, peoples, encryption, passwords, certificates and users.
• In this session (on few demos) I'll show which part of our SQL Server Environment are critical and how to be prepared to disaster. In some documents I'll show You how to be BEST prepared.
Power point presentation on backup and recovery.
A good presentation cover all topics.
For any other type of ppt's or pdf's to be created on demand contact -dhawalm8@gmail.com
mob. no-7023419969
Open Source Backup Conference 2014: Rear, by Ralf DannertNETWAYS
ReaR(Relax and Recover) is delivered as part of the SUSE Linux High Availability Extension.
We show -by way of example- how corporations integrate ReaR during Preparation, Testing and Recovery as buildingblock of their disaster recovery strategy.In the technical part we will highlight the AutoYaST/YaST integration with rear-suse.
We will also investigate some of the adaptations, that had to be done to make ReaR work with upcoming SLES12, that will include systemd and grub2 to be able to automatically recover btrfs subvolumes.
Similar to Ibm spectrum scale fundamentals workshop for americas part 8 spectrumscale backup-sobar-snaps-quotas-clones (20)
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
10. Course materials may not be reproduced in whole or in part without the prior written permission of IBM. 8.0
Scale out Backup and Recovery (SOBAR)
SOBAR
11. 11
GPFS Scale Out Backup and Recovery
• SOBAR Relies on two well integrated components
– TSM HSM capability to premigrate files
– GPFS capability to dump file system image
• TSM HSM integrates with GPFS policy engine
– Allows to premigrate files (backup) which have
changed in between „backup“ cycles
– Versioning is not possible
• GPFS file system image includes all file system
metadata (inodes, etc)
– File system image is backed up to TSM
• For recovery the file system image is re-applied to
a new GPFS file system
– All data appears migrated and can be recalled
• SOBAR provides disaster protection
– Backup data resides in TSM server
GPFS Cluster
TSM Servers
Tape
LAN
TSM HSM
SOBAR
12. 12
• Recovery of the GPFS file system includes all directories and files in stub
format
– Most recent files can be selectively recalled
• High backup scalability:
– Leverages GPFS policy engine for fast file identifcation
– Files are “backed“ up incrementally forever
• High restore performance
– Only file metadata is applied without transferring file data
– File data resides on the TSM Server and recall happens on demand
• Lifts the ACL/Extended Attribute limitation of the TSM Server.
– Because Complete inode information is part of the image file.
• Requires TSM HSM and Backup client to be licensed and installed
• No versioning possible
GPFS SOBAR Characteristics
13. 13
stub
Object ID
(DMAPI handle)
TSM Server
file
migrated
Object ID
(DMAPI handle)
filepremigrated
stub
Object ID (new
DMAPI handle)
after
Image restore
migstate=yes
file
file
premigrate
migrate
recall
HSM Client on GPFS cluster
TSM HSM file states
File HSM state
recall
resident file
14. 14
Scale Out Backup And Restore – Backup process
File Data
Directory Data &
Directory Tree Relation
Metadata
(Inode & ACL)
File System and Cluster
Configuration Data
Continously:
Premigrate all file data
to TSM Server using
TSM for Space Management
and GPFS policy engine
Backup Step 1:
Collect and backup
file system configuration
to TSM Server using
TSM Client
Backup Step 2:
Create file system
image files and
Backup to TSM
Server using TSM
Client
TSM Server
15. 15
Scale Out Backup And Restore – Recovery process
File Data
Directory Data &
Directory Tree Relation
Metadata
(Inode & ACL)
File System and Cluster
Configuration Data
Recovery Step 1:
Restore file system
configuration
And recreate files
system manually
TSM Server
Recovery Step 2:
Mount file system and
restore file system
image files from TSM
Server
Automatically recreate
file system metadata
and directory tree
Recovery Step 3:
Enable space management and start
production.
Recall file data (on demand & background
using GPFS policy engine)
21. 21
Comparison of GPFS backup methods
Characteristic Snapshot mmbackup SOBAR
RTO Recovery Time Objective Low High (reading tape) Medium (partial tape read –
on demand)
RPO Recovery Point Objective Low Medium - High Medium
Backup window Low High Medium
Versioning Yes (multiple Snapshots) Yes No (stubbed)
Disaster protected No Yes Yes
Complete restore Yes Maybe Yes
Backup to tape No Yes Yes
Integration with ILM No Yes Yes
22. Course materials may not be reproduced in whole or in part without the prior written permission of IBM. 8.0
Clones
34. Course materials may not be reproduced in whole or in part without the prior written permission of IBM. 8.0
Notes on TSM & HSM & Extended Attributes
This 1hr session of Spectrum Scale & ESS features, provide awareness of solutions and competitive advantages and how clients use them to improve their data / object management, with some discussion on the limitations and best practices, as well as a basic understanding on Extended Metadata, backup integration (their requirements), SOBAR, Quotas, Snapshots and Clones from a Sales perspective and basic implementation perspective.
There are 6 basic ways to backup data with GPFS.
1st and likely the best is Replication to a 2nd cluster. (almost instantly available on disaster).
2nd TSM Backup to tape or VTL.
3rd SOBAR (using TSM & HSM for tape optimized Disaster Recovery)
4th is Snapshots (point in time copies of files or file versioning)
5th is File Cloning
6th Is using client mounted exports to back up data to any common Backup system.
Let’s take a minute to discuss TSM Backups.
Notes:
GPFS can backup either a GPFS configuration or file data. But Standard File System Backups do not provide Disaster recoverability alone.
There are two things that remain essential considerations for backup.
GPFS Configuration & File Data
The configuration consists of all the Cluster Configuration information as well as the File System Configuration information.
The File Data consists of Data & Metadata.
Notes:
Before talking about backing up configuration data we need to look at a feature of GPFS called a “user exit” for backups.
A user exit is an event triggered script. The Script determines which event will trigger it’s execution.
Instructor notes:
Purpose — Describe user exits.
Details —
Additional information —
Transition statement — However you backup your data there are some files you want to be sure to back up.
Notes:
Backing up the GPFS file system configuration is extremely important. This can be done as the slide details.
If your intent is to backup data for individual file recovery that’s one thing, but if you want to restore a file system you will need the information that sets that up for you.
Instructor notes:
Purpose — Describe backing up the GPFS configuration.
Details —
Additional information —
Transition statement — Let’s remember the cluster configuration.
Notes:
There is a copy of the mmsdrfs file on every node in the cluster to cover some failure scenarios but it is always a good idea to back it up.
You can back it up in the file system or by using the user exit to make sure it is backed up when the file system changes.
Instructor notes:
Purpose — Describe backup and restoring of the cluster configuration.
Details —
Additional information —
Transition statement — Let’s look at GPFS and TSM.
Notes:
Use the mmbackup command to backup a GPFS file system to a backup server. mmbackup takes a temporary
snapshot named. mmbuSnapshot of the specified file system, and backs up this snapshot up to a back end
data store. Accordingly, the files backed up by the command will be stored in the directory
/Device/.snapshots/.mmbuSnapshot in the remote data store. This command may be issued from any GPFS
node in the cluster to which the file system being backed up belongs, and on which the file system is
mounted.
As part of a backup strategy you should backup your configuration files not stored in GPFS.
Instructor notes:
Purpose — Describe using GPFS with TSM.
Details —
Additional information —
Transition statement — Let’s review what we’ve learned to this point.
We often get the question of if we support Netbackup or some other 3rd party backup solution. And although we do not have plug in backup capabilities it is possible to design a solution to support 3rd party backup using the policy engine to build the work list for that backup system to use to back up our file system from a mounted file system on their media servers.
Scale Out Backup and Restore (SOBAR) is a specialized mechanism for data protection against disaster only for GPFS™ file systems that are managed by Tivoli® Storage Manager (TSM) Hierarchical Storage Management (HSM). So, you do need TSM & HSM licensing and Storage Pools for both TSM & HSM to use it.
For such systems, the opportunity exists to premigrate all file data into the HSM storage and take a snapshot of the file system structural metadata, and save a backup image of the file system structure. This metadata image backup, consisting of several image files, can be safely stored in the backup pool of the TSM server and later used to restore the file system in the event of a disaster.
The SOBAR utilities include the commands mmbackupconfig, mmrestoreconfig, mmimgbackup, and mmimgrestore. The mmbackupconfig command will record all the configuration information about the file system to be protected and the mmimgbackup command performs a backup of GPFS file system metadata. The resulting configuration data file and the metadata image files can then be copied to the TSM server for protection.
In the event of a disaster, the file system can be recovered by recreating the necessary NSD disks, restoring the file system configuration with the mmrestoreconfig command, and then restoring the image of the file system with the mmimgrestore command. NOTE: that the mmrestoreconfig command must be run prior to running the mmimgrestore command.
Extended Attributes of all files in GPFS will list the state of HSM migration. Resident, Pre-migrated, Migrated.
SOBAR will reduce the time needed for a complete restore by utilizing all available bandwidth and all available nodes in the GPFS cluster to process the image data in a highly parallel fashion. It will also permit users to access the file system before all file data has been restored, thereby minimizing the file system down time. Recall from HSM of needed file data is performed automatically when a file is first accessed.
The first step of the backup process is to collect and backup the file system configuration to TSM using the TSM client. The second step is to create the file system image and backup to the TSM server again using the TSM client and it continuously manages to pre-migrate all file data to the TSM server using HSM with the GPFS Policy engine (Incremental forever).
Then to restore, it begins with the File system configuration to recreate the file system manually, then it mount the file system and restores file images form TSM and automatically recreates the metadata and Directory trees, and finally we enable Space Management and start production, then clients recall file data on demand & in the background using the GPFS Policy engine.
One limitation to note is that these commands cannot be run from a Windows node in a GPFS cluster.
Just to clean up the process in review this chart walks thru the entire process of a SOBAR backup.
A snapshot of an entire GPFS™ file system can be created to preserve the contents of the file system at a single point in time.
Snapshots of the entire file system are also known as global snapshots. It is an instantaneous capture of the state of metadata that points to a set of data blocks.
The storage overhead for maintaining a snapshot is keeping a copy of data blocks that would otherwise be changed or deleted after the time of the snapshot.
Snapshots provide an online backup capability that allows easy recovery from common problems such as accidental deletion of a file, and comparison with older versions of a file.
However, because snapshots are not copies of the entire file system, they should not be used as protection against media failures.
Notes:
A snapshot is a logical, read-only copy of the file system or fileset (and all of its data) at a point in time.
A file system or Independent file set can be captured with a snapshot.
Dependent File sets can only be snapshot captured by their parent file system snapshot.
Transition statement — Let’s see how snapshots work.
Notes:
Various commands allow for the administration of snapshots:
Create a snapshot: mmcrsnapshot Device Directory [-j Fileset]
Viewing snapshot information: mmlssnapshot Device [-d [--block-size {BlockSize | auto}]] [-s {all | global | Snapshot[,Snapshot...]} | -j Fileset[,Fileset...]]
Delete a snapshot: -N Parameter allows for faster delete of snapshot
Usage: mmdelsnapshot Device Directory [-N {Node[,Node...] | NodeFile | NodeClass}]
Restore a file system from a snapshot—most of the time restore from snapshot is partial using file system commands to copy from snapshot directory to active area.
To restore the entire file system from a snapshot: mmrestorefs Device Directory [-c] However, this can obviously overwrite files that are intentionally changed changes.
Instructor notes:
Purpose — Describe snapshot administration.
Details —
Additional information —
Transition statement — One use for a snapshot is for running a point in time backup. Let’s look at accessing snapshot data.
Notes:
Snapshots are accessible through the “.snapshots” sub-directory in the file system root directory. This location can be changed.
One common failed assumption is that:
When snapshots are present, deleting files from the active file system does not always result in any space actually being freed up; rather, blocks may be pushed to the previous snapshot.
In order to to see true space reclaimed capacity from snapshot deletions, all snapshots must be deleted on the files.
Cloning a file is similar to creating a copy of a file, but the creation process is faster and more space efficient because no additional disk space is consumed until the clone or the original file is modified. Multiple clones of the same file can be created with no additional space overhead. You can also create clones of clones.
Read the chart
GPFS Clones are often used for VM’s, as well as Test, QA, and Development teams that all need to do their own thing with a same base state on files.
Especially when you want to save time and capacity, as only the changes are actually written.
Creating clones is a simple process however it is file based. File Systems and file sets cannot be cloned without a complex use of the policy engine to process a work list of files to clone.
To create a read-only snapshot of a file to be cloned issue the “mmclone snap file1 snap1”
To create a writeable clone copy clone parent issue the “mmclone copy snap1 file2”
The GPFS™ quota system helps you to control the allocation of files and data blocks in a file system.
Quotas are enabled by the system administrator when control over the amount of space used by the individual users, groups of users, or individual filesets is required.
GPFS quotas can be defined on
Individual users
Groups of users
Individual filesets
There are currently two approaches to quotas:
Enforced: These are traditional quotas where a soft and hard limit is set
Pros: Automatically implements hard limits,
Cons: Performance Overhead on each allocation and file create
Usage Reports: Use the sample utility (or your own tool) to report on usage on a periodic basis (commonly used for chargeback like purposes.
Pros: Short batch run can be done at off time, no performance impact on allocation or file create
Cons: No hard limits, enforcement is through a mechanism like “nag” emails
Transition statement — Usage report quotas are done in batch mode issued by cron, for example and you can customize the reports. Let’s take a look at how the “traditional” quota mechanism works.
Notes:
This is the process for setting up quotas
1. Set the –Q parameter on the file system
2. It is not required to set default quotas
3. Quotas can be set on users/groups/fileset and based on allocated bytes and/or number of inodes
4. Hard quotas will be implemented automatically, tools are provided to report on quota usage.
Transition statement — Let’s walk through each step of this process.
This is a file system parameter that can be set on creation or after the file system is created.
ProtoUser is a user that is used as a prototype for setting quotas on other users. For example, you can have an HR user prototype that sets quotas the same for all HR employees.
Transition statement — Now that quotas are enabled you can set default quotas
Notes:
By default, user and group quota limits are enforced across the entire file system. Optionally, the scope of quota enforcement can be limited to an individual fileset boundaries.
Transition statement — You can use default quotas, individual quotas or a mix.
mmedquota opens a “quota file” for that user/group or fileset and allows you to edit is using vi, for example.
Once the file is saved/closed the changes take effect.
Transition statement — Now that quotas are set you use mmrepquota to see the current quota usage.
In review, you can use two methods of implementing quotas:
Traditional quotas with hard stops
Or a reporting method using the high performance metadata interface
Remember that quacking quotas is a metadata intense operation and it is best practice to check quotes in off peak hours for a very busy system.
Transition statement — Let’s look at Extended Attributes
Another command for reporting Quota usage is mmlsquota.
Transition statement — Let’s look at Extended Attributes
The GPFS™ quota system helps you to control the allocation of files and data blocks in a file system.
Quotas are enabled by the system administrator when control over the amount of space used by the individual users, groups of users, or individual filesets is required.
When HSM is incorporated with TSM to provide Tape Based Archive and a file is migrated to tape, it is (by default) first backed up thru TSM and a Stub file is left on the GPFS file system. The Stub file contains the necessary metadata and a portion of the data to allow the file to be recalled from the Tape.
When a migrated file is accessed, it is recalled to the local file system to replace the Stub. The recall is automatic or selective depending on how the recall is initiated. It is wise to manage recalls wisely and avoid large scale recalls, because the can hang things up as they exhaust GPFS worker threads waiting for tapes to be loaded into drives to retrieve the files.
HSM will manage files as Resident (on Disk), Pre-Migrated (Disk & Tape), and Migrated (File on Tape with a Stub on disk).
Understanding Extended Attributes can be helpful in understanding things such as if the file is backed up, if snapshots have been applied, if data is migrated into HSM pools, what storage pool the data lives in and the state of replication for the data and the metadata. Learning to use the “mmlsattr” command can prove useful in validating assumptions. Reading attributes of files will not recall them from tape.
Note the Resident File report appears as ARCHIVE above. A resident file lives on the Disk.
Next slide..
A Migrated File shows as Archive Offline, with dmapi Object, ID, and Region. A migrated file lives on the Tape (with a stub file on disk), by default HSM requires the file is backed up before the file moves to Tape Archive (this can be over-ridden).
A PreMigrated File Shows with Archive with dmapi region, Mig, & ID.