Developer Data Modeling Mistakes: From Postgres to NoSQL
Ocfs2 storage
1. Oracle VM & Advanced Storage
Capabilities
Presented by Tim Krupinski, Solution Architect
2. Cloning Types
Use Cases for Cloning
• Consistent deployment of servers from “Golden” template
• Create a stable point to which you can fail back in case of patching gone awry
• Refreshing lower environments from Production
OVM Types of Cloning
• Supports cloning of individual virtual disks along with entire VMs
• For disk cloning, the following types are supported:
• Non-sparse (Traditional copy)
• Sparse (Smarter copy)
• Thin (Advanced copy)
3. Traditional Clone
Traditional “Non Sparse” cloning is a bit-for-bit copy at the filesystem level.
Some drawbacks:
• Time to copy dependant upon disk size, and grows with size
• Generates lots of read/write
• Disk needs to be offline to ensure consistent copy
Some use cases:
• Good to copy a disk suffering from filesystem or some other storage corruption
• Uncomplicated storage measurements are straightforward and easy to understand
• Completely Unambiguous regarding storage utilization
4. A Sparse Clone is a bit smarter than a standard copy. It only copies actual data, and
bypasses the reading and writing of storage blocks allocated but not yet used.
For example:
● A VM has a “backup” disk provisioned that is 100GB in size
● However, backups only consume 10GB of storage
● Sparse copy will only copy 10GB of bits, and bypasses the rest
Faster than a traditional Non Sparse clone, but still suffers the same drawbacks of
consistency and copying from a live volume
Sparse Cloning
5. Thin Cloning
Thin Cloning is the most advanced type of clone
• It is Instant, regardless of the size of the volume.
• It can be used on live volumes
• Does not generate unnecessary read/write
cycles.
• Enables overcommitment of virtual disks given
the overall size of an Oracle VM repository
6. Differences
Unlike a Sparse or Non-Sparse clone, a Thin Clone can only be
cloned to the repository in which the target volume resides.
Uses REFLINKS, and when a thin clone is created, the File
System maintains a copy of these links to Inodes. When an
Inode is updated on either the clone or the volume, OCFS2
tracks the change.
Because of REFLINKS, you can clone a 1 TiB volume in a
Repository that only has 1.5 TiB allocated. Even if the volume
is 90% full, Thin Cloning enables you to have N+1 copies of
the cloned file
7. Real Deal on Thin Clones
OVM Manager 3.3.1 requires manually refreshing the repository view,
otherwise you run the risk of seeing stale information…. for example, if you are
wanting to see if you are approaching a storage threshold.
On the server, df is pretty good about the size of an OVS repository
• But what if we want to know something more specific, like whether
restoring a thin clone is going to tap out the available storage in a
repository?
• What if you want to see the relationship of shared extents between a clone
and an original?
Using a powerful solution like thin cloning requires us to dig deeper, and
understand mechanics at the OCFS2 layer.
8. Caveats of Thin Cloning
With Great Power comes Great Responsibility!
Left unchecked, thin cloning can result in unexpected explosion
of storage
(Fast way to shut down all VMs)
Traditional Tools inadequate to report on actual size of virtual
disk images in OCFS2, du and ls are inconsistent
Which view does Oracle VM Manager reflect?
9. du & ls problems
Not fully aware of OCFS2 capabilities
• Sparse files
• REFLINKs
Attributes full allocation to each thin clone, irrespective of actual
usage
Total Size by Tool
du 5.7TB
ls 6.4TB
10. Client Example
Disk1 100GB
Disk2 1113
Disk3 766
Disk4 1885
Disk5 1888
Size Allocated to
Repo 5400GB
TOTAL SIZE 5652
You sometimes must manually refresh repositories in the OVM Manager to get an
accurate total … otherwise it will be incorrect
There’s a short cut to whats really being consumed....
11. O2 Utilities & Shared DU
shared-du tool designed by Sunil Mushran
Not available in standard Yum repositories or the ULN
• Download from Oracle OSS FTP
• Provides an apparent and actual size of files, accounting for sparse and reflinks
• Use to get real-time information on shared disk sizing and growth
ocfs2-tools developed by Oracle and maintained in Unbreakable Linux Network
repositories
• Allows you to view file “holes”
• Calculates amount of shared reflinks between cloned files
12. Use Case for shared-du
A Virtual Disk used for a filesystem that holds DB extracts is cloned.
The virtual disk continues to be used, and as more read/write operations occur,
the original diverges from the Thin Clone in the amount of data it can share
We can use Shared-Du, along with o2info, to examine the relationship between
the clone and its original. From this we can see the true amount of storage
being used
13. Tells us the Actual Size along with the Apparent Size
Actual Size is the amount of data the file is using which is not shared
Apparent Size is the amount of data it appears to be using
Shared-Du example (After 12 hours)
These .img files are Virtual Disks for a VM, with one being a thin clone of the other and shared-du
measured after fs activity occurred
1. Original disk is 1.9TB, but is in fact using 988 GB of data
1. Cloned disk is still its original size, at 1.9TB
1. 2.8 TB is the total amount of storage actually used, whereas the
filesystem reports two 1.9TB disks, totaling 3.8TB
16. So What?
Enables you to plan for storage usage in the event of recovery from thin clones
• Total repository usage will grow as data is changed on the volumes
For example:
1. You refresh Dev from Production, both having a 100GB disk for /u02
2. Total repository storage is 200GB
3. Your production system has:
a. / - 10GB
b. /u01 - 20GB
c. /u02 - 100GB
d. total - 130GB
4. Take Thin Clone of Production
5. Each VM thinks it has 130GB to grow, so at first this is OK, but over time without
monitoring, the repository can reach 100%