This document discusses critical convergence in three key areas: the convergence of front and back office goals and strategies, the convergence of IT architectural models, and the convergence of intelligent data management. It argues that purpose-built infrastructure designed around business needs, rather than vendor marketing, is key to addressing current challenges around cost, compliance, risk management and growth. Properly managing and routing data across such an infrastructure requires understanding data hosting requirements.
2. Convergence /kən-vər-jən(t)s/
• Noun. L.L. convergere "to incline together"
from com- "together" + vergere "to bend"
1. The act of converging and especially moving toward
union or uniformity ; especially coordinated
movement of the two eyes so that the image of a
single point is formed on corresponding retinal areas
2. The state or property of being convergent
3. Independent development of similar characters (as of
bodily structure of unrelated organisms or cultural
traits) often associated with similarity of habits or
environment
4. The merging of distinct technologies, industries, or
devices into a unified whole
5. Mathematics The property or manner of approaching
a limit, such as a point, line, function, or value.
3. Great Marketing Term
• Can be loaded with any value you want
• Three meanings I want to explore
– Convergence of front and back office goals
and strategies
– Convergence of IT architectural models
– Convergence of drivers around intelligent
data management
• These constitute “critical convergence”
areas…
4. “Critical” Convergence?
• “Critical” connotes normative value: the
way things ought or need to be…
– Business and IT need to be realigned in most
companies for productive work to be done
– IT infrastructure needs to be purpose-built to
business requirements and components and
processes must be commonly manageable
– Data assets need to be handled much more
intelligently and granularly than they are today
• Says who? Trust me, it isn’t only me.
6. Situation Today in Too Many Firms
• Current thinking: IT costs too much with no
demonstrable return on investment
– Committees formed to prioritize purchases
(often without IT representation)
– TCO hammer makes all IT costs look like nails
• Leading to frantic adoption of “silver bullet”
methodologies: ITIL, ISO, CoBIT, SOA, etc.
• Generating more frustration at the top…and
bottom of the organization
– CIO office has a revolving door
– Laissez faire attitudes regarding product
suitability – “It’s not my money.”
– Uptick in outsourcing/in-sourcing – despite fact
that a problem cannot be outsourced
7. Speaking Different Languages
• The Front Office focused on
– Cost-Containment
– Compliance & Governance
– Continuity & Risk Management
– Top Line Growth
• The Back Office focused on
– Infrastructure Efficiency
– Performance Management
– Service Level Improvement
– New Information Products
8. Current “C-4” Challenges Can (and Should)
Unite Us
• The four issues that are front of mind for business IT planners:
9. Not Just a Reflection of the Current Economy…
• The C-4 issues have been building
steadily for decades, a function of…
1. Failure to “purpose-build”
infrastructure…especially in the
distributed computing environment
2. Failure to instrument infrastructure
for monitoring, measurement and
management
3. Failure to manage data effectively
The current recession has only underscored the problems and, perhaps,
given us a “pause” to think strategically about ways for solving them…
10. The Front and Back Office Need to Converge
• Common language needed to describe
– Challenges and opportunities
– Strategic vision and goals
– Objectives and success metrics
• More effective division of labor
– Business value case development
– Product selection
– Implementation plans
• In an atmosphere of mutual respect…
12. Back to Basics
• What is “purpose-building”
infrastructure?
– n. a kind of “intelligent design” – a deliberate
and well conceived matching of infrastructure
“services” (resources and functions) to meet
well-defined business needs…
– v. the act of buying or developing and deploying
technology that serves a business need and not
simply outsourcing your thinking process to
vendors whose top line growth is more
important than your top line growth
13. NOT a New Idea:
Data “DNA” Inherited from Business Processes and
Applications
15. Chances are Good That Your Infrastructure Was Originally
Built That Way…
• Might have started with a data-centric, business
process-focused view, but it rarely stayed that way
over the years
– Management changed: Different masters bring
different views…
– Technology changed: Distributed systems, TCP/IP, the
Internet, the “V” word, et al…
– Business changed: Mergers and acquisitions, new
products, new go to market strategies…
– Attitudes changed: “IT is our strategic differentiator,”
“Bricks and Sticks are Dead,” “Does IT Matter?,” “Legal
is running IT now,” “Web 2.0,” “Enterprise 2.0,” “Mash-
ups will replace IT applications,” Clouds will replace
data centers, blah blah blah…
16. Biggest Problems Attributable to
Triumph of Marketecture
• Marketecture: n. that which otherwise good
architecture becomes when subjected to
vendor marketing departments; see kluge
– The rise of distributed systems: mainframe
deconstruction gone awry…
– The network is the computer: distributed computing
gone awry…
– SANs: storage infrastructure gone awry
– x86 server virtualization: LPARs gone awry
– “Cloud computing:” multi-tenancy gone awry
Purpose-Built Tech
“Cool” Tech
MARKETING
17. Marketecture Myths Abound
• “IT isn’t about building and managing
infrastructure anymore, it’s about
managing a few trusted vendors…”
• “Smarter systems reduce management
burden (and its associated labor costs)
and fit the needs of all businesses well…”
• “Resource utilization efficiency is best
achieved through server virtualization…”
• “Why deal with problems? Source your
services from a cloud…”
18. And They Resonate Because…
• Leading vendors tend to sell to the
front office, not to the back office
(where IT lives)…
– Eschewing “tech speak” and framing pitch
in terms of business problem solving
– Purchasing mind share with PPV analyst
reports
– Casting aspersions on internal IT
competency
– Plus, the usual schmoozing…
20. Latest Challenges to Purpose-Built
• Return to the “mainframe-
centric” model of the universe
• Rise of the smart storage array
• Hypervisors as generic
operating systems
• Standards-free “Cloud
Computing” paradigms
21. Mainframes and Mini-Me’s
• Mainframes enjoying long-deserved
renaissance
– Workload shifting to mainframes like never
before
– Most companies deferring or reversing MF
decommission plans
• Prompting the appearance of mainframe
mini-me’s
– Dedicated proprietary hardware stovepipes
– De facto standards
– Proprietary software protocols for
interconnecting bus
22. Fundamental Question:
What is the Proper Order of the IT Universe?
• Mainframe centric
• Network centric
• Storage centric
Data centric
23. So, Is “Purpose-Built” a Matter of Interpretation?
HP claims that its Oracle platform is purpose-built: to host an
Oracle database (Q: Are all Oracle DBs the same?)
EMC claims that NetApp makes a mistake by building one-size-fits-
all storage platforms, which they have learned do not meet the
needs of every application: hence, they offer at least four storage
product lines, calling this a purpose-built strategy (Note: most
value-add functions require homogenous EMC infrastructure)
Xiotech has a Lego™ building block storage array that works with
any controller and lacks embedded software functionality found in
“value-add” arrays:
Cobble together various ISE bricks and use software components
from third party vendors in their ecosystem to cobble together exactly
the capacity + software functionality set you need
Commonly manageably using open W3C Web Services standards…
24. Storage is Key to Purpose-Built Design
• 33 to 70 cents of every dollar spent
on IT hardware today
• 300% capacity growth expected
between 2007 and 2011
• Biggest power pig in the data center
• Where data goes to sleep…
25. Data Re-Reference Rates Already Baked Into
Some Product Categories
ProbabilityofReuse
100
0
“Primary Storage”
“Secondary Storage”
Accesses to Data
0 days 30+ days 90+ 1 yr ∞Time
milliseconds seconds minutes hours daysAccess
Speed
“Retention Storage”“Tertiary Storage”
26. Embedded Value-Add Software: A Good Idea?
• The array controller is getting bloated
– Certain functions need to be performed close
to the spindles themselves; most don’t
– Dynamic established in market: must add new
“value” to controllers every six months or you
look like you are standing still
– Vendors argue pre-integration value; stifle open
discussion of performance deficits
• Driving up cost
– You pay for all functionality, whether you use it
or not
– Management is obfuscated: need to hire more
monks when you buy a new array
27. One Practical Consequence Worth Noting
1980 2010
Sub-MB
TB+
CAPACITY
(ArealDensity=Gbperin2)
COSTPERGB
HIGH
LOW
ARRAY COSTS ACCELERATING
Source: “Avoiding the Storage Crunch,”
Jon Toigo, Scientific American
28. Placed in a SAN, Now You’re
Talking Real Money
• SANs aren’t networks: ENSA has never
been brought to market
• SAN switch ports remain pricey,
underutilized (sub 15% in most cases),
accelerating price for primary storage to as
high as $150/GB, secondary storage to
$110/GB, tertiary to $50-$80/GB
• At those prices, and anticipated capacity
growth…
29. More Bad News
• Server virtualization isn’t the
big fix everyone is hoping for…
• Nothing wrong in concept:
mainframe LPARs and PRISM
around for a couple of decades
• But x86 extent code for multi-
tenancy not fully ripened
• And hypervisors confront a two-
front war…
30. A War on Two Fronts…
• Good hypervisors can do a great
job with x86 extent code…
• The challenge to stability…
– From above: applications
making “illegal” resource calls
– From below: value-add
functionality proffered by storage
vendors (thin provisioning, for
one) can compromise expected
resource request fulfillment
App software vendors
don’t write code to
support VM hosting…
Many storage vendors are
“adding value” by obfuscating
actual capacity…
31. Latest Elephant in the Room
• Storage vendors cozying up to server virtualization vendors
using storage resource utilization enhancement speak:
– “3PAR has worked with VMware on its new adaptive queue depth
algorithm… dynamically adjusts the LUN queue depth in the VMkernel
I/O stack to minimize the impact of I/O congestion.”
– But, “Queue depth is just one element which regulates what traffic is
allowed onto the "SAN Freeway" but the entire SYSTEM is what needs to
be balanced AND MEASURED… NetWisdom is like the ‘Eye in the Sky’
that sees everything. We collect 65 metrics. Latency is very important.”
• Two effects:
– More hype about storage management advantages of x86
virtualization, stalling work on real solutions
– Multiple proprietary attachment and I/O load
balancing/routing APIs inspiring abandonment of networked
storage vision
32. Is Server Virtualization Really Ready to
be the Cloud OS?
• This week’s trend or a valid model?
– Service bureau computing begat mainframe
multi-tenancy begat ASPs begat Cloud storage…
– Electronic vaults begat SSPs…
• Questions abound:
– Can cloud providers do storage more
efficiently than you can?
– Can cloud provider SLAs be trusted?
– Is data secure in a cloud?
– Do clouds present governance issues?
– Why is man born only to suffer and die?
34. Realizing a Purpose-Built Infrastructure Requires…
• “Deconstructionalist” thinking
– Willingness to let go of stovepipe vendors
– Willingness to cobble together required services
from best of breed suppliers rather than one-stop-
shop providers
– Desire to achieve best efficiency at lowest cost
• Embrace of common, standards-based
approach to managing of infrastructure
• Understanding of data hosting requirements
and intelligent routing of I/O across
infrastructure (we will get to this one)
35. A Quick Historical Perspective
• Tendency to view mainframe storage
management as inherently superior to
distributed systems SRM
• In the late 1970s, IBM blessed the world
with SMS and HSM, bringing order to
the mainframe storage universe
– Mainframe storage allocation efficiency
is 4x that of distributed storage
allocation efficiency
– Mainframe storage utilization efficiency
is 3-4x that of distributed storage
Mainframe Storage CAE: ~80%*
Distributed Storage CAE: ~20%
Mainframe Storage CUE: ~70%**
Distributed Storage CUE: ~17-20%***
*Assumes SMS
**Assumes HSM
***Depends on Server OS
36. IBM’s Contribution Was a Blessing
• But it was also limited to IBM
architecture…
– IBM Cosmology: Mainframe at the center of
the IT universe
• Leveraged de facto standards and well-
defined architectural model
• Homogeneity: all storage must conform to
IBM rules
• Sets up a rigid workflow model: data goes
from system memory to DASD to tape (or
virtual tape) over time
– Actual goal: sell more DASD, while
maintaining sparse labor force
37. HSM Added Value:
Classes for Policy-Based Data Movement
STORAGE
ADMINISTRATOR
END USERS APPLICATIONS
STORAGE GROUP
38. Good Enough Then, but Now?
• Mainframe world in transition
– More, varied workloads
– Different data management needs
– Consolidation of servers into LPARs
• However, SMS/HSM have not fully
transitioned into the z-Series world
– Still part of z/OS, but not part of z/Linux
– HSM does not extend to z/Linux environments
39. Open Systems Corollaries
• “Trust your vendors to solve your management
problems.”
• Smarter arrays do it all
– Capacity management through embedded tiering
algorithms
– Capacity management through embedded thin
provisioning algorithms
– Capacity management through embedded
compression and block de-duplication algorithms
• Hypervisors on the server allocate storage
resources to apps-in-VMs elegantly
– The rise of mainframe wannabes
– The rise of cloud storage
40. “See that green light? I know I’ve got a
problem if it turns red.”
• Primed to see things this way from early on…
– Common view in small shops with only one storage platform from
one vendor
– Indicator lights, self-articulating web pages with status and
configuration data, or a utility CD provides a solid “storage
management” story
– Meets the standard requirements of storage management
Provides configuration control
Monitors device status and capacity
Reports when problems occur so they can be addressed
Easy to use, ready out of the box, no muss, no fuss
41. But Element Managers Become a Problem
When…
• Volume of data grows…
• The number of storage products increases…
• The storage becomes heterogeneous…
– Products deployed from different vendors
– Different products deployed from the same vendor
• Storage becomes “smart”…
– Vendors build “spoofs” into gear to improve performance
(example: NVRAM caching)
– Array controller-embedded “value-add” software obfuscates
real monitoring of capacity
42. But On-Array Thin Provisioning will Take Care of All That
• A storage capacity “shell game”
– Leverage “reserved-but-not-allocated”
space to over-allocate disk capacity
– No standards, only as effective as disk
space demand forecasting algorithm
– Can defer need to buy more storage
– But data at risk of a “margin call”
• Not a replacement for storage
management…
Thin is In, or Is It?
Survey of 249 Companies
• 79% expect TP to deliver improved capacity
allocation efficiency
• 49% expect easier provisioning with less
disruption
Conversely,
• 44% concerned that TP will result in greater
risk of running out of storage
• 43% worry that TP adds complexity
• 42% worry about a lack of capacity
management tools
43. Okay. So, Compression and De-Dupe:
That’ll Fix It.
• Quick definition: reduction in the
number of bits to represent data;
can also mean:
– Removing/stubbing bit patterns
– Storing only change bytes or bits in
versioning system
– Elimination of file duplicates
• Ask a vendor about the differences
and make some coffee…
• Purported Management value:
– Squeeze more data onto media
– Defer need to buy more capacity
• Issues:
– Your mileage may vary
– Possible compliance issues
– More equipment to power
• Over time, even a trash compactor
fills a landfill…
• Replacement for storage
management?
The new Marlboro 72s may be
smaller than old Reds, and less
expensive to buy, but they will
eventually kill us just the same…
44. Ahh, But Server Virtualization is Coming to the Rescue
• Leveraging x86 chip code extents to stack
virtual machines in a single server chassis…
– To consolidate assets
– To improve resource allocation efficiency
(abysmally poor in distributed systems)
– To simplify storage management and automate
data movements (hypervisor controls storage)
• Hmm. We talked about this already at
some length, but…
45. The Simple Truth
• Good stuff: x86 Hype has focused
attention on need for storage
management from a systemic
viewpoint
• Problems
• Industry resists common virtualization
standards: x86 chip extensions alone are not
enough
• Even if Hypervisor is “well behaved,” not
necessarily the case with application
software: performance issues result
• Storage vendors keep throwing curves,
destabilizing VM stack from below
46. Plus, the Problem of “Dark Storage”
• Early on, storage associated with VMs was double-
counted…since resolved by leading SRM tool makers
• Now, we have the problem of “dark storage”
– A TB allocated to server admin
– Server admin applies file system to 500 GB, forgets about his
“reserved capacity”
– Storage admin believes 1TB allocated, Server admin believes
he has 500 GB – 500 GB dark and beyond detection
• Virtualization doesn’t simplify storage management, it
adds an abstraction layer that can confuse and obfuscate
capacity management and data management…
47. What About All of those
SAN Management Standards?
• Hardware and cabling
– FC Standards (T-11 ANSI) defined by and provide
wiggle room for vendors
– No guarantee of interoperability:
switch/HBA/controller firmware upgrades can
make the SAN disappear
• SMI-S from SNIA
– A bust: implemented on a fraction of hardware
platforms
– Better management data from APIs and SNMP
MIBs even when hardware is SMI-enabled
• Kind of feels like the storage industry’s heart isn’t
in it…
48. Lack of Strategic Management in Distributed
Systems is Lagging
• Wild West World of Storage Standards combined
with Vendor Proprietary Value-Add designs in a
Heterogeneous Environment equals…
– Lack of visibility and huge resource waste
– No means to anticipate problems or requirements
– No effective means to police vendor claims
– Increase in storage-related downtime
– More time spent fire-fighting tactical issues
• Issues creep up over time: Frog Boiling in a Pot
49. Contributing Significantly to TCO
Backup and
Recovery
Hardware
Management
Scheduled
Downtime
Administration
Environmental
Hardware
Acquisition
30%
3%
20%
13%
~14%
20%
COST
TO
MANAGE
COST
TO
ACQUIRE
Administration
(labor) and other
“soft costs”
equal to 4x
acquisition costs
annually.
Acquisition cost
is only 20% of
TCO
Source: Gartner
50. Bottom Line
• Purpose-building infrastructure to realign IT
with business will not be easy
– May need to abandon “comfort zones” with
“love brands”
– May need to revisit build vs buy decisions
– Decide on management paradigm first, and
promote management via preferred approach to
a primary selection criteria
– No forklift upgrades: phase in as part of normal
hardware refresh
– Work with an integrator who isn’t just an order
taker for brand name vendors
• It starts with a strategic vision and some
business-savvy…
51. My Preferred Paradigm: Business-Centric,
Standards-Based and Data-Focused
• Data focused? Effective infrastructure management is
required to get to the Holy Grail -- Data Lifecycle
Management -- across purpose-built storage
infrastructure
– For real D/ILM to happen, storage classes need to be
defined and operational status monitored continuously
– Selecting or building storage using heterogeneous
components and fielding services on networked
platforms (aka appliances, routers, gateways, etc.)
makes management a must-have
• W3C Web Services-based management support by
gear vendors would make for an easily deployed and
integrated enterprise management solution (covering
servers, networks and storage)
SOAP/XML could provide the
“universal glue” required to auto-
provision and auto-protect
applications with infrastructure-based
resources and services.
Already supported by a broad range
of applications, OS software, network
hardware…and now storage.
53. Intelligent Data Management is
No Longer Optional
• Data Management: “The undiscovered country”
• Absolutely essential to
– Align business processes with IT investments and to
demonstrate business value
– Achieve sustainability in service-oriented strategy
– Realize capacity utilization efficiency and contain costs
– Comply with regulatory mandates
– Protect data effectively
– Green IT operations
54. In the Absence of Data Management
• Storage is a junk drawer – as much as 70%
of capacity is wasted
• After normalizing over 10,000 storage
assessments, we find
– 30% of the data occupying spindles is
useful
– 40% is inert (needed for retention, but
never accessed)
– 15% of capacity is allocated but unused
(often “dark storage”)
– 10% is orphan data whose owner/server
no longer exist
– 5% (low estimate) is contraband
Source: Randy Chalfant, CTO, Sun Microsystems,
from Making IT Matter (Chalfant/Toigo, 2008)
55. Sorting the Storage Junk Drawer
• Conceptually, a straightforward
undertaking
1. Classify your data
2. Create policies for data handling
by class
3. Instrument your infrastructure for
policy-based data movement
4. Move your data around per policy
• Then, as in writing, just open a vein
and bleed on the page…
56. Classification requires Information Collection and Analysis
• Technically speaking, you need to perform a
business process deconstruction analysis
– Start with a few key business processes: senior
management will probably be able to point you to the
right ones
– Deconstruct processes into tasks and tasks into
workflows, then identify data associated with a workflow
– Identify data handling requirements associated with each
workflow’s data (retention requirements, criticality,
disposal requirements, encryption requirements, etc.)
– Record everything (a spreadsheet works well)
• (This is a wildly profitable service offering for
solution providers…)
BUSINESS PROCESS 1
TASK 1.1
TASK 1.2
TASK 1.n
WORKFLOW 1.1.1
WORKFLOW 1.1.2
WORKFLOW 1.1.n
DATA
DATA
DATA
DATA
57. Classification Simplified
Data Name
Workflow ID
UpdateFrequency
AccessFrequency
Staleby
SpecialRetention
DeletionRequirement
ReferenceData?Criticality
Security
OtherRequirement
OtherRequirement
Data Name
Workflow ID
UpdateFrequency
AccessFrequency
Staleby
SpecialRetention
DeletionRequirement
ReferenceData?Criticality
Security
OtherRequirement
OtherRequirement
Just sort the spreadsheet
and basic data classes
separate out like a parfait…
58. Shampoo, Rinse, Repeat
• Often heard complaints…
– “What if there are a lot of tasks, workflows
and data inputs and outputs?”
– “I have thousands of business processes: you
really think we can do this in a timely way?”
– “Jon, this is your classic waterfall approach,
which has already come under criticism in
fields like application development. Isn’t it
easier just to backup/mirror/encrypt
everything – all on disk drives?“
– Why is man born only to suffer and die?
• OMG, you folks complain a lot…
59. But I Feel Your Pain…
• Truth be told, there aren’t a lot of short cuts
• Hazards of NOT classifying data by business
process-related criteria include
– Application of inappropriate or overly expensive
data hosting and protection strategies to relatively
unimportant or low priority data assets
– Conversely, the failure to include mission critical
data assets in protection schemes
– Lack of predictable time-to-data outcomes from
ANY data protection scheme
– None of the extra business value that should derive
from data analysis
• Any way you cut it, the results can’t be good
60. A Few Tools that Can Help
• Tek-Tools Storage Profiler
– Lightweight deployment
– Good storage assessment reporting for identifying
dupes and dreck
• Digital Reef
– First “deep blue math” e-discovery algorithm that
works
– Aimed at information governance and storage
capacity reclamation
• NSM (Novell Storage Manager)
– Classify data by user profile
– Microsoft Active Directory support improving
– Robust reporting on the way
61. Just a Gol’ Dern Second…
• You are talking about an “SRM” tool (Tek-
Tools), an information governance tool (Digital
Reef) and a file lifecycle management tool
(Novell): what has that got to do with a C-4
Strategy?
• Each tool provides the means to
– Locate data assets on infrastructure
– Cull out the junk data and facilitate the
introduction of intelligent data management to
develop policy-based schemes for classifying
data for data protection service provisioning
– Simplify manual classification activity,
especially with respect to user files, which are
the largest component of overall data stored
by most companies today
62. Data Modeling in Action
Outage Costs
Detailed Summary
Regulatory Reqs:
(e.g., Privacy, Preservation,
Protection,
Discovery)
Criticality
Interdependencies
Interdependencies
Volume, Volatility,
Access Frequency
Hard and Soft Costs
Hard and Soft Costs
Hard and Soft Costs
Hard and Soft Costs
64. My Vision of Data Management
Convergence…
Data
Bus
Value-Add Software Functions
Hardware Resources
Status
Monitoring
I/O Routes
Provisioning
Policy Engine
65. Providing the Means to Realize C-4 Nirvana:
Automated Data Service Provisioning
Services
Requested for
Data Handling
Services
Status
Inventoried
“Route”
Established for
Services
Requested
Services Applied
to Data
Provisioning and
Protection
Policies Applied
App Seeks to
Write Data
66. Already Underway
• Toigo Partners International and the Data
Management Institute will shortly launch a C-4
Community Portal to aid in W3C Web Services
storage reference architecture design,
development and education: You Are Invited!
• Working groups will span technical and
nontechnical subjects and will combine
consumers, vendors and reseller/integrator
perspectives…
• Leading to the establishment of a C-4
Microfactory – a “farmer’s market” web site
where Web Services-enabled hardware and
software products can be discovered and
sourced for integrated solutions…
67. Join the Community Effort
Questions?
Thanks!jtoigo@toigopartners.com
www.c4project.org