1. Business opportunity or problem:
Nature of engagement:
China Guangfa Bank (CGB) founded in 1988, through over 20 years of development,
CGB builds itself from a regional bank into a national joint stock commercial bank, by
the end of December 2010, and it has total assets amounting to RMB 814.7 billion.
CGB is one of a biggest FSS client of IBM Power server, but considering storage
infrastructure, CGB is an EMC loyal client, the client only purchased EMC storages in
their core business related systems; IBM high-end storages are only used in non-core
business related systems, such as OA system.
Recently, CGB would like to build up a new intermediate business platform to support
their online-bank and 24 hours phone banking, etc, and is planning to build up
2-sites-3-datacenters DR structure. In this project, IBM has an opportunity to present
total solution to client. It’s a good chance to win back client.
Scope and complex:
1. I draw a picture about banking IT structure, used it to discuss project scope and
business requirements with client.
2. Based on CGB requirements on server and storage architecture, I led team to set
IBM winning strategy and tactic. After several cycles’ communication with client, they
accepted my point’s finally. IBM winning points were high-lighted in RED:
2. CGB focused on storage 2-sites-3 datacenters DR function for intermediate business
platform, and then IBM provided DS8000 MGM (Metro-Global-Mirroring). This kind of
solution made tiny different between EMC VMAX SRDF/Star, moreover VMAX 10k
which was lower price than DS8000.
We needed a unique value solution to win-back the client. Considering TPC-R solution
can monitor DR environment, especially for 2-sites-3 datacenters DR infrastructure,
and control failover/failback in GUI, EMC without the similar solution at this time; also
client can leverage TPC to monitor DS8000 performance. So I emphasized easy to
manage was a key point, to let CGB accept IBM DR total solution's advantage.
For the performance requirement, EMC provided 4 SSDs to client. But for IBM, the
price would be a problem to configure 16 SSDs. So my strategy was convincing client
do not use SSD at the first stage.
Relationship and communicate:
I were the storage technical leader in this project, I discussed solution with CGB IT
management team and procurement team, led them to understand what technical
points should be focused on and what the shining points of IBM storage solution was.
For every meeting, I brought out the main points and email to client; I wrote concise
documents and explain details to them, so they can leverage these documents to
report to their manager.
The good relationship helped me to get to know every enquiries, issues and problems
that the client was facing.
3. Solution
My role in the project
I was the solution leader in this case. I led team in understanding client’s requirement,
setting strategy to beat competitor and designing solution to show IBM’s value.
Besides discussion with whiteboard and presentation, I wrote MGM POC testing plan
and result according to client’s requirement.
Key decisions
1. Lead team to convince client focused DR management tool - TPC.
As I mentioned in “Scope and complex”, TPC-R and TPC were IBM powerful weapon
to beat EMC on DR solution. I took the following action to let client to focus on
management tool:
1. Before launched DS8000 MGM POC, I used “TPC-R 2-SITES-3-DATACENTER
DEMO” video to impress our client.
2. I used TPC to extract l CGB existing DS8100 performance data, and analyze the
DS8100 performance to client. It showed how easy to use TPC for DS8000 on
storage performance report & analysis.
The sample I used to coached CGB storage management team as following:
a. Check TPC report “Write Response Time (normal)” and ““Read Response Time
(normal)”. Response time was the most direct criterion reflexes storage performance
behavior, read/write response time should be less than 15ms, less than 10ms is
better.
b. Checked TPC report “Read Cache Hit Percentage (normal)”, random IO hit ratio
normally should be 70%-80% to get good performance behavior.
c. To evaluate whether storage cache was enough or not, user can check
4. “Write-cache Delay I/O Rate”, if it was not zero, that means write IO had to wait write
cache flush data to disks and make available rooms to this write IO, that will make
write response time become high. The records which response time >15ms, delay I/O
rate were high. Increasing cache can make this index became lower. Also more disks
can help cache flush to disks faster to reduce the latency.
The records which response time >15ms, delay I/O rate were high. Both increasing
cache and increasing disks can solve this issue.
Via my effort, client believed that use GUI tools to monitor the whole IT environment
and performance data were very important; they really needed TPC solution to make
their storage infrastructure management more efficiency. And this DS8100 had been
upgrade to 128GB memory and 6*16 15K 300GB FC disks finally based on my
analysis and proposal.
2. Simplify DS8000 MGM POC
Our client would like to know details about the MGM, they requested IBM to
demonstrate all the scenarios of MGM to them. It would take more complexity and
more effort on demonstration, and it might confuse client after more than 10
scenarios.
I led a short meeting with sales and ATS on this issue, finally I decided to simplify
MGM POC and to let client accept only POC the typical scenario - "A crashed down, B
as primary storage; and then B crashed down, C as primary storage; A repaired,
failback and become A MM to B, B GM to C".
I reference to DS8000 copy service red book, wrote down all the scenarios and
described to client why POC the typical scenario was enough.
3 base scenarios: X->B->C, A->X->B, A->B->X; X means the storage crashed. Other
7 scenarios can become one of the 3 base scenarios under some situation
Through my discreet logic, the client understood MGM scenario clearly and agreed to
5. our MGM POC plan. Also, I shared both the 15-scenarios table and POC report with
my teammates for future project reuse.
3. Lead to handling the attack from EMC VMAX SSD
Intermediate business platform required high performance storage, EMC provided 4
SSDs on VMAX to client, and urged IBM to provide SSDs on DS8700. We at least
configure 16SSDs according DS8000 best practice and at least configure 8SSDs in
e-config, the price would be a big problem to beat EMC VMAX.
I analyzed the client performance requirement, led the sales team to convince client
do not use SSD:
a.I explained to client SSD usually used in read sensitive environment, for CGB
business reason, the whole intermediate system would be accessed frequently, it’s
both read/write sensitive. Several SSDs indeed could improve some parts of data’s
performance, but it’s impossible to improve performance for the whole system.
b.I used DiskMagic to demonstrated DS8700 with FC can lower TCO in this project:
1.DS8700 configure: 8 SSD-RAID5,256GB cache, 2 DA
Sizing data profile: read: 70% 4KB; write 30% 4KB
Result: 2 DAs utilization 98.3%; maximum 32000IOPS; 0.2ms response time
2. Under the same data profile and response time, DS8700 used 80 15K FC
maximum 39000 IOPS, 20% higher than DS8700 with 8 SSD. DA utilization 52.6%
lower than DS8700 with SSD DAs utilization 98.3%; Choose 15K FC would lower
TCO.
6. 3. EMC VMAX configure: 80 15K 300GB FC,256GB cache – as the same as DS8700,
use the same data profile.
Result: VMAX cannot sustain 39000 IOPS, maximum IOPS was 16500. – 57.5% less
IOPS than DS8700 under the same configuration.
That mean VMAX was not as powerful as DS8700, which indicated why EMC had to
propose SSD in this project.
c. I found an article from IBM Lab to prove my analyzed was right and showed it to
client.
7. Through my in-depth analysis and good negotiation skills, client accepted IBM
explanation and believed that DS8700 can provide high performance without SSD.
Proposed solution
Structure:
After several times of internal discussion and client communication, I proposed 2
DS8700: transaction layer-128GB cache, 15*16 300GB 15K FC; Front end
layer-64GB cache, 7*600GB 15K FC:
Performance estimation:
Based on CGB storage performance requirement, intermediate business platform
needed to sustain >100 transactions per second, IOPS should be 40000 for
transaction layer and 20000 for front end layer. I used DiskMagic to help the client
complete DS8700 performance sizing.
For example transaction layer:
Disk utilization under 70%, 40000IOPS
8. Maximum 60000IOPS;
Results
1) This was the first time that CGB used IBM storages in core business system.
-Client purchased 2*IBM DS8700 storages
9. - Client believed DS8000 2-site-3-datacenter solution was easy to management.
- Client clearly understand DS8000 hardware structure, functions and performance,
they believe IBM provides a valuable solution for them.
- They upgraded existing DS8100 with 6*12 300GB 15k FC, according to my DS8100
performance analysis.
2) I build well relationship with client through this project.
In this project, I handled the competitor’s challenges professionally through my
in-depth analysis and solution design, good communication skill . we built good
relationship, client worke more closely with us now
3) Help IBM gain more storage opportunities in CGB
After this case, IBM had gained more opportunities in CGB; it’s good for storage
selling. Now we are processing 2 projects: development environment backup project,
Production AS400 virtual tape library project, engaged in virtualization project.
4) I wrote documents - DS8700 performance advantages, MGM POC result and 15
scenarios summarized DS8700 selling points, I shared them to Southern China
storage TSS; And after this cases, I shared how to beat EMC VMAX experience to
sales and TSS team in XiaMen early-bird training.
Lessons learned
1) In this project, I did research on website, read articles to understand what the
banking intermediate business platform is; Also, I consulted FSS CITA to know about
the concept of data profile, anything special in intermediate business platform.
2) I learned DS8000 cache Algorithms from this project. Before this case, I didn’t know
what DS8700 selling point was; I thought DS8700 performance advance only
because it used POWER as controller. After this case, I read DS8000 performance
POC result, study the cache Algorithms details, and use DiskMagic to estimate
DS8700 maximum IOPS, I clearly understand that DS8700 high performance
because of it’s a precision instrument, well designed structure with discrete algorithm.
It’s totally wrong that people thought DS8000 was as the same as mid-range storage
because DS8000 only has 2 storage controllers.
3) I promote my performance analysis skill through using TPC.