Flash storage solves the problem of application I/O wait time; Pepperl+Fuchs consolidates applications and databases on a single flash storage platform
Violin Memory DOAG (German Oracle User Group) Nov 2012
1. What is the real cost of User IO Wait?
Rob Bloemendal, Director Consultancy Engineering EMEA
November 20th, 2012
Violin Memory Inc. Proprietary 1
2. Agenda
Violin Memory company overview
Why did Pepperl+Fuchs chose Violin Memory
The real cost of application user I/O wait – Flash Storage
Other real user case of application I/O wait
Q&A
Violin Memory Inc. Proprietary 2
3. The Violin Memory Company
• Founded 2005, first flash product in 2009
• 200+ top tier customers
• Toshiba investor/ Strategic Supply &
Roadmap Agreement
• Global Presence
– WW HQ in Silicon Valley USA, Offices in North
America, EMEA and APAC
– 450+ employees
• August 2011 - awarded Silicon Valley
„Company of the Year‟
– Previous winners –
Google, Salesforce.com, YouTube, Twitter, MySQL
Violin Memory Inc. Proprietary 3
4. Run your business faster & more efficiently with Violin
• Accelerate your business critical applications
– 5x – 10x improvement in performance and latency
– 2x – 10x more users for the same infrastructure
– Survive business spikes without disruptions
• Lower your infrastructure and operational costs
– Physical consolidation of 10:1
– Software licensing reduction up to 50%
– Sub 12-month ROI
Violin Memory Inc. Proprietary 4
4
6. Pepperl+Fuchs
Applications: Siebel CRM & Witron SCM
Challenge: Performance (Traditional Storage Systems)
Requirement:
‒ Application Acceleration
‒ Consolidating applications and databases on one storage platform
‒ VMware virtualisation without impacting the performance of other applications
Strategy:
‒ Being able to do more without falling in the same trap
‒ Scalability and Performance issues in the next couple of years to be avoided
Violin Memory Inc. Proprietary 6
7. Pepperl+Fuchs options:
Exadata:
‒ Pricing very high
‒ Alternative required
Violin Memory:
‒ Chosen alternative
Approach:
‒ Proof of Concept with both products
‒ Rule: no database or application tuning
‒ Result: Violin Memory best on price versus performance ratio
Violin Memory Inc. Proprietary 7
10. What if you could have so much resource…
It was more than you could actually use?
It was no longer constrained?
What if I/O were free?
Violin Memory Inc. Proprietary 10
11. Tuning Techniques Used
Time Consuming Expensive Short Term
Application I/O wait times are still high
Violin Memory Inc. Proprietary 11
12. Disk Storage – Constant Constraint
• To avoid disk performance
Underneath, it’s still strategies
issues, mitigation a disk…
must be used, such as:
– Short Stroking
– RAID (100‟s of options)
– Over Provisioning
– Quality of Service policies
– Resource Management
– Storage Tiering (Automatic!)
– Complex Buffering and Caching Algorithms
Violin Memory Inc. Proprietary 12
13. Flash Application Goals
• Increase Agility
– Real Time Data
– React faster Less Less
– Increase productivity Cost Risk
More
• Reduce Cost Agility
– Do more with existing infrastructure
– Lower operational costs
– Reduce software licensing costs
• Reduce Risk
– Exceed SLAs
– Meet exceptional demands
Violin Memory Inc. Proprietary 13
15. The Technology Language Barrier
In the I.T. industry each area of specialism has its own terminology
‒ Consider a simple performance issue:
Application User says “system is slow”
Database Database Administrator says “wait time is high”
Server OS Admin says system has “high IOWAIT”
Storage Storage Admin says “latency is high”
Violin Memory Inc. Proprietary 15
16. Translating “Storage” Into “Application”
Application
Scalability
Application Batch
Acceleration Throughput
Violin Memory Inc. Proprietary 16
17. Latency – The Application Stealth Tax
Violin Memory Inc. Proprietary 17
18. Latency – The Application Stealth Tax
Calls to and from storage require the calling application to wait
The time spent waiting is the latency of the storage – it is lost time
CPU: BUSY I/O
WAIT BUSY I/O
WAIT BUSY
DISK:
Violin Memory Inc. Proprietary 18
19. Lost Application Time
This time is lost for every I/O call… for every user…
Busy applications perform millions of I/Os per hour
CPU: BUSY WAIT BUSY WAIT BUSY
Lost Application Time
DISK: I/O I/O
Sum of Latency
Violin Memory Inc. Proprietary 19
20. Flash Memory – Ultra Low Latency
With flash memory I/O calls are orders of magnitude faster
Time spent waiting for storage is dramatically reduced
W W
CPU: BUSY A
I
T
BUSY A
I
T
BUSY
Increased
CPU
Efficiency
I I
FLASH: /
O
/
O
Violin Memory Inc. Proprietary 20
21. IOPS – The Application Ceiling
Violin Memory Inc. Proprietary 21
22. IOPS – The Application Ceiling
IOPS are needed for scaling applications
‒ Increasing users or volume of data, adding modules etc
At higher IOPS, response times increase
‒ Legacy disk systems have unpredictable performance
In application language:
1. Storage systems have an upper limit
2. The busier the system, the slower the response
Violin Memory Inc. Proprietary 22
23. Sustained Performance
14000
Disk arrays behave 12000 Flash Disk
unpredictably under load 10000
Latency
Disk latency increases
8000
6000
exponentially
4000
Violin flash memory
2000
latency has a linear
0
increase under load 0 10,000 20,000 30,000 40,000
IOPS 50,000 60,000 70,000
Violin provides applications with predictable performance
Violin Memory Inc. Proprietary 23
24. Flash versus Disk
A typical 15k RPM SAS disk can service
around 200 IOPS
A Violin 6616 flash memory array can
service 1,000,000 IOPS
This is 5000x more in a single 3U unit
The equivalent disk array would fill a data
centre and require huge amounts of power
and cooling
Violin Memory Inc. Proprietary 24
26. Application Throughput
Bandwidth describes the volume
of data that can be delivered to
and from a storage system
In the application world this is
critical for batch processing
Applications which regularly process large amounts of data require
bandwidth in order to complete processing within the batch window
As data volumes grow this becomes more difficult
Violin Memory Inc. Proprietary 26
27. Batch Windows 8am Monday
Start of online day
Friday Saturday Sunday Monday
Windows are used to schedule batch runs for times when users will not
be affected, e.g. nights, weekends etc
Why Reduce Batch Windows?
‒ Overrunning into the next window has serious consequences
‒ Users Protection from overrunning batchor batch job may have to be
Risk: could experience bad performance jobs
cancelled
‒ Processing could fall behindrun additional to become stale
Agility: Process more data, causing data jobs
Violin Memory Inc. Proprietary 27
29. With Disk Storage - SAP BW
8 CPUs
Oracle 11.2.0.2.0
Non-ASM
HDS Disk Storage
20k SAP processes
SVC - IBM
VMWARE
Sync Replication
Violin Memory Inc. Proprietary 29
30. With Flash Storage - SAP BW
20 CPUs
Oracle 11.2.0.2.0
Non-ASM
40k SAP processes
SVC - IBM
VMWARE
Sync Replication
Violin Memory Inc. Proprietary 30
31. Before Violin - SAP SCM
Oracle 11.2.0.2
Non-ASM
2 x HP XP12K SAN
Avg. Random I/O 7ms
Avg. Direct Path Reads 22ms
Violin Memory Inc. Proprietary 31
32. Violin Final Test - SAP SCM
Oracle Version: 11.2.0.2
SAP BC/BC-
MID/PP/QM/Specifics
Non-ASM
2 x VMA 3205 (mirrored)
• Avg. Random I/O latency 500µs
• Avg. Direct Path Read I/O latency 1ms
Violin Memory Inc. Proprietary 32
34. Application Journey – The way forward
Legacy Accelerated Consolidated Virtualised
Constrained Unleashed Shared Private Cloud
• Isolated Applications • Simple To Deploy • Lower Operating Costs • Highest Cost Savings
• Complex and Slow • Increased Productivity • Reduced Complexity • Self-Service
• Disparate Platforms • Maximise ROI • Higher Service Levels • Increased Agility
• Difficult To Manage • Exceed SLAs • Reduced License Cost • Reduced Complexity
Violin Memory Inc. Proprietary 34
35. What If………..
• You could remove the I/O constraint from
your infrastructure?
• You could reduce the application wait time
by 5x – 10x?
• You could accelerate the performance of
your mission critical IT applications by
5x – 10x?
• You could reduce the Total Cost of
Ownership by 70%?
Violin Memory Inc. Proprietary 35
36. Customer Quote
Helmut Eckstein, Manager Global IT/SIS, Pepperl+Fuchs:
“This is a fast and stable system, we have been running it now for
more than a year and we haven’t had any disruption or any break.
The performance is still unbelievable”
Violin Memory Inc. Proprietary 36
Something that shows company growth (Employees, Capitalization, revenue???)Toshiba relationship is critical to our competitive advantage.only Violin Memory and Apple computer have guaranteed supply of NAND flashthe roadmap agreement allows us to understand and work on the lowest levels of NAND technology enabling vRAID and extreeme performanceSilicon Valley Company of year shows our potential.HP relationship shows our credibility.
The most important question to ask here is: “How would this change your behaviour?”If I/O stopped being a finite resource and effectively became infinite:Think of all the things you could stop doing…Think of all the things you could do now that you couldn’t do before…
Using disk means coping with poor performance. Coping strategies have to be put in place to try and mitigate this performance. Those strategies include Quality of Service (QoS) policies, Resource Management tools and the introduction of various complex buffering and caching algorithms in order to try and evenly distribute the resource. These mitigation strategies have their own cost, both in terms of overhead and also management: for example, designing and maintaining a QoS policy for hundreds or thousands of consumers would require a large amount of administration.RAID offers a massive set of permutations which need to be considered when provisioning disks: RAID level (0,1,2,3,4,5,6,10), disk type (SAS, SATA, FC, SSD), disk size, stripe size etc. For each different file type these permutations must be considered, resulting in a lengthy design process which is complex and subjective.
DON’T LABEL THE APPLICATION GOALS HERE AS VIOLIN’S BECAUSE IT SOUNDS TOO MUCH LIKE SALES. TALK ABOUT THESE BEING APPLICATION GOALS IN GENERAL FOR FLASH STORAGEViolin Memory has three important goals within the application space:Increase Agility: allowing you to do more with your applications, to react faster and be more dynamic. Business agility is about having the ability to react quickly and cost effectively to changes in the business environment.Reduce Cost: flash memory frees the constraints of your existing infrastructure to allow more return on investment; as a storage solution it has lower operational costs (power, cooling, data centre footprint) and can allow for a reduction in CPU-based software licenses (e.g. Oracle Database)Reduce Risk: by taking away the resource limits imposed by legacy disk storage systems, flash memory allows for a greater protection against saturation points and the resulting consequences such as missed SLAs and system unavailability; the incredible capability of Violin Memory flash memory arrays means that exceptional peaks in demand can be tolerated with negligible impact on performance
For the oracle user group presentations I am highlighting that the talk will cover the “Accelerate” part of the journey but explain that the journey leads to consolidate,virtualise and unleashAgain don’t talk about the application journey in context to Violin because it is a sale push. Talk about the application journey in relation to flash. I have changed the name to The Flash Application JourneyThe Violin Application Journey begins with disparate applications which are being held back by legacy infrastructure issues such as I/O constraints. These applications are often running on complex systems which have many different exceptions in the way they are managed. Customers often view the virtualised, private-cloud style goal as a utopia which is beyond their reach. Violin allows that journey to take place without the need for massive investment in a complete new hardware stack.In the first phase of acceleration, applications are unleashed from the constraints inherent with disk technology and allowed to perform at the true speed of flash memory. This allows users to get more results and process more data, whilst existing server infrastructure is given a new lease of life. Issues such as SLAs no longer remain a priority as the performance of reports and batch runs is improved many times over. As a result, the customer has room to manoeuvre and start planning the implementation of the next stage.In the consolidation stage, customers begin to migrate systems on to a single, consolidated environment. This allows for great reductions in operating costs and in particular Oracle licenses. Due to the standardised nature of consolidation environments (i.e. every system looks the same and is managed in the same way), systems become more manageable and service levels increase as a result. Violin provides a key advantage here in allowing for a greater density, i.e. more systems to be migrated to each physical platform.The final stage is virtualisation, which gives customers the ability to provide their end users with self-service options. Virtualisation brings a wealth of advantages: automatic provisioning of environments, cloning of environments for backups, test or development systems, migration of virtual machines to limit the impact of planned maintenance. The performance impact of the virtualisation layer is minimised due to the incredible performance of Violin flash memory, which again allows for the density of virtual to physical systems to be maximised.The use of Violin flash memory allows the customer to unlock their existing infrastructure and then achieve the maximum cost savings from consolidating and virtualising.
In the IT industry there are many areas of specialist expertise. For example, in a typical application deployment there will be end users using an application… which runs on a database… which runs on a number of servers… which utilise data kept on a storage system. It is commonplace for each of these different entities to have differing sets of personnel administering them. It is also common for each of these different administration roles to use different terminology to describe events which impact many or all layers of this stack.In the example, an application user complains that the system appears to be “slow” based on their perception of the application’s performance. A report of this type would usually work its way down the stack being checked at each level. So in this instance the database administrator checks to see if there are performance issues on the database and reports that there is a high amount of “wait time” associated with I/O. This may then be checked by the Operating System Administrator who confirms that the server shows a high amount of “IOWAIT”. Finally this falls to the Storage Administrator who checks the storage system and reports that, due to the storage system being pushed to capacity, the “latency” of I/O requests is very high, causing the performance issues.Although these terms have subtle differences at each level, they all correspond to the same simple issue: a saturated storage system causes the application users to complain that the system is slow.
Latency is important for highly transactional (OLTP ) applications which require fast response times. Examples include call centre systems, CRM, trading, e-Business etc where real-time data is critical and the millisecond latency of spinning disk has a direct negative impact on revenue.IOPS (IOs Per Second) are required for scaling applications and increasing the workload, which most commonly means one of three things: in the OLTP space, increasing the number of concurrent users; in the datawarehouse space increasing the parallelism of batch processes, or in the consolidating / virtualisation space increasing the number of databases located on a single platform.Bandwidth / Throughput is a critical requirement for datawarehouse-type workloads where massive amounts of data need to be processed in order to aggregate and report, or identify trends. Increased throughput allows for batch processes to complete in reduced amounts of time.
Customer increased their SAP processes to 40k before the CPU and still hit the same wall of 100% utilisationWe asked them increase their CPUs again to 20 and once they did that the testing produced low latency of 600microsecs through VMWARE and SVC giving them an x8 acceleration in the database8x performance improvement overall3 hours batch process down from 24 batch Only 3 hours of lost application time down from 81 hoursRandom I/O latency down to 600 microseconds from 9msSequential I/O latency down to 2ms from 14msDirect Path Reads latency down to 1ms from 13msDoubled SAP processes from 20k to 40k
PEA software application used for spare chain supply management using SAP BC/BC-MID/PP/QM/Specifics. Oracle 11.2.0.2.0 running on HPUX B.11.31 U ia64/HP BL 890C I2 / 24 cores / 256Gb RAM