2. Agenda
"The Mission"
Topology overview
Get stressed!
Tuning details
System test environment
3. WAS System Test had a mission
• Build an "Internet Scale" Liberty Collective topology
− 10,000 collective members
− Stress the system management layer
− Stress against applications running on Liberty
− All over 7+ days
• Monitor & watch it go!
3
4. History Collective Scale in System Test
4
8.5.5.0 8.5.5.1 8.5.5.2 8.5.5.3 8.5.5.4
0
2000
4000
6000
8000
10000
12000
0
2000
4000
6000
8000
10000
12000
Target
Actual
* Initial test was larger than Full Profile by 2,000 servers
5 Controllers3 Controllers
+ Application Workload
*
+ MBean Stress
6. Internet Scale Collective Topology
• 5 IHS Servers
− 5 Virtual Machines
• 5 Collective Controllers
− 5 Virtual Machines
• 10,000 Collective Members
− 225 Collective Members per Virtual Machine
− 2000 per Collective Controller
• 5 clusters
− 2,000 members each
• 1 application (PingServlet) per member
6
7. Internet Scale Collective Topology
7
Collective Controller
Replica Set
CC
CC
CC
CC
CC
Machine Boundary
AppServer
AppServerLiberty
Profile
Clustered
AppIHS
IHS
IHS
IHS
IHS
Collective
AppServer
AppServerLiberty
Profile
Clustered
App
AppServer
AppServerLiberty
Profile
Clustered
App
8. Topology – IHS Servers
• WebSphere Application Server 8.5.5 IHS
• Hosted on VMWare ESX
• 4 CPU with 16 GB of RAM
• Red Hat 6.5 x64
• Hosting merged plugin-cfg.xml for 2000 Liberty Servers
• Tuning Parameters
• standard application workload tuning
8
9. Topology – Collective Controller
• WebSphere Liberty Profile 8.5.5.4
• Hosted on VMWare ESX
• 6 CPU with 32 GB of RAM
• Red Hat 6.5 x64
• Features used in server.xml
<feature>jsp-2.2</feature>
<feature>collectiveController-1.0</feature>
<feature>restConnector-1.0</feature>
<feature>monitor-1.0</feature>
<feature>adminCenter-1.0</feature>
• Tuning Parameters
• OS: ulimit file handles increased
• Java: heap size increased
• WLP: thread pool increased
9
10. Topology – Liberty Collective Member
• WebSphere Liberty Profile 8.5.5.4
• Hosted on VMWare ESX
• 8 CPU with 64 GB of RAM
• Red Hat 6.5 x64
• Hosting one application
• Features used in server.xml
<feature>jsp-2.2</feature>
<feature>collectiveMember-1.0</feature>
<feature>clusterMember-1.0</feature>
<feature>restConnector-1.0</feature>
<feature>monitor-1.0</feature>
• Tuning Parameters
• OS: ulimit file handles increased
• WLP: TCP configuration (for application workload)
10
12. Management & Monitoring Workload
• Apply stress at the system management layer
• Invocation of Liberty MBeans through REST connector
• ThreadPool – Display Active Threads and Pool Size
• JVM Statistics – Display UsedMemory, FreeMemory, and Heap
Size
• File Transfer Operation – Transfer files of various sizes from
Collective Controller to Collective members
• Continuously over a period of 7 days
12
13. Application Workload
• Light-weight application workload: pingServlet
• Other Persona scenarios cover application workload
• Continuously over a period of 7 days
13
16. IHS – configuration & tuning
• No changes required to handle large scale collective
• Collective size does not impact application workload
– No application workload on controllers
• Modified httpd.conf to accommodate general application stress
(followed standard practices for application load)
MaxClient incremented to 1600 (up from 600)
16
17. Collective Controller – configuration & tuning
• server.xml
<!-- Increase the operation timeouts to 10m, up from 1m for long running
gen cluster plugin config -->
<serverCommands startServerTimeout="600" stopServerTimeout="600" />
<executor name="LargeThreadPool" id="default" coreThreads="150"
maxThreads="400" keepAlive="120s" stealPolicy="STRICT"
rejectedWorkPolicy="CALLER_RUNS" />
• jvm.options
-Xms512m
-Xmx12288m
-verbose:gc
-Xdump:heap
-Xverbosegclog:logs/verbosegc.log
• OS tuning
ulimit max files 20,000
17
18. Collective Member – configuration & tuning
• Use the collective configuration defaults
− heartbeat interval (1m) & controller read timeout (5m)
• server.xml
<!-- Dictated by your application -->
• jvm.options
-Xms128m
-Xmx256m
-verbose:gc
-Xdump:heap
-Xverbosegclog:logs/verbosegc.log
• OS tuning
ulimit max files 8192
18
19. Key configuration & tuning takeaways
• No WLP tuning configuration to handle large scale collective
• Controller requires JVM and OS tuning to accommodate large
data set
• Modify timeouts if using long running operations
• Collective size does not impact application workload
• Best practice: no application workload on controllers
• Tune your servers as you would normally
19
21. Unlike Rome, collectives can be built in a day
• In-house scripts built on standard Unix operations
• Time to build: 5 - 6 hours of time.
• jython scripting for MBean invocation
− Generating plugin-cfg.xml for 2,000 cluster members takes
time
Set jython script timeout appropriately
(20 min for 2k cluster)
• Not using DevOps tools (yet)
21
22. Many paths to the same result...
22
Manual
Scripts
DevOps
Tools
(UrbanCode Deploy,
Chef, Puppet)
Admin
Center
Liberty
Commands
Liberty
Collective
Admin Center
Many ways to execute commands…
To yield the same results
23. Use in Continuous Persona (CP)
• WAS Liberty is executing continuous persona (system test)
− New initiative
• CP uses collectives to run system-level tests in a mixed runtime
level
− Does not include 10k scale
• CP uses mixed-version runtimes in collective for continuous
update and test
− Uses A/B testing practices to ensure newer versions do not
regress behaviour or function
23
24. A work in progress
• Member failover to another controller
− Jan'15 Beta: Small-scale failover tested @ 600 member
collective (125 members per controller)
− Feb'15 Beta: 5,000 member collective fail over tested @
1,000 per controller
− Failover does not impact application work load
• Multiple concurrent server joins can result in incomplete
requests.
• Member registration time increases as we approach very large
scale
24
25. In Summary
• Minimal tuning required to get to large scale
• Controller JVM and OS tuning required to accommodate large
data set
• Collective size does not impact application workload
• Best practice: no application workload on controllers
• Large scale collective is stable for mixed management
operations and application workload
• On-going improvements for management operations
performance and failure scenarios
• Management failover does not impact application workload
25
26. Further Reference material
• Building a large scale WebSphere Application Server Liberty
collective topology (white paper)
http://www.ibm.com/developerworks/websphere/library/techarticles/1309_yu/1309_yu.ht
ml
• Tuning the Liberty profile (Knowledge Center)
http://www-
01.ibm.com/support/knowledgecenter/SSD28V_8.5.5/com.ibm.websphere.wlp.core.doc/
ae/twlp_tun.html
• Best Practices for Large WebSphere Topologies
http://www.ibm.com/developerworks/websphere/library/techarticles/0710_largetopologie
s/0710_largetopologies.html
26
29. Related Sessions – Tuesday
29
AAI-3281 Smarter Production with WebSphere Application Server ND
Intelligent Management
Tues, 24-Feb 05:30 PM - 06:30 PM, Mandalay Bay - Surf Ballroom A
AAI-2827 Problem Determination Tools and Strategies for Liberty and
Full Profile WAS
Tues, 24-Feb 05:30 PM - 06:30 PM, Mandalay Bay - Mandalay Ballroom B
30. Related Sessions – Wednesday
30
AAI-1445 Managing Dynamic Workloads with WAS ND and in the Cloud
Wed, 25-Feb 09:30 AM - 10:30 AM, Mandalay Bay - Reef Ballroom E
AAI-3228 DevOps Tools and WebSphere Application Server
Wed, 25-Feb 09:30 AM - 10:30 AM, Mandalay Bay - Surf Ballroom A
AAI-3590 Best Practices for Configuring and Managing Large
WebSphere Topologies
Wed, 25-Feb 02:00 PM - 03:00 PM, Mandalay Bay - Reef Ballroom E
AAI-3218 Production Deployment Best Practices for the IBM WebSphere
Liberty Profile
Wed, 25-Feb 05:30 PM - 06:30 PM, Mandalay Bay - Surf Ballroom F
31. Related Customer Feedback Roundtables
31
AAI-3319 Shaping the Future of WebSphere Liberty Admin Center
Tue, 24-Feb 05:30 PM - 06:30 PM, Mandalay Bay - Coral A
Wed, 25-Feb 09:30 AM - 10:30 AM, Mandalay Bay - Coral A
Thu, 26-Feb 09:00 AM - 10:00 AM, Mandalay Bay - Tropics B
AAI-2810 Problem Determination and Troubleshooting Full Profile and
Liberty Servers
Wed, 25-Feb 09:30 AM - 10:30 AM, Mandalay Bay - Tropics B
Wed, 25-Feb 03:30 PM - 04:30 PM, Mandalay Bay - Tropics B
33. Notices and Disclaimers (con’t)
Information concerning non-IBM products was obtained from the suppliers of those products, their published
announcements or other publicly available sources. IBM has not tested those products in connection with this
publication and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM
products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those
products. IBM does not warrant the quality of any third-party products, or the ability of any such third-party
products to interoperate with IBM’s products. IBM expressly disclaims all warranties, expressed or implied,
including but not limited to, the implied warranties of merchantability and fitness for a particular purpose.
The provision of the information contained herein is not intended to, and does not, grant any right or license under any
IBM patents, copyrights, trademarks or other intellectual property right.
• IBM, the IBM logo, ibm.com, Bluemix, Blueworks Live, CICS, Clearcase, DOORS®, Enterprise Document
Management System™, Global Business Services ®, Global Technology Services ®, Information on Demand,
ILOG, Maximo®, MQIntegrator®, MQSeries®, Netcool®, OMEGAMON, OpenPower, PureAnalytics™,
PureApplication®, pureCluster™, PureCoverage®, PureData®, PureExperience®, PureFlex®, pureQuery®,
pureScale®, PureSystems®, QRadar®, Rational®, Rhapsody®, SoDA, SPSS, StoredIQ, Tivoli®, Trusteer®,
urban{code}®, Watson, WebSphere®, Worklight®, X-Force® and System z® Z/OS, are trademarks of
International Business Machines Corporation, registered in many jurisdictions worldwide. Other product and
service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on
the Web at "Copyright and trademark information" at: www.ibm.com/legal/copytrade.shtml.
34. Thank You
Your Feedback is
Important!
Access the InterConnect 2015
Conference CONNECT Attendee Portal
to complete your session surveys from
your smartphone, laptop or conference
kiosk.