2011 European HyperWorks Technology Conference                    Vladi Nosenzo, Roberto Vadori20 Novembre, 2010   2011
ABSTRACT The work described below starts from an idea of a previous experience of Reply, developed in collaboration with P...
INTRODUCTION The term Cloud Computing refers to a distributed and relocated structure. The analyzed solution is to use the...
VIRTUAL AND REAL INFRASTRUCTURE                                                The initial implementation included a Front...
VIRTUAL AND REAL INFRASTRUCTUREThe services and infrastructure offered by Amazon are configured as different types of inst...
NETWORK INFRASTRUCTURE                   The Amazon infrastructure is mainly an internal virtual network, where the       ...
NETWORK INFRASTRUCTURE                                          The communication channel between Amazon and Iveco is made...
COMPUTATION CODES AND DATA TRANSFERThe software needed for to setup the infrastructure are standard; in addition to the pa...
COMPUTATION CODES AND DATA TRANSFER     INPUT FILE   DIMENSION    TIME                                       The following...
RESULTS AND CONCLUSIONS           TEST CASE    CLOUD    HPC    Normalizing the execution times respected to the more power...
NEXT STEPThe next step will be to integrate the two solutions (virtual and real) using the virtual solution to manage, on-...
RESULTS AND CONCLUSIONSTHANK YOU FOR YOUR ATTENTIONBonn, 2011/11/08               12
Upcoming SlideShare
Loading in …5
×

Performance Evaluation of a Virtual Cluster at IVECO Engineering

682 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
682
On SlideShare
0
From Embeds
0
Number of Embeds
118
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Performance Evaluation of a Virtual Cluster at IVECO Engineering

  1. 1. 2011 European HyperWorks Technology Conference Vladi Nosenzo, Roberto Vadori20 Novembre, 2010 2011
  2. 2. ABSTRACT The work described below starts from an idea of a previous experience of Reply, developed in collaboration with Prompt Engineering. The technical constraints for IVECO have been more stringent than Reply ones; it was therefore necessary to carry out a feasibility study to take into account the requirements of IVECO, both in terms of safety and accessibility of the infrastructures. The main request was to ensure the security and inviolability of the data used during the simulations. The infrastructure was designed and built to meet this need. We will start by describing a simple configuration that, gradually, will be made more complex until to achieve the final one. Finally, we will compare the simulation results, in terms of performances, obtained by using the two configurations (the virtual and the real cluster).Bonn, 2011/11/08 2
  3. 3. INTRODUCTION The term Cloud Computing refers to a distributed and relocated structure. The analyzed solution is to use the services offered by Amazon, where the machines are virtual computing resources. It should be noted that Amazon only provides the computing resources with a minimal configuration, while the infrastructure and application codes must be installed by the end user.Bonn, 2011/11/08 3
  4. 4. VIRTUAL AND REAL INFRASTRUCTURE The initial implementation included a Front End (hereafter FE) and a set of cores N1 , N2 , … Ni , … Nn , distributed across multiple nodes (a minimum of two cores for the FE, four to sixteen cores per compute nodes) connected, via Internet, with a remote license server.This structure was considered non-compliant with the safety IVECO standards. Therefore, it was built a new networkinfrastructure that contained, within a Virtual Private Network (hereafter VPN), already existing, the FE and thecalculation nodes Ni. This solution has allowed the integration of the virtual Amazon resources with the IVECO network.While the FE and the calculation nodes are within the VPN, thelicense servers are outside: therefore, we must create acommunication channel between them and the virtual calculationstructure. The communication channel is controlled by two firewalls:the first (an Open Point) verifies the communication between the FEand the VPN, while the second (a Point to Point) verifies thecommunication between the VPN and the network of license servers.Bonn, 2011/11/08 4
  5. 5. VIRTUAL AND REAL INFRASTRUCTUREThe services and infrastructure offered by Amazon are configured as different types of instances: one instance is,typically, a CPU with a number of cores (from two to sixteen), one disk space on which is installed the Operating System(Linux or Window Server) and a few basic utilities. The following table contains the instances that have been activatedfor the Proof of Concept (hereafter POC).. The first instance, a Windows Server 2003, has been used to INSTANCE OS MEMORY CPU NOTES display the calculation results, using the HyperView software. On the same machine, it was also installed an Expedat server, 1 WS 2003 2 GB Remote Graphics 2 Linux 8 GB 2 Front End dedicated to data transfer. On the FE were installed two different 3 Linux 16 GB 4 Compute Node classes of software: the first includes the codes of management and , in particular: PBS (Portable Batch System) : to manage the queues and to allocate the resources of the computation nodes. NFS (Network File System) : to share the storage disk resources. MPI (Message Parsing Interface): control libraries for running the same calculation performed in parallel on multiple independent processors SSH/SSL (Secure Socket Layer): to manage the encrypted communication between FE and nodes, and between FE and graphic display,while the second contains the calculation codes Radioss : explicit solver for crash analysis Optistruct : implicit solver for static and optimization analysis.Bonn, 2011/11/08 5
  6. 6. NETWORK INFRASTRUCTURE The Amazon infrastructure is mainly an internal virtual network, where the compute nodes, the FE and the Graphics WorkStation are embedded. The FE has several network interfaces: one on the internal and one on the public network (Internet), which allows access to the client via SSH. The other is on the VPN. The Iveco infrastructure is more complex. We outline the network in its main lines. In the figure we see three main components:  a VPN server, connected to the public and internal network, which controls the input/output data  a client that communicates with the outside, through the firewall, and with the license server  Two license servers that communicate with the external clients only through the firewall.Bonn, 2011/11/08 6
  7. 7. NETWORK INFRASTRUCTURE The communication channel between Amazon and Iveco is made using an encrypted VPN. The VPN server is embedded in the Iveco network, while the client, in this case, is the FE Amazon. All communications occur via SSH protocol, thus creating a secure channel of communication. The most critical point, was to find the correct mechanism of communication between the FE and the license server. From technical point of view, the license server, which uses FlexLM software, requires a two-way connection, using two specific communication ports; was therefore necessary to assign a well-defined communication port, not arbitrary, opening the access on the firewall. In the same way, the calculation nodes must be able to communicate with the license server and therefore must be embedded inside the VPN.Bonn, 2011/11/08 7
  8. 8. COMPUTATION CODES AND DATA TRANSFERThe software needed for to setup the infrastructure are standard; in addition to the package management (NFS, PBS,SSH, MPI libraries) and some software to log on to the license server, it was not necessary to install additional packages.All software is installed automatically using scripts; the whole process takes about 24 minutes to setup a FE and tencomputing nodes (quad-core). The computation codes were chosen considering the cluster configuration used: anexplicit solver (Radioss) and an implicit one (Optistruct). The choice was made taking into account the two differentcalculation types; a crash analysis, performed using explicit codes, and a linear static analysis solved using an implicitcode. Since the two calculations are inherently different, the first was performed using all available cores in the computenode, while the second was still running on the same node, but assigning two of the four available cores. A critical point is the data transfer. While, in general, the input file is an ASCII file, compressible and relatively small, the output files are binary files, poorly compressible and large. These represent, if transferred by traditional methods, a real bottleneck of the structure. The solution was as follows:  transfer of results from the compute nodes to the Window server (green arrow); as the compute nodes and the server are on the same Amazon internal network, the transfer is very fast.  Open of a graphic session using the RDS protocol (Remote Desktop System) through encrypted VPN channel  Download of results by connecting to the Expedat server; also in this case the communication (red line) is via encrypted VPN channelBonn, 2011/11/08 8
  9. 9. COMPUTATION CODES AND DATA TRANSFER INPUT FILE DIMENSION TIME The following table shows the average time taken to transfer the input data to the Amazon network using the SFTP protocol (Secure File Transfer Protocol) RADIOSS 95 MBytes 112 s without compressing the data. OPTISTRUCT 70 MBytes 59 s The following table shows the time taken to download the results. The data related to ADSL connection are comparable to the Iveco VPN that has a bandwidth of 10 Mbps. In particular, using the Iveco VPN, a transfer of 1.5 GB compressed by means Expedat protocol to 46%, takes about 480 s. Extrapolating the data obtained, a download of a 10 GB, using the Iveco VPN, requires 3200 s, i.e. about 53 minutes.Bonn, 2011/11/08 9
  10. 10. RESULTS AND CONCLUSIONS TEST CASE CLOUD HPC Normalizing the execution times respected to the more powerful Optistruct 3.43 1 architecture, it should be noted that the Cloud Solution is more efficient Radioss 1.3 1 using explicit solvers, how we can see in the table and figure following. Optistruct Model Radioss ModelOne important point is to have an appropriate scratch area on the disk: the architecture used allocates a single disk area,mounted on the FE, and exported via NFS onto all compute nodes. The scratch area must be dimensioned as a functionof the codes used, in particular for the implicit solvers, where sometimes is required a disk area also of 2 TB. Inconclusion we can say that the Cloud Solution is more efficient, in terms of performances, as the infrastructure iscorrectly dimensioned from technical point of view (i.e. in terms of CPU time, disk space, bandwidth for data transferoptimization) and is much useful to manage the workloads.Bonn, 2011/11/08 10
  11. 11. NEXT STEPThe next step will be to integrate the two solutions (virtual and real) using the virtual solution to manage, on-demand, theworkloads.We are currently evaluating the economic impacts in order to have a "Proper Business Case".Bonn, 2011/11/08 11
  12. 12. RESULTS AND CONCLUSIONSTHANK YOU FOR YOUR ATTENTIONBonn, 2011/11/08 12

×