Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Enhancing and Preparing TIMES for High Performance Computing

93 views

Published on

Enhancing and Preparing TIMES for High Performance Computing

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Enhancing and Preparing TIMES for High Performance Computing

  1. 1. Enhancing and preparing TIMES for High Performance Computing Evangelos Panos, Tarun Sharma (Status report on ETSAP funded project) 72nd ETSAP-MEETING | 11th Dec 2017, ETH Zurich, Switzerland
  2. 2. Contents • Premise • Transferring TIMES on LINUX: modifications needed • Application of TIMES on LINUX by running single/batch jobs of scenarios • Running multiple independent scenarios with TIMES on LINUX • By manually creating scripts controlling the execution and result collection • By modifying the TIMES code using the Grid Computing Features of GAMS
  3. 3. Premise Constraint matrix sizes for different TIMES-based models • Solution time on a typical workstation: • Irish TIMES: 11 minutes • ETSAP TIAM: 1.3 hours • JRC EU TIMES: 4.2 hours • SWISS TIMES: 2 hours • Uncertainty analysis part of the best practice (DeCarolis, Daly et al. 2017). • High Performance Computing (HPC): Aggregating computing power to deliver better performance while solving large problems. • Most of these systems run on LINUX.
  4. 4. • Transferring TIMES-based models to Linux • Code is compatible with minor changes • Running TIMES models on GAMS Linux installation • Scripts for running batches of model instances • Transferring results from Linux to Windows for post processing with VEDA_BE • .gdx files are compatible between the platforms and data transfer can be easily established • Scripts for converting batches of GDX files to VEDA_BE files TIMES on LINUX: single scenario
  5. 5. • Making GAMS and CPLEX/Barrier to run on specific CPU cores • Simultaneously running model instances on selected CPU cores • This can be done either manually by the user or automated using the GAMS Grid computing features • When the user manually generates and runs model instances on selected CPU cores: • Must generate the *.dd files for every model instance in VEDA-FE • Must write OS-dependent scripts to run the scenario and assign CPU cores to each scenario • Must write OS-dependent scripts to check if a scenario has finished and collect its solution • When using the Grid Features Facility of GAMS then the user: • Must only write a GAMS solve loop, in each iteration of which a scenario is solved • GAMS undertakes the assignment to CPUs (it can be also controlled by the user if desired) • GAMS undertakes the polling to check if a scenario is ready and collects the results TIMES on LINUX: multiple independent scenarios
  6. 6. Transferring TIMES based model to Linux Contents of the ETSAP Project Report 1. Setting up a Linux computer ...................................................................................................... 4 1.1. Some basic Linux commands ............................................................................................. 6 1.2. Terminal and File Manager ................................................................................................ 6 2. Installing Gams on Linux........................................................................................................... 6 3. Transferring a TIMES-based model to Linux ............................................................................. 9 4. Running a TIMES model on Linux .......................................................................................... 12 5. Transferring results from Linux to Windows for Post-processing with VEDA_BE ................... 13 6. Scripts ..................................................................................................................................... 14 6.1. Running a TIMES scenario in Linux ................................................................................ 14 6.2. Batch job with multiple scenarios in Linux....................................................................... 15 6.3. Batch job for creating VEDA_BE files in Windows ......................................................... 15 7. Troubleshooting a TIMES run in Linux.................................................................................... 16 7.1. GAMS Error: Scratch directory does not exist .................................................................. 17 7.2. Gams Error: Unable to open input file .............................................................................. 17 7.3. Linux error: File permission error..................................................................................... 17
  7. 7. Transferring TIMES based model to Linux Contents of the ETSAP Project Report 8. Preparing TIMES for High Performance Computing................................................................ 20 8.1. Multiple parallel scenarios in TIMES using user-defined scripts that are assigned to multiple CPUs ………………………………………………………………………………………………21 8.1.1. Making GAMS and CPLEX/Barrier to run in specific CPU(s) .................................. 21 8.1.2. Run multiple independent scenarios in parallel ......................................................... 23 8.2. Multiple parallel scenarios in TIMES using the GAMS grid features ................................ 23 8.2.1. Introduction to Grid Computing by using the GAMS language ................................. 24 8.2.2. A simple illustrative example of Grid Computing using GAMS language features .... 25 8.2.3. Equipping TIMES with grid computing features ....................................................... 34 8.2.4. Writing a GAMS file for a TIMES-based model to query the status of the submitted jobs to the grid ................................................................................................................................ 41 8.2.5. Writing a collection script for TIMES....................................................................... 42 8.2.6. Executing TIMES using the GAMS Grid Computing Features.................................. 44 8.3. Issues for further investigation regarding running TIMES over a Grid of CPUs ................ 50
  8. 8. Transferring TIMES to LINUX and running a single or a batch job of scenarios
  9. 9. Cross platform compatibility of TIMES source code 1. Linux OS is case sensitive thus all filenames should be either lowercase or uppercase 2.Run file generated by Veda should be named in lowercase 3.Three major changes are required to the source code: • calling GAMS with the call option FILECASE=2  this enforces lowercase filenames • converting all the model names in TIMES code to lowercase, “times” and “times_macro”, to ensure compatibility with the FILECASE=2 option, when producing the .gdx solution files • replacing all system calls from GAMS in the TIMES code from MS-DOS commands to POSIX (Portable Operating System Interface ) commands that operate both under Windows and Linux
  10. 10. Cross platform compatibility of TIMES source code  3 major changes are required Sl No. File name Current TIMES code (v400) Modification needed for Linux 1 *.run $ SET MODEL_NAME ‘TIMES’ $ SET MODEL_NAME ‘times’ 2 maindrv.mod $ IF '%MACRO%' == YES $SET MODEL_NAME 'TIMES_MACRO' SET TIMESED '0' SETLOCAL SRC tm $ IF '%MACRO%' == YES $SET MODEL_NAME 'times_macro' SET TIMESED '0' SETLOCAL SRC tm 3 eqlducs.vda $ if exist cplex.opt execute "copy /Y cplex.opt+indic.txt cplex.op2 > nul"; $ if exist xpress.opt execute "copy /Y xpress.opt+indic.txt xpress.op2 > nul"; $ if exist cplex.opt execute "cat cplex.opt indic.txt > cplex.op2"; $ if exist xpress.opt execute "cat xpress.opt indic.txt > xpress.op2";
  11. 11. Verification of execution • the .dd files from Windows generated with VEDA-FE should be transferred to Linux prior execution • GAMS call requires the paths to the TIMES source files and the .dd files of the scenario run • Calling GAMS to solve a TIMES instance test.run in Linux from the gams_wrktimes directory, in a Linux terminal window: • gams test.run idir=…/gams_srctimesv400 filecase=2 gdx=gamssave/test.gdx Post processing of the result file .gdx to VEDA-BE • When the run completes, the generated test.gdx file in Linux is compatible in Windows • It can be transferred to Windows and processed with the ‘gdx2veda’ utility to generate the .vd* files required for post-processing with VEDA-BE (the gdx2veda comes with the GAMS installation) • Calling gdx2veda to generate the .vd* files for VEDA-BE in the command prompt in Windows • gdx2veda test.gdx C:VEDAVEDA_FEGAMS_SrcTIMESv400times2veda.vdd test
  12. 12. Script in Linux for running a scenario Running the script from Linux terminal: etsap-user@etsap- user:~/gams_wrktimes$./run.sh test # set the path to times source file times =$HOME/gams_srctimesv400 # set the path to model data definition files ddfiles=$HOME/gams_wrktimes # execute gams to run the model gams $1.run idir=$times:$ddfiles filecase=2 gdx=$ddfiles/gamssave/$1.gdx Script name: run.sh • Why we need a script? • to encapsulate into it the paths to the source code and to the input files • to encapsulate all gams options required to run TIMES and obtain the .gdx file with the results • Thus, we create a script in Linux similar to the VTRUN.cmd script generated by VEDA-FE when running under Windows
  13. 13. Script for executing a batch of scenarios Script for processing a batch of gdx to veda Running the script from Linux terminal to solve 3 scenarios in a batch: etsap-user@etsap-user:~/gams_wrktimes$./batch_exo.sh “test1 test2 test3” for s in $1 do ./run.sh $s done Script name: batch_exo.sh for %%s in (%*) do ( gdx2veda %%s.gdx C:VedaVEDA_FEGams_srcTIMESv400times2veda.vdd %%s ) Script name: batch_linux2veda.cmd Running the script from Windows command prompt to collect the results from 3 scenarios ran in Linux: C:UsersetsapDesktoplinux2veda> batch_linux2veda test1 test2 test3
  14. 14. Single CPU Linux vs Windows performance • Initial tests in a typical desktop with 1 CPU with 8 logical cores, with Linux installed as a guest OS over Windows • Performance gain in Linux over Windows: Swiss TIMES model 20%, Irish TIMES model 50% • The improvement in performance seems to depend on the model size (needs further investigation) Swiss TIMES solution time in Windows : 126 minutes Swiss TIMES solution time in Linux: 100 minutes
  15. 15. Running multiple independent scenarios on Linux on different CPU cores by manually creating execution and result collection scripts  all the required input files *.dd should have been created in advance in VEDA_FE
  16. 16. Script to run scenarios on a set of specific cores Running the script from Linux terminal: etsap-user@etsap- user:~/gams_wrktimes$./p_run.sh test 3 #!/bin/bash # set the path to times source file times =$HOME/gams_srctimesv400 # set the path to model data definition files ddfiles=$HOME/gams_wrktimes # execute gams to run the model taskset $2 gams $1.run idir=$times:$ddfiles filecase=2 gdx=$ddfiles/gamssave/$1.gdx Script name: p_run.sh CPU core (CPU core Index) 12 (11) 11 (10) 10 (9) 9 (8) 8 (7) 7 (6) 6 (5) 5 (4) 4 (3) 3 (2) 2 (1) 1 (0) Decimal representation (CPU indices that are used in the run as exponents) Second argument to the script: hexadecimal representation 0 0 0 0 0 0 0 0 1 1 1 1 2^3 + 2^2 + 2^1 + 2^0 =15 f 0 0 0 0 1 1 1 1 0 0 0 0 2^7 + 2^6 + 2^5 + 2^4 =240 f0 1 1 1 1 0 0 0 0 0 0 0 0 2^11 + 2^10 + 2^9 + 2^8 =3840 f00 Second argument to the script: Indexing CPU cores for taskset • First argument $1: scenario to run • Second argument $2: index for CPU group on which solver runs
  17. 17. Execution of a scenario on one core Second argument determines the cores available to CPLEX/Barrier algorithm for parallelization in the linear algebra, i.e., 1 for this illustration.
  18. 18. Execution of a scenario on two cores Second argument determines the cores available to CPLEX/Barrier algorithm for parallelization in the linear algebra, i.e., 2 (number 1 and 2) for this illustration.
  19. 19. TIMES for High-Performance computing: Application • TIMES on FIONN at Irish Centre for High Performance Computing • Scripts for simultaneous execution of multiple scenario batches, i.e., 1000 scenarios (5 batches of 200 scenarios each) solved in 3 hours • Work flow optimization • Experiments: • Solution time vs number of cores assigned • 48 virtual cores per node of FIONN • Such results + further investigation • Optimal distribution of compute load • Maximum utilization of computing resources • Minimization of solution time 0 1 2 3 4 5 0 5000 10000 15000 20000 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 timeinhours timeinseconds Number of virtual cores JRC EU TIMES TIAM 2D scenario
  20. 20. Running multiple independent scenarios on Linux on different CPU cores by using the GRID COMPUTING language features of GAMS  this is an exploration of the feasibility and not a proposed design  better integration in TIMES could be designed and implemented in a separate project and after coordination with Antti, Amit and every one else who is interested in it
  21. 21. What is Grid Computing • It combines computers from multiple administrative domains • It is a form of distributed computing • A “super virtual computer” is composed of many networked loosely coupled computers to achieve a goal • There is a submitting-system where the user creates a large task • The task is partitioned into smaller tasks • These smaller tasks are solved from the computers in the grid • Advantages: Saves money, Increases efficiency, Solves problems • Disadvantages: Requires unique software, Computers can drop-off https://www.electronicproducts.com/Computer_Systems/Servers/Cloud_computing_vs_grid_computing.aspx
  22. 22. • It enables users to employ resources from any web-connected device • Cloud computing provides 3 main services: • Infrastructure (IaaS): virtual server for storage • Platform (PaaS): software to create applications • Software (SaaS): infrastructure and products • Advantages: saves money, high performance, easy to use • Disadvantages: security and privacy What is Cloud Computing https://www.electronicproducts.com/Computer_Systems/Servers/Cloud_computing_vs_grid_computing.aspx
  23. 23. What is the difference between Grid Computing and Cloud Computing • Cloud computing and grid computing have similar goals • With grid computing: • the user assigns one large task that gets divided into several smaller portions and implemented on different machines • With cloud computing: • the user enjoys a host of readily available web-based services (without investing in any underlying architecture); the services can be combined to provide homogenous and optimised experience
  24. 24. Exploring Grid Computing with TIMES • GAMS normally operates at a synchronous mode, and waits until the solver terminates • When independent model solutions are required, then an asynchronous mode is an option to increase performance • GAMS provides the asynchronous solving, via the Grid Computing and Multi-threading features • We need the following loops in the GAMS code: • Submission Loop: in this phase we will generate and submit models for solutions that can be solved independently • Inquiry Loop: in this phase we will check which solutions have been finished • Collection Loop: in this phase the previously submitted models are collected as soon as a solution is available • This implies that the scenarios of a TIMES-based model are solved via a GAMS-loop instead of using separate script files one for each scenario
  25. 25. An exploratory implementation in TIMES (1) • In the .run template file we create a GAMS set with the names of the scenarios we want to solve: SET GRID_SCENARIOS /scen1, scen2, scen3/; • In the .run template file we declare a parameter to hold the handle (i.e. the ID) of each submitted scenario: PARAMETER HANDLE(GRID_SCENARIOS) store the instance handle; • In the solve.mod file we make GAMS aware that the TIMES model will solve in an Asynchronous mode: %MODEL_NAME%.solvelink = %solvelink.AsyncGrid%; • In the solve.mod file we include a loop over the scenarios; in each iteration we update the required TIMES parameter(s) and then we submit the scenario to GAMS to be solved
  26. 26. The submission loop in a Grid Computing Facility • In each iteration we update the required TIMES parameter, then we submit the scenario to GAMS and we keep its handle (the internal ID) for identifying this scenario in the collection phase * turn on the grid option %MODEL_NAME%.solvelink = %solvelink.AsyncGrid%; LOOP(GRID_SCENARIOS, * update model actual parameters with scenario data (here the CO2 tax) OBJ_COMNT(R,DATAYEAR,C,S,'TAX',CUR)$ GS_COM_TAXNET (GRID_SCENARIOS, R, DATAYEAR, C, S, CUR) = GS_COM_TAXNET (GRID_SCENARIOS, R, DATAYEAR, C, S, CUR); * solve the model SOLVE %MODEL_NAME% MINIMIZING objZ USING %METHOD%; * keep the handle of the job for future reference HANDLE(GRID_SCENARIOS)=%MODEL_NAME%.handle; ); In this exploratory implementation, we additionally defined the input parameter over the scenario dimension and then we had to update an internal parameter of TIMES, i.e. this parameter is not visible to the user via the VEDA_FE More elegant designs and better integration into TIMES could be explored, if the grid computing features are of interest in ETSAP community, in coordination with Antti, Amit and every one else who is interested in it
  27. 27. Running TIMES in a Grid Computing Facility • To submit the TIMES model to a Grid Computing Facility, we must call GAMS as follows (Linux example) gams times_demo_co2.run idir=../times filecase=2 s=submit gdir=grid • During the submission three folders are created under folder grid, holding the solutions of the 3 scenarios; the names of the folders correspond to the handles of the scenarios • During the solving, in each subfolder in the grid directory the execution scripts generated automatically from GAMS and also theb model matrix and solution of each scenario are kept
  28. 28. Running TIMES in a Grid Computing Facility • In the first iteration of the loop, GAMS submits scen1, and assigns a handle to it which has the same name as the corresponding folder in the grid directory
  29. 29. When scen1 is running in CPU1, GAMS submits scen2 in CPU2 Scen1 is at iteration 15 of CPLEX/Barrier, by the time when scen2 is submitted When the generation of the model matrix of scen2 is finished, scen1 is already at iteration 20
  30. 30. Collection of the results from the GRID • We implement a GAMS source file, similar to the report main driver file of TIMES • It takes as a call parameter %gams.user1% the name of the scenario scalar h handle of each scenario; h:=handlecollect(handle("%gams.user1%")); %MODEL_NAME%.handle=HANDLE("%gams.user1%"); execute_loadhandle %MODEL_NAME%; *----------------------------------------------------------------------------- * produce the reports (THIS CODE IS THE SAME AS IN THE REPORT GENERATOR IN TIMES) *----------------------------------------------------------------------------- $ LABEL REPORT $ BATINCLUDE rptmain. %gams.user2% %gams.user2% NO_EMTY $ IF NOT %TIMESED%==0 $ IF NOT %TIMESED%==YES $BATINCLUDE wrtbprice.mod $ IF SET SPOINT $BATINCLUDE spoint. %gams.user2% 0 *----------------------------------------------------------------------------- * do an check on compile/execute errors from reports *----------------------------------------------------------------------------- $ BATINCLUDE err_stat.mod '$IF NOT ERRORFREE' ABORT '*** ERRORS IN GAMS COMPILE ***' $ BATINCLUDE err_stat.mod ABORT EXECERROR '*** ERRORS IN GAMS EXECUTION ***' *----------------------------------------------------------------------------- * dump to the gdx file for VEDA_BE execute_unload "gamssave/%gams.user1%.gdx"; The collection loop can also be implemented into TIMES without changing the reporting routines, but would require some design considerations that were not pursued at this stage.
  31. 31. Collecting the results for processing in VEDA-BE • We call the previous script to collect the solution from the grid subdirectory corresponding to the scenario we want to process in VEDA-BE; this would generate the .gdx file with the solution gams gc_report.gms idir=..times r=submit gdir=grid filecase=2 user2=mod user1=scen1 • The .gdx files of the scenarios can then be transferred to Windows for processing with VEDA-BE by using the gdx2veda utility Linux collection of the .gdx files with the solutions of the 3 scenarios Windows VEDA-BE
  32. 32. Conclusions • TIMES could be turned into a cross-platform modelling framework with minor code modification • Single-CPU solution times seems to be 20 – 50% less in Linux than in Windows (depends on the model size/structure needs further tests) • Running multiple independent scenarios can be done either by: • Manually generating and solving the scenarios through user-defined scripts • Making use the advanced GAMS language features for Grid Computing (needs modification of the TIMES source code) • Using multiple nodes for running multiple scenarios results in optimum distribution of computation load (maximum use of resources, minimum solution time)
  33. 33. Conclusions • Suggestions from this report regarding the transferring of TIMES in Linux would be implemented in the next release of TIMES source code (Antti kindly offered to do this) • Better integration of grid computing and mutli-trheading features into TIMES could be considered and implemented if there is a general interest (in coordination with Antti, Amit and everyone else who is interested in this facility) • What is next? Exploring smart decompositions of a TIMES-based model such that to exploit multiple CPU cores for a single scenario run • Benefiting from the BEAM-ME project (PSI and UCC are in the advisory board) • Using new GAMS language features regarding model annotation for facilitating decomposition
  34. 34. Thank you for your attention Our thanks go to: Gary Goldstein Antti Lehtilä Maurizio Gargiulio Joseph DeCarolis, Sonia Yeh .. and to many others who tried in the past to run TIMES under Linux and are not visible to the rest of the TIMES community

×