PAT calibration project - performance and parallelization

asked 2022-10-26 05:44:55 -0500

updated 2023-06-07 09:41:00 -0500

Hi everyone,

I have been trying to use PAT to calibrate a building model to measured data. The building model was developed using project specifications, i have measured data for almost a full year, with 10 min granularity (timestep) for rooms temperature and HVAC energy consumption. I also have a AMY weather file for that specific location.

I am currently running PAT in Algorithmic mode using OpenStudio-server on AWS, with a node type "t2.large" and max of 4 nodes.

My problem is the run time per each simulation. Running the model locally using EnergyPlus takes round 30s while in PAT is taking almost 3 minutes.

I have done some testing and believe that the problem is the reporting measure "Time Series Objective Function" that i am using to calculate calibration metrics (CVRMSE). This measure is being used per each room with temperature readings (8 rooms) plus for total HVAC energy consumption.

I would like to know if there is a way to improve simulation time and also if it is possible to run simulations in parallel.

Thanks!

edit retag flag offensive close merge delete

Comments

Sorry to be brief as I am traveling, but all the algorithms run in parallel. My recommendation is to choose a better instance type, with more cores, etc.

BrianLBall's avatar BrianLBall  ( 2022-10-27 14:12:10 -0500 )edit

Thank you for your answer and sorry for the late reply (was on vacations). I have manage to run simulations in parallel by changing the eksctl cluster settings, increasing both the number of nodes and minimum nodes (ex. --nodes 2 \ --nodes-min 2).

Regarding the reporting measure the problem still exists, since is increasing a lot the simulation time. I have also noticed that having more run periods (i have data with some missing days) also was a huge impact. Could this be related to the SQL querys being repeated for every run period?

Raul Teixeira's avatar Raul Teixeira  ( 2022-11-03 03:15:46 -0500 )edit

Also regarding the measure, it has a lot of cool features (like the graphic representation), but could there be a way of just reporting the CVRMSE with less computation power required? Thank you!

Raul Teixeira's avatar Raul Teixeira  ( 2022-11-03 03:18:35 -0500 )edit

ah, its coming back to me. sorry its been 6 years. there is an argument 'verbose_messages'. This is very useful to get the output of the measure for debugging and getting the sql query right, but its a major performance hit. make sure that is set to FALSE when doing production runs. i think there's a few other arguments that are not needed for production runs, like 'find_avail' which you can also set to FALSE.

BrianLBall's avatar BrianLBall  ( 2022-11-03 07:55:05 -0500 )edit

Thanks again for your answer. I was already setting all these arguments to FALSE, but the reporting measures are still taking around 40% of the total simulation time. Is there something i am missing? Another question i have is related to parallelization. Is it possible to have several simulations runs per each node? I have managed to increase the number of parallel simulations to 6 by using the following setting of openstudio-server (set worker_hpa.minReplicas=6), but this is just adding 6 instances (nodes) that have a CPU utilization of arround 50%.

Raul Teixeira's avatar Raul Teixeira  ( 2022-11-07 05:59:39 -0500 )edit