Question-and-Answer Resource for the Building Energy Modeling Community
Get started with the Help page
Ask Your Question
4

Concurrent EnergyPlus simulations cause processor bottleneck!?!

asked 2017-10-16 20:36:35 -0500

Stuy1974's avatar

updated 2017-10-17 07:30:28 -0500

Hi there.

I have a question regarding EnergyPlus simulation performance/speed and parallelisation.

We have created some software that allows us to split a simulation into multiple parts and run these simulations concurrently. This software currently runs on a single physical machine with a decent specification: AMD 8-core 3.00 GHz, 16GB RAM, SSD.

Unfortunatley we have noticed an interaction between instances of EnergyPlus when running multple simulations concurrently, e.g.

  • 1 concurrent simulation run-time : 245 seconds
  • 2 concurrent simulations run-time: 481 seconds
  • 3 concurrent simulations run-time: 627 seconds

We expected that 2 concurrent simulations would run in the same time as 1 simulation. This was not the case, in fact we pay a time penalty for each additional concurrent simulation!!!

We repeated this experiment using virtual machines (in the Azure cloud). Our results were as originally expected e.g. 1 simulation took the same time to run as 2 (and 3, 4, etc.).

This leads us to believe there is a bottleneck caused by concurrent EnergyPlus simulation. We believe we have ruled out RAM & Storage (SSD).

Any thoughts and/or potential solutions would be greatly appreciated?

Regards, Stuart

edit retag flag offensive close merge delete

Comments

What version of EnergyPlus? What operating system?

shorowit's avatar shorowit  ( 2017-10-17 08:18:14 -0500 )edit
2

How many cores does one concurrent simulation use?

JasonGlazer's avatar JasonGlazer  ( 2017-10-17 08:24:54 -0500 )edit

Hi Stuy1974,

From what I've read it sounds like you've figured out how to run EnergyPlus in Azure. Do you think you could tell me generally how you did that? I've been trying to run EnergyPlus simulations in Batch, but it's proving to be difficult. E+ seems to try to find the .IDD file in the C drive. This error occurs. " Missing C:\EnergyPlusV8-1-0\Energy+.idd *FATAL:ProcessInput: Energy+.idd missing. Program terminates. Fullname=C:\EnergyPlusV8-1-0\Energy+.idd "*

I'd really appreciate any hints as to how you got parallel simulations of E+ in Azure!

Thanks, Grant

__grant_payne__'s avatar __grant_payne__  ( 2019-01-21 18:09:51 -0500 )edit

3 Answers

Sort by ยป oldest newest most voted
3

answered 2017-10-17 13:57:35 -0500

updated 2017-10-17 14:00:53 -0500

What is interesting is that you saw linear scaling on the Azure cloud instance (which likely use Intel processors) but not with the AMD (I assume Ryzen 7 1700 based off specs). The fact that you got linear scaling when running concurrent simulations on the cloud shows that EnergyPlus can run concurrently with minimal performance impact. Which is consistent with many other people's and my experience when running numerous concurrent simulations.

One thing with many modern CPUs is that they scale their frequency based on number of cores in use (Ryzen 7 1700 has 3.7 GHz max turbo speed vs 3.0 GHz base). Thus if you are doing a single (or a few) simulation(s), these will run faster than running simulations on all available cores.

Next, Ryzen is one of AMD's newest processors. Depending on whether you are running on Linux or Windows, the compiler we use to build the releases may not take advantage of AMD's latest technology and hardware. The architecture of AMD's new processors are quite different and the code might not be optimized for it. One thing I would suggest is compiling EnergyPlus on that computer with the newest version of Visual Studio or GCC (depending on operating system) and run your tests again. It may not solve your problem but it at least eliminates a variable.

In searching google, there seems to be potential Windows thread scheduler issues, BIOS bugs, Linux bugs, and NUMA-like architecture that causes scheduling and caching issues. All these can affect the performance.

My general guess is there are hardware oddities and software bugs that are causing thread and cache scheduling issues, especially given that Intel processors work as expected with concurrent EnergyPlus runs. So make sure all chipset drivers, BIOS, operating system, compilers, etc are up to date.

edit flag offensive delete link more
1

answered 2017-10-17 08:17:08 -0500

A lot of EnergyPlus users, myself included, run concurrent EnergyPlus simulations all the time without seeing this behavior.

That said, there was one release of EnergyPlus where the development team tried to introduce parallelization inside EnergyPlus itself and it caused problems -- if I recall correctly, one simulation would use all available cores at 100%. I can't immediately dig up which version it was, maybe something like 7.0? But this was fixed in the subsequent release.

edit flag offensive delete link more
0

answered 2017-10-17 07:30:14 -0500

Shared cache size and off-chip memory bandwidth? See if the slowdown is still present for smaller models, i.e., few zones, few surfaces.

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Careers

Question Tools

2 followers

Stats

Asked: 2017-10-16 20:36:07 -0500

Seen: 387 times

Last updated: Oct 17 '17