Hi,
I have been experimenting with Reinforcement Learning and EnergyPlus using a wrapper called Sinergym. However, so far I have found that the optimal control to save energy is to always set the cooling setpoint to the lowest possible temperature. That achieves a significant reduction on IT power consumption while the increase of power consumption of the HVAC is much smaller. Is that a flaw with the energyplus model? If so, is there any changes I could make to solve the issue?