Revision history [back]

Connecting Energyplus simulation with OpenAI Gym Environment through Python API

I'm trying to implement OpenAI Gym environment (for reinforcement learning training) with EnergyPlus building environment. I recently made an effort to achieve this by connecting Python with EnergyPlus through Python API. Currently, my understanding is that E+ simulation is run with few lines of code as below

ARGS = [
    '--weather',
    weather_file_path,
    '--output-directory',
    output_dir,
    '--readvars',
    idf_file_path
]

api = EnergyPlusAPI()
state = api.state_manager.new_state()
api.runtime.run_energyplus(state, ARGS)

and the only way I can interact with each timestep is through callback functions provided by Runtime API.

OpenAI Gym environment requires us to implement a "step" function in a class which will be repeatedly called during RL model training. A pseudo-code of what flow inside a step function with EnergyPlus Python API should look like is shown below

 def step(self, action)
    set_eplus_actuators_at_start_timestep(action)
    new_state = get_eplus_sensor_values_at_end_timestep(.....)
    reward = calculate_reward_or_penalty(new_state)
    return new_state, reward

However, since each action towards EnergyPlus running simulation can only be done through callback functions, the step function above cannot be achieved through typical single-process programming.

Any suggestion on how to solve this?

Connecting Energyplus simulation with OpenAI Gym Environment through Python API

ARGS = [
    '--weather',
    weather_file_path,
    '--output-directory',
    output_dir,
    '--readvars',
    idf_file_path
]

api = EnergyPlusAPI()
state = api.state_manager.new_state()
api.runtime.run_energyplus(state, ARGS)

and the only way I can interact with each timestep is through callback functions provided by Runtime API.

 def step(self, action)
    set_eplus_actuators_at_start_timestep(action)
    new_state = get_eplus_sensor_values_at_end_timestep(.....)
    reward = calculate_reward_or_penalty(new_state)
    return new_state, reward

However, since each action towards EnergyPlus running simulation can only be done through callback functions, the step function above cannot be achieved through typical single-process programming.

Any suggestion on how to solve this?