Revision history [back]

Whole MultiFamily Simulation in ResStock

I am trying to run a whole multifamily building simulation in ResStock by setting whole_sfa_or_mf_building_sim = true in measure.rb.

Initially, the simulation failed with this error:

pandas.errors.MergeError: Passing 'suffixes' which cause duplicate columns {'schedules_No Space Heating_x', 'schedules_hot_water_fixtures_x', 'schedules_Vacancy_x', 'schedules_hot_water_clothes_washer_x', 'schedules_plug_loads_other_x', 'schedules_lighting_interior_x', 'schedules_clothes_washer_x', 'schedules_ceiling_fan_x', 'schedules_occupants_x', 'schedules_plug_loads_tv_x', 'schedules_lighting_garage_x', 'schedules_No Space Cooling_x', 'schedules_Power Outage_x', 'schedules_cooking_range_x', 'schedules_clothes_dryer_x', 'schedules_hot_water_dishwasher_x', 'schedules_dishwasher_x'} is not allowed.

I traced this back to how schedules are merged in buildstockbatch/base.py. The original merging code was:

for schedules_filepath in schedules_filepaths:
    schedules = read_csv(schedules_filepath, dtype=np.float64)
    schedules.rename(columns=lambda x: f"schedules_{x}", inplace=True)
    schedules["TimeDST"] = tsdf["Time"]
    tsdf = tsdf.merge(schedules, how="left", on="TimeDST")

To fix the issue, I modified it as follows:

for schedules_filepath in schedules_filepaths:
    schedules = read_csv(schedules_filepath, dtype=np.float64)
    time_cols = {"TimeDST", "Time", "time", "TimeUTC"}
    schedules.rename(
        columns=lambda c: c if c in time_cols else f"schedules_{c}",
        inplace=True,
    )
    schedules["TimeDST"] = tsdf["TimeDST"]

    dup = [c for c in schedules.columns if c in tsdf.columns and c != "TimeDST"]

    if dup:
        logging.debug(f"Dropping duplicate schedule columns: {dup}")
        schedules.drop(columns=dup, inplace=True)

    tsdf = tsdf.merge(schedules, how="left", on="TimeDST")

With this change, the simulation runs successfully when max_num_units_modeled in measure.rb is left at its default value of 5.

However, as soon as I increase max_num_units_modeled to any value greater than 5 (even 6), the simulation fails due to running out of RAM.

My questions:

Is there any straightforward way to run whole multifamily building simulations without modifying the base code?
If not, does my fix make sense? Or is there a better long-term fix for the issue?
Is there a known limitation with higher max_num_units_modeled values in ResStock (e.g., performance, memory constraints)?
Has anyone successfully run whole multifamily building simulations without hitting these memory problems?

Thank you!

Whole MultiFamily Simulation in ResStock

I am trying to run a whole multifamily building simulation in ResStock by setting whole_sfa_or_mf_building_sim = true in measure.rb.

Initially, the simulation failed with this error:

pandas.errors.MergeError: Passing 'suffixes' which cause duplicate columns {'schedules_No Space Heating_x', 'schedules_hot_water_fixtures_x', 'schedules_Vacancy_x', 'schedules_hot_water_clothes_washer_x', 'schedules_plug_loads_other_x', 'schedules_lighting_interior_x', 'schedules_clothes_washer_x', 'schedules_ceiling_fan_x', 'schedules_occupants_x', 'schedules_plug_loads_tv_x', 'schedules_lighting_garage_x', 'schedules_No Space Cooling_x', 'schedules_Power Outage_x', 'schedules_cooking_range_x', 'schedules_clothes_dryer_x', 'schedules_hot_water_dishwasher_x', 'schedules_dishwasher_x'} is not allowed.

I traced this back to how schedules are merged in buildstockbatch/base.py. The original merging code was:

for schedules_filepath in schedules_filepaths:
    schedules = read_csv(schedules_filepath, dtype=np.float64)
    schedules.rename(columns=lambda x: f"schedules_{x}", inplace=True)
    schedules["TimeDST"] = tsdf["Time"]
    tsdf = tsdf.merge(schedules, how="left", on="TimeDST")

To fix the issue, I modified it as follows:

for schedules_filepath in schedules_filepaths:
    schedules = read_csv(schedules_filepath, dtype=np.float64)
    time_cols = {"TimeDST", "Time", "time", "TimeUTC"}
    schedules.rename(
        columns=lambda c: c if c in time_cols else f"schedules_{c}",
        inplace=True,
    )
    schedules["TimeDST"] = tsdf["TimeDST"]

    dup = [c for c in schedules.columns if c in tsdf.columns and c != "TimeDST"]

    if dup:
        logging.debug(f"Dropping duplicate schedule columns: {dup}")
        schedules.drop(columns=dup, inplace=True)

    tsdf = tsdf.merge(schedules, how="left", on="TimeDST")

With this change, the simulation runs successfully when max_num_units_modeled in measure.rb is left at its default value of 5.

However, as soon as I increase max_num_units_modeled to any value greater than 5 (even 6), the simulation fails due to running out of RAM.

My questions:

Is there any straightforward way to run whole multifamily building simulations without modifying the base code?
If not, does my fix make sense? Or is there a better long-term fix for the issue?
Is there a known limitation with higher max_num_units_modeled values in ResStock (e.g., performance, memory constraints)?
Has anyone successfully run whole multifamily building simulations without hitting these memory problems?

Thank you!

Whole MultiFamily Simulation in ResStock

I am trying to run a whole multifamily building simulation in ResStock by setting whole_sfa_or_mf_building_sim = true in measure.rb.

Initially, the simulation failed with this error:

pandas.errors.MergeError: Passing 'suffixes' which cause duplicate columns {'schedules_No Space Heating_x', 'schedules_hot_water_fixtures_x', 'schedules_Vacancy_x', 'schedules_hot_water_clothes_washer_x', 'schedules_plug_loads_other_x', 'schedules_lighting_interior_x', 'schedules_clothes_washer_x', 'schedules_ceiling_fan_x', 'schedules_occupants_x', 'schedules_plug_loads_tv_x', 'schedules_lighting_garage_x', 'schedules_No Space Cooling_x', 'schedules_Power Outage_x', 'schedules_cooking_range_x', 'schedules_clothes_dryer_x', 'schedules_hot_water_dishwasher_x', 'schedules_dishwasher_x'} is not allowed.

I traced this back to how schedules are merged in buildstockbatch/base.py. The original merging code was:

for schedules_filepath in schedules_filepaths:
    schedules = read_csv(schedules_filepath, dtype=np.float64)
    schedules.rename(columns=lambda x: f"schedules_{x}", inplace=True)
    schedules["TimeDST"] = tsdf["Time"]
    tsdf = tsdf.merge(schedules, how="left", on="TimeDST")

To fix the issue, I modified it as follows:

for schedules_filepath in schedules_filepaths:
    schedules = read_csv(schedules_filepath, dtype=np.float64)
    time_cols = {"TimeDST", "Time", "time", "TimeUTC"}
    schedules.rename(
        columns=lambda c: c if c in time_cols else f"schedules_{c}",
        inplace=True,
    )
    schedules["TimeDST"] = tsdf["TimeDST"]

    dup = [c for c in schedules.columns if c in tsdf.columns and c != "TimeDST"]

    if dup:
        logging.debug(f"Dropping duplicate schedule columns: {dup}")
        schedules.drop(columns=dup, inplace=True)

    tsdf = tsdf.merge(schedules, how="left", on="TimeDST")

With this change, the simulation runs successfully when max_num_units_modeled in measure.rb is left at its default value of 5.

However, as soon as I increase max_num_units_modeled to any value greater than 5 (even 6), the simulation fails due to running out of RAM.

My questions:

Is there any straightforward way to run whole multifamily building simulations without modifying the base code?
If not, does my fix make sense? Or is there a better long-term fix for the issue?
Is there a known limitation with higher max_num_units_modeled values in ResStock (e.g., performance, memory constraints)?
Has anyone successfully run whole multifamily building simulations without hitting these memory problems?

Thank you!