Question-and-Answer Resource for the Building Energy Modeling Community
Get started with the Help page
Ask Your Question

Revision history [back]

I recommend using Python for this. There are many tools that are good for text processing. Since the EPW is a quasi-csv format you can use some of the CSV tools to speed things up. You have to be creative by splitting the header and body into two datasets, though.

One option is to use the csv module to read the data and process by row.

import csv
with open('some.csv', 'rb') as f:
    reader = csv.reader(f)
    for row in reader:
        print row

You will have to determine which rows are the header data and process that data differently. You might choose to skip that data in a loop for each row. The hourly data can be extracted by column and put into your preferred data type (e.g. numpy arrays).

Another option is to use the numpy module genfromtxt.

data = np.genfromtxt(s, names = ['var1','var2', 'var3'], skip_header=num_to_skip, delimiter=",")

genfromtxt is a more powerful tool that turns the data into a numpy array. It allows you to skip headers, skip footers, pass in the header names and delimiter types. You can access the numpy array column by the column name that you pass in, which is very useful for processing.

The EnergyPlus EPW format is described in the Auxiliary Programs Documentation, so you can use that to figure out what the variables are for each column.

To write the data it may be easiest to combine the whole data set into a single string variable with comma delimiters and newline characters and then write that with the standard Python file I/O

with open('file.epw', 'w') as f:
    f.write(data_string)