Question-and-Answer Resource for the Building Energy Modeling Community
Get started with the Help page
Ask Your Question
6

Why does datapoint.zip exist?

asked 2018-08-27 15:22:20 -0600

Determinant's avatar

updated 2019-02-16 12:56:00 -0600

Is data_point.zip an exact copy of the run directory? If so, why are there both the run directory and a compressed version of the run directory?

Is there a way to disable the creation of data_point.zip? I'm not running anything on the cloud. but my large runs are taking forever to compress, adding 15+ minutes to run time.

Also, this issue may be exacerbated by this other issue, which does in my case affect Linux as well.

edit retag flag offensive close merge delete

Comments

1

@Determinant while not a direct solution to having data_point.zip not made, one option to make it smaller is a reporting measure that throw away many of the files from the run directory prior to the zip being made. If you are are interested in that I can add an answer with the relevant ruby code. I don't think we currently have that measure published anywhere. We have used it on large analyses with lots of time series results.

David Goldwasser's avatar David Goldwasser  ( 2019-02-14 10:08:34 -0600 )edit

Thanks @David Goldwasser, yes, please post that. Maybe it can help out. Hi again @David Goldwasser, just pinging to see if that Reporting Measure can still be made available. Thanks

Determinant's avatar Determinant  ( 2019-02-14 15:37:25 -0600 )edit

@Determinant here is example code to put in reporting measure to delete simulation files. You can repeat this code for any type of file you want to delete. Just make sure this is the last reporting measure in the workflow so you are not deleting files that other reporting measures might need.

Dir.glob("./../*.sql").each do |f|
  File.delete(f)
  runner.registerInfo("Deleted #{f} from the run directory.") if !File.exist?(f)
end
David Goldwasser's avatar David Goldwasser  ( 2019-02-25 09:48:50 -0600 )edit

Thanks @David Goldwasser, I'll work on converting this to a Reporting Measure.

Determinant's avatar Determinant  ( 2019-02-27 12:35:28 -0600 )edit

2 Answers

Sort by ยป oldest newest most voted
5

answered 2019-02-16 11:22:30 -0600

data_point.zip supports the architecture of OpenStudio-Server, which is what PAT uses when running simulations both locally and on the cloud.

When you run OpenStudio-Server, you are actually running one "server" and one or more "workers." These may all physically be on one computer (like when you run PAT on your laptop) or split across many computers (like when you run on the cloud). The "server" has the brains and is what you interface with, usually via PAT. The "workers" simply accept jobs, run them, and send back the results, which live inside data_point.zip. See diagram below:

image description

When you are running on the cloud, this is obviously necessary: you clearly need to be able to get the results from the computer(s) running the workers to the computer running the server to your computer running PAT. When all of these processes are located on your local computer it seems redundant.

Unfortunately this is a foundational aspect of the design of OpenStudio-Server, so it can't be disabled. If you don't need all of the simulation results, you can use the approach suggested in a comment: delete the unnecessary files using a Reporting Measure (which is run before the data_point.zip is created). If you do need those simulation results, you are kind of out of luck.

edit flag offensive delete link more

Comments

1

Thanks for the info. Does it make a difference that I use the CLI, not the PAT, for my parametrics?

Determinant's avatar Determinant  ( 2019-02-16 12:52:46 -0600 )edit
1

answered 2018-09-08 02:49:45 -0600

Avi's avatar

I believe it has to do with cloud simulation. I found some clues here and here.

edit flag offensive delete link more

Comments

1

That's kind of what I suspected but in that case, it's using up disk space without serving a purpose from what I can see. I run Radiance and so the file can be 300+MB.

Determinant's avatar Determinant  ( 2018-09-08 20:09:30 -0600 )edit

Hey Eric, you can also enable the "cleanup_data" option on the Radiance Measure to get rid of the individual window group matrices once the final merged results matrix has been created. This will reduce the size of the run directory/datapoint.zip files.

rpg777's avatar rpg777  ( 2019-02-19 13:38:23 -0600 )edit

Thanks Rob, I may have to write my own custom cleanup because I do a lot of post-processing with various files. But still, I think this issue is the biggest culprit. the SQLs and other big files being copied multiple times into a zip is a big bottleneck, even if I get the Reporting Measure talked about above.

Determinant's avatar Determinant  ( 2019-02-24 15:10:48 -0600 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Training Workshops

Careers

Question Tools

2 followers

Stats

Asked: 2018-08-27 15:22:20 -0600

Seen: 335 times

Last updated: Feb 16 '19