Question-and-Answer Resource for the Building Energy Modeling Community
Get started with the Help page
Ask Your Question
3

How does one correct errors related to truncated images ?

asked 9 years ago

updated 9 years ago

I was trying to understand the new photon mapping extension and tried rendering a bunch of views using the RAD program.

My settings were:

QUALITY=     H
DETAIL=      H
VARIABILITY= high
INDIRECT=    1
PENUMBRAS=   true

RESOLUTION=  1844 863
ZONE=        I 0   60        0   60   -1.5873   4.00685
EXPOSURE=    1

render=      -ad 10000

I ran into errors after the images finished rendering. A screenshot of my terminal is below. Error messages are in the lower part of the image:

image description

I tried running pfilt on the images and found that almost all the images that were unfinished were truncated in the vertical axis. Some of the images, all of which are supposed to be of the same dimensions, are below:

image description

image description

image description

I got pfilt: warning - partial frame (70%)on all the images where the % range was between 50% to 96%. I googled for 'pfilt partial error' but did not really find anything anywhere except here http://arch.xtr.jp/radiance/tips.htm. Is there anything that I can do to avoid this error in the future ?

The errors/warnings from the logfile are below:

rpict: 289913 rays, 6.90% after 0.001u 0.000s 15.973r hours on hammer12.hpc.rcc.psu.edu (PID 19734)
rpict: signal - Terminate
rpict: 1003391 rays, 31.03% after 0.004u 0.000s 15.976r hours on hammer12.hpc.rcc.psu.edu (PID 19649)
rpict: 464166 rays, 13.79% after 0.002u 0.000s 15.968r hours on hammer12.hpc.rcc.psu.edu (PID 19859)
rpict: signal - Hangup
rpict: 289913 rays, 6.90% after 0.001u 0.000s 15.973r hours on hammer12.hpc.rcc.psu.edu (PID 19734)
rpict: signal - Hangup
rpict: 1003391 rays, 31.03% after 0.004u 0.000s 15.976r hours on hammer12.hpc.rcc.psu.edu (PID 19649)

Update (10/01): The issue, as everyone pointed out, was with my rpict renderings getting killed. I ran the renderings with lower settings and everything worked out fine. image description

Preview: (hide)

Comments

This problem stems from rpict terminating before finishing. Are there any errors in the log (logfiles/unamedscene.log)?

Andyrew's avatar Andyrew  ( 9 years ago )

Andy, I have updated my question with the error messages that I found. There were a whole lot warnings due to vertices being non-planar. I haven't pasted those above.

I had earlier rendered the same views with gensky and that worked out fine. Do the errors have anything to do with the fact that my luminaires were underneath a dielectric material ? The surface normals of the dielectric were facing outwards i.e away from the luminaires.

Sarith's avatar Sarith  ( 9 years ago )
1

electric lights underwater will take longer to run than with a sky, but shouldn't be a problem. It looks like all the rpict processes were killed after 16 hours. You can continue the rendering where it left off with rpict's -ro option (see the man page).

It would be nice to figure out why everything stopped. What system are you running this on? Is it possible that there is a time limit on a running process?

Andyrew's avatar Andyrew  ( 9 years ago )

I wasn't aware about the -ro option. I have the directory structure intact, so I will try and run the renderings from where it got truncated.

Radiance was compiled from source through the NREL repository on github. As far as I know there isn't a time limit on processes but I could be wrong about that. I was asking about the materials because the error log shows that in one case only 6.9 % rays were traced after nearly 16 hours.

Update: I was running mulitple simulations at the time so I got the systems mixed up in the last comment.. the renderings were run on a linux cluster..details above

Sarith's avatar Sarith  ( 9 years ago )

1 Answer

Sort by » oldest newest most voted
4

answered 9 years ago

updated 9 years ago

Your rpict processes are being killed by the system, somehow. The only time rpict will report a signal is when it receives one, and it does what a process should do by politely exiting. Pfilt is then left to clean up after, and finds the images short of the length they claim. Simply re-running the rad program should finish up the renderings where they left off, provided the processes don't get killed again.

You need to find out why your processes are being terminated. One possibility is that the system is running low on memory, so decides to kill its larger processes. Another possibility is that the file size is being exceeded during rendering, though this seems unlikely as your images are not that big. Check out what the limits are in your shell "ulimit" command ("limit" in C-shell). Also, make sure that huponexit is not set, and/or you have run "disown -h PID" on the process, so that it won't get killed when the shell dies. (These are bash-only commands -- I don't think the C-shell kills its children so eagerly.)

Let us know what you find out!

Preview: (hide)
link

Comments

Hi Greg, thanks for the insights. My ulimit is unlimited and huponexit is set to off. I have updated my post with system configuration, shell options as well as file sizes. I don't have enough experience with Unix systems to decipher which option means what. So I have pasted everything in the post. I ran the renderings on a linux cluster. The renderings that did render without errors rendered within 6 or 7 hours. The remaining kept growing in size but did not complete. I think I might have gotten logged off once but the processes kept running even after I got logged off.

Sarith's avatar Sarith  ( 9 years ago )

I would suspect the cluster management in this case. Is it possible that a system administrator killed your rpict processes because they didn't recognize them or thought they were "hung?" The log file shows some were killed with a HANGUP signal and others with TERMINATE, which is difficult to explain with a single cause unless it was human.

GregWard's avatar GregWard  ( 9 years ago )

It is possible that processes were killed by a system admin. The cluster that I used is an interactive cluster that is meant for "short jobs" although I am not sure what time interval qualifies as a short job ( no documentation). In the past I have never run into renderings that took more than 10 hours. I am going to make a clean break and try and render this again. Is there a way to debug or track a rendering processes during runtime ie to know if it will render at all ? Secondly, is there a likelihood of something like total internal reflection happening and light rays not escaping ?

Sarith's avatar Sarith  ( 9 years ago )
1

The progress reports are the best way to make sure that the renderings are progressing, which they seemed to be doing until they got killed. Even in cases where total internal reflection prevents rays escaping, the tracing will hit some limit (either -lw or -lr and setting both to 0 will give an error) to prevent an infinite loop. Believe me, if there were any infinite loops in Radiance, people would be complaining about it!

GregWard's avatar GregWard  ( 9 years ago )
1

My experience is that interactive nodes on a cluster usually have hard limits on process running time. This means the system will automatically kill it after X hours. Interactive nodes are really for file management and sometimes script debugging, so you really shouldn't run long simulations on an interactive node (assuming there are productions nodes on your cluster). Your system administrator can probably help you to write a script to submit the simulation to the system scheduler if you don't know how.

Andyrew's avatar Andyrew  ( 9 years ago )

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Training Workshops

Careers

Question Tools

1 follower

Stats

Asked: 9 years ago

Seen: 533 times

Last updated: Oct 01 '15