Question-and-Answer Resource for the Building Energy Modeling Community
Get started with the Help page
Ask Your Question
3

How does one correct errors related to truncated images ?

asked 2015-09-29 13:36:54 -0500

updated 2015-10-01 12:35:27 -0500

I was trying to understand the new photon mapping extension and tried rendering a bunch of views using the RAD program.

My settings were:

QUALITY=     H
DETAIL=      H
VARIABILITY= high
INDIRECT=    1
PENUMBRAS=   true

RESOLUTION=  1844 863
ZONE=        I 0   60        0   60   -1.5873   4.00685
EXPOSURE=    1

render=      -ad 10000

I ran into errors after the images finished rendering. A screenshot of my terminal is below. Error messages are in the lower part of the image:

image description

I tried running pfilt on the images and found that almost all the images that were unfinished were truncated in the vertical axis. Some of the images, all of which are supposed to be of the same dimensions, are below:

image description

image description

image description

I got pfilt: warning - partial frame (70%)on all the images where the % range was between 50% to 96%. I googled for 'pfilt partial error' but did not really find anything anywhere except here http://arch.xtr.jp/radiance/tips.htm. Is there anything that I can do to avoid this error in the future ?

The errors/warnings from the logfile are below:

rpict: 289913 rays, 6.90% after 0.001u 0.000s 15.973r hours on hammer12.hpc.rcc.psu.edu (PID 19734)
rpict: signal - Terminate
rpict: 1003391 rays, 31.03% after 0.004u 0.000s 15.976r hours on hammer12.hpc.rcc.psu.edu (PID 19649)
rpict: 464166 rays, 13.79% after 0.002u 0.000s 15.968r hours on hammer12.hpc.rcc.psu.edu (PID 19859)
rpict: signal - Hangup
rpict: 289913 rays, 6.90% after 0.001u 0.000s 15.973r hours on hammer12.hpc.rcc.psu.edu (PID 19734)
rpict: signal - Hangup
rpict: 1003391 rays, 31.03% after 0.004u 0.000s 15.976r hours on hammer12.hpc.rcc.psu.edu (PID 19649)

Update (10/01): The issue, as everyone pointed out, was with my rpict renderings getting killed. I ran the renderings with lower settings and everything worked out fine. image description

edit retag flag offensive close merge delete

Comments

This problem stems from rpict terminating before finishing. Are there any errors in the log (logfiles/unamedscene.log)?

Andyrew's avatar Andyrew  ( 2015-09-29 18:51:01 -0500 )edit

Andy, I have updated my question with the error messages that I found. There were a whole lot warnings due to vertices being non-planar. I haven't pasted those above.

I had earlier rendered the same views with gensky and that worked out fine. Do the errors have anything to do with the fact that my luminaires were underneath a dielectric material ? The surface normals of the dielectric were facing outwards i.e away from the luminaires.

Sarith's avatar Sarith  ( 2015-09-29 19:52:06 -0500 )edit
1

electric lights underwater will take longer to run than with a sky, but shouldn't be a problem. It looks like all the rpict processes were killed after 16 hours. You can continue the rendering where it left off with rpict's -ro option (see the man page).

It would be nice to figure out why everything stopped. What system are you running this on? Is it possible that there is a time limit on a running process?

Andyrew's avatar Andyrew  ( 2015-09-29 20:08:44 -0500 )edit

I wasn't aware about the -ro option. I have the directory structure intact, so I will try and run the renderings from where it got truncated.

Radiance was compiled from source through the NREL repository on github. As far as I know there isn't a time limit on processes but I could be wrong about that. I was asking about the materials because the error log shows that in one case only 6.9 % rays were traced after nearly 16 hours.

Update: I was running mulitple simulations at the time so I got the systems mixed up in the last comment.. the renderings were run on a linux cluster..details above

Sarith's avatar Sarith  ( 2015-09-29 20:23:02 -0500 )edit

1 Answer

Sort by ยป oldest newest most voted
4

answered 2015-09-30 11:20:08 -0500

updated 2015-09-30 11:21:14 -0500

Your rpict processes are being killed by the system, somehow. The only time rpict will report a signal is when it receives one, and it does what a process should do by politely exiting. Pfilt is then left to clean up after, and finds the images short of the length they claim. Simply re-running the rad program should finish up the renderings where they left off, provided the processes don't get killed again.

You need to find out why your processes are being terminated. One possibility is that the system is running low on memory, so decides to kill its larger processes. Another possibility is that the file size is being exceeded during rendering, though this seems unlikely as your images are not that big. Check out what the limits are in your shell "ulimit" command ("limit" in C-shell). Also, make sure that huponexit is not set, and/or you have run "disown -h PID" on the process, so that it won't get killed when the shell dies. (These are bash-only commands -- I don't think the C-shell kills its children so eagerly.)

Let us know what you find out!

edit flag offensive delete link more

Comments

Hi Greg, thanks for the insights. My ulimit is unlimited and huponexit is set to off. I have updated my post with system configuration, shell options as well as file sizes. I don't have enough experience with Unix systems to decipher which option means what. So I have pasted everything in the post. I ran the renderings on a linux cluster. The renderings that did render without errors rendered within 6 or 7 hours. The remaining kept growing in size but did not complete. I think I might have gotten logged off once but the processes kept running even after I got logged off.

Sarith's avatar Sarith  ( 2015-09-30 13:36:40 -0500 )edit

I would suspect the cluster management in this case. Is it possible that a system administrator killed your rpict processes because they didn't recognize them or thought they were "hung?" The log file shows some were killed with a HANGUP signal and others with TERMINATE, which is difficult to explain with a single cause unless it was human.

GregWard's avatar GregWard  ( 2015-09-30 13:59:23 -0500 )edit

It is possible that processes were killed by a system admin. The cluster that I used is an interactive cluster that is meant for "short jobs" although I am not sure what time interval qualifies as a short job ( no documentation). In the past I have never run into renderings that took more than 10 hours. I am going to make a clean break and try and render this again. Is there a way to debug or track a rendering processes during runtime ie to know if it will render at all ? Secondly, is there a likelihood of something like total internal reflection happening and light rays not escaping ?

Sarith's avatar Sarith  ( 2015-09-30 14:21:10 -0500 )edit
1

The progress reports are the best way to make sure that the renderings are progressing, which they seemed to be doing until they got killed. Even in cases where total internal reflection prevents rays escaping, the tracing will hit some limit (either -lw or -lr and setting both to 0 will give an error) to prevent an infinite loop. Believe me, if there were any infinite loops in Radiance, people would be complaining about it!

GregWard's avatar GregWard  ( 2015-09-30 15:16:21 -0500 )edit
1

My experience is that interactive nodes on a cluster usually have hard limits on process running time. This means the system will automatically kill it after X hours. Interactive nodes are really for file management and sometimes script debugging, so you really shouldn't run long simulations on an interactive node (assuming there are productions nodes on your cluster). Your system administrator can probably help you to write a script to submit the simulation to the system scheduler if you don't know how.

Andyrew's avatar Andyrew  ( 2015-09-30 15:22:27 -0500 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Careers

Question Tools

1 follower

Stats

Asked: 2015-09-29 13:36:54 -0500

Seen: 415 times

Last updated: Oct 01 '15