Upgrade OpenStudio from 3.2.1 to 3.3.0
Hey everyone! I'm getting this error when I try to run a calibration simulation after I upgraded from 3.2.1 to 3.3.0. any ideias???
/opt/openstudio/server/app/lib/analysis_library/rgenoud.rb failed with voidEval failed: , /usr/local/lib/ruby/gems/2.7.0/gems/rserve-client-0.3.5/lib/rserve/connection.rb:178:in `void_eval'
complete error log:
/opt/openstudio/server/app/lib/analysis_library/rgenoud.rb failed with voidEval failed: , /usr/local/lib/ruby/gems/2.7.0/gems/rserve-client-0.3.5/lib/rserve/connection.rb:178:in `void_eval' /usr/local/lib/ruby/gems/2.7.0/gems/rserve-simpler-0.0.6/lib/rserve/simpler.rb:74:in `command' /opt/openstudio/server/app/lib/analysis_library/r/cluster.rb:83:in `start' /opt/openstudio/server/app/lib/analysis_library/rgenoud.rb:224:in `perform' /opt/openstudio/server/app/jobs/resque_jobs/run_analysis.rb:43:in `perform' /usr/local/lib/ruby/gems/2.7.0/gems/resque-2.0.0/lib/resque/job.rb:168:in `perform' /usr/local/lib/ruby/gems/2.7.0/gems/resque-2.0.0/lib/resque/worker.rb:308:in `perform' /usr/local/lib/ruby/gems/2.7.0/gems/resque-2.0.0/lib/resque/worker.rb:897:in `block in perform_with_fork' /usr/local/lib/ruby/gems/2.7.0/gems/resque-2.0.0/lib/resque/worker.rb:895:in `fork' /usr/local/lib/ruby/gems/2.7.0/gems/resque-2.0.0/lib/resque/worker.rb:895:in `perform_with_fork' /usr/local/lib/ruby/gems/2.7.0/gems/resque-2.0.0/lib/resque/worker.rb:264:in `work_one_job' /usr/local/lib/ruby/gems/2.7.0/gems/resque-2.0.0/lib/resque/worker.rb:238:in `block in work' /usr/local/lib/ruby/gems/2.7.0/gems/resque-2.0.0/lib/resque/worker.rb:235:in `loop' /usr/local/lib/ruby/gems/2.7.0/gems/resque-2.0.0/lib/resque/worker.rb:235:in `work' /usr/local/lib/ruby/gems/2.7.0/gems/resque-2.0.0/lib/resque/tasks.rb:20:in `block (2 levels) in <top (required)>' /usr/local/lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/task.rb:281:in `block in execute' /usr/local/lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/task.rb:281:in `each' /usr/local/lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/task.rb:281:in `execute' /usr/local/lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/task.rb:219:in `block in invoke_with_call_chain' /usr/local/lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/task.rb:199:in `synchronize' /usr/local/lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/task.rb:199:in `invoke_with_call_chain' /usr/local/lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/task.rb:188:in `invoke' /usr/local/lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/application.rb:160 ...
Are you using the meta-CLI? If so did you upgrade to the meta-CLI included in PAT 3.3.0 which just came out yesterday https://github.com/NREL/OpenStudio-PA...
hey David! so, I'm using openstudio_meta from Docker
@Julio Betta I was able to run rgenoud algorithm. on our 3.3.0 so it isn't a global issue with that algorithm. Try restarting the server and see if this goes away. If not can you let me know if it si local or AWS deployment, and can you share the failed analysis log.
we're using the latest version of openstudio-server-helm (v3.3.0) on azure. here's the link to the logs (https://www.dropbox.com/s/vvxs6zsgftd...) and osa (https://www.dropbox.com/s/drem25s9i89...)
thanks ;)
if restarting doesnt work, post the other logs (simulate_datapoint log, resque log from admin page).
the R log looks like the R cluster never starts up, are there worker nodes starting up? whats your helm configuration look like?
hey Brian! I recorded a quick video for you guys. This time I re-installed openstudio in a different server (gcp), but I got the same error... I forgot to mention that this simulation was working in v3.2.1
https://www.dropbox.com/s/gk7vvfgb56p...
this is what a successful R cluster start up looks like from the logs:
[1] "Current working directory is /mnt/openstudio"
max_queued_jobs: 42
[1] "Starting cluster..."
[1] "Number of Workers: 100"
[1] "max timeout is: 180"
[1] "R cluster startup time: 24.0866537094116"
max_queued_jobs gets set to an ENV here https://github.com/NREL/OpenStudio-se...
and that ENV gets set here https://github.com/NREL/OpenStudio-se...
so why is your ENV for OS_SERVER_NUMBER_OF_WORKERS not being set. Are you running this on local hardware? if so, set that as an ENV to the number of workers you want and retry. If its not running locally, who is hosting the server?
I'm running OS in a remote server (google cloud), and I followed the instructions from openstudio-server-helm (https://github.com/NREL/openstudio-se...). I didn't change any values... I did notice that OS_SERVER_NUMBER_OF_WORKERS isn't defined by values.yml. https://github.com/NREL/openstudio-se...
if you can, see if this PR from Tim helps https://github.com/NREL/openstudio-se...
I pulled the latest version of openstudio-server-helm and now OS_SERVER_NUMBER_OF_WORKER is defined correctly. it's still throwing the same error though. this is how R cluster start up looks like now.
... max_queue_jobs looks strange. 7 is the number of workers... any ideas on what does 7[1] mean?
so 7 is the default number of workers for when it doesnt get set. you can set the size of the R cluster in the OSA like here: https://github.com/NREL/OpenStudio-se...
it should be the same size as the number of workers.
what error are you seeing now? can you post the logs
it's the same error from the original message... about the osa: this is my "algorithm" attribute: https://pastebin.com/a9kpjsWL
I set "max_queued_jobs: 100", which is associated with "[1] Number of Workers: 100" in the R cluster log. so you're saying that OS_SERVER_NUMBER_OF_WORKER should also be 100, right?
that was it! I changed "max_queued_jobs: 7" in the osa and it works! thanks Brian and David, you guys rock!!!