| Author | Message |
|
|
|
We are still in the process of fixing all the bugs. This version should show large improvement. |
|
|
|
|
|
All 1.90 with errors like
1570856
1570836
ERROR: Unable to open weights. Neither ./dslf_weights.wts nor dslf_weights.wts nor minirosetta_database\scoring/weights/dslf_weights.wts exist
ERROR:: Exit from: ..\..\src\core\scoring\ScoreFunctionFactory.cc line: 177
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish |
|
|
|
|
All 1.90 with errors like
1570856
1570836
ERROR: Unable to open weights. Neither ./dslf_weights.wts nor dslf_weights.wts nor minirosetta_database\scoring/weights/dslf_weights.wts exist
ERROR:: Exit from: ..\..\src\core\scoring\ScoreFunctionFactory.cc line: 177
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish
No Improvement that I can see, I have 3 version 1.87 WUs and 5 version 1.90 WUs ALL with the same Error as quoted above "Unable to open weights. Neither".
This 1.87 WU
This 1.87 WU
And this 1.87 WU
Also this 1.90 WU
This 1.90 WU
This 1.90 WU
This 1.90 WU
And this 1.90 WU
So Yifan, looks like a bit more bug searching to do.
Conan.
____________
 |
|
|
|
|
|
Same here: lr13_seq_score12_ss2.5_rlbd_1py9_IGNORE_THE_REST_DECOY_12034_3_1
ERROR: Unable to open weights. Neither ./dslf_weights.wts nor dslf_weights.wts nor minirosetta_database/scoring/weights/dslf_weights.wts exist
ERROR:: Exit from: src/core/scoring/ScoreFunctionFactory.cc line: 177
BOINC:: Error reading and gzipping output datafile: default.out
|
|
|
|
|
|
Same error:
lr10_seq_score12_ss2.5_rlbd_1ptq_IGNORE_THE_REST_DECOY_12033_3_1
ERROR: Unable to open weights. Neither ./dslf_weights.wts nor dslf_weights.wts nor minirosetta_database\scoring/weights/dslf_weights.wts exist
ERROR:: Exit from: ..\..\src\core\scoring\ScoreFunctionFactory.cc line: 177
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish
|
|
|
|
|
|
All recent workunits are failing quickly on Mac on file errors
1572293
1572221
(plus some others)
Sample output:
</stderr_txt>
<message>
<file_xfer_error>
<file_name>1a2p_A_30_S_ddg_predictions_0.3_1_8_local_min_test_sc_min_input_73009__12047_1_0_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>
|
|
|
|
|
All recent workunits are failing quickly on Mac on file errors
1572293
1572221
(plus some others)
Sample output:
</stderr_txt>
<message>
<file_xfer_error>
<file_name>1a2p_A_30_S_ddg_predictions_0.3_1_8_local_min_test_sc_min_input_73009__12047_1_0_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>
I have the same error in these two tasks (Linux):
1572053
1572278
All these tasks finish after exactly 1201 seconds!
reached end of minirosetta::main()
======================================================
DONE :: 1 starting structures 1201 cpu seconds
This process generated 1 decoys from 1 attempts
======================================================
BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down cleanly ...
called boinc_finish
</stderr_txt>
<message>
<file_xfer_error>
<file_name>1ten_L_61_A_ddg_predictions_0.2_10_8_local_min_test_sc_min_input_73009__12046_1_0_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>
</message>
AdeB |
|
|
|
|
|
I apologize for posting this here as it is not an app error but I was afraid anywhere else it would languish perhaps past the point where relevant log files are available.
I was the second to receive this workunit. The website lists the outcome as "client detached"at 10:56:25 UTC.
Here are the messages from my client. Times are UTC -4 and a few lines about other projects have been left out.
Sat Aug 1 05:56:19 2009|ralph@home|Sending scheduler request: To fetch work. Requesting 3298 seconds of work, reporting 0 completed tasks
Sat Aug 1 06:01:35 2009|ralph@home|Scheduler request failed: Timeout was reached
Sat Aug 1 06:02:35 2009|ralph@home|Sending scheduler request: To fetch work. Requesting 3299 seconds of work, reporting 0 completed tasks
Sat Aug 1 06:02:40 2009|ralph@home|Scheduler request succeeded: got 1 new tasks
Sat Aug 1 06:02:42 2009|ralph@home|Started download of 0.5_0.5_10.relaxed_input.run3_73109.5azu.pdb
Sat Aug 1 06:02:42 2009|ralph@home|Started download of 0.5_0.5_10.relaxed_input.run3_73109.5azu_V_60_G.mutfile
Sat Aug 1 06:02:44 2009|ralph@home|Finished download of 0.5_0.5_10.relaxed_input.run3_73109.5azu_V_60_G.mutfile
Sat Aug 1 06:02:44 2009|ralph@home|Started download of 0.5_0.5_10.relaxed_input.run3_73109.5azu_V_60_G.cst
Sat Aug 1 06:02:47 2009|ralph@home|Finished download of 0.5_0.5_10.relaxed_input.run3_73109.5azu.pdb
Sat Aug 1 06:02:47 2009|ralph@home|Finished download of 0.5_0.5_10.relaxed_input.run3_73109.5azu_V_60_G.cst
Sat Aug 1 06:02:47 2009|ralph@home|Started download of soft_rep_design_mod.wts
Sat Aug 1 06:02:49 2009|ralph@home|Finished download of soft_rep_design_mod.wts
Sat Aug 1 06:55:37 2009|ralph@home|Starting 5azu_V_60_G_ddg_predictions_test_local_min_0.5_0.5_10_relaxed_starting_4__12055_1_1
Sat Aug 1 06:55:39 2009|ralph@home|Starting task 5azu_V_60_G_ddg_predictions_test_local_min_0.5_0.5_10_relaxed_starting_4__12055_1_1 using minirosetta version 190
Sat Aug 1 06:55:40 2009|boincsimap|Started upload of 9080101.093552_1_0
Sat Aug 1 06:55:48 2009|ralph@home|Computation for task 5azu_V_60_G_ddg_predictions_test_local_min_0.5_0.5_10_relaxed_starting_4__12055_1_1 finished
Sat Aug 1 06:55:48 2009|ralph@home|Output file 5azu_V_60_G_ddg_predictions_test_local_min_0.5_0.5_10_relaxed_starting_4__12055_1_1_0 for task 5azu_V_60_G_ddg_predictions_test_local_min_0.5_0.5_10_relaxed_starting_4__12055_1_1 absent
Sat Aug 1 06:56:49 2009|ralph@home|Sending scheduler request: To fetch work. Requesting 3305 seconds of work, reporting 1 completed tasks
Sat Aug 1 06:56:54 2009|ralph@home|Scheduler request succeeded: got 0 new tasks
You can see there is no entry matching the time the server marked the client detached and it was obviously not detached as my client subsequently attempted to report this result and requested more work eventually receiving another WU.
I am attached to and regularly crunch WUs for nine projects. I have had a number of these spontaneous detachments on one other project. It appears that these detachments are related to hanging scheduler requests as described in this thread at SIMAP. I was hoping you could check your logs and confirm whether the same has occurred here.
Snags
Mac OS 10.5.7, BOINC 6.2.18
|
|
|
|
|
|
WU received after the detachment referenced above lr8_rama_map_all_ss_iter08_rlbn_1acf_IGNORE_THE_REST_NATIVE_12056_1_1
Both crunchers failed with:
process exited with code 1 (0x1, -255)
Native pose needed for OptionKeys::relax::constrain_relax_to_native_coords
ERROR:: Exit from: src/protocols/relax/ClassicRelax.cc line: 544
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish
Snags |
|
|
|
|
|
1573375
lr8_rama_map_all_ss_iter08_rlbn_1bkr_IGNORE_THE_REST_NATIVE_12056_1_1
<core_client_version>6.4.5</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
...
...
...
Native pose needed for OptionKeys::relax::constrain_relax_to_native_coords
ERROR:: Exit from: ..\..\src\protocols\relax\ClassicRelax.cc line: 544
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish
</stderr_txt>
]]>
|
|
|
|
|
|
Add this one: 1573370 to the list.
1573375
lr8_rama_map_all_ss_iter08_rlbn_1bkr_IGNORE_THE_REST_NATIVE_12056_1_1
<core_client_version>6.4.5</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
...
...
...
Native pose needed for OptionKeys::relax::constrain_relax_to_native_coords
ERROR:: Exit from: ..\..\src\protocols\relax\ClassicRelax.cc line: 544
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish
</stderr_txt>
]]>
|
|
|
|
|
WU received after the detachment referenced above lr8_rama_map_all_ss_iter08_rlbn_1acf_IGNORE_THE_REST_NATIVE_12056_1_1
Both crunchers failed with:
process exited with code 1 (0x1, -255)
Native pose needed for OptionKeys::relax::constrain_relax_to_native_coords
ERROR:: Exit from: src/protocols/relax/ClassicRelax.cc line: 544
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish
Snags
Same error on 1573362
1573269
1573030
Thanks
Conan
____________
 |
|
|
|
|
Same error:
lr10_seq_score12_ss2.5_rlbd_1ptq_IGNORE_THE_REST_DECOY_12033_3_1
ERROR: Unable to open weights. Neither ./dslf_weights.wts nor dslf_weights.wts nor minirosetta_database\scoring/weights/dslf_weights.wts exist
ERROR:: Exit from: ..\..\src\core\scoring\ScoreFunctionFactory.cc line: 177
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish
Another one of these errors 1573348
Thanks
Conan
____________
 |
|
|
|
|
|
Yet another
ERROR: Unable to open weights. Neither ./dslf_weights.wts nor dslf_weights.wts nor minirosetta_database\scoring/weights/dslf_weights.wts exist
ERROR:: Exit from: ..\..\src\core\scoring\ScoreFunctionFactory.cc line: 177
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish
Me and another crunchers of these WUs.
Task 1
Task 2
Task 3
Others are OK. |
|
|
|
|
|
Another one of the line 544 errors.
1573827
lr8_rama_map_all_ss_iter08_rlbn_1tif_IGNORE_THE_REST_NATIVE_12056_1_1
Native pose needed for OptionKeys::relax::constrain_relax_to_native_coords
ERROR:: Exit from: ..\..\src\protocols\relax\ClassicRelax.cc line: 544
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish
</stderr_txt>
|
|
|
|
|
|
1574093
1u5p_I_164_A_ddg_predictions_8709_run6_WT.1u5p_I_164_A_.out_12063_1_0
This WU was stuck in the initialising phase for 1 hour 50 minutes and when BOINC switched to another task the Ralph process retained half of my available memory as committed (I use the option to NOT keep suspended tasks in memory). I restarted the system and let it run overnight.
Looking at it this morning I have had a massive memory leak somewhere and Ralph is stuck with the message "waiting for memory". I tried restarting to clear the memory again, but the Ralph WU retained the "waiting for memory" status and refused to start even when I suspended all other tasks.
I aborted the Ralph task as I don't believe it would have reported back on its own. |
|
|
|
|
|
Have had two WUs so far that completed successfully but only took 13 seconds for 1 Decoy.
Also had this one error out WU 1574265 with this error
ERROR:: Exit from: src/core/pack/task/ResfileReader.cc line: 987
BOINC:: Error reading and gzipping output datafile: default.out
Although we received no notification from the Ralph project or the Rosetta project about why this project was down for 4 days and also nothing about it when it came back up, I am glad to see it running again.
Conan.
____________
 |
|
|
|
|
|
The last two work units, and one I am currently running, show 0% progress in the Boinc progress column, even after it has been running for an hour. But from the looks of the finished work units, they are not listed as compute errors. But I'm not sure what the problem is. Guess that is something to fix on Ralph's end. |
|
|
|
|
|
4lyz_I_55_A_ddg_predictions_lm_rel0.5_2_8_8709_run12_MUT.4lyz_I_55_A_.out_12067_1_0 was still initializing when I opened the graphics window after 32 minutes. I quit and restarted BOINC. When I checked BOINC again several hours later this WU had completed successfully and I'd picked up another WU:
5azu_V_95_A_ddg_predictions_lm_0.5_0.5_10_8709_run9_MUT.5azu_V_95_A_.out_12065_1 which failed in less than 5 seconds with:
Exit status 1 (0x1)
Setting up graphics native ...
ERROR:: Exit from: src/core/pack/task/ResfileReader.cc line: 987
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish
|
|
|
|
|
|
5azu_I_20_T_ddg_predictions_lm_0.5_0.8_10_8709_run10_MUT.5azu_I_20_T_.out_12066_1_1
...
Setting up checkpointing ...
Setting up graphics native ...
ERROR:: Exit from: src/core/pack/task/ResfileReader.cc line: 987
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish
AdeB
|
|
|
|
|
The last two work units, and one I am currently running, show 0% progress in the Boinc progress column, even after it has been running for an hour. But from the looks of the finished work units, they are not listed as compute errors. But I'm not sure what the problem is. Guess that is something to fix on Ralph's end.
I also have a current v1.90 WU (type 5azu_V_95_A_ddg) that has run for 45 minutes with no progress showing.
As I have no graphics, I have no idea if it is actually doing anything.
____________
 |
|
|
|
|
|
Big task?
1574065
<message>
Maximum memory exceeded
</message>
AdeB |
|
|
|
|
Have had two WUs so far that completed successfully but only took 13 seconds for 1 Decoy.
Also had this one error out WU 1574265 with this error
ERROR:: Exit from: src/core/pack/task/ResfileReader.cc line: 987
BOINC:: Error reading and gzipping output datafile: default.out
Although we received no notification from the Ralph project or the Rosetta project about why this project was down for 4 days and also nothing about it when it came back up, I am glad to see it running again.
Conan.
Received this same error on latest two results, again running for seconds only before crashing
WU 1574605
WU 1574606
Thanks
Conan.
____________
 |
|
|