Posts by Snagletooth

61) Message boards : RALPH@home bug list : Bug reports for 5.96 (Message 3871)
Posted 9 Apr 2008 by Snagletooth
Post:
Another 161 error for CAPRI_15_t036_1gh10_1_IGNORE_THE_RESTt036_1_t035.template.pdb_3453_3



stderr out

<core_client_version>5.10.34</core_client_version>
<![CDATA[
<stderr_txt>
Rosetta@home Macintosh Stack Size checker.
Original size: 0.
Maximum size: 8388608.
RLIM_INFINITY 0
# cpu_run_time_pref: 7200
# random seed: 1418100
# cpu_run_time_pref: 14400
======================================================
DONE :: 1 starting structures 29526.5 cpu seconds
This process generated 1 decoys from 1 attempts
0 starting pdbs were skipped
======================================================


BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down...

</stderr_txt>
<message>
<file_xfer_error>
<file_name>CAPRI_15_t036_1gh10_1_IGNORE_THE_RESTt036_1_t035.template.pdb_3453_3_1_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>

</message>
]]>

You should also know that it ran straight through which I take to mean there is no checkpoint. Is this right? Eight hours is a very long time to go without checkpointing.
62) Message boards : RALPH@home bug list : Bug reports for 5.96 (Message 3851)
Posted 8 Apr 2008 by Snagletooth
Post:
wu CAPRI_15_t036_1gh10_1_IGNORE_THE_RESTt036_1_t035.template.pdb_3453_7_1failed twice with same error for both crunchers

stderr out

<core_client_version>5.10.34</core_client_version>
<![CDATA[
<stderr_txt>
Rosetta@home Macintosh Stack Size checker.
Original size: 0.
Maximum size: 8388608.
RLIM_INFINITY 0
# cpu_run_time_pref: 7200
# random seed: 1418096
======================================================
DONE :: 1 starting structures 26657.5 cpu seconds
This process generated 1 decoys from 1 attempts
0 starting pdbs were skipped
======================================================


BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down...

</stderr_txt>
<message>
<file_xfer_error>
<file_name>CAPRI_15_t036_1gh10_1_IGNORE_THE_RESTt036_1_t035.template.pdb_3453_7_1_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>

</message>
]]>


edit/ I'm working on another one of these right now which failed for the previous cruncher with the same error. Any need to attempt this one or should I abort it?

Snags
63) Message boards : Feedback : Run time defaults (Message 3739)
Posted 14 Feb 2008 by Snagletooth
Post:
I had the impression on one occasion that a second model actually took less time than the first model but I didn't jot down any numbers at the time and I might have well have been mistaken. If not it might point to why the project would like to see more than one model completed for each wu here on ralph. Only someone from the project can say for certain if there is any value in running more than one model.

What prompted this thread though was the observation that more of the wus are taking longer to complete that first model. Once it goes past your runtime preferences BOINC no longer has any idea how long it will take and and the "time to completion" will no longer update. At which point the wu appears stuck. I've noticed a good many reports both here and on rosetta@home of runs ended by the watchdog and of folks aborting stuck wus. In some cases the folks have given enough information to indicate that they are giving up on the wu before the watchdog would have. Which seems a waste both here and on r@h.

Snags
64) Message boards : Feedback : Run time defaults (Message 3704)
Posted 10 Feb 2008 by Snagletooth
Post:
I started to put this in the bug thread since I know that thread gets read but since this is not a bug report but a concern I'll post here and hope someone notices. BAF5__BOINC_SYMM_FOLD_AND_DOCK_RELAX_ONLY-BAF5_-lowres_dock_-dock_2983__3222_1 took over 26 hours to complete a single model. I opened the graphics window every now and then to take note of the step # and make sure it was making progress and when it looked like it might not be done in 24 hours (4x my runtime pref of 6 hours) I increased my runtime preference to 8 hours and updated in hopes of actually completing a model. Not something one could expect to happen on Rosetta. I know my older mac takes a good bit longer than most but I was also under the impression that most folks have a shorter runtime preference. The concerns then are one, folks looking but not looking closely enough may abort tasks they mistakenly think are stuck and two, if most folks can't complete a single model after many hours of crunching they are also unable to contribute to the science. I would think this would be frustrating both for them and the project and a great waste of resources. Even if they manage to complete a model by going way over their runtime pref they may be in danger of missing the deadline, especially if they run other projects. Not detrimental to the science perhaps but inefficient.
When these tasks are sent out by the main project does the scheduler limit them to faster computers with certain minimum runtimes or is it the luck of the draw? If not would an increase in the default and minimum selectable runtimes be in order? I realize most tasks probably aren't hurt by one or two hour runtimes but what are the cons if they are made to run longer?

I realize my perception may be warped by my older Mac machine and by lack of data and thus may not be worth two cents but, hey, I'm offering it for free:)

Snags
65) Message boards : RALPH@home bug list : Bug reports for version 5.93 (Message 3612)
Posted 14 Jan 2008 by Snagletooth
Post:
Another "161" error for trunc_solit_BOINC_ABRELAX_-trunc_solit-_2891_53_1

workunit 649520 has now been sent to a third cruncher


<core_client_version>5.10.20</core_client_version>
<![CDATA[
<stderr_txt>
# cpu_run_time_pref: 36000
# random seed: 1553839
# cpu_run_time_pref: 36000
======================================================
DONE :: 1 starting structures 35646.5 cpu seconds
This process generated 6 decoys from 6 attempts
======================================================


BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down...

</stderr_txt>
<message>
<file_xfer_error>
<file_name>trunc_solit_BOINC_ABRELAX_-trunc_solit-_2891_53_1_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>

</message>
]]>


66) Message boards : RALPH@home bug list : Bug reports for 5.78 (Message 3346)
Posted 7 Sep 2007 by Snagletooth
Post:
We just got the ability to queue stuff to ralph so you'll start seeing some jobs. By the way, the first round that I sent out had a silly error (which I just fixed); sorry about that. Looking forward to getting ralph and rosetta running smoothly again!



Does the "silly error" have anything to do with this message:

WARNING! Not sure non-ideal rotamers are compatible with symmetry yet...

from 556979?

Snags
67) Message boards : RALPH@home bug list : Bug reports for 5.65 (Message 3125)
Posted 22 May 2007 by Snagletooth
Post:
unrecoverable error

522925


Previous 20



©2024 University of Washington
http://www.bakerlab.org