Bug reports for Ralph 5.02

Message boards : RALPH@home bug list : Bug reports for Ralph 5.02

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Rhiju
Volunteer moderator
Project developer
Project scientist

Send message
Joined: 14 Feb 06
Posts: 161
Credit: 3,725
RAC: 0
Message 1289 - Posted: 22 Apr 2006, 5:20:20 UTC

We're testing some new features (see news on main page).
Please pay special attention to jobs that appear stuck or appear to be taking too long! We're hoping a new watchdog thread will catch them
ID: 1289 · Report as offensive    Reply Quote
Nikolay A. Saharov

Send message
Joined: 17 Feb 06
Posts: 6
Credit: 25,102
RAC: 0
Message 1293 - Posted: 22 Apr 2006, 7:15:38 UTC
Last modified: 22 Apr 2006, 7:27:13 UTC

I have 3 errored results:
1. 92417 and 92455 finished with the message

<message>Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
ERROR:: Exit at: .fragments.cc line:687

</stderr_txt>

2. 91907 finished with the text:

<stderr_txt>
# random seed: 3886793
# cpu_run_time_pref: 7200
**********************************************************************
Rosetta score stayed the same too long. Watchdog is killing the run!
**********************************************************************

</stderr_txt>
<message><file_xfer_error>
<file_name>FACONTACTS_RECENTER_NOFILTERS_1dhn__399_6_0_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>

</message>

ID: 1293 · Report as offensive    Reply Quote
Yeti
Avatar

Send message
Joined: 19 Feb 06
Posts: 32
Credit: 316,371
RAC: 853
Message 1294 - Posted: 22 Apr 2006, 8:09:35 UTC

So far, all 5.02-WUs have crashed:

https://ralph.bakerlab.org/result.php?resultid=91805

https://ralph.bakerlab.org/result.php?resultid=91808

https://ralph.bakerlab.org/result.php?resultid=91965

Sevral more in progress, let's see, what's going on



Supporting BOINC, a great concept !
ID: 1294 · Report as offensive    Reply Quote
Yeti
Avatar

Send message
Joined: 19 Feb 06
Posts: 32
Credit: 316,371
RAC: 853
Message 1295 - Posted: 22 Apr 2006, 11:21:30 UTC

Here is one with a large crash-dump:

https://ralph.bakerlab.org/result.php?resultid=92882



Supporting BOINC, a great concept !
ID: 1295 · Report as offensive    Reply Quote
tralala

Send message
Joined: 12 Apr 06
Posts: 52
Credit: 15,257
RAC: 0
Message 1296 - Posted: 22 Apr 2006, 11:34:06 UTC

Out of seven six have crashed:

https://ralph.bakerlab.org/results.php?userid=1266

Although I have 5.4.3 installed I didn't get a large crash-dump
ID: 1296 · Report as offensive    Reply Quote
Pieface

Send message
Joined: 16 Feb 06
Posts: 64
Credit: 203,513
RAC: 0
Message 1297 - Posted: 22 Apr 2006, 13:40:18 UTC

Had one die this morning with 0xc00000005, result: resultid

Looks like the old died while swapping problem.

4/22/2006 5:49:41 AM|ralph@home|Restarting task FACONTACTS_RECENTER_NOFILTERS_1a68__399_7_0 using rosetta_beta version 502
4/22/2006 5:49:41 AM|ralph@home|Restarting task FACONTACTS_RECENTER_NOFILTERS_1ew4A_399_2_0 using rosetta_beta version 502
4/22/2006 5:49:41 AM|SETI@home Beta Test|Pausing task 01jn01aa.27448.448.572166.3.124_1 (removed from memory)
4/22/2006 5:49:41 AM|SETI@home Beta Test|Pausing task 01jn01aa.27448.448.572166.3.132_3 (removed from memory)
4/22/2006 6:49:41 AM|ralph@home|Pausing task FACONTACTS_RECENTER_NOFILTERS_1ew4A_399_2_0 (removed from memory)
4/22/2006 6:49:41 AM|SETI@home Beta Test|Restarting task 01jn01aa.27448.448.572166.3.124_1 using setiathome_enhanced version 511
4/22/2006 6:49:43 AM|ralph@home|Unrecoverable error for result FACONTACTS_RECENTER_NOFILTERS_1a68__399_7_0 ( - exit code -1073741819 (0xc0000005))

ID: 1297 · Report as offensive    Reply Quote
Pieface

Send message
Joined: 16 Feb 06
Posts: 64
Credit: 203,513
RAC: 0
Message 1299 - Posted: 22 Apr 2006, 15:47:34 UTC

oops, my bad, that points to the one from yesterday, the one this morning that the log entries go with is here: 91953
ID: 1299 · Report as offensive    Reply Quote
Snake Doctor

Send message
Joined: 16 Feb 06
Posts: 37
Credit: 998,880
RAC: 0
Message 1300 - Posted: 22 Apr 2006, 16:00:19 UTC

Just got this one here.

This was on a MAC Dual G4 running MAC OS 10.4.6, BOINC 5.3.28

WU - NO_CHECK_7486h002_dec123_1.pdb_407_19_0

Looks like a file problem from this error message -

<message><file_xfer_error>
<file_name>NO_CHECK_7486h002_dec123_1.pdb_407_19_0_0</file_name>
<error_code>-161</error_code>
<error_message></error_message>
</file_xfer_error>

</message>
ID: 1300 · Report as offensive    Reply Quote
Profile [B^S] Doug Worrall
Avatar

Send message
Joined: 16 Feb 06
Posts: 10
Credit: 1,515
RAC: 0
Message 1301 - Posted: 22 Apr 2006, 16:45:48 UTC - in response to Message 1300.  

Just got this one here.

This was on a MAC Dual G4 running MAC OS 10.4.6, BOINC 5.3.28

WU - NO_CHECK_7486h002_dec123_1.pdb_407_19_0

Looks like a file problem from this error message -

<message><file_xfer_error>
<file_name>NO_CHECK_7486h002_dec123_1.pdb_407_19_0_0</file_name>
<error_code>-161</error_code>
<error_message></error_message>
</file_xfer_error>

</message>


Hello,
I thought all the w/u that were fubarred was cause my P.C. crashed yesteday.
The last 8 w/u only 1 worked well.Amstill having difficulties tweeking this new system.The same w/u above stuck at 1.47% at 1 hour,some I have let go 2 to 3 hours
before aborting.I just saw in Tam thread to let new Feture to handle these
w/u.Will Edit this with the correct information.
Sincerely
Sluger



ID: 1301 · Report as offensive    Reply Quote
Profile Daxl

Send message
Joined: 1 Mar 06
Posts: 2
Credit: 55,301
RAC: 0
Message 1302 - Posted: 22 Apr 2006, 17:02:48 UTC

All 6 WU's have crashed on my Laptop : P4-M 2,2 GHz 512 MB Memory (XP-SP2)

WU-83346 Error -161
WU-83301 Error -161
WU-83275 Watchdog kill
WU-83276 Error -161
WU-83302 Error -161
WU-83282 Error -161

-----------------------------------------------------------------------------
<core_client_version>5.4.4</core_client_version>
<stderr_txt>
# random seed: 3885665
# cpu_run_time_pref: 3600
**********************************************************************
Rosetta score stayed the same too long. Watchdog is killing the run!
**********************************************************************

</stderr_txt>
<message><file_xfer_error>
<file_name>NO_CHECK_7486h002_dec124_1.pdb_407_9_0_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>
</message>
----------------------------------------------------------------------------

<core_client_version>5.4.4</core_client_version>
<stderr_txt>
# random seed: 3885638
# cpu_run_time_pref: 3600
# DONE :: 1 starting structures built 5 (nstruct) times
# This process generated 1 decoys from 1 attempts
# 0 starting pdbs were skipped

</stderr_txt>
<message><file_xfer_error>
<file_name>NO_CHECK_7486h002_dec129_1.pdb_407_16_1_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>
</message>
-----------------------------------------------------------------------------

greetz DAXL
ID: 1302 · Report as offensive    Reply Quote
Profile Daxl

Send message
Joined: 1 Mar 06
Posts: 2
Credit: 55,301
RAC: 0
Message 1303 - Posted: 22 Apr 2006, 17:22:58 UTC

6 out of 12 WU's have crashed - 6 aborted
On my Athlon 64-3000 1GB Memory (XP SP2)

WU-83315 Error -161
WU-83216 Error -161
WU-83217 Error -161
WU-83218 Error -161
WU-83219 Error -161
WU-83222 Error -161

---------------------------------------------------------------------

<core_client_version>5.4.4</core_client_version>
<stderr_txt>
# random seed: 3885631
# cpu_run_time_pref: 3600
# DONE :: 1 starting structures built 5 (nstruct) times
# This process generated 3 decoys from 3 attempts
# 0 starting pdbs were skipped

</stderr_txt>
<message><file_xfer_error>
<file_name>NO_CHECK_7486h002_dec184_1.pdb_407_3_0_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>
</message>
---------------------------------------------------------------------
greetz DAXL
ID: 1303 · Report as offensive    Reply Quote
suguruhirahara

Send message
Joined: 5 Mar 06
Posts: 40
Credit: 11,320
RAC: 0
Message 1304 - Posted: 22 Apr 2006, 17:33:31 UTC

on winxp 64bit

https://ralph.bakerlab.org/result.php?resultid=91614

<core_client_version>5.2.13</core_client_version>
<stderr_txt>
# random seed: 3886628
# cpu_run_time_pref: 3600
**********************************************************************
Rosetta score stayed the same too long. Watchdog is killing the run!
**********************************************************************

</stderr_txt>
<message><file_xfer_error>
<file_name>FACONTACTS_RECENTER_NOFILTERS_1pgx__399_1_0_0</file_name>
<error_code>-161</error_code>
<error_message></error_message>
</file_xfer_error>

</message>

Anyway, what is the watchdog?
ID: 1304 · Report as offensive    Reply Quote
Psycodad

Send message
Joined: 16 Feb 06
Posts: 14
Credit: 2,157
RAC: 0
Message 1305 - Posted: 22 Apr 2006, 17:38:08 UTC
Last modified: 22 Apr 2006, 17:38:48 UTC

22.04.2006 17:50:28|ralph@home|Unrecoverable error for result NO_CHECK_7486h002_dec123_1.pdb_407_8_1 (<file_xfer_error> <file_name>NO_CHECK_7486h002_dec123_1.pdb_407_8_1_0</file_name> <error_code>-161</error_code> <error_message></error_message></file_xfer_error>)


WU
Result





<core_client_version>5.2.13</core_client_version>
<stderr_txt>
# random seed: 3885686
# cpu_run_time_pref: 3600
**********************************************************************
Rosetta score stayed the same too long. Watchdog is killing the run!
**********************************************************************

</stderr_txt>
<message><file_xfer_error>
<file_name>NO_CHECK_7486h002_dec123_1.pdb_407_8_1_0</file_name>
<error_code>-161</error_code>
<error_message></error_message>
</file_xfer_error>

</message>
ID: 1305 · Report as offensive    Reply Quote
casio7131

Send message
Joined: 20 Mar 06
Posts: 15
Credit: 12,660
RAC: 0
Message 1307 - Posted: 23 Apr 2006, 2:15:07 UTC

8 results where watchdog killed the run. i think that it might be killing it a bit too early because this machine doesn't usually get stuck or error out too often.

22/04/2006 6:41:39 PM|ralph@home|Unrecoverable error for result FACONTACTS_RECENTER_NOFILTERS_1ail__399_7_0 (Incorrect function. (0x1) - exit code 1 (0x1))
https://ralph.bakerlab.org/result.php?resultid=91955
22/04/2006 6:43:50 PM|ralph@home|Unrecoverable error for result FACONTACTS_RECENTER_NOFILTERS_1a32__399_8_0 (<file_xfer_error> <file_name>FACONTACTS_RECENTER_NOFILTERS_1a32__399_8_0_0</file_name> <error_code>-161</error_code></file_xfer_error>)
https://ralph.bakerlab.org/result.php?resultid=92014
22/04/2006 11:15:56 PM|ralph@home|Unrecoverable error for result FACONTACTS_RECENTER_NOFILTERS_1ubi__399_8_0 (Incorrect function. (0x1) - exit code 1 (0x1))
https://ralph.bakerlab.org/result.php?resultid=92058
22/04/2006 11:15:59 PM|ralph@home|Unrecoverable error for result FACONTACTS_RECENTER_NOFILTERS_1who__399_6_0 (<file_xfer_error> <file_name>FACONTACTS_RECENTER_NOFILTERS_1who__399_6_0_0</file_name> <error_code>-161</error_code></file_xfer_error>)
https://ralph.bakerlab.org/result.php?resultid=91941
23/04/2006 2:52:01 AM|ralph@home|Unrecoverable error for result HOMO_7486_h002_1_LOOPRLX_7486h002_dec184_1.pdb_406_9_2 (Incorrect function. (0x1) - exit code 1 (0x1))
https://ralph.bakerlab.org/result.php?resultid=93253
23/04/2006 2:52:06 AM|ralph@home|Unrecoverable error for result NO_CHECK_7486h002_dec124_1.pdb_407_3_0 (<file_xfer_error> <file_name>NO_CHECK_7486h002_dec124_1.pdb_407_3_0_0</file_name> <error_code>-161</error_code></file_xfer_error>)
https://ralph.bakerlab.org/result.php?resultid=92833
23/04/2006 7:35:36 AM|ralph@home|Unrecoverable error for result NO_CHECK_7486h002_dec123_1.pdb_407_12_1 (Incorrect function. (0x1) - exit code 1 (0x1))
https://ralph.bakerlab.org/result.php?resultid=93254
23/04/2006 7:35:42 AM|ralph@home|Unrecoverable error for result HOMO_7486_h002_1_LOOPRLX_7486h002_dec08_1.pdb_406_15_1 (<file_xfer_error> <file_name>HOMO_7486_h002_1_LOOPRLX_7486h002_dec08_1.pdb_406_15_1_0</file_name> <error_code>-161</error_code></file_xfer_error>)
https://ralph.bakerlab.org/result.php?resultid=93255

ID: 1307 · Report as offensive    Reply Quote
Rhiju
Volunteer moderator
Project developer
Project scientist

Send message
Joined: 14 Feb 06
Posts: 161
Credit: 3,725
RAC: 0
Message 1308 - Posted: 23 Apr 2006, 3:10:04 UTC - in response to Message 1307.  

Thanks for the posts. We think we've tracked down the
two most common errors. The watchdog does seem to be
a little too aggressive... we'll see how things
go for ralph 5.03!

8 results where watchdog killed the run. i think that it might be killing it a bit too early because this machine doesn't usually get stuck or error out too often.

22/04/2006 6:41:39 PM|ralph@home|Unrecoverable error for result FACONTACTS_RECENTER_NOFILTERS_1ail__399_7_0 (Incorrect function. (0x1) - exit code 1 (0x1))
https://ralph.bakerlab.org/result.php?resultid=91955
22/04/2006 6:43:50 PM|ralph@home|Unrecoverable error for result FACONTACTS_RECENTER_NOFILTERS_1a32__399_8_0 ( FACONTACTS_RECENTER_NOFILTERS_1a32__399_8_0_0 -161)
https://ralph.bakerlab.org/result.php?resultid=92014
22/04/2006 11:15:56 PM|ralph@home|Unrecoverable error for result FACONTACTS_RECENTER_NOFILTERS_1ubi__399_8_0 (Incorrect function. (0x1) - exit code 1 (0x1))
https://ralph.bakerlab.org/result.php?resultid=92058
22/04/2006 11:15:59 PM|ralph@home|Unrecoverable error for result FACONTACTS_RECENTER_NOFILTERS_1who__399_6_0 ( FACONTACTS_RECENTER_NOFILTERS_1who__399_6_0_0 -161)
https://ralph.bakerlab.org/result.php?resultid=91941
23/04/2006 2:52:01 AM|ralph@home|Unrecoverable error for result HOMO_7486_h002_1_LOOPRLX_7486h002_dec184_1.pdb_406_9_2 (Incorrect function. (0x1) - exit code 1 (0x1))
https://ralph.bakerlab.org/result.php?resultid=93253
23/04/2006 2:52:06 AM|ralph@home|Unrecoverable error for result NO_CHECK_7486h002_dec124_1.pdb_407_3_0 ( NO_CHECK_7486h002_dec124_1.pdb_407_3_0_0 -161)
https://ralph.bakerlab.org/result.php?resultid=92833
23/04/2006 7:35:36 AM|ralph@home|Unrecoverable error for result NO_CHECK_7486h002_dec123_1.pdb_407_12_1 (Incorrect function. (0x1) - exit code 1 (0x1))
https://ralph.bakerlab.org/result.php?resultid=93254
23/04/2006 7:35:42 AM|ralph@home|Unrecoverable error for result HOMO_7486_h002_1_LOOPRLX_7486h002_dec08_1.pdb_406_15_1 ( HOMO_7486_h002_1_LOOPRLX_7486h002_dec08_1.pdb_406_15_1_0 -161)
https://ralph.bakerlab.org/result.php?resultid=93255


ID: 1308 · Report as offensive    Reply Quote
Rhiju
Volunteer moderator
Project developer
Project scientist

Send message
Joined: 14 Feb 06
Posts: 161
Credit: 3,725
RAC: 0
Message 1312 - Posted: 23 Apr 2006, 8:33:15 UTC - in response to Message 1308.  

Is anyone out there running with a Mac? Are your jobs from 5.02 or 5.03 running?

ID: 1312 · Report as offensive    Reply Quote
Leffe

Send message
Joined: 19 Feb 06
Posts: 10
Credit: 3,683
RAC: 0
Message 1315 - Posted: 23 Apr 2006, 10:31:21 UTC

win xp pro sp2
boinc 5.2.13
Ralph 5.02


23/04/2006 12:50:50|ralph@home|Unrecoverable error for result NO_CHECK_7486h002_dec08_1.pdb_407_3_1 (<file_xfer_error> <file_name>NO_CHECK_7486h002_dec08_1.pdb_407_3_1_0</file_name> <error_code>-161</error_code> <error_message></error_message></file_xfer_error>)

ID: 1315 · Report as offensive    Reply Quote
Robert Everly

Send message
Joined: 16 Feb 06
Posts: 10
Credit: 2,333
RAC: 0
Message 1317 - Posted: 23 Apr 2006, 13:26:36 UTC
Last modified: 23 Apr 2006, 13:29:27 UTC

All three of my 5.02 WUs were killed by the watchdog thread.

resultid=91985
resultid=91973
resultid=91972

I still have my settings to leave the app in memory when switching. Is it possible that the watchdog thread is taking that time into consideration? I have my systems set to switch projects every hour. All of mine aborted very very close to the one hour mark.
ID: 1317 · Report as offensive    Reply Quote
Profile paul and kirsty yates
Avatar

Send message
Joined: 16 Feb 06
Posts: 11
Credit: 949
RAC: 0
Message 1318 - Posted: 23 Apr 2006, 13:51:24 UTC
Last modified: 23 Apr 2006, 13:53:42 UTC

i also got a watchdog killing :(

on this one this one


ID: 1318 · Report as offensive    Reply Quote
Profile anders n

Send message
Joined: 16 Feb 06
Posts: 166
Credit: 131,419
RAC: 0
Message 1321 - Posted: 23 Apr 2006, 16:34:10 UTC

The dog is barking bad :)

https://ralph.bakerlab.org/results.php?hostid=2049

Anders n
ID: 1321 · Report as offensive    Reply Quote
1 · 2 · Next

Message boards : RALPH@home bug list : Bug reports for Ralph 5.02



©2024 University of Washington
http://www.bakerlab.org