| Author | Message |
|
|
|
Please post bugs/issues regarding version 5.98 here.
This version includes a boinc api fix that should return status 0 (success) when tasks finish the rosetta working thread okay but then produce an error after boinc_finish(0) is called. This fix is specifically for the t405 type jobs that have been giving access violations after tasks were finished.
Please look out for tasks that appear to be stuck and report here.
____________
|
|
|
|
|
|
FRA_t449_CASP8_MANUAL_1_IGNORE_THE_RESTt449_1_ttxxxxT0449_1CHIM_0001_0001_0001_4584_6_0
<core_client_version>5.10.45</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1, -255)
</message>
<stderr_txt>
Graphics are disabled due to configuration...
# cpu_run_time_pref: 86400
# random seed: 1173943
ERROR:: Exit from: loop_relax.cc line: 1745
</stderr_txt>
]]>
http://ralph.bakerlab.org/result.php?resultid=1048265 |
|
|
|
|
|
My G5 Mac got an error on FRA_t451_CASP8_HYBRID_MANUAL_1_IGNORE_THE_RESTt451_1_axmin1_0001_4596_9, error code –161:
Wed 2 Jul 23:51:37 2008|ralph@home|Computation for task FRA_t451_CASP8_[…]_1_axmin1_0001_4596_9_1 finished
Wed 2 Jul 23:51:37 2008|ralph@home|Output file FRA_t451_CASP8_[…]_1_axmin1_0001_4596_9_1_0
for task FRA_t451_CASP8_[…]_1_axmin1_0001_4596_9_1 absent
It would be nice if this forum could be updated to support [code] tags, especially with these long WU names … |
|
|
|
|
|
The FRA_t451_CASP8_HYBRID_MANUAL_1_IGNORE_THE_RESTt451_1_axmin1_0001_4594_10_0 exited with code 1: \"ERROR:: Exit from: barcode_classes.cc line: 657\" after 73.5 seconds runtime. The same happened to wingman (both are Windows).
Peter
|
|
|
|
|
|
FRA_t451_CASP8_HYBRID_MANUAL_1_IGNORE_THE_RESTt451_1_axmin1_0001_4599_7_0
<core_client_version>6.2.7</core_client_version>
<![CDATA[
<stderr_txt>
Graphics are disabled due to configuration...
# cpu_run_time_pref: 86400
# random seed: 1173207
======================================================
DONE :: 1 starting structures 75928.6 cpu seconds
This process generated 5 decoys from 5 attempts
0 starting pdbs were skipped
======================================================
BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down...
called boinc_finish
</stderr_txt>
<message>
<file_xfer_error>
<file_name>FRA_t451_CASP8_HYBRID_MANUAL_1_IGNORE_THE_RESTt451_1_axmin1_0001_4599_7_0_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>
</message>
]]>
|
|
|
|
|
|
Failure |
|
|
|
|
|
I had to abort a unit which became a CPU hog
07/07/2008 08:40:35|ralph@home|Starting task n002__BOINC_DIMER_SYMM_FOLD_AND_DOCK-n002_-t484__4630_48_0 using rosetta_beta version 598
07/07/2008 08:40:37|ralph@home|Started upload of n001__BOINC_MONOMER_ABRELAX-n001_-t484__4633_47_0_0
07/07/2008 08:40:40|ralph@home|Finished upload of n001__BOINC_MONOMER_ABRELAX-n001_-t484__4633_47_0_0
07/07/2008 08:40:43|ralph@home|Sending scheduler request: To report completed tasks. Requesting 0 seconds of work, reporting 1 completed tasks
07/07/2008 08:40:48|ralph@home|Scheduler request succeeded: got 0 new tasks
07/07/2008 08:54:19|ralph@home|Starting n006__BOINC_MONOMER_ABRELAX-n006_-t484__4632_46_0
07/07/2008 08:54:19|ralph@home|Starting task n006__BOINC_MONOMER_ABRELAX-n006_-t484__4632_46_0 using rosetta_beta version 598
07/07/2008 08:54:22|ralph@home|Computation for task n002__BOINC_DIMER_SYMM_FOLD_AND_DOCK-n002_-t484__4630_48_0 finished
07/07/2008 08:55:20|ralph@home|Sending scheduler request: To report completed tasks. Requesting 0 seconds of work, reporting 1 completed tasks
07/07/2008 08:55:25|ralph@home|Scheduler request succeeded: got 0 new tasks
Task n002__BOINC_DIMER_SYMM_FOLD_AND_DOCK-n002_-t484__4630_48_0 using rosetta_beta version 598
If I can provide any furthur information, please let me know
Regards
Bob
|
|
|
|
|
|
segmentation violation |
|
|
|
|
|
Both computers attempting this one failed with the same error:
MFR_ABRELAX_PICKED_TR5d__4731_6_2
stderr out
<core_client_version>5.10.45</core_client_version>
<![CDATA[
<stderr_txt>
Rosetta@home Macintosh Stack Size checker.
Original size: 0.
Maximum size: 8388608.
RLIM_INFINITY 0
# cpu_run_time_pref: 14400
# random seed: 1052640
======================================================
DONE :: 1 starting structures 15594.9 cpu seconds
This process generated 1 decoys from 1 attempts
======================================================
BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down...
</stderr_txt>
<message>
<file_xfer_error>
<file_name>MFR_ABRELAX_PICKED_TR5d__4731_6_2_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>
</message>
]]>
|
|
|
|
|
|
Snagletooth, a file transfer error, but only after you completed a model. And given the file name, it looks like your machine finally gave up on trying to upload the result file. Have you had any other upload problems?
____________
|
|
|
|
|
Snagletooth, a file transfer error, but only after you completed a model. And given the file name, it looks like your machine finally gave up on trying to upload the result file. Have you had any other upload problems?
No but since the other cruncher who attempted this wu before me reported the same error I assumed the problem is with the wu. I\'ve noticed other folks have occasionally reported this error for different types of wus and I have become curious about the cause. Can you offer any enlightenment?
Snags |
|
|
|
|
|
Well, I assume that one way to cause this to happen would be to abort the transfer from the transfers tab in the advanced view. But, I doubt you did that, and especially since both had the same problem.
Have there been any BOINC problems?? I understand there is some handshaking and authorizations that take place to permit your upload while thwarting SPAMers. Perhaps there is a problem with that process? Or, one of the things I think they exchange is the file size. Perhaps the out file for this task grew unexpectedly large? Hopefully Ralph team can shed some light.
____________
|
|
|
|
|
Well, I assume that one way to cause this to happen would be to abort the transfer from the transfers tab in the advanced view. But, I doubt you did that, and especially since both had the same problem.
Have there been any BOINC problems?? I understand there is some handshaking and authorizations that take place to permit your upload while thwarting SPAMers. Perhaps there is a problem with that process? Or, one of the things I think they exchange is the file size. Perhaps the out file for this task grew unexpectedly large? Hopefully Ralph team can shed some light.
There have have been no other BOINC problems. I haven\'t gotten another Ralph wu yet but I have successfully finished, uploaded, and reported results from other projects: wus I received before, during, and after the ralph wu crunched.
I was away from the computer while this one was finishing up and returned to find it\'s status listed as \"computation error\" instead of \"ready to report\" as I would have expected if everything finished and uploaded normally. I just went back through the message logs and found this:
Wed Aug 13 03:56:13 2008|ralph@home|Computation for task MFR_ABRELAX_PICKED_TR5d__4731_6_2 finished
Wed Aug 13 03:56:13 2008|ralph@home|Output file MFR_ABRELAX_PICKED_TR5d__4731_6_2_0 for task MFR_ABRELAX_PICKED_TR5d__4731_6_2 absent
I think -161 may be a bit of a catchall error code as it seems to include problems with the creation of the file to be transferred as well as problems with the transfer itself. Obviously some info is reported as the stderr out for both my co-cruncher and I shows one decoy completed. My co-cruncher has since successfully completed another Ralph WU so I\'m inclined to think it the problem is with the wu rather than something on our ends. Of course by now the project may well have discovered the problem, fixed it and moved on. Do you know how to search for this particular protein/app combo as crunched by others, here or on Rosetta, to find out if this is the case?
Snags |
|
|
|
|
Do you know how to search for this particular protein/app combo as crunched by others, here or on Rosetta, to find out if this is the case?
I believe the project team would have to run a query to answer a question like that.
____________
|
|
|
|
|
|
This task had an error at startup.
stderr out:
<core_client_version>5.10.45</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1, -255)
</message>
<stderr_txt>
Graphics are disabled due to configuration...
ERROR:: Exit from: options.cc line: 525
</stderr_txt>
]]>
|
|
|
|
|
|
Hello,
Had the same error:
ERROR:: Exit from: options.cc line: 525
Task details
Have a nice day,
Path7.
|
|
|
|
|
Had the same error:
ERROR:: Exit from: options.cc line: 525
Me too: t042_1_NMRREF_1_t015_1_S_00001_0000163IGNORE_THE_REST_040000_5065_48_0.
Peter |
|
|
|
|
|
2dlb__BOINC_SYMM_FOLD_AND_DOCK_RELAX-2dlb_-native_frag__7418_3_0
CPU time 6.359375
stderr out
<core_client_version>6.4.5</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# cpu_run_time_pref: 14400
# random seed: 3363599
ERROR:: Exit from: .\fragments.cc line: 769
</stderr_txt>
]]>
|
|
|
|
|
|
2mlt__BOINC_SYMM_FOLD_AND_DOCK_RELAX-2mlt_-native_frag__7418_1_0
big error dump
<core_client_version>6.4.5</core_client_version>
<![CDATA[
<message>
- exit code -1073741819 (0xc0000005)
</message>
<stderr_txt>
# cpu_run_time_pref: 14400
# random seed: 3363541
Unhandled Exception Detected...
- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x008BB7E2 read attempt to address 0x122B0000
Engaging BOINC Windows Runtime Debugger...
Dump Timestamp : 02/06/09 22:07:25
LoadLibraryA( symsrv.dll ): GetLastError = 126
LoadLibraryA( srcsrv.dll ): GetLastError = 126
Debugger Engine : 4.0.5.0
Symbol Search Path: E:\boinc\projects\slots\3;E:\boinc\projects\projects\ralph.bakerlab.org;srv*C:\DOCUME~1\Me\LOCALS~1\Temp\symbols*http://msdl.microsoft.com/download/symbols;srv*C:\DOCUME~1\Me\LOCALS~1\Temp\symbols*http://boinc.bakerlab.org/rosetta/symstore
SymGetModuleInfo(): GetLastError = 87
ModLoad: 00000000 00000000 ( Symbols Loaded)
these last two lines repeat more than 10 times. |
|
|
|
|
|
2dlb__BOINC_SYMM_FOLD_AND_DOCK_RELAX-2dlb_-native_frag__7418_1_0
5.9375
stderr out
<core_client_version>6.4.5</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# cpu_run_time_pref: 14400
# random seed: 3363601
ERROR:: Exit from: .\fragments.cc line: 769
</stderr_txt>
]]> |
|
|
|
|
|
1irg__BOINC_SYMM_FOLD_AND_DOCK_RELAX-1irg_-native_frag__7418_1_0
Exit status -1073741819 (0xffffffffc0000005)
CPU time 271.375
stderr out
<core_client_version>6.4.5</core_client_version>
<![CDATA[
<message>
- exit code -1073741819 (0xc0000005)
</message>
<stderr_txt>
# cpu_run_time_pref: 14400
# random seed: 3363701
Unhandled Exception Detected...
- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x008BB9A5 read attempt to address 0x12166000
Engaging BOINC Windows Runtime Debugger...
********************
BOINC Windows Runtime Debugger Version 6.1.5
Dump Timestamp : 02/06/09 14:25:57
LoadLibraryA( symsrv.dll ): GetLastError = 126
LoadLibraryA( srcsrv.dll ): GetLastError = 126
Debugger Engine : 4.0.5.0
Symbol Search Path: E:\boinc\projects\slots\3;E:\boinc\projects\projects\ralph.bakerlab.org;srv*C:\DOCUME~1\Me\LOCALS~1\Temp\symbols*http://msdl.microsoft.com/download/symbols;srv*C:\DOCUME~1\Me\LOCALS~1\Temp\symbols*http://boinc.bakerlab.org/rosetta/symstore
SymGetModuleInfo(): GetLastError = 87
ModLoad: 00000000 00000000 ( Symbols Loaded
the above 2 lines repeat greater than 10 times |
|
|
|
|
|
1irg__BOINC_SYMM_FOLD_AND_DOCK_RELAX-1irg_-native_frag__7418_2_0
symbols error again
ran 184 seconds
<core_client_version>6.4.5</core_client_version>
<![CDATA[
<message>
- exit code -1073741819 (0xc0000005)
</message>
<stderr_txt>
# cpu_run_time_pref: 14400
# random seed: 3363700
Unhandled Exception Detected...
- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x008BB955 read attempt to address 0x11FF1000
Engaging BOINC Windows Runtime Debugger...
|
|
|
|
|
|
2dlb__BOINC_SYMM_FOLD_AND_DOCK_RELAX-2dlb_-native_frag__7418_2_0
CPU time 4.953125
stderr out
<core_client_version>6.4.5</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# cpu_run_time_pref: 14400
# random seed: 3363600
ERROR:: Exit from: .\fragments.cc line: 769
</stderr_txt>
|
|
|
|
|
|
1irg__BOINC_SYMM_FOLD_AND_DOCK_RELAX-1irg_-native_frag__7418_3_0
Client state Compute error
Exit status -1073741819 (0xffffffffc0000005)
CPU time 284.4531
stderr out
<core_client_version>6.4.5</core_client_version>
<![CDATA[
<message>
- exit code -1073741819 (0xc0000005)
</message>
<stderr_txt>
# cpu_run_time_pref: 14400
# random seed: 3363699
Unhandled Exception Detected...
- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x008BB9A5 read attempt to address 0x137DC000
Engaging BOINC Windows Runtime Debugger...
-------
2mlt__BOINC_SYMM_FOLD_AND_DOCK_RELAX-2mlt_-native_frag__7418_2_0
same as above
run time: 177.8281 seconds |
|
|
|
|
|
This one is running over 5 hours per model. It is called r120g_BOINC_SYMM_FOLD_AND_DOCK_RELAX-r120g-_9223_30_0
____________
|
|
|
|
|
|
My G4 Mac has had errors on the last few tasks it tried: the most recent result says:
<core_client_version>5.10.20</core_client_version>
<![CDATA[
<stderr_txt>
Rosetta@home Macintosh Stack Size checker.
Original size: 0.
Maximum size: 8388608.
RLIM_INFINITY 0
# cpu_run_time_pref: 7200
ERROR:: Unable to determine sequence length from pdb file
======================================================
DONE :: 1 starting structures 38.8 cpu seconds
This process generated 0 decoys from 0 attempts
1 starting pdbs were skipped
======================================================
BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down...
</stderr_txt>
<message>
<file_xfer_error>
<file_name>Rossmann2x3_f001_relax_12429_2_1_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>
</message>
My G5 hasn’t managed to get any work from here since May. |
|
|