Message boards : RALPH@home bug list : Bug reports for version 5.93
Author | Message |
---|---|
Ingemar Volunteer moderator Project developer Project scientist Send message Joined: 7 Mar 07 Posts: 9 Credit: 76 RAC: 0 |
Please report any weird behavior of rosetta version 5.93! |
Dr Who Fan Send message Joined: 2 Sep 06 Posts: 76 Credit: 107,857 RAC: 0 |
This Work Unit 649494 exited with a "161" error: Outcome Client error Client state Compute error Exit status 0 (0x0) Computer ID 10396 CPU time 6315.28125 stderr out <core_client_version>6.1.0</core_client_version> <![CDATA[ <stderr_txt> # cpu_run_time_pref: 7200 # random seed: 1553865 ====================================================== DONE :: 1 starting structures 6315.06 cpu seconds This process generated 5 decoys from 5 attempts ====================================================== BOINC :: Watchdog shutting down... BOINC :: BOINC support services shutting down... </stderr_txt> <message> <file_xfer_error> <file_name>trunc_solit_BOINC_ABRELAX_-trunc_solit-_2891_27_0_0</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> |
Dr Who Fan Send message Joined: 2 Sep 06 Posts: 76 Credit: 107,857 RAC: 0 |
This Work Unit 649484 exited with a "161" error for me and my wingman. Details below from my result id: Outcome Client error Client state Compute error Exit status 0 (0x0) Computer ID 4500 Report deadline 16 Jan 2008 1:09:00 UTC CPU time 7364.203125 stderr out <core_client_version>6.1.0</core_client_version> <![CDATA[ <stderr_txt> # cpu_run_time_pref: 7200 # random seed: 1553875 ====================================================== DONE :: 1 starting structures 7363.47 cpu seconds This process generated 3 decoys from 3 attempts ====================================================== BOINC :: Watchdog shutting down... BOINC :: BOINC support services shutting down... </stderr_txt> <message> <file_xfer_error> <file_name>trunc_solit_BOINC_ABRELAX_-trunc_solit-_2891_17_1_0</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> |
Dr Who Fan Send message Joined: 2 Sep 06 Posts: 76 Credit: 107,857 RAC: 0 |
Another 161 error to report: https://ralph.bakerlab.org/result.php?resultid=732877 Exit status 0 (0x0) Computer ID 4500 Report deadline 16 Jan 2008 1:09:00 UTC CPU time 7364.203125 stderr out <core_client_version>6.1.0</core_client_version> <![CDATA[ <stderr_txt> # cpu_run_time_pref: 7200 # random seed: 1553875 ====================================================== DONE :: 1 starting structures 7363.47 cpu seconds This process generated 3 decoys from 3 attempts ====================================================== BOINC :: Watchdog shutting down... BOINC :: BOINC support services shutting down... </stderr_txt> <message> <file_xfer_error> <file_name>trunc_solit_BOINC_ABRELAX_-trunc_solit-_2891_17_1_0</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> Validate state Invalid Claimed credit 18.0472218006911 |
Dr Who Fan Send message Joined: 2 Sep 06 Posts: 76 Credit: 107,857 RAC: 0 |
Another 161 error to report: https://ralph.bakerlab.org/result.php?resultid=732906 Outcome Client error Client state Compute error Exit status 0 (0x0) Computer ID 10396 Report deadline 16 Jan 2008 3:04:44 UTC CPU time 6187.515625 stderr out <core_client_version>6.1.0</core_client_version> <![CDATA[ <stderr_txt> # cpu_run_time_pref: 7200 # random seed: 1553752 ====================================================== DONE :: 1 starting structures 6186.97 cpu seconds This process generated 5 decoys from 5 attempts ====================================================== BOINC :: Watchdog shutting down... BOINC :: BOINC support services shutting down... </stderr_txt> <message> <file_xfer_error> <file_name>trunc_solit_BOINC_ABRELAX_-trunc_solit-_2892_40_1_0</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> Validate state Invalid Claimed credit 16.9298421717622 Granted credit 0 application version 5.93 |
Snagletooth Send message Joined: 4 May 07 Posts: 67 Credit: 134,427 RAC: 0 |
Another "161" error for trunc_solit_BOINC_ABRELAX_-trunc_solit-_2891_53_1 workunit 649520 has now been sent to a third cruncher <core_client_version>5.10.20</core_client_version> <![CDATA[ <stderr_txt> # cpu_run_time_pref: 36000 # random seed: 1553839 # cpu_run_time_pref: 36000 ====================================================== DONE :: 1 starting structures 35646.5 cpu seconds This process generated 6 decoys from 6 attempts ====================================================== BOINC :: Watchdog shutting down... BOINC :: BOINC support services shutting down... </stderr_txt> <message> <file_xfer_error> <file_name>trunc_solit_BOINC_ABRELAX_-trunc_solit-_2891_53_1_0</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> |
BigMike Send message Joined: 23 Feb 06 Posts: 63 Credit: 58,730 RAC: 0 |
Wow ... that didn't take long... <core_client_version>5.10.30</core_client_version> <![CDATA[ <message> Incorrect function. (0x1) - exit code 1 (0x1) </message> <stderr_txt> # cpu_run_time_pref: 3600 ERROR:: Unable to determine sequence length from pdb file ERROR:: Exit from: .pose.cc line: 1983 </stderr_txt> ]]> Don't believe everything you think. |
Eric Ogletree Send message Joined: 27 Aug 07 Posts: 1 Credit: 24,361 RAC: 0 |
Got four of them here. Hope it helps. :) 16/01/2008 1:27:23 AM|ralph@home|Reason: Unrecoverable error for result mini_-1a32_-test_2898_200_0 (<file_xfer_error> <file_name>mini_-1a32_-test_2898_200_0_0</file_name> <error_code>-161</error_code></file_xfer_error>) 16/01/2008 5:34:01 AM|ralph@home|Task mini_-1a32_-test_2898_193_0 exited with zero status but no 'finished' file 16/01/2008 5:57:55 AM|ralph@home|Task mini_-1a32_-test_2898_206_0 exited with zero status but no 'finished' file 16/01/2008 8:36:34 AM|ralph@home|Reason: Unrecoverable error for result mini_-1a32_-test_2898_193_0 (<file_xfer_error> <file_name>mini_-1a32_-test_2898_193_0_0</file_name> <error_code>-161</error_code></file_xfer_error>) |
ramostol Send message Joined: 29 Mar 07 Posts: 24 Credit: 31,121 RAC: 0 |
You probably know this now, but anyhow: (Some?) trunc_solit-wus seem unable to create proper output files. 3 invalid results for trunc_solit_BOINC_ABRELAX_-trunc_solit-_2934_25 |
Conan Send message Joined: 16 Feb 06 Posts: 364 Credit: 1,368,421 RAC: 0 |
Getting the same error as some others here with WU type 'trunc_solit' WU 732726 WU 732758 WU 732769 WU 732802 WU 732803 WU 733276 WU 736191 <core_client_version>5.10.21</core_client_version> <![CDATA[ <stderr_txt> Graphics are disabled due to configuration... # cpu_run_time_pref: 21600 # random seed: 1553847 ====================================================== DONE :: 1 starting structures 21007.6 cpu seconds This process generated 30 decoys from 30 attempts ====================================================== BOINC :: Watchdog shutting down... BOINC :: BOINC support services shutting down... </stderr_txt> <message> <file_xfer_error> <file_name>trunc_solit_BOINC_ABRELAX_-trunc_solit-_2891_45_0_0</file_name> <error_code>-161</error_code> </file_xfer_error> |
Conan Send message Joined: 16 Feb 06 Posts: 364 Credit: 1,368,421 RAC: 0 |
WU 736639 <core_client_version>5.10.21</core_client_version> <![CDATA[ <message> process exited with code 1 (0x1, -255) </message> <stderr_txt> Graphics are disabled due to configuration... ERROR:: Unable to obtain total_residue & sequence. start pdb file must be provided. ERROR:: Exit from: input_pdb.cc line: 2968 # cpu_run_time_pref: 21600 |
Conan Send message Joined: 16 Feb 06 Posts: 364 Credit: 1,368,421 RAC: 0 |
Have a WU running at the moment (1 of 4 but not sure exactly which one), that is behaving very strange. I noticed this morning that I had a Ralph WU that had completed at 100% after 17:29:51 but was still showing as running at High Priority. Suspending and resuming made no difference so I stopped Boinc Manager and restarted. The WU appeared to have gone but on checking further I found that it has gone back to a process time of 4 hours 12 minutes and going as normal again but still at High Priority. Is this normal for these 2h4o_BOINC_TWIST type work units? |
Conan Send message Joined: 16 Feb 06 Posts: 364 Credit: 1,368,421 RAC: 0 |
Have a WU running at the moment (1 of 4 but not sure exactly which one), that is behaving very strange. I am suspecting that the WU resets and starts again, so I lost possibly up to 17 hours processing time. Of the 4 I received, 1 has now completed normally without an indication of problems. 2 more are now up to 15 and 16 hours at 98.5% with 9 minutes 56 seconds left on both. One has switched to let another project run but the other is running at High Priority. |
Conan Send message Joined: 16 Feb 06 Posts: 364 Credit: 1,368,421 RAC: 0 |
Have a WU running at the moment (1 of 4 but not sure exactly which one), that is behaving very strange. Ok WU 736936 finished without error and in normal 6 hour preference range. I believe that this is the WU that got to 17:29:51 then after restarting BM it went back to normal, but I can't prove that, it could of been one of the following WU's. WU 736937 went for 16:24:27 (59067.94 seconds) and then returned a computation error, that was a lot of wasted effort, here is the error output 59067.941307 stderr out <core_client_version>5.10.21</core_client_version> <![CDATA[ <message> process exited with code 193 (0xc1, -63) </message> <stderr_txt> Graphics are disabled due to configuration... # cpu_run_time_pref: 21600 # random seed: 1551605 ********************************************************************** Rosetta score is stuck or going too long. Watchdog is ending the run! Stuck at score 16.1773 for 900 seconds ********************************************************************** GZIP SILENT FILE: ./xx2h4o.out *** glibc detected *** corrupted double-linked list: 0xae7e1098 *** SIGABRT: abort called Stack trace (14 frames): [0x8da3037] [0x8d9de2c] [0xb7f8c420] [0x8e0e444] [0x8e2330f] [0x8e28532] [0x8e28653] [0x8e0e9b4] [0x8d9fab7] [0x8d9ff27] [0x8d2023d] [0x8d20f35] [0x8d9a0c5] [0x8e3aa1a] Exiting... SIGSEGV: segmentation violation Stack trace (18 frames): [0x8da3037] [0x8d9de2c] [0xb7f8c420] [0x8cad54d] [0x8c11820] [0x8c14e33] [0x804c7c2] [0x8a835ed] [0x8a8586f] [0x89363de] [0x89380e3] [0x893ba27] [0x898ad7a] [0x85e96d6] [0x87289d2] [0x8728af2] [0x8e07384] [0x8048111] Exiting... FILE_LOCK::unlock(): close failed.: Bad file descriptor Graphics are disabled due to configuration... # cpu_run_time_pref: 21600 SIGSEGV: segmentation violation Stack trace (18 frames): [0x8da3037] [0x8d9de2c] [0xb7f00420] [0x8cad54d] [0x8c11820] [0x8c14e33] [0x804c7c2] [0x8a835ed] [0x8a8586f] [0x89363de] [0x8938119] [0x893ba27] [0x898ad7a] [0x85e96d6] [0x87289d2] [0x8728af2] [0x8e07384] [0x8048111] Exiting... WU 736938 ran for 21:48:59 (78,539.06 seconds) was validated but returned a very poor credit amount for such a long process time. Both the last two WU's were stopped by the Watchdog for being stuck. |
RAD-Poland Send message Joined: 6 Apr 07 Posts: 6 Credit: 100,029 RAC: 0 |
Workunit 652259 <core_client_version>5.10.10</core_client_version> <![CDATA[ <message> process exited with code 193 (0xc1, -63) </message> <stderr_txt> Graphics are disabled due to configuration... # cpu_run_time_pref: 3600 # random seed: 1551090 ********************************************************************** Rosetta score is stuck or going too long. Watchdog is ending the run! CPU time: 17762 seconds. Greater than 4X preferred time: 3600 seconds ********************************************************************** GZIP SILENT FILE: ./xxgp04.out SIGSEGV: segmentation violation Stack trace (25 frames): [0x8da3037] ... Validate state Invalid |
Basilaris Send message Joined: 16 Feb 06 Posts: 2 Credit: 10,006 RAC: 0 |
2h4o_Boinc_Twist_Angle_Symm_Fold_and_Dock-2h4o_-native__2970_18_0 did not continue at Model 2, Step: 34817, RMSDE 1.187E+004, Energy: -68.98463. Time and Percent complete went on, but nothing happend. After restarting it was the same: it went up to step 34817 and stop. And the graphics went were faulty too. |
Keith T. Send message Joined: 4 May 07 Posts: 13 Credit: 10,923 RAC: 0 |
https://ralph.bakerlab.org/workunit.php?wuid=651754 2h4o__BOINC_TWIST_RINGS_TWIST_ANGLE_SYMM_FOLD_AND_DOCK-2h4o_-native__2970_79 got stuck on the 9th decoy for over an hour at least twice. I eventually changed the CPU run time down to 4 hours from 8 to get the WU to finish before it's deadline. I did try exiting BOINC a few times as well. The WU was stuck on the 9th decoy and restarted the same one at least twice. Keith |
Conan Send message Joined: 16 Feb 06 Posts: 364 Credit: 1,368,421 RAC: 0 |
Just finished this one. It took over 24 hours before the watchdog stopped it. Should of claimed 300 credits but was granted 80 for less than 3.5 cr/h. Pretty miserable. So Work units are still not adhering to preferences. Bug not fixed. |
[B^S] JoeB@Ky Send message Joined: 11 Oct 06 Posts: 8 Credit: 39,098 RAC: 0 |
I had 2 WU's load on my 2.13GHZ C2D with about a ~1 hr run time. Both were stuck at ~84.3/84.4% after running 1:42/1:46 hrs. I let them stay that way for an additional ~2.25 hours before aborting them yesterday PM. No such problems on my 3.4GHZ P4w/HT; the 2 WU's on it now loaded at ~2.0 hr run time and after 1:07:04 run time the 1st one is at 86.7% done, no freeze up. I just DLed the code file listed on the news buletin on the Bonic Synergy web site and put it in the Ralph PROJECT Folder on the C2D box. I noticed at that time that there was a similar file named: "minirosetta_1.03_windows_intelx86" dated 1-15-08. But it didn't have the .pbd file extention on the end of it. My P4 box, RALPH directory, already has the current 1.07 code file w/ the .pbd extention. Might be why it wasn't working right on the C2D box! |
quimillo Send message Joined: 14 Feb 08 Posts: 4 Credit: 10,604 RAC: 0 |
task tol5__BOINC_SYMM_FOLD_AND_DOCK_RELAX_ONLY-tol5_-lowres_dock_-dock_3218__3305_1_0 using rosetta_beta version 593 time of CPU stopped in: 04:38:41 Progress: 100% Status: Running, high prioprity BOINC client version 5.10.28 for i686-pc-linux-gnu What I do? |
Message boards :
RALPH@home bug list :
Bug reports for version 5.93
©2024 University of Washington
http://www.bakerlab.org