Message boards : RALPH@home bug list : MiniRosetta Beta 3.26
Author | Message |
---|---|
Conan Send message Joined: 16 Feb 06 Posts: 364 Credit: 1,368,421 RAC: 0 |
|
Conan Send message Joined: 16 Feb 06 Posts: 364 Credit: 1,368,421 RAC: 0 |
The 6 Work Unit Limit is a bit of a pain. If the project sends out faulty work then I can't get any more for the day to test if some work units actually work or not. This will spread the work around I suppose but slow down getting the work returned. Conan |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 905 Credit: 1,892,541 RAC: 294 |
Same here on windows xp: 2641583 2641588 2641590 ERROR: [ERROR] Error opening symmetry file '/work/dimaio/projects/casp9/T0524/run_12/symmdef/3imhA_101_C4.symm' ERROR:: Exit from: ......srccoreconformationsymmetrySymmData.cc line: 535 BOINC:: Error reading and gzipping output datafile: default.out called boinc_finish |
Snagletooth Send message Joined: 4 May 07 Posts: 67 Credit: 134,427 RAC: 0 |
CASP9_bv_benchmark_hybridization_run48_T0518_0_C2_SAVE_ALL_OUT_IGNORE_THE_REST_17843_2_0 ERROR: [ERROR] Error opening symmetry file '/work/dimaio/projects/casp9/T0518/run_12/symmdef/3h3lA_201_C2.symm' ERROR:: Exit from: src/core/conformation/symmetry/SymmData.cc line: 535 BOINC:: Error reading and gzipping output datafile: default.out called boinc_finish CASP9_bv_benchmark_hybridization_run48_T0563_0_C3_SAVE_ALL_OUT_IGNORE_THE_REST_17886_5_0 ERROR: [ERROR] Error opening symmetry file '/work/dimaio/projects/casp9/T0563/run_12/symmdef/1unbA_301_C3.symm' ERROR:: Exit from: src/core/conformation/symmetry/SymmData.cc line: 535 BOINC:: Error reading and gzipping output datafile: default.out called boinc_finish CASP9_bv_benchmark_hybridization_run48_T0521_0_C2_SAVE_ALL_OUT_IGNORE_THE_REST_17845_6_0 ERROR: [ERROR] Error opening symmetry file '/work/dimaio/projects/casp9/T0521/run_12/symmdef/3l19B_102_C2.symm' ERROR:: Exit from: src/core/conformation/symmetry/SymmData.cc line: 535 BOINC:: Error reading and gzipping output datafile: default.out called boinc_finish The C1 units appear to be fine on my Mac. CASP9_bv_benchmark_hybridization_run48_T0561_2_C1_SAVE_ALL_OUT_IGNORE_THE_REST_17884_5_0 Currently crunching another C1 so we'll see if it holds up. It's about 45 minutes in with a cpu preferred runtime of 4 hours. Best, Snags |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 905 Credit: 1,892,541 RAC: 294 |
2642639 Unhandled Exception Detected... - Unhandled Exception Record - Reason: Out Of Memory (C++ Exception) (0xe06d7363) at address 0x7C812AFB Engaging BOINC Windows Runtime Debugger... - Registers - eax=0222bb88 ebx=004536c0 ecx=00000000 edx=015f6a28 esi=0222bc10 edi=015f65e0 eip=7c812afb esp=0222bb84 ebp=0222bbd8 cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000206 - Callstack - ChildEBP RetAddr Args to Child 0222bbd8 0041262e e06d7363 00000001 00000003 0222bc04 kernel32!_RaiseException@16+0x0 0222bc10 004125e5 0222bc20 01431280 012bd338 012c1ab0 minirosetta_beta_3.26_windows_i!cppdb::atomic_counter::atomic_counter+0x0 0222bc30 008c7a3e 04536c00 0222d21c 0222d21c 00000000 minirosetta_beta_3.26_windows_i!cppdb::atomic_counter::atomic_counter+0x0 0222bc48 008c80cf 004536c0 0222bd74 00a2bb72 e3d4513a minirosetta_beta_3.26_windows_i!cppdb::atomic_counter::get+0x0 0222bce8 00a2c15d 0222bd74 0222d21c e3d453e6 0000008d minirosetta_beta_3.26_windows_i!cppdb::atomic_counter::get+0x0 0222be34 008dc042 0222c05c 0222d21c 0222c104 0222d2cc minirosetta_beta_3.26_windows_i!cppdb::mutex::~mutex+0x0 0222d4ec 008e5985 069a0490 00000000 069a0490 00000001 minirosetta_beta_3.26_windows_i!cppdb::atomic_counter::get+0x0 0222d554 00e97fc4 00000001 1a3dd8c0 069a0490 00000000 minirosetta_beta_3.26_windows_i!cppdb::atomic_counter::get+0x0 0222d5ac 004fd499 1a3dd8c0 00000001 069a0490 1a3d5510 minirosetta_beta_3.26_windows_i!cppdb::backend::static_driver::in_use+0x0 0222de3c 005025b2 069a0490 e3d4337a 060a54f8 1a270d48 minirosetta_beta_3.26_windows_i!cppdb::atomic_counter::atomic_counter+0x0 0222dea8 00a34e37 069a0490 1a270d48 01199e1b 00000000 minirosetta_beta_3.26_windows_i!cppdb::atomic_counter::atomic_counter+0x0 0222dec0 00a3543b 0222eb48 e3d4333a 0222eb48 000000dd minirosetta_beta_3.26_windows_i!cppdb::mutex::~mutex+0x0 0222dee8 00c71ebf 0222eb48 1a3d5510 00000000 06a15688 minirosetta_beta_3.26_windows_i!cppdb::mutex::~mutex+0x0 0222e51c 00c5ffea 0222eb48 e3d4052a 063a62d8 06a9aaf8 minirosetta_beta_3.26_windows_i!cppdb::mutex::~mutex+0x0 0222e8f8 0060b673 0222eb48 e3d404be 063a62d8 06a9aaf8 minirosetta_beta_3.26_windows_i!cppdb::mutex::~mutex+0x0 0222e96c 0060b84c 0222eb48 06a9aaf8 063a6354 00000008 minirosetta_beta_3.26_windows_i!cppdb::backend::driver::connect+0x0 0222e984 0060c36c 0222eb48 06a9aaf8 e3d40786 00000000 minirosetta_beta_3.26_windows_i!cppdb::backend::driver::connect+0x0 0222ea54 005f4e1d 0222eb48 e3d40136 02ac4610 02ac4610 minirosetta_beta_3.26_windows_i!cppdb::backend::driver::connect+0x0 0222ece4 00612a72 00000000 e3d400da 02ac4620 0222ecec minirosetta_beta_3.26_windows_i!cppdb::atomic_counter::atomic_counter+0x0 0222ed08 006097df 00000000 e3d40092 00000000 0000000d minirosetta_beta_3.26_windows_i!cppdb::backend::driver::connect+0x0 0222ed40 00405450 00000000 e3d402ca 00000000 00000000 minirosetta_beta_3.26_windows_i!cppdb::backend::driver::connect+0x0 0222ef18 004056fd 0000001b 0222ef30 00052310 0222ef30 minirosetta_beta_3.26_windows_i!+0x0 0222ff30 0041814e 00400000 00000000 00052357 0000000a minirosetta_beta_3.26_windows_i!+0x0 0222ffc0 7c817077 00000000 00000000 7ffd5000 e06d7363 minirosetta_beta_3.26_windows_i!cppdb::atomic_counter::atomic_counter+0x0 0222fff0 00000000 004181a1 00000000 00000000 00000000 kernel32!_BaseProcessStart@4+0x0 *** Dump of thread ID 3832 (state: Waiting): *** - Information - Status: Wait Reason: ExecutionDelay, , Kernel Time: 701008.000000, User Time: 300432.000000, Wait Time: 2061862.000000 - Registers - eax=00000000 ebx=00000000 ecx=0410f898 edx=00000304 esi=00000000 edi=0410ff64 eip=7c91e514 esp=0410ff34 ebp=0410ff8c cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202 - Callstack - ChildEBP RetAddr Args to Child 0410ff30 7c91d21a 7c8023f1 00000000 0410ff64 7c801e1a ntdll!_KiFastSystemCallRet@0+0x0 FPO: [0,0,0] 0410ff34 7c8023f1 00000000 0410ff64 7c801e1a 00000002 ntdll!_NtDelayExecution@8+0x0 FPO: [2,0,0] 0410ff8c 7c802455 00000064 00000000 0410ffb4 004080a8 kernel32!_SleepEx@8+0x0 0410ff9c 004080a8 00000064 0000000c 19b4dcec 404e91a7 kernel32!_Sleep@4+0x0 0410ffb4 7c80b729 00000000 0000000c 00000002 00000000 minirosetta_beta_3.26_windows_i!+0x0 0410ffec 00000000 00408090 00000000 00000000 eb832e98 kernel32!_BaseThreadStart@8+0x0 *** Dump of thread ID 1272 (state: Waiting): *** - Information - Status: Wait Reason: ExecutionDelay, , Kernel Time: 500720.000000, User Time: 0.000000, Wait Time: 2061778.000000 - Registers - eax=00000000 ebx=05c59600 ecx=7c802413 edx=ffffffff esi=00000000 edi=075dfe48 eip=7c91e514 esp=075dfe18 ebp=075dfe70 cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000206 - Callstack - ChildEBP RetAddr Args to Child 075dfe14 7c91d21a 7c8023f1 00000000 075dfe48 00000031 ntdll!_KiFastSystemCallRet@0+0x0 FPO: [0,0,0] 075dfe18 7c8023f1 00000000 075dfe48 00000031 00000000 ntdll!_NtDelayExecution@8+0x0 FPO: [2,0,0] 075dfe70 7c802455 000007d0 00000000 075dff68 00619853 kernel32!_SleepEx@8+0x0 075dfe80 00619853 000007d0 e6ab12ba 00000050 05c596f0 kernel32!_Sleep@4+0x0 075dff68 00619a47 00000000 004150c3 00000000 e6ab127a minirosetta_beta_3.26_windows_i!cppdb::backend::driver::connect+0x0 075dffa8 0041514d 060afd50 075dffec 7c80b729 05c596f0 minirosetta_beta_3.26_windows_i!cppdb::backend::driver::connect+0x0 075dffb4 7c80b729 05c596f0 00000050 060afd50 05c596f0 minirosetta_beta_3.26_windows_i!cppdb::atomic_counter::atomic_counter+0x0 075dffec 00000000 004150e9 05c596f0 00000000 08560000 kernel32!_BaseThreadStart@8+0x0 *** Debug Message Dump **** *** Foreground Window Data *** Window Name : Window Class : Window Process ID: 0 Window Thread ID : 0 Exiting... |
TPCBF Send message Joined: 20 Jun 11 Posts: 30 Credit: 27,776 RAC: 0 |
The 6 Work Unit Limit is a bit of a pain.What 6 WU Limit? I had in the last couple of days up to 20 if I counted right of those quickly failing 3.24 ones, and right now I have 9 of the 3.26 Beta WUs in queue (one currently running)... Ok, make that 8 in queue and one just finished successfully... Ralf |
Conan Send message Joined: 16 Feb 06 Posts: 364 Credit: 1,368,421 RAC: 0 |
The 6 Work Unit Limit is a bit of a pain.What 6 WU Limit? When you get a number of errors (as in quite a lot of them), the project limits how many work units you get so that a lot of work is not 'trashed'. However if the project sends out faulty work units you get the same result and then you are limited in how many work units, PER MACHINE, you can get. Myself and a number of others hit this limit on some of our computers, we could then only get 6 WUs for the whole day. Once a few successful WUs go through this limit gets lifted and then we can get as many as we can handle. Conan |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 905 Credit: 1,892,541 RAC: 294 |
2647170 ERROR: [ERROR] Error opening symmetry file '/work/dimaio/projects/casp9/T0555/run_12/symmdef/1yc9A_201_C3.symm' ERROR:: Exit from: ......srccoreconformationsymmetrySymmData.cc line: 535 BOINC:: Error reading and gzipping output datafile: default.out called boinc_finish |
Saenger Send message Joined: 28 Feb 06 Posts: 13 Credit: 67,395 RAC: 0 |
18,800 seconds for 1.95 credits, that's incredible: 2646029 21363 5 Apr 2012 23:19:34 UTC 9 Apr 2012 21:11:20 UTC Over Success Done 18,800.09 187.68 1.95 Any idea what went so terribly wrong? https://ralph.bakerlab.org/result.php?resultid=2646029 Grüße vom Sänger |
Rocco Moretti Volunteer moderator Project developer Project scientist Send message Joined: 18 May 10 Posts: 11 Credit: 30,188 RAC: 0 |
18,800 seconds for 1.95 credits, that's incredible: From the stderr out, it looks like your boinc client actually ran the executable twice. Once for 99 decoys, and the second time for just a single decoy - the output file of which likely overwrote the output file of the first time around. This means that although you crunched for 100 decoys worth of time, you only sent back (and got credit for) one decoy. Why boinc re-ran the minirosetta application, I don't know - it might have something to do with the "No heartbeat from core client for 30 sec - exiting" line. If I had to guess, your Boinc manager was not running or unresponsive when the minirosetta application finished, so it didn't recognize that it was done, causing it to restart it and overwrite the results. |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 905 Credit: 1,892,541 RAC: 294 |
2650831 Unpacking zip data: ../../projects/ralph.bakerlab.org/minirosetta_database_rev48292.zip Unpacking WU data ... Unpacking data: ../../projects/ralph.bakerlab.org/input_CASP9_bz_benchmark_hybridization_run52_T0596_0_C1_yfsong.zip Setting database description ... Setting up checkpointing ... Setting up graphics native ... BOINC:: Worker startup. Starting watchdog... Watchdog active. </stderr_txt> ]]> Validate state Invalid |
rilian Send message Joined: 7 Sep 07 Posts: 35 Credit: 107,666 RAC: 725 |
ERROR: Cannot open PDB file "/work/brunette/experiments/alignment_challenge/raptor_difficult_cases/T0540/native//T0540.pdb" ERROR:: Exit from: src/core/import_pose/import_pose.cc line: 184 BOINC:: Error reading and gzipping output datafile: default.out some brunette poses are invalid :D https://ralph.bakerlab.org/result.php?resultid=2658197 on a side note, i think error ins in file path format with double slashes here //T0540.pdb -- I crunch for Ukraine |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 905 Credit: 1,892,541 RAC: 294 |
I'm running 3.26 version. Why not 3.31?? |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 905 Credit: 1,892,541 RAC: 294 |
2659979 ====================================================== DONE :: 1 starting structures 4896.75 cpu seconds This process generated 1 decoys from 1 attempts ====================================================== BOINC :: WS_max 4.53599e+008 BOINC :: Watchdog shutting down... BOINC :: BOINC support services shutting down cleanly ... called boinc_finish </stderr_txt> ]]> Validate state Invalid |
Conan Send message Joined: 16 Feb 06 Posts: 364 Credit: 1,368,421 RAC: 0 |
On this Work Unit 2666791 I ran into this error <message> couldn't start Can't write init file: -108: -108 </message> Conan |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 905 Credit: 1,892,541 RAC: 294 |
A LOT of errors on my win7 32bit: 2670362 2670356 2670348 2670333 <core_client_version>7.0.25</core_client_version> <![CDATA[ <message> Funzione non corretta. (0x1) - exit code 1 (0x1) </message> <stderr_txt> [2012- 6-20 9:15:33:] :: BOINC:: Initializing ... ok. [2012- 6-20 9:15:33:] :: BOINC :: boinc_init() BOINC:: Setting up shared resources ... ok. BOINC:: Setting up semaphores ... ok. BOINC:: Updating status ... ok. BOINC:: Registering timer callback... ok. BOINC:: Worker initialized successfully. Registering options.. Registered extra options. Initializing broker options ... Registered extra options. Initializing core... Initializing options.... ok Options::initialize() Options::adding_options() Options::initialize() Check specs. Options::initialize() End reached Loaded options.... ok Processed options.... ok Initializing random generators... ok Initialization complete. Initializing options.... ok Options::initialize() Options::adding_options() Options::initialize() Check specs. Options::initialize() End reached Loaded options.... ok Processed options.... ok Initializing random generators... ok Initialization complete. Setting WU description ... Unpacking zip data: ../../projects/ralph.bakerlab.org/minirosetta_database_rev48292.zip Setting database description ... Setting up checkpointing ... Setting up graphics native ... BOINC:: Worker startup. Starting watchdog... Watchdog active. </stderr_txt> ]]> Validate state Invalid |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 905 Credit: 1,892,541 RAC: 294 |
Again, errors 2676163 CPU time 7071.837 stderr out <core_client_version>7.0.25</core_client_version> <![CDATA[ <message> Funzione non corretta. (0x1) - exit code 1 (0x1) </message> <stderr_txt> [2012- 6-23 6:27:19:] :: BOINC:: Initializing ... ok. [2012- 6-23 6:27:19:] :: BOINC :: boinc_init() BOINC:: Setting up shared resources ... ok. BOINC:: Setting up semaphores ... ok. BOINC:: Updating status ... ok. BOINC:: Registering timer callback... ok. BOINC:: Worker initialized successfully. Registering options.. Registered extra options. Initializing broker options ... Registered extra options. Initializing core... Initializing options.... ok Options::initialize() Options::adding_options() Options::initialize() Check specs. Options::initialize() End reached Loaded options.... ok Processed options.... ok Initializing random generators... ok Initialization complete. Initializing options.... ok Options::initialize() Options::adding_options() Options::initialize() Check specs. Options::initialize() End reached Loaded options.... ok Processed options.... ok Initializing random generators... ok Initialization complete. Setting WU description ... Unpacking zip data: ../../projects/ralph.bakerlab.org/minirosetta_database_rev48292.zip Setting database description ... Setting up checkpointing ... Setting up graphics native ... BOINC:: Worker startup. Starting watchdog... Watchdog active. # cpu_run_time_pref: 7200 </stderr_txt> ]]> Validate state Invalid |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 905 Credit: 1,892,541 RAC: 294 |
Usual error 2693765 |
TPCBF Send message Joined: 20 Jun 11 Posts: 30 Credit: 27,776 RAC: 0 |
Since the latest batch started the other day, I get roughly one compute error for each dozen or so WUs that go through just fine... Ralf |
Snagletooth Send message Joined: 4 May 07 Posts: 67 Credit: 134,427 RAC: 0 |
I'm running 3.26 version. Why not 3.31?? I have this question as well. The 3.30 version included the fix for the Mac slowdown problem which effected every type of work unit. It (the fix) presumably will be included in every new version going forward so why would it not be used on Ralph? |
Message boards :
RALPH@home bug list :
MiniRosetta Beta 3.26
©2024 University of Washington
http://www.bakerlab.org