Message boards : RALPH@home bug list : Rosetta mini beta and/or android 3.61-3.83
Previous · 1 . . . 8 · 9 · 10 · 11 · 12 · Next
Author | Message |
---|---|
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 910 Credit: 1,892,541 RAC: 294 |
I aborted those that I had at 99%. After 150h i quit. I think it's a bug of these wus No problems with "ab_12_01_...." wus |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 910 Credit: 1,892,541 RAC: 294 |
|
Admpicard999 Send message Joined: 4 Sep 16 Posts: 2 Credit: 1,257 RAC: 0 |
I have a WU that has been running 215+ hours, now at 99.7%. it's been at 99+ for at least a week. The deadline was now several days ago. At what point to do I give up and cancel the WU? |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 910 Credit: 1,892,541 RAC: 294 |
I have a WU that has been running 215+ hours, now at 99.7%. it's been at 99+ for at least a week. The deadline was now several days ago. At what point to do I give up and cancel the WU? Kill it |
BQL_FFM Send message Joined: 7 Oct 16 Posts: 1 Credit: 53,648 RAC: 0 |
4051342 Name ab_12_01__vall_2011_1rnbA_vall_2011_9mers_3mers_20564_374_0 Workunit 3565236 4051267 Name ab_12_01__vall_2011_1lisA_vall_2011_9mers_3mers_20564_373_0 Workunit 3565167 Tasks aborted, the started again and again after max ca. 60 seconds. |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 910 Credit: 1,892,541 RAC: 294 |
Rosetta Rosetta beta Rosetta mini Rosetta mini beta I'm little bit confused... :-) |
dekim Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 20 Jan 06 Posts: 250 Credit: 543,579 RAC: 0 |
The latest boinc client release, 7.8.2 is buggy. The BOINC developers are working on a fix but not knowing how long it may take, I will be testing application side fixes for the minirosetta app. The main issue is that work may run in dirty slot directories and when this happens our application crashes. It's causing a 10-20% increase in failures. |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 910 Credit: 1,892,541 RAC: 294 |
The latest boinc client release, 7.8.2 is buggy. The BOINC developers are working on a fix but not knowing how long it may take, I will be testing application side fixes for the minirosetta app. I know this problem, but i was thinking you were abandoning the "old" 3.x versions.... |
dekim Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 20 Jan 06 Posts: 250 Credit: 543,579 RAC: 0 |
We are still actively using 3.73 for forward folding and for Robetta structure prediction. |
Roadranner Send message Joined: 15 Oct 13 Posts: 4 Credit: 528,888 RAC: 24 |
One task (#4084226) stalled at 83.144%. I'm going to stop it now. |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 910 Credit: 1,892,541 RAC: 294 |
We are still actively using 3.73 for forward folding and for Robetta structure prediction. I notice. We are still crunching this version on Ralph.... |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 910 Credit: 1,892,541 RAC: 294 |
We are still actively using 3.73 for forward folding and for Robetta structure prediction. My fault. I'm crunching 3.78 app. I hope we start to test largely the 4.x version |
TPCBF Send message Joined: 20 Jun 11 Posts: 30 Credit: 27,776 RAC: 0 |
One task (#4084226) stalled at 83.144%.Got about a dozen or so he last couple of days. They all would end up in a compute error or stall out at various percentages in the 70-90% range, blocking any other useful WU from running. I noticed that pretty much all WUs checkpoint after about 13 secs, then won't show another checkpoint for hours until they crap out. Also, I had to manually remove a dozen or so dead tasks in the task manager to get my machine responsive again... Have the last two running right now, which show a slightly different behaviour (at least while I can watch them). Started about 15min ago, they show 12% done, 12 min of CPU time vs 15 clock time and the check point increasing each time I check by about 10 secs, but still a fraction of the indicated CPU time used. I am using BOINC agent 7.8.3 on an 8GB/i3/Windows 8.1 host... |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 910 Credit: 1,892,541 RAC: 294 |
|
Dr Who Fan Send message Joined: 2 Sep 06 Posts: 76 Credit: 107,857 RAC: 0 |
Same error on 4 tasks: 4191210 4191171 4189917 4189960 ERROR: Warning: can't open file t000_.fasta! ERROR:: Exit from: ......srccoresequenceutil.cc line: 148 BOINC:: Error reading and gzipping output datafile: default.out called boinc_finish |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 910 Credit: 1,892,541 RAC: 294 |
ERROR: Warning: can't open file t000_.fasta! ERROR:: Exit from: ......srccoresequenceutil.cc line: 148 BOINC:: Error reading and gzipping output datafile: default.out called boinc_finish These above are errors after few seconds, others after 2 or 3 hours 4193187 4193217 etc ====================================================== DONE :: 1 starting structures 14521.4 cpu seconds This process generated 1 decoys from 1 attempts ====================================================== BOINC :: WS_max 3.40202e+008 BOINC :: Watchdog shutting down... BOINC :: BOINC support services shutting down cleanly ... called boinc_finish </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>e8bf7_SOL_jumping_all_pairings_SAVE_ALL_OUT_20700_2_1_r875276919_0</file_name> <error_code>-240 (stat() failed)</error_code> </file_xfer_error> </message> |
Mad_Max Send message Joined: 15 Nov 12 Posts: 15 Credit: 404,700 RAC: 0 |
ALL of my WUs from latest batch errored out Some very soon(just 1-2 min) after start with similar errors: ERROR: Warning: can't open file t000_.fasta! ERROR:: Exit from: ......srccoresequenceutil.cc line: 148 BOINC:: Error reading and gzipping output datafile: default.out Other near the end of computation with another errors: <message> upload failure: <file_xfer_error> <file_name>40a6c_SOL_jumping_all_pairings_SAVE_ALL_OUT_20700_2_1_r1951005696_0</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> </message> |
Dr Who Fan Send message Joined: 2 Sep 06 Posts: 76 Credit: 107,857 RAC: 0 |
SAME HERE: Application version Rosetta Mini v3.78 windows_intelx86 TASK 4193784 ====================================================== DONE :: 1 starting structures 25656.8 cpu seconds This process generated 2 decoys from 2 attempts ====================================================== BOINC :: WS_max 4.0303e+008 BOINC :: Watchdog shutting down... BOINC :: BOINC support services shutting down cleanly ... called boinc_finish </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>84897_SOL_jumping_all_pairings_SAVE_ALL_OUT_20700_10_1_r1828484885_0</file_name> <error_code>-240 (stat() failed)</error_code> </file_xfer_error> </message> ]]> |
PDW Send message Joined: 30 Aug 14 Posts: 6 Credit: 1,832,794 RAC: 0 |
https://ralph.bakerlab.org/workunit.php?wuid=3786787 bdff5_MEM_cplu_180319_jumping_all_pairings_SAVE_ALL_OUT_IGNORE_THE_REST_20754_3 Errors out with... Unpacking zip data: ../../projects/ralph.bakerlab.org/minirosetta_database_d0bf94b.zip Unpacking WU data ... Unpacking data: ../../projects/ralph.bakerlab.org/bdff5_MEM_cplu_180319.zip Setting database description ... Setting up checkpointing ... Setting up graphics native ... Setting up folding (abrelax) ... ERROR: [ERROR] Unable to open constraints file: bdff5_MEM_t000.cst ERROR:: Exit from: ......srccorescoringconstraintsConstraintIO.cc line: 461 BOINC:: Error reading and gzipping output datafile: default.out called boinc_finish |
PDW Send message Joined: 30 Aug 14 Posts: 6 Credit: 1,832,794 RAC: 0 |
These two: https://ralph.bakerlab.org/workunit.php?wuid=3786735 https://ralph.bakerlab.org/workunit.php?wuid=3786774 had this error... Starting work on structure: _00001 ERROR: p>=1 && p<= (int) pairings_.size() ERROR:: Exit from: ......srcprotocolsjumpingSheetBuilder.cc line: 157 BOINC:: Error reading and gzipping output datafile: default.out called boinc_finish |
Message boards :
RALPH@home bug list :
Rosetta mini beta and/or android 3.61-3.83
©2024 University of Washington
http://www.bakerlab.org