21)
Message boards :
RALPH@home bug list :
Bug report for Rosetta version 5.97
(Message 4132)
Posted 21 Jun 2008 by Pepo Post: Linux Boinc 6.2.4, t419N_autoalign_IGNORE_THE_REST_renumbered_4388_2_1: "Maximum disk usage exceeded". And a lot of Unrecognized XML in parse_init_data_file: fraction_done_update_period in the stderr_txt. Peter |
22)
Message boards :
RALPH@home bug list :
Bug reports for 5.96
(Message 4124)
Posted 20 Jun 2008 by Pepo Post: Linux Boinc 6.2.4, t405__CASP8_JUMPAB_TYPE2_RES81to192_SAVE_ALL_OUT_BARCODE__4233_55_0: task got stuck at 100% and around 3 1/2 hour CPU time, for half day long, after restarting the client progress jumped to 48% and two hours CPU time. Preference is set to 2 hours. The task is again idle at 100%, 4:16 hours and marked as running. Restarted the client again - task jumped to 59.9% and 2:24 hours... bye bye. (And the aborted one got replaced with a coil2_* task - hopefully not another beast from the same family.) Peter |
23)
Message boards :
RALPH@home bug list :
Bug reports for 5.96
(Message 4123)
Posted 20 Jun 2008 by Pepo Post: Linux Boinc 6.2.4, t405__CASP8_JUMPAB_TYPE2_RES81to192_SAVE_ALL_OUT_BARCODE__4233_55_0: task got stuck at 100% and around 3 1/2 hour CPU time, for half day long, after restarting the client progress jumped to 48% and two hours CPU time. Preference is set to 2 hours. In the logs I've found following: 4:27:44 ralph@home [cpu_sched] Resuming t405__CASP8_JUMPAB_TYPE2_RES81to192_SAVE_ALL_OUT_BARCODE__4233_55_0 4:27:44 ralph@home [task_debug] task_state=EXECUTING for t405__CASP8_JUMPAB_TYPE2_RES81to192_SAVE_ALL_OUT_BARCODE__4233_55_0 from unsuspend 4:27:44 ralph@home Resuming task t405__CASP8_JUMPAB_TYPE2_RES81to192_SAVE_ALL_OUT_BARCODE__4233_55_0 using rosetta_beta version 596 4:30:45 --- Restarting t405__CASP8_JUMPAB_TYPE2_RES81to192_SAVE_ALL_OUT_BARCODE__4233_55_0 - message timeout 4:30:45 ralph@home [task_debug] task_state=UNINITIALIZED for t405__CASP8_JUMPAB_TYPE2_RES81to192_SAVE_ALL_OUT_BARCODE__4233_55_0 from kill_task 4:30:45 ralph@home [cpu_sched] Starting t405__CASP8_JUMPAB_TYPE2_RES81to192_SAVE_ALL_OUT_BARCODE__4233_55_0(resume) 4:30:45 --- [task_debug] ACTIVE_TASK::start(): forked process: pid 11473 4:30:45 ralph@home [task_debug] task_state=EXECUTING for t405__CASP8_JUMPAB_TYPE2_RES81to192_SAVE_ALL_OUT_BARCODE__4233_55_0 from start 4:30:45 ralph@home Restarting task t405__CASP8_JUMPAB_TYPE2_RES81to192_SAVE_ALL_OUT_BARCODE__4233_55_0 using rosetta_beta version 596 4:30:46 --- [error] Process 735 not found 4:34:23 ralph@home [task_debug] result t405__CASP8_JUMPAB_TYPE2_RES81to192_SAVE_ALL_OUT_BARCODE__4233_55_0 checkpointed 4:37:58 ralph@home [task_debug] result t405__CASP8_JUMPAB_TYPE2_RES81to192_SAVE_ALL_OUT_BARCODE__4233_55_0 checkpointed ..... 5:26:16 ralph@home [task_debug] result t405__CASP8_JUMPAB_TYPE2_RES81to192_SAVE_ALL_OUT_BARCODE__4233_55_0 checkpointed 5:29:51 ralph@home [task_debug] result t405__CASP8_JUMPAB_TYPE2_RES81to192_SAVE_ALL_OUT_BARCODE__4233_55_0 checkpointed that's all, idle machine since. I think it might be possible that the sudden exit at 4:30:45 and the lockup might have something common? Slot's stdout.txt contains 52591 lines with "res 13 and var 1 at position 1 is not a proper Nterm variant". It is still just 96% of 3.2 MB file :-) I'll try to let it finish. Peter |
24)
Message boards :
RALPH@home bug list :
Bug reports for 5.96
(Message 4121)
Posted 20 Jun 2008 by Pepo Post: ... "Maximum disk usage exceeded + ERROR:: Exit from: .refold.cc line: 338", etc. Few debugging outputs. No wonder, when two Rosetta Beta 5.96 slots dirs' stdout.txt files, sized 1 GB, contain no idea how many hundred thousands trailing lines in form of WARNING: refold called with uninit torsions And the slots are not being cleaned upon aborting and reporting the task (Win Boinc 6.2.4). Peter |
25)
Message boards :
RALPH@home bug list :
Bug reports for 5.96
(Message 4119)
Posted 20 Jun 2008 by Pepo Post: Same here. All my 't419N' WU's are failing. The same here today. Lot of "Report to Microsoft" confirmation windows. With plenty of apparently unrelated reasons: "Maximum disk usage exceeded + ERROR:: Unable to determine sequence length from pdb file", "Unhandled Exception Record, Reason: Breakpoint Encountered (0x80000003) at address 0x7C90120E", "Incorrect function. (0x1) - exit code 1 (0x1) + ERROR:: Exit from: .refold.cc line: 338", "Maximum disk usage exceeded + ERROR:: Exit from: .refold.cc line: 338", etc. Few debugging outputs. Once I've noticed in the logs "Aborting task t419N_autoalign_IGNORE_THE_REST_renumbered_4290_6_0: exceeded disk limit: 531.22MB > 476.84MB" - the disk space isue seems to be my fault, the free disk space went down to approx. my BOINC "leave xxxx free" preference. But still the "Report to Microsoft" dialogs should not appear. Peter |
26)
Message boards :
RALPH@home bug list :
Bug reports for 5.96
(Message 4114)
Posted 17 Jun 2008 by Pepo Post: t405__CASP8_JUMPAB_TYPE2_RES81to192_SAVE_ALL_OUT_BARCODE__4233_54_0: <core_client_version>6.2.4</core_client_version> Peter |
27)
Message boards :
RALPH@home bug list :
minirosetta v1.27 bug thread
(Message 4112)
Posted 16 Jun 2008 by Pepo Post: A slight improvement on this one. This is the first version with any graphics, I can see the text now but still no picture of the atom. I haven't got any 1.27 nor a 1.28 on Ralph, but a 1.28 on Rosetta (I suppose it's code is identical). STILL no graphics (with BOINC 6.2.x, service/secure install). Is the graphics already suposed to work on such configuration? Peter |
28)
Message boards :
RALPH@home bug list :
minirosetta v1.27 bug thread
(Message 4110)
Posted 11 Jun 2008 by Pepo Post: A slight improvement on this one. This is the first version with any graphics, I can see the text now but still no picture of the atom. I could already see the minirosetta's texts many versions ago (with BOINC 6.2.x), but similarly with no pictures (which were OTOH available while using BOINC 5.10.45). Peter |
29)
Message boards :
RALPH@home bug list :
minirosetta v1.26 bug thread
(Message 4092)
Posted 30 May 2008 by Pepo Post: Unfortunately, my first 1.26 WU failed. The same outcome here: process exited with code 1 (0x1, -255) fragment reading error -- pdb, chain and seqpos mismatch position: 66; neighbor: 187; line: 2 ERROR: Cannot open file aaT0396_1JR8A_1_0001.pdb ERROR:: Exit from: src/core/io/pdb/pose_io.cc line: 157[/code] Peter |
30)
Message boards :
Number crunching :
CPU run time
(Message 4034)
Posted 18 May 2008 by Pepo Post: In light of the recent difficulties with memory leaks in Rosetta Mini 1.1x, especially those leaks that took upwards of 20 hours to cause a failure, would it be appropriate to increase the default runtime? This is possibly not the case, but IMO the events prove the necessity of rather broad variance of used runtimes - not only the short ones (for fast turnaround of tasks and apps as possibly the devs' main goal) but also the mid and long(est) ones to test the tasks' and apps' stability. Peter |
31)
Message boards :
RALPH@home bug list :
minirosetta v1.20 bug thread
(Message 4019)
Posted 13 May 2008 by Pepo Post: Since my BOINC upgrade to 6.2.1 (protected service install), done in the middle of the short v1.19 era, I can see no protein-related graphical stuff displayed in minirosetta's graphics window, just progress%, CPU time and user info, which are displayed and being updated correctly. Is it just a mini v1.20 issue, or is minirosetta not yet supposed to correctly display graphics in BOINC v6.x protected service installation (although I could see no reason for it, as the texts are fine)? Peter |
32)
Message boards :
RALPH@home bug list :
minirosetta 1.19 bug thread
(Message 4008)
Posted 8 May 2008 by Pepo Post: Here's a new one: The same here, twice: mtlr_test2_S.00000001.440_3671_1_1 and mtlr_test2_S.00000001.84_3671_1_1: process exited with code 1 (0x1, -255) Peter |
33)
Message boards :
RALPH@home bug list :
minirosetta 1.19 bug thread
(Message 3997)
Posted 6 May 2008 by Pepo Post: Another compute error to report, although this one needed only 20 seconds to error out. My mtlr_test1_3612_2_0 was apparently a sibling of the mentioned mtlr_test1_3612_3_0 and exited nearly the same way: <message> Peter |
34)
Message boards :
Feedback :
Run time defaults
(Message 3958)
Posted 23 Apr 2008 by Pepo Post: ...is it better to prefer testing larger number of workunits (producing less decoys for each one, 1-5), or rather somewhat more decoys (5-15-...) from each WU, at the expense of the number of tested WUs? There could indeed be a reason to also test single WUs for a longer time. feet1st wrote: This one mini_abinitio-1bk2_-test_2008-2-6_3310_73_0 shows peak memory of 867MB! It's 17hrs in to a 24hr runtime preference on WinXP. Peter |
35)
Message boards :
RALPH@home bug list :
minirosetta 1.12 bug thread
(Message 3916)
Posted 16 Apr 2008 by Pepo Post: The new version (v1.12) is now released. Post bugs here! Sorry, me toooooo. Tried 3 WUs, still the same. Peter |
36)
Message boards :
Number crunching :
CPU run time
(Message 3868)
Posted 9 Apr 2008 by Pepo Post: So I was just wondering... ...and not only you... Would it be better for RALPH to have a bigger cpu run time per WU? Is the default 1 hour run time enough for testing or would it be better for me to raise it up by few hours? We were discussing this question two months ago in the Run time defaults thread, unluckily without any agreement or an official response from devs' side. The only known note is a one year old David Kim's comment: We would prefer lower run times so that results are returned quicker. Peter |
37)
Message boards :
RALPH@home bug list :
Bug Reports for Rosetta Mini Versions 1.+
(Message 3866)
Posted 9 Apr 2008 by Pepo Post: My Linux host did get two of yesterday's failing 1.10 tasks. Today the scheduler suddenly incorrectly claimed the host lacks enough free disk space: Message from server: No work sent (there was work but you don't have enough disk space allocated) The host's projects occupy 466 MB of the 1 GB reserved for Boinc (on the otherwise nearly empty disk), Ralph took its a bit less than 40 MB. (After writing this message: the complete BOINC/ tree disk usage is 599 MB). I've enlarged the limit to 1.6 GB - the host suddenly received one new 1.10 task consisting of three downloaded files - alltogether some 5 MBs. Ralph's disk usage rose to a bit more than 40 MB. I'll keep the project's and task's slot disk usage in eye, but do not really believe that the intermediate files would grow to (much) more than 394 MB........ The requested limit must have changed during last few hours. Peter |
38)
Message boards :
RALPH@home bug list :
Bug Reports for Rosetta Mini Versions 1.+
(Message 3855)
Posted 8 Apr 2008 by Pepo Post: unzip: cannot find zipfile directory in one of minirosetta_database_rev20940.zip or minirosetta_database_rev20940.zip.zip, and cannot find minirosetta_database_rev20940.zip.ZIP, period. Stay cool, Edith ;-) the stupid remarks about "*.zip.zip" and "*.zip.ZIP" belongs to the unzip library (its guesswork for possible typo errors). Example: $ unzip -v archive unzip: cannot find or open archive, archive.zip or archive.ZIP. Peter |
39)
Message boards :
RALPH@home bug list :
Bug Reports for Rosetta Mini Versions 1.+
(Message 3852)
Posted 8 Apr 2008 by Pepo Post: My Linux host has also got both types of task errors ("Unique best command line context option match not found for -weights" and "End-of-central-directory signature not found...") Peter |
40)
Message boards :
Current tests :
Help us debug minirosetta.
(Message 3828)
Posted 12 Mar 2008 by Pepo Post: When the new pdb symbols file like http://ralph.bakerlab.org/download/minirosetta_1.09_windows_intelx86.pdb will become available... Thanks, Mike. I've noticed it too, it is already waiting (and compressed to 18MB) two days long in my projects' folders, to be paired with any bugy apps. I've just got the first 3 pieces today (but they finished healthy, again nothing to report about)-; Peter |
©2024 University of Washington
http://www.bakerlab.org