Posts by Pepo

21) Message boards : RALPH@home bug list : Bug report for Rosetta version 5.97 (Message 4132)
Posted 21 Jun 2008 by Pepo
Post:
Linux Boinc 6.2.4, t419N_autoalign_IGNORE_THE_REST_renumbered_4388_2_1: "Maximum disk usage exceeded". And a lot of
Unrecognized XML in parse_init_data_file: fraction_done_update_period
Skipping: 1.000000
Skipping: /fraction_done_update_period

in the stderr_txt.

Peter
22) Message boards : RALPH@home bug list : Bug reports for 5.96 (Message 4124)
Posted 20 Jun 2008 by Pepo
Post:
Linux Boinc 6.2.4, t405__CASP8_JUMPAB_TYPE2_RES81to192_SAVE_ALL_OUT_BARCODE__4233_55_0: task got stuck at 100% and around 3 1/2 hour CPU time, for half day long, after restarting the client progress jumped to 48% and two hours CPU time. Preference is set to 2 hours.

I'll try to let it finish.

The task is again idle at 100%, 4:16 hours and marked as running. Restarted the client again - task jumped to 59.9% and 2:24 hours... bye bye. (And the aborted one got replaced with a coil2_* task - hopefully not another beast from the same family.)

Peter
23) Message boards : RALPH@home bug list : Bug reports for 5.96 (Message 4123)
Posted 20 Jun 2008 by Pepo
Post:
Linux Boinc 6.2.4, t405__CASP8_JUMPAB_TYPE2_RES81to192_SAVE_ALL_OUT_BARCODE__4233_55_0: task got stuck at 100% and around 3 1/2 hour CPU time, for half day long, after restarting the client progress jumped to 48% and two hours CPU time. Preference is set to 2 hours.

In the logs I've found following:

4:27:44 ralph@home [cpu_sched] Resuming t405__CASP8_JUMPAB_TYPE2_RES81to192_SAVE_ALL_OUT_BARCODE__4233_55_0
4:27:44 ralph@home [task_debug] task_state=EXECUTING for t405__CASP8_JUMPAB_TYPE2_RES81to192_SAVE_ALL_OUT_BARCODE__4233_55_0 from unsuspend
4:27:44 ralph@home Resuming task t405__CASP8_JUMPAB_TYPE2_RES81to192_SAVE_ALL_OUT_BARCODE__4233_55_0 using rosetta_beta version 596
4:30:45 --- Restarting t405__CASP8_JUMPAB_TYPE2_RES81to192_SAVE_ALL_OUT_BARCODE__4233_55_0 - message timeout
4:30:45 ralph@home [task_debug] task_state=UNINITIALIZED for t405__CASP8_JUMPAB_TYPE2_RES81to192_SAVE_ALL_OUT_BARCODE__4233_55_0 from kill_task
4:30:45 ralph@home [cpu_sched] Starting t405__CASP8_JUMPAB_TYPE2_RES81to192_SAVE_ALL_OUT_BARCODE__4233_55_0(resume)
4:30:45 --- [task_debug] ACTIVE_TASK::start(): forked process: pid 11473
4:30:45 ralph@home [task_debug] task_state=EXECUTING for t405__CASP8_JUMPAB_TYPE2_RES81to192_SAVE_ALL_OUT_BARCODE__4233_55_0 from start
4:30:45 ralph@home Restarting task t405__CASP8_JUMPAB_TYPE2_RES81to192_SAVE_ALL_OUT_BARCODE__4233_55_0 using rosetta_beta version 596
4:30:46 --- [error] Process 735 not found
4:34:23 ralph@home [task_debug] result t405__CASP8_JUMPAB_TYPE2_RES81to192_SAVE_ALL_OUT_BARCODE__4233_55_0 checkpointed
4:37:58 ralph@home [task_debug] result t405__CASP8_JUMPAB_TYPE2_RES81to192_SAVE_ALL_OUT_BARCODE__4233_55_0 checkpointed
.....
5:26:16 ralph@home [task_debug] result t405__CASP8_JUMPAB_TYPE2_RES81to192_SAVE_ALL_OUT_BARCODE__4233_55_0 checkpointed
5:29:51 ralph@home [task_debug] result t405__CASP8_JUMPAB_TYPE2_RES81to192_SAVE_ALL_OUT_BARCODE__4233_55_0 checkpointed

that's all, idle machine since.

I think it might be possible that the sudden exit at 4:30:45 and the lockup might have something common?

Slot's stdout.txt contains 52591 lines with "res 13 and var 1 at position 1 is not a proper Nterm variant". It is still just 96% of 3.2 MB file :-)
I'll try to let it finish.

Peter
24) Message boards : RALPH@home bug list : Bug reports for 5.96 (Message 4121)
Posted 20 Jun 2008 by Pepo
Post:
... "Maximum disk usage exceeded + ERROR:: Exit from: .refold.cc line: 338", etc. Few debugging outputs.

Once I've noticed in the logs "Aborting task t419N_autoalign_IGNORE_THE_REST_renumbered_4290_6_0: exceeded disk limit: 531.22MB > 476.84MB" - the disk space isue seems to be my fault, the free disk space went down to approx. my BOINC "leave xxxx free" preference. But still the "Report to Microsoft" dialogs should not appear.


No wonder, when two Rosetta Beta 5.96 slots dirs' stdout.txt files, sized 1 GB, contain no idea how many hundred thousands trailing lines in form of
WARNING: refold called with uninit torsions
I still copied your coordinates into misc::, hope that was the right thing to do!
WARNING: refold called with uninit torsions
I still copied your coordinates into misc::, hope that was the right thing to do!
WARNING: refold called with uninit torsions
I still copied your coordinates into misc::, hope that was the right thing to do!
...


And the slots are not being cleaned upon aborting and reporting the task (Win Boinc 6.2.4).

Peter
25) Message boards : RALPH@home bug list : Bug reports for 5.96 (Message 4119)
Posted 20 Jun 2008 by Pepo
Post:
Same here. All my 't419N' WU's are failing.

The same here today. Lot of "Report to Microsoft" confirmation windows. With plenty of apparently unrelated reasons: "Maximum disk usage exceeded + ERROR:: Unable to determine sequence length from pdb file", "Unhandled Exception Record, Reason: Breakpoint Encountered (0x80000003) at address 0x7C90120E", "Incorrect function. (0x1) - exit code 1 (0x1) + ERROR:: Exit from: .refold.cc line: 338", "Maximum disk usage exceeded + ERROR:: Exit from: .refold.cc line: 338", etc. Few debugging outputs.

Once I've noticed in the logs "Aborting task t419N_autoalign_IGNORE_THE_REST_renumbered_4290_6_0: exceeded disk limit: 531.22MB > 476.84MB" - the disk space isue seems to be my fault, the free disk space went down to approx. my BOINC "leave xxxx free" preference. But still the "Report to Microsoft" dialogs should not appear.

Peter
26) Message boards : RALPH@home bug list : Bug reports for 5.96 (Message 4114)
Posted 17 Jun 2008 by Pepo
Post:
t405__CASP8_JUMPAB_TYPE2_RES81to192_SAVE_ALL_OUT_BARCODE__4233_54_0:

<core_client_version>6.2.4</core_client_version>
<![CDATA[ - exit code -1073741819 (0xc0000005)
<stderr_txt>
# cpu_run_time_pref: 7200
# random seed: 1181096

Unhandled Exception Detected...
- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x7C911669 read attempt to address 0x00000000
Engaging BOINC Windows Runtime Debugger...

</stderr_txt>
]]>


Peter
27) Message boards : RALPH@home bug list : minirosetta v1.27 bug thread (Message 4112)
Posted 16 Jun 2008 by Pepo
Post:
A slight improvement on this one. This is the first version with any graphics, I can see the text now but still no picture of the atom.

I could already see the minirosetta's texts many versions ago (with BOINC 6.2.x), but similarly with no pictures (which were OTOH available while using BOINC 5.10.45).

I haven't got any 1.27 nor a 1.28 on Ralph, but a 1.28 on Rosetta (I suppose it's code is identical). STILL no graphics (with BOINC 6.2.x, service/secure install).

Is the graphics already suposed to work on such configuration?

Peter
28) Message boards : RALPH@home bug list : minirosetta v1.27 bug thread (Message 4110)
Posted 11 Jun 2008 by Pepo
Post:
A slight improvement on this one. This is the first version with any graphics, I can see the text now but still no picture of the atom.

I could already see the minirosetta's texts many versions ago (with BOINC 6.2.x), but similarly with no pictures (which were OTOH available while using BOINC 5.10.45).

Peter
29) Message boards : RALPH@home bug list : minirosetta v1.26 bug thread (Message 4092)
Posted 30 May 2008 by Pepo
Post:
Unfortunately, my first 1.26 WU failed.

Incorrect function. (0x1) - exit code 1 (0x1)

fragment reading error -- pdb, chain and seqpos mismatch
position: 66; neighbor: 187; line: 2

ERROR: Cannot open file aaT0396_1JR8A_1_0001.pdb
ERROR:: Exit from: ....srccoreiopdbpose_io.cc line: 157

The same outcome here:
process exited with code 1 (0x1, -255)
fragment reading error -- pdb, chain and seqpos mismatch
position: 66; neighbor: 187; line: 2

ERROR: Cannot open file aaT0396_1JR8A_1_0001.pdb
ERROR:: Exit from: src/core/io/pdb/pose_io.cc line: 157[/code]


Peter
30) Message boards : Number crunching : CPU run time (Message 4034)
Posted 18 May 2008 by Pepo
Post:
In light of the recent difficulties with memory leaks in Rosetta Mini 1.1x, especially those leaks that took upwards of 20 hours to cause a failure, would it be appropriate to increase the default runtime?

This is possibly not the case, but IMO the events prove the necessity of rather broad variance of used runtimes - not only the short ones (for fast turnaround of tasks and apps as possibly the devs' main goal) but also the mid and long(est) ones to test the tasks' and apps' stability.

Peter
31) Message boards : RALPH@home bug list : minirosetta v1.20 bug thread (Message 4019)
Posted 13 May 2008 by Pepo
Post:
Since my BOINC upgrade to 6.2.1 (protected service install), done in the middle of the short v1.19 era, I can see no protein-related graphical stuff displayed in minirosetta's graphics window, just progress%, CPU time and user info, which are displayed and being updated correctly.

Is it just a mini v1.20 issue, or is minirosetta not yet supposed to correctly display graphics in BOINC v6.x protected service installation (although I could see no reason for it, as the texts are fine)?

Peter
32) Message boards : RALPH@home bug list : minirosetta 1.19 bug thread (Message 4008)
Posted 8 May 2008 by Pepo
Post:
Here's a new one:

ERROR: Cannot open 9-mer fragment library file: 9merfile

The same here, twice: mtlr_test2_S.00000001.440_3671_1_1 and mtlr_test2_S.00000001.84_3671_1_1:
process exited with code 1 (0x1, -255)
ERROR: Cannot open 9-mer fragment library file: 9merfile
ERROR:: Exit from: src/protocols/frags/TorsionFragment.cc line: 70


Peter
33) Message boards : RALPH@home bug list : minirosetta 1.19 bug thread (Message 3997)
Posted 6 May 2008 by Pepo
Post:
Another compute error to report, although this one needed only 20 seconds to error out.

My mtlr_test1_3612_2_0 was apparently a sibling of the mentioned mtlr_test1_3612_3_0 and exited nearly the same way:
<message>
process exited with code 1 (0x1, -255)
</message>
<stderr_txt>
WARNING: Override of option -out:nstruct sets a different value

</stderr_txt>


Peter
34) Message boards : Feedback : Run time defaults (Message 3958)
Posted 23 Apr 2008 by Pepo
Post:
...is it better to prefer testing larger number of workunits (producing less decoys for each one, 1-5), or rather somewhat more decoys (5-15-...) from each WU, at the expense of the number of tested WUs?


There could indeed be a reason to also test single WUs for a longer time.
feet1st wrote:
This one mini_abinitio-1bk2_-test_2008-2-6_3310_73_0 shows peak memory of 867MB! It's 17hrs in to a 24hr runtime preference on WinXP.

Memory does not seem to be growing without bounds (i.e. no memory leak), just seems to use a lot, then free it again at various times as it runs.


Peter
35) Message boards : RALPH@home bug list : minirosetta 1.12 bug thread (Message 3916)
Posted 16 Apr 2008 by Pepo
Post:
The new version (v1.12) is now released. Post bugs here!

The 2 v1.12 had the same error I had before on v1.10 & 1.11, running Linux.

Same problem for me also.

Sorry, me toooooo. Tried 3 WUs, still the same.

Peter
36) Message boards : Number crunching : CPU run time (Message 3868)
Posted 9 Apr 2008 by Pepo
Post:
So I was just wondering...

...and not only you...

Would it be better for RALPH to have a bigger cpu run time per WU? Is the default 1 hour run time enough for testing or would it be better for me to raise it up by few hours?
I was just figuring that since this project is focusing on error detection etc., is it really useful to run the WU's longer than the deafault 1 hour?

We were discussing this question two months ago in the Run time defaults thread, unluckily without any agreement or an official response from devs' side.

The only known note is a one year old David Kim's comment:
We would prefer lower run times so that results are returned quicker.


Peter
37) Message boards : RALPH@home bug list : Bug Reports for Rosetta Mini Versions 1.+ (Message 3866)
Posted 9 Apr 2008 by Pepo
Post:
My Linux host did get two of yesterday's failing 1.10 tasks. Today the scheduler suddenly incorrectly claimed the host lacks enough free disk space:
Message from server: No work sent (there was work but you don't have enough disk space allocated)
Message from server: Not enough disk space (only 393.9 MB free for BOINC). Review preferences for maximum disk space used.

The host's projects occupy 466 MB of the 1 GB reserved for Boinc (on the otherwise nearly empty disk), Ralph took its a bit less than 40 MB. (After writing this message: the complete BOINC/ tree disk usage is 599 MB).

I've enlarged the limit to 1.6 GB - the host suddenly received one new 1.10 task consisting of three downloaded files - alltogether some 5 MBs. Ralph's disk usage rose to a bit more than 40 MB.

I'll keep the project's and task's slot disk usage in eye, but do not really believe that the intermediate files would grow to (much) more than 394 MB........ The requested limit must have changed during last few hours.

Peter
38) Message boards : RALPH@home bug list : Bug Reports for Rosetta Mini Versions 1.+ (Message 3855)
Posted 8 Apr 2008 by Pepo
Post:
unzip:  cannot find zipfile directory in one of minirosetta_database_rev20940.zip or
        minirosetta_database_rev20940.zip.zip, and cannot find minirosetta_database_rev20940.zip.ZIP, period.


Edith says:
I just looked in the projects directory and found the missing .zip, inly a second .ZIP was missing ;). In the described thread in there was the residue_types.txt, but if the stupid minirosetta looks for a double-zipped file, it can't find any of course.
Opening the .zip was no problem

Stay cool, Edith ;-) the stupid remarks about "*.zip.zip" and "*.zip.ZIP" belongs to the unzip library (its guesswork for possible typo errors). Example:
$ unzip -v archive
unzip:  cannot find or open archive, archive.zip or archive.ZIP.


Peter
39) Message boards : RALPH@home bug list : Bug Reports for Rosetta Mini Versions 1.+ (Message 3852)
Posted 8 Apr 2008 by Pepo
Post:
My Linux host has also got both types of task errors ("Unique best command line context option match not found for -weights" and "End-of-central-directory signature not found...")

Peter
40) Message boards : Current tests : Help us debug minirosetta. (Message 3828)
Posted 12 Mar 2008 by Pepo
Post:
When the new pdb symbols file like http://ralph.bakerlab.org/download/minirosetta_1.09_windows_intelx86.pdb will become available...

The 1.09 version is available now...48MB!

Thanks, Mike. I've noticed it too, it is already waiting (and compressed to 18MB) two days long in my projects' folders, to be paired with any bugy apps. I've just got the first 3 pieces today (but they finished healthy, again nothing to report about)-;

Peter


Previous 20 · Next 20



©2024 University of Washington
http://www.bakerlab.org