Posts by Hermes

1) Message boards : RALPH@home bug list : application not staying in memory (Message 780)
Posted 2 Mar 2006 by Hermes
Post:
28/02/2006 23:43:04|ralph@home|Result HOMSdi_homDB018_1di2__228_10_0 exited with zero status but no 'finished' file
28/02/2006 23:43:04|ralph@home|If this happens repeatedly you may need to reset the project.


I had this problem frequently with rosetta@home on my WindowsXP machine. This seems to be a timing issue, when the application doesn't receive a heartbet from the Boinc CC in time and exits, because it thinks the CC has been stopped. This even happens, when the application is suspended (left in memory).
I finally discovered that a certain program doing many harddrive accesses causes this. Maybe the Windows multitasking system is not up to the job of giving those processes that need to run the cpu. The same program on a slower computer under Linux lets Boinc/rosetta work flawlessly.

The PC was only running SETI@home at the time above, no user activity, no backup, no antivirus etc was running. The PC has 1GB of RAM so there is no issue with physical memory availability.


Hmmm, perhaps Windows decided it was time to run one of those findfast-Utilities that scan your harddisks?
2) Message boards : RALPH@home bug list : Maximum disk usage exceeded (Message 744)
Posted 28 Feb 2006 by Hermes
Post:
This definitely is a limit of the Ralph work units
<workunit>
<name>HOMSb7_homDB015_1b72__226_2</name>
<app_name>rosetta_beta</app_name>
<version_num>489</version_num>
<rsc_fpops_est>40000000000000.000000</rsc_fpops_est>
<rsc_fpops_bound>50000000000000000.000000</rsc_fpops_bound>
<rsc_memory_bound>100000000.000000</rsc_memory_bound>
<rsc_disk_bound>200000000.000000</rsc_disk_bound>
<command_line>
...

The workunit is limited to 200000000.000000 bytes. It has nothing to do with my disk-usage parameters. The offending file is probably stdout.txt, whose size is approximatly 1MB/CPU-Minute on my PC. So I expect it to fail after 3.5h.


If you are correct we need to know. Could you set your time parameter to 4 hours and see if that fixes the problem? If not then try 2 hours

The WU failed after 3.25h )see the Result here. The next WU failed after just over 5h. See Result here.
I'll try "Target CPU run time" 4 hours the next time I get a WU.
3) Message boards : RALPH@home bug list : Maximum disk usage exceeded (Message 710)
Posted 28 Feb 2006 by Hermes
Post:
This definitely is a limit of the Ralph work units
<workunit>
<name>HOMSb7_homDB015_1b72__226_2</name>
<app_name>rosetta_beta</app_name>
<version_num>489</version_num>
<rsc_fpops_est>40000000000000.000000</rsc_fpops_est>
<rsc_fpops_bound>50000000000000000.000000</rsc_fpops_bound>
<rsc_memory_bound>100000000.000000</rsc_memory_bound>
<rsc_disk_bound>200000000.000000</rsc_disk_bound>
<command_line>
...

The workunit is limited to 200000000.000000 bytes. It has nothing to do with my disk-usage parameters. The offending file is probably stdout.txt, whose size is approximatly 1MB/CPU-Minute on my PC. So I expect it to fail after 3.5h.
4) Message boards : RALPH@home bug list : Maximum disk usage exceeded (Message 590)
Posted 24 Feb 2006 by Hermes
Post:
This work unit / result was aborted with:
2006-02-24 10:57:25 [ralph@home] Aborting result BARCODE_30_1a32__219_4_0: exceeded disk limit: 200020295.000000 > 200000000.000000
2006-02-24 10:57:25 [ralph@home] Unrecoverable error for result BARCODE_30_1a32__219_4_0 (Maximum disk usage exceeded)

The limit must be part of the work unit as I still have several GB of free disk space, that is available to boinc.


You can set your disk usage parameters in your prefs. It probably exceed the "use les than xxx %" setting or something like that.


Preferences:
Disk and memory usage	
Use no more than	40 GB disk space	
Leave at least	1 GB disk space free	
Use no more than	50% of total disk space

The Boinc directory currently occupies just under 2 GiB of a 55.8 GiB partition with 23.7 GiB free.
So the Boinc CC is nowhere near limit in this machine.

5) Message boards : RALPH@home bug list : Maximum disk usage exceeded (Message 566)
Posted 24 Feb 2006 by Hermes
Post:
This work unit / result was aborted with:
2006-02-24 10:57:25 [ralph@home] Aborting result BARCODE_30_1a32__219_4_0: exceeded disk limit: 200020295.000000 > 200000000.000000
2006-02-24 10:57:25 [ralph@home] Unrecoverable error for result BARCODE_30_1a32__219_4_0 (Maximum disk usage exceeded)

The limit must be part of the work unit as I still have several GB of free disk space, that is available to boinc.






©2024 University of Washington
http://www.bakerlab.org