Posts by David@home

11) Message boards : RALPH@home bug list : Rosetta does not give up CPU time to cleanmgr.exe (Message 560)
Posted 24 Feb 2006 by Profile David@home
Post:
This runs cleanmgr exactly the same way. The first thing cleanmgr does is look for the amount of disk space that can be saved by compressing old files. This check can be disabled using the information in the Microsft article.

Try creating a backup of this registry key then delete it then check RALPH@home.

After the test you can always restore the registry key back using the backup you made.

This is all explained in the Microsoft article.

12) Message boards : RALPH@home bug list : Rosetta does not give up CPU time to cleanmgr.exe (Message 543)
Posted 23 Feb 2006 by Profile David@home
Post:
cleanmgr.exe runs at normal priority. I have not seen this on my XP Pro SP2 system. Cleanmgr only uses a very small amount of CPU when it is checking for space that can be saved by compress old files.

If you are happy using regedit there is a registry key that you can set to stop XP from running this compress space check which is very slow. I have done this on my system.

This url has more info

http://support.microsoft.com/?id=812248


13) Message boards : RALPH@home bug list : application not staying in memory (Message 541)
Posted 23 Feb 2006 by Profile David@home
Post:
I have noticed that RALPH WUs regularly fall out of memory when the client is in paused state.

e.g. from the log file:

23/02/2006 17:17:06|ralph@home|Restarting result BARCODE_30_1cc8A_215_22_0 using rosetta_beta version 4.86
23/02/2006 17:17:06|SETI@home|Pausing result 23dc00aa.19627.28578.1009650.1.195_0 (left in memory)
23/02/2006 18:17:06|ralph@home|Pausing result BARCODE_30_1cc8A_215_22_0 (left in memory)
23/02/2006 18:17:06|SETI@home|Resuming result 23dc00aa.19627.28578.1009650.1.195_0 using setiathome version 4.11
23/02/2006 18:36:56|ralph@home|Result BARCODE_30_1cc8A_215_22_0 exited with zero status but no 'finished' file
23/02/2006 18:36:56|ralph@home|If this happens repeatedly you may need to reset the project.
23/02/2006 18:36:56||request_reschedule_cpus: process exited


As a project reset will delete all files associated with RALPH it would not make sense to do this if this failure to remain in memory is something to do with the new client under test.

Using Windows XP Pro SP2, BOINC v 4.45, Intel P4 single core no hyperthreading. Client perferences set to leave in memory. Sharing two applications RALPH@home and SETI@home.





When you say "Client is in paused state", are you saying -
1) That the rosetta client application has been swapped out by BOINC to run another project application
2) You have paused the workunit from the work tab
3) you have suspended BOINC client activities from the BOINC menu
4) You have suspended the Ralph project in the projects tab in BOINC Manager.

If you are talking about 1, 3 or 3 then there is a problem, if you are talking about 4 then it might be normal



I am refering to 1 i.e. the normal swapping of applications by the BOINC core. I have noticed it happen several times. E.g. from the sample log RALPH had been paused to allow SETI to run, but in this case 19 minutes after being paused the RALPH application just dropped out of memory for no reason. It was nolonger visible in Windows Task Manager. I am using a one hour switch time.

14) Message boards : RALPH@home bug list : application not staying in memory (Message 538)
Posted 23 Feb 2006 by Profile David@home
Post:
I have noticed that RALPH WUs regularly fall out of memory when the client is in paused state.

e.g. from the log file:

23/02/2006 17:17:06|ralph@home|Restarting result BARCODE_30_1cc8A_215_22_0 using rosetta_beta version 4.86
23/02/2006 17:17:06|SETI@home|Pausing result 23dc00aa.19627.28578.1009650.1.195_0 (left in memory)
23/02/2006 18:17:06|ralph@home|Pausing result BARCODE_30_1cc8A_215_22_0 (left in memory)
23/02/2006 18:17:06|SETI@home|Resuming result 23dc00aa.19627.28578.1009650.1.195_0 using setiathome version 4.11
23/02/2006 18:36:56|ralph@home|Result BARCODE_30_1cc8A_215_22_0 exited with zero status but no 'finished' file
23/02/2006 18:36:56|ralph@home|If this happens repeatedly you may need to reset the project.
23/02/2006 18:36:56||request_reschedule_cpus: process exited


As a project reset will delete all files associated with RALPH it would not make sense to do this if this failure to remain in memory is something to do with the new client under test.

Using Windows XP Pro SP2, BOINC v 4.45, Intel P4 single core no hyperthreading. Client perferences set to leave in memory. Sharing two applications RALPH@home and SETI@home.





15) Message boards : RALPH@home bug list : Discussion of the \"1% Hang\" issue (Message 335)
Posted 19 Feb 2006 by Profile David@home
Post:
I was out for about 1 hour to do love.
I'm back, now.
WU still undisturbed, suspended into RAM.
Anything to do ?


Hi,

There is a request from dekim a few posts down in this thread:

http://ralph.bakerlab.org/forum_thread.php?id=1#328p

My best guess is that this is for both of us.


16) Message boards : RALPH@home bug list : Report \"stuck at 1%\" bugs here (Message 332)
Posted 19 Feb 2006 by Profile David@home
Post:
Can you restart boinc and see if it continues on?



Restarted BOINC, the WU appears to have gone back to the start, according to the graphic it is at Model 1 step 78 (and incrementing), if it gets stuck again is there any thing we can do to help pin this down?

From the log:

19/02/2006 19:42:20||request_reschedule_cpus: project op
19/02/2006 19:42:40|ralph@home|Restarting result BARCODE_30_256bA_NATIVE_210_24_0 using rosetta_beta version 4.84
19/02/2006 19:42:40|SETI@home|Pausing result 14au00aa.7506.496.234660.1.92_2 (left in memory)





OK, this time it went thought to completion OK and credit was granted:

http://ralph.bakerlab.org/workunit.php?wuid=3325

Interestingly this is about 30 mins of CPU time, it had done 30 minutes previously before it hung. These test WUs typically take just under an hour on my PC. It is as if it has only claimed credit for the second 30 mins of CPU but carried on the calculations from where it got stuck. Should it have claimed the credit for both periods of CPU activity?

Any comments on the credit and the fact that it did not hang the second time?

It would be interesting to hear any ideas. Thanks.



17) Message boards : RALPH@home bug list : Report \"stuck at 1%\" bugs here (Message 329)
Posted 19 Feb 2006 by Profile David@home
Post:
Can you restart boinc and see if it continues on?



Restarted BOINC, the WU appears to have gone back to the start, according to the graphic it is at Model 1 step 78 (and incrementing), if it gets stuck again is there any thing we can do to help pin this down?

From the log:

19/02/2006 19:42:20||request_reschedule_cpus: project op
19/02/2006 19:42:40|ralph@home|Restarting result BARCODE_30_256bA_NATIVE_210_24_0 using rosetta_beta version 4.84
19/02/2006 19:42:40|SETI@home|Pausing result 14au00aa.7506.496.234660.1.92_2 (left in memory)


18) Message boards : RALPH@home bug list : Report \"stuck at 1%\" bugs here (Message 306)
Posted 19 Feb 2006 by Profile David@home
Post:
The people with hung work units in memory waiting for instructions:

I will send a note to David Kim to get his attention to this thread and provide you furthur instructions on what to do. As I write this it is 7:00 am Sunday on the West coast, so assuming he checks his mail on Sunday mornings he should get back to you soon. The information you can provide him is valuable so please hang in there till he gets back to you.



Many thanks for the update. I just checked and for some reason the WU has dropped out of memory. Even though the project was suspended and BOINC manager shows the work unit still as preempted it is nolonger in Windows Task Manager and in the BOINC Manager log there is this info:

19/02/2006 10:59:19|ralph@home|Result BARCODE_30_256bA_NATIVE_210_24_0 exited with zero status but no 'finished' file
19/02/2006 10:59:19|ralph@home|If this happens repeatedly you may need to reset the project.
19/02/2006 10:59:19||request_reschedule_cpus: process exited


Why after it was happily suspended for several hours it did this is not clear. The other project was not doing anything other than crunch its work unit at this time so it was not a side effect of the other project.

My understanding is that the CC will retry this WU once again when I unsuspend the client. I will wait to hear from the devs before doing this.

Edit >> Hmmm, interesting, I just checked something... the last Antispyware scan I ran was at around 11:00. maybe Windows defender kicked the binary in memory which caused it to fail as above.


19) Message boards : RALPH@home bug list : Report \"stuck at 1%\" bugs here (Message 303)
Posted 19 Feb 2006 by Profile David@home
Post:
Sorry just seen your request for step number etc.

There is a screen shot of the graphic at

http://mercury.walagata.com/w/appetiser/ralph.gif
20) Message boards : RALPH@home bug list : Report \"stuck at 1%\" bugs here (Message 302)
Posted 19 Feb 2006 by Profile David@home
Post:
I'd be curious to know the model# and Step number it's frozen at, but don't want you to lose the possiblility of them asking you to do something first. This data is on the graphic.
Is it a 4.83, 4.85??
What's your switch between projects time?
Are you doing more than one project?
Is this a Hyperthreading host?
CPU type?


WU is BARCODE_30_256bA_NATIVE_210_24_0
Application is rosetta_beta 4.84
3 projects RALPH and SETI active (+Rosetta suspended)
Switch interval 60 minutes
No hyperthreading unfortunately
CPU: Pentium 4 2.5GHz
OS is Windows XP Pro SP2


Full Proc specs:

Intel(R) Processor Frequency ID Utility
Version: 5.5.20030402
Time Stamp: 2006/02/19 07:58:58
Number of processors in system: 1
Current processor: #1
Processor Name: Intel(R) Pentium(R) 4 CPU 2.53GHz
Type: 0
Family: F
Model: 2
Stepping: 4
Revision: 1E
L1 Trace Cache: 12 Kµops
L1 Data Cache: 8 KB
L2 Cache: 512 KB
Packaging: FC-PGA2
MMX(TM): Yes
SIMD: Yes
SIMD2: Yes
NetBurst(TM) Microarchitecture: Yes
Expected Processor Frequency: 2.53 GHz
Reported Processor Frequency: 2.53 GHz
Expected System Bus Frequency: 533 MHz
Reported System Bus Frequency: 533 MHz

21) Message boards : RALPH@home bug list : Report \"stuck at 1%\" bugs here (Message 300)
Posted 19 Feb 2006 by Profile David@home
Post:
I have a RALPH WU stuck at 1% after 37 minutes of CPU time. I have currently suspended the project so it is left in memory.

Is there anything the devs would like me to do to check this out further or should I just abort it and post the link to the result file?



22) Message boards : Current tests : CPU Run Time preference (Message 208)
Posted 18 Feb 2006 by Profile David@home
Post:
Have I understood this discussion correctly? If I download WUs as before they will run a prediction and then end after a varaiable amount of time and return the results. If I set the CPU run time preference it will run sufficient predictions on the same WU to process up to that amount of time (Einstein runs different calculations on the same data, but does not have a time limit). If so what happens to the partial result? I.e. if I set it to 2 hours then I assume the work that is cut off at 2 hours will only be a partial result. Is this prediction ignored?

I believe the leave resdient in memory issue is fixed (from the other thread) so is this the only item currently worth testing on RALPH? I.e. should we all set such a CPU limit?

23) Message boards : RALPH@home bug list : Preferences across projects (Message 198)
Posted 18 Feb 2006 by Profile David@home
Post:
Ralph started out with only the defaults and no separate venues, so if you edit those, they go everywhere.



Understood, but my other projects were assigned to a set of general preferences for home location and not the defaults. My general home location settings were erased on the other projects when I joined RALPH, is that due to the item you refer to above: Ralph started out with only the defaults and no separate venues?

No major issue as I figured it out in time before my cache emptied.
24) Message boards : RALPH@home bug list : Preferences across projects (Message 162)
Posted 17 Feb 2006 by Profile David@home
Post:
Not sure where this fits, but I observed this when I joined RALPH@home.

The general preferences from RALPH@home were applied to all projects. My setting for connect to "network about every" got reset to 0.1 days (the default in RALPH) and my cache almost ran out of work as RALPH had no work available and the other projects drained the cache to get down to the 0.1 days setting. Also, but I am not 100% sure that my separate preferences for home vanished. I thought that my default location was set to home. Memory fades so I am not 100% sure but I thought I had a set of preferences for location home.

More a BOINC issue than a Rosetta issue but I thought it worth mentioning.



Previous 20



©2024 University of Washington
http://www.bakerlab.org