removed from memory by benchmark

Message boards : RALPH@home bug list : removed from memory by benchmark

To post messages, you must log in.

AuthorMessage
Profile David@home
Avatar

Send message
Joined: 16 Feb 06
Posts: 24
Credit: 409
RAC: 0
Message 809 - Posted: 3 Mar 2006, 19:47:04 UTC

More an observation, but one of concern after reading the FAQ about checkpoint times and that it is best to keep Rosetta in memory etc. When the benchmark ran it forced RALPH out of memory. Not sure how this can be managed better.


2006-03-03 09:29:51 [ralph@home] Resuming computation for result BARCODE_30_2ci2I_237_4_0 using rosetta_beta version 4.92
2006-03-03 09:31:38 [---] Suspending computation and network activity - running CPU benchmarks
2006-03-03 09:31:38 [ralph@home] Pausing result BARCODE_30_2ci2I_237_4_0 (removed from memory)
2006-03-03 09:31:39 [---] request_reschedule_cpus: process exited
2006-03-03 09:31:40 [---] Running CPU benchmarks
2006-03-03 09:32:37 [---] Benchmark results:
2006-03-03 09:32:37 [---] Number of CPUs: 1
2006-03-03 09:32:37 [---] 1369 double precision MIPS (Whetstone) per CPU
2006-03-03 09:32:37 [---] 2854 integer MIPS (Dhrystone) per CPU
2006-03-03 09:32:37 [---] Finished CPU benchmarks
2006-03-03 09:32:37 [---] Resuming computation and network activity
2006-03-03 09:32:37 [---] schedule_cpus: must schedule
2006-03-03 09:32:37 [ralph@home] Restarting result BARCODE_30_2ci2I_237_4_0 using rosetta_beta version 4.92
ID: 809 · Report as offensive    Reply Quote
Moderator9
Volunteer moderator

Send message
Joined: 16 Feb 06
Posts: 251
Credit: 0
RAC: 0
Message 810 - Posted: 3 Mar 2006, 20:13:09 UTC - in response to Message 809.  

More an observation, but one of concern after reading the FAQ about checkpoint times and that it is best to keep Rosetta in memory etc. When the benchmark ran it forced RALPH out of memory. Not sure how this can be managed better.


2006-03-03 09:29:51 [ralph@home] Resuming computation for result BARCODE_30_2ci2I_237_4_0 using rosetta_beta version 4.92
2006-03-03 09:31:38 [---] Suspending computation and network activity - running CPU benchmarks
2006-03-03 09:31:38 [ralph@home] Pausing result BARCODE_30_2ci2I_237_4_0 (removed from memory)
2006-03-03 09:31:39 [---] request_reschedule_cpus: process exited
2006-03-03 09:31:40 [---] Running CPU benchmarks
2006-03-03 09:32:37 [---] Benchmark results:
2006-03-03 09:32:37 [---] Number of CPUs: 1
2006-03-03 09:32:37 [---] 1369 double precision MIPS (Whetstone) per CPU
2006-03-03 09:32:37 [---] 2854 integer MIPS (Dhrystone) per CPU
2006-03-03 09:32:37 [---] Finished CPU benchmarks
2006-03-03 09:32:37 [---] Resuming computation and network activity
2006-03-03 09:32:37 [---] schedule_cpus: must schedule
2006-03-03 09:32:37 [ralph@home] Restarting result BARCODE_30_2ci2I_237_4_0 using rosetta_beta version 4.92


The WU should only have lost the work since the last checkpoint.

Moderator9
RALPH@home FAQs
RALPH@home Guidelines
Moderator Contact
ID: 810 · Report as offensive    Reply Quote
Profile UBT - Halifax--lad

Send message
Joined: 15 Feb 06
Posts: 29
Credit: 2,723
RAC: 0
Message 813 - Posted: 4 Mar 2006, 21:50:11 UTC

The way to solve it is to manually do a benchmark when a RALPH or Rosetta WU is not in the cache if at all possible
Join us in Chat (see the forum) Click the Sig


Join UBT
ID: 813 · Report as offensive    Reply Quote
Profile Angus

Send message
Joined: 17 Feb 06
Posts: 10
Credit: 1,007
RAC: 0
Message 815 - Posted: 5 Mar 2006, 7:30:20 UTC - in response to Message 813.  
Last modified: 5 Mar 2006, 7:31:05 UTC

The way to solve it is to manually do a benchmark when a RALPH or Rosetta WU is not in the cache if at all possible

That is NOT a solution - that is a work-around.

A solution would be to fix the Rosetta client application so that it does not lose the work when suspended and removed from memory for the scheduled benchmark calibration.

ID: 815 · Report as offensive    Reply Quote
Profile David@home
Avatar

Send message
Joined: 16 Feb 06
Posts: 24
Credit: 409
RAC: 0
Message 816 - Posted: 5 Mar 2006, 10:57:00 UTC - in response to Message 815.  
Last modified: 5 Mar 2006, 10:59:11 UTC

The way to solve it is to manually do a benchmark when a RALPH or Rosetta WU is not in the cache if at all possible

That is NOT a solution - that is a work-around.

A solution would be to fix the Rosetta client application so that it does not lose the work when suspended and removed from memory for the scheduled benchmark calibration.


Exactly. For test projects it is be expected to spend time doing things to help but for production experiments you just want to let BOINC do its stuff. You should not have to micro manage BOINC. Look at the success of the BBC climate project. It must be the fastest growing project which in part will be due to the ease of running it: download, install, enter an email address to register and thats it.

BOINC was developed initially alongside SETI@home which has a 60 second checkpoint cycle. Newer projects that use longer checkpoints do not fit well. E.g. a user that only uses it as a screen saver or set to run when idle could lose a significant amount of elapsed time to complete the CPU time between checkpoints. The BOINC infrastructure needs to manage running the bench mark differently. BOINC should not need to remove clients out of memory, e.g. could it check for free RAM before running the benchmark and just suspend clients? Can Rosetta use a different checkpoint algorithm?




ID: 816 · Report as offensive    Reply Quote
Profile Brotherbard

Send message
Joined: 16 Feb 06
Posts: 15
Credit: 76,109
RAC: 0
Message 817 - Posted: 5 Mar 2006, 15:01:10 UTC - in response to Message 809.  

When the benchmark ran it forced RALPH out of memory. Not sure how this can be managed better.


This was a problem with BOINC not with the science apps, I'm not sure which version fixed it (it might be in the development version) but try updating to the current version for your computer.

--Nathan
ID: 817 · Report as offensive    Reply Quote
Profile David@home
Avatar

Send message
Joined: 16 Feb 06
Posts: 24
Credit: 409
RAC: 0
Message 819 - Posted: 5 Mar 2006, 22:25:08 UTC - in response to Message 817.  

When the benchmark ran it forced RALPH out of memory. Not sure how this can be managed better.


This was a problem with BOINC not with the science apps, I'm not sure which version fixed it (it might be in the development version) but try updating to the current version for your computer.

--Nathan


Cool, if a newer dev version of BOINC handles this better then that is good. Part of the alpha test should be to feedback on the BOINC infrstructure as well if it highlights an issue but it sounds like this is already covered off.

ID: 819 · Report as offensive    Reply Quote
Profile UBT - Halifax--lad

Send message
Joined: 15 Feb 06
Posts: 29
Credit: 2,723
RAC: 0
Message 849 - Posted: 11 Mar 2006, 13:33:42 UTC

Should be all fixed soon, seen as though Rom is on the case
Join us in Chat (see the forum) Click the Sig


Join UBT
ID: 849 · Report as offensive    Reply Quote

Message boards : RALPH@home bug list : removed from memory by benchmark



©2024 University of Washington
http://www.bakerlab.org