Message boards : RALPH@home bug list : Bug reports for rosetta_beta_5.77 and rosetta_5.69
Author | Message |
---|---|
dekim Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 20 Jan 06 Posts: 250 Credit: 543,579 RAC: 0 |
Please post any bugs here regarding version rosetta_beta 5.77 and/or rosetta 5.69. The same bug was resolved for both versions (the stable and development versions). |
m.mitch Send message Joined: 12 May 06 Posts: 16 Credit: 159,478 RAC: 357 |
The work unit 550011 is stuck at about 94% complete, with 10 minutes left to run. The BOINC Manager says it's running but there is no CPU use. It's on a Linux box and the work unit was suspended but I didn't notice if that was the direct cause also the box was rebooted. Should it be aborted? Click here to join the #1 Aussie Alliance on RALPH |
mdettweiler Send message Joined: 4 Apr 07 Posts: 11 Credit: 1,010 RAC: 0 |
The work unit 550011 is stuck at about 94% complete, with 10 minutes left to run. The BOINC Manager says it's running but there is no CPU use. NO. Rosetta bases its progress bar and time to completion estimates off of your preferred run time--which, for some of the larger workunits that seem to be very common nowadays, is less (sometimes drastically) than the amount of time actually required to complete one model (the minimum to complete a WU). Thus, if the workunit goes over your preferred runtime, it will stick at about 10 minutes left, and cut down that and up the % done very slowly, because it really has no idea how long the workunit's going to take. The % done and time left to completion, at least for Rosetta/RALPH workunits, are just rough estimates, and with the new, bigger workunits, if you have a lower set runtime (which is recommended for RALPH anyway), most of your workunits will probably go over, unless you have a very fast, modern CPU. Long story short, this is normal, so don't abort the workunit, let it run. Some workunits can take up to 4 hours (a couple close to 5, even) per model on my P4 3.2Ghz HT, so in my case, they'll take at the very least that amount of time, no matter what time preferences you have set. Rosetta doesn't know ahead of time how much time they'll take, so once it goes over your preferred run time, all it can do is make underestimates so people don't freak out if it goes over 100%. :-) |
m.mitch Send message Joined: 12 May 06 Posts: 16 Credit: 159,478 RAC: 357 |
The work unit 550011 is stuck at about 94% complete, with 10 minutes left to run. The BOINC Manager says it's running but there is no CPU use. I don't think it's normal for the BOINC Manager to report the work unit as running but the CPU to be inactive. Click here to join the #1 Aussie Alliance on RALPH |
anders n Send message Joined: 16 Feb 06 Posts: 166 Credit: 131,419 RAC: 0 |
A restart of Boinc is the first thing to do when a Wu seems stuck. |
mdettweiler Send message Joined: 4 Apr 07 Posts: 11 Credit: 1,010 RAC: 0 |
The work unit 550011 is stuck at about 94% complete, with 10 minutes left to run. The BOINC Manager says it's running but there is no CPU use. Oh! Sorry. I made a blooper--I didn't notice that you said that the CPU was not being active at all. If the CPU was being used, yet the progress and time to completion were as you said, then what I said would be correct, but not in the case that it's not using any CPU time at all. In the case of it using no CPU time at all, I would recommend that you abort the WU. Sorry! :-( |
m.mitch Send message Joined: 12 May 06 Posts: 16 Credit: 159,478 RAC: 357 |
The work unit 550011 is stuck at about 94% complete, with 10 minutes left to run. The BOINC Manager says it's running but there is no CPU use. No probs Anonymous, I have duelly blown it out of the water. Just as well to, I'd left it unsuspended and have no idea how much crunching time it wasted. Cheers Click here to join the #1 Aussie Alliance on RALPH |
ramostol Send message Joined: 29 Mar 07 Posts: 24 Credit: 31,121 RAC: 0 |
I commented on a similar problem in a Rosetta message board some time ago (39305). My experience is that if a Boinc project is running using no CPU (more correctly: using so little CPU time that it is practically unnoticeable), it happens because other programs hog the CPU in such a way that the Rosetta crunching is performed not in the Rosetta process but in the kernel_task process. To bring the situation back to normal you may examine the active processes on you computer. If you observe a quite active kernel_task process this would confirm the theory. Then look through all processes to find a program/process using lots of CPU although doing nothing sensible, and quit this program. Then you can see kernel_task shrinking and the Rosetta process using CPU as normally. What I did not mention in my original message is that this is probably also the cause of the occasionally reported problem of Rosetta processes running for days and days without being able to stop. Since Boinc/Rosetta will register the CPU use of the Rosetta process to determine when to terminate the process in accordance with your default settings, it will know nothing of the computing going on inside the kernel_task process and will let the process continue for a looooong time. |
Message boards :
RALPH@home bug list :
Bug reports for rosetta_beta_5.77 and rosetta_5.69
©2024 University of Washington
http://www.bakerlab.org