Bug reports for Ralph 5.15

Message boards : RALPH@home bug list : Bug reports for Ralph 5.15

To post messages, you must log in.

AuthorMessage
Rhiju
Volunteer moderator
Project developer
Project scientist

Send message
Joined: 14 Feb 06
Posts: 161
Credit: 3,725
RAC: 0
Message 1616 - Posted: 13 May 2006, 20:57:50 UTC
Last modified: 14 May 2006, 21:21:51 UTC

Sorry for the batch of bad work units yesterday -- we're testing a new scientific mode for rosetta@home. We're really glad we have the ralph test system to fix these problems before they go out to rosetta@home on a grand scale! In ralph 5.15, we've put in a fix in the file input/output that should hopefully allow us to run this new mode.
ID: 1616 · Report as offensive    Reply Quote
Moderator9
Volunteer moderator

Send message
Joined: 16 Feb 06
Posts: 251
Credit: 0
RAC: 0
Message 1617 - Posted: 14 May 2006, 6:09:13 UTC
Last modified: 14 May 2006, 16:06:12 UTC

I am seeing a very high error rate on MAC systems for the "MAPRELAX_TEST..." work unit type. The failure rate is at or near 100% for MACs. I suspect that there is another batch of bad work units for version 5.15 passing through the system at this time. If I am correct the storm should pass very quickly as these seem to error before any significant processing is done. If you see this on your system Rhiju has been notified and I am certain he will take steps to minimize the problem.

All of you should remember that you are saving the larger Rosetta community from a lot of discomfort by helping in the Ralph test program. I know the project staff is very appreciative of your contributions. Thank you for your help.

Moderator9
RALPH@home FAQs
RALPH@home Guidelines
Moderator Contact
ID: 1617 · Report as offensive    Reply Quote
suguruhirahara

Send message
Joined: 5 Mar 06
Posts: 40
Credit: 11,320
RAC: 0
Message 1618 - Posted: 14 May 2006, 12:25:19 UTC

My computer hasn't faced an error so far. Graphics and work tasks both work well.
ID: 1618 · Report as offensive    Reply Quote
[B^S] sTrey
Avatar

Send message
Joined: 15 Feb 06
Posts: 58
Credit: 15,430
RAC: 0
Message 1619 - Posted: 14 May 2006, 19:54:32 UTC
Last modified: 14 May 2006, 19:55:42 UTC

Running a MAPRELAX_TEST unit now and either this is a bad unit or the work towards reducing memory usage is going backwards. Currently on my XP box Task Manager gives the following values for this work unit: memory usage 233MB, Peak memory usage 351MB, VMsize 391 MB. Ouch, this is significantly worse than previous units/ralph versions I've run. It's taken the memory-hog prize away from FightAIDS@Home (30,113,277MB) which was the undisputed front runner for months.
It's pretty painful for a 1GB machine; pagefile usage is about 1.5GB now and 40% of that is Ralph!

p.s.
PF is about 400 every 2-4 seconds, which is much lower than 5.12 units in insanity mode (2-4K/second). It's about halfway through my 4-hour preference setting and just got pre-empted, will see what happens on the last half.

Graphics: Boinc screensaver set to go to blank screen after 5 minutes. I also inspected graphics on this wu manually and I do not know what the memory usage was before that.
ID: 1619 · Report as offensive    Reply Quote
Honza

Send message
Joined: 16 Feb 06
Posts: 9
Credit: 1,962
RAC: 0
Message 1620 - Posted: 14 May 2006, 20:16:20 UTC

@ sTrey - just wanted to address same issue: quite high memory usage.
I believe it is not easy to *known* prior computing particular WU what peak memory usage will be but some users may experience problems with it.
(it is the reason why some users still stick with *outdated* Predictor even with a very little user's support).

Does it make sense to introduce WU memory limit like "Target CPU run time" in user's profile?
Or send selected WUs according host specification. I known this was addressed before (and I was critical on how scheduler works) and I'm used to high emory demands from CPDN/SAP but this is something many users will not expect...
ID: 1620 · Report as offensive    Reply Quote
Rhiju
Volunteer moderator
Project developer
Project scientist

Send message
Joined: 14 Feb 06
Posts: 161
Credit: 3,725
RAC: 0
Message 1621 - Posted: 14 May 2006, 20:37:08 UTC - in response to Message 1617.  

Hi all: a few replies to some really useful posts.

I found the problem with the Mac app. It was an unfortunate slipup in version control where the Mac app was built with slightly order code. I'm fixing it now for 5.16. The other embarrassing thing is that the database didn't process my command to cancel these WUs last night ... I should have checked. My apologies to Mac users! My own powerbook was a "victim".

sTrey and others: thanks much for the posts on memory. I'm going to queue up a few more MAPRELAX_TEST jobs tonight. I'll put in a limit so that only users with 1024Mb free can crunch those jobs. Can users report how their memory usages differ for these jobs than from earlier jobs (or with the workunits being sent on Rosetta@home)? This is a new style of workunit with potentially very important scientific implications -- but we need to know if it can't be sent to low memory clients.

Finally, thanks for posting to this thread even though the link from the main page was incorrect! We really appreciate the feedback from the ralph community.

I am seeing a very high error rate on MAC systems for the "MAPRELAX_TEST..." work unit type. The failure rate is at or near 100% for MACs. I suspect that there is another batch of bad work units for version 5.15 passing through the system at this time. If I am correct the storm should pass very quickly as these seem to error before any significant processing is done. If you see this on your system Rhiju has been notified and I am certain he will take steps to minimize the problem.

All of you should remember that you are saving the larger Rosetta community from a lot of discomfort by helping in the Ralph test program. I know the project staff is very appreciative of your contributions. Thank you for your help.


ID: 1621 · Report as offensive    Reply Quote
wizzszz

Send message
Joined: 28 Apr 06
Posts: 17
Credit: 1,128
RAC: 0
Message 1623 - Posted: 15 May 2006, 0:13:27 UTC

Im cruching WU MAPRELAX_TEST_hom021_1fna__514_125_0 using rosetta_beta version 515 right now!

The "phantom chain" and "broken chain" phenomenon is still NOT fixed, and I experienced another problem: in Model 2, exactly at Step 38000, RMSD jumped suddenly to 0 (!), while Accepted Energy was still changing...

IMHO, if RMSD of zero is reached (no difference to native) there should be no change at Accepted energy... Is this right?
ID: 1623 · Report as offensive    Reply Quote
Moderator9
Volunteer moderator

Send message
Joined: 16 Feb 06
Posts: 251
Credit: 0
RAC: 0
Message 1624 - Posted: 15 May 2006, 1:53:27 UTC - in response to Message 1623.  
Last modified: 15 May 2006, 1:57:58 UTC

Im cruching WU MAPRELAX_TEST_hom021_1fna__514_125_0 using rosetta_beta version 515 right now!

The "phantom chain" and "broken chain" phenomenon is still NOT fixed, and I experienced another problem: in Model 2, exactly at Step 38000, RMSD jumped suddenly to 0 (!), while Accepted Energy was still changing...

IMHO, if RMSD of zero is reached (no difference to native) there should be no change at Accepted energy... Is this right?


I believe that the MAPRELAX workunits are running a CASP target. Since the RMSD is unknown there is no graph for it so it will always appear to be zero. The accepted energy will always rise and fall depending on the value for the current accepted shape.

I am not seeing the broken chain issue on my windows machine (not to imply you aren't), But I will watch for it. Sorry I did not check before responding but are you running linux? That may matter. If anyone else is seeing the broken chain please let us know.

You should see an improvement in the graphic text overrun in the description field soon, there is a fix for that.

Moderator9
RALPH@home FAQs
RALPH@home Guidelines
Moderator Contact
ID: 1624 · Report as offensive    Reply Quote
BigMike
Avatar

Send message
Joined: 23 Feb 06
Posts: 63
Credit: 58,730
RAC: 0
Message 1625 - Posted: 15 May 2006, 2:04:08 UTC

I'm running 5.15 on WinXP Pro SP1 (Intel) and it's one very memory hungry beast. Currently crunching MAPRELAX_TEST_hom003_1fna__512_5_0.

It was cruising at just under 300M, and my system was paging itself to death. Finally had to reboot since my system was so sluggish. Now the same WU is continuing, but cruising at about 90M, which makes me think there is some kind of memory release issue.

Something's amiss.

--Mike
Don't believe everything you think.
ID: 1625 · Report as offensive    Reply Quote
Profile feet1st

Send message
Joined: 7 Mar 06
Posts: 313
Credit: 116,623
RAC: 0
Message 1626 - Posted: 15 May 2006, 3:24:09 UTC
Last modified: 15 May 2006, 3:27:00 UTC

Found these two running tonight when I got home. These are each roughly double my normal Mem usage showing 300+MB each on a dual CPU. One later dropped to 250MB which is still quite high.

MAPRELAX_TEST_hom004_1fna
MAPRELAX_TEST_hom005_1fna
http://www.geocities.com/feet1st/Ralph515.html
ID: 1626 · Report as offensive    Reply Quote
Snake Doctor

Send message
Joined: 16 Feb 06
Posts: 37
Credit: 998,880
RAC: 0
Message 1627 - Posted: 15 May 2006, 5:48:38 UTC - in response to Message 1626.  

Found these two running tonight when I got home. These are each roughly double my normal Mem usage showing 300+MB each on a dual CPU. One later dropped to 250MB which is still quite high.

MAPRELAX_TEST_hom004_1fna
MAPRELAX_TEST_hom005_1fna
http://www.geocities.com/feet1st/Ralph515.html


I am not seeing this kind of memory use on the Mac Mine usually use about 20MB of real memory and about 169 MB virtual. If I turn on the graphic display, the virtual jumps up to a little over 200MB, and stays there until the WU I displayed is completed and then the system gets it back.
ID: 1627 · Report as offensive    Reply Quote

Message boards : RALPH@home bug list : Bug reports for Ralph 5.15



©2024 University of Washington
http://www.bakerlab.org