Posts by Viromancy

1) Message boards : RALPH@home bug list : Bug reports for 5.49-5.51 (Message 2860)
Posted 12 Mar 2007 by Viromancy
Post:
Very short runtime in 5.51 for an abinitio RNA WU that generated 0 decoys from 0 attempts, followed by a validation error

http://ralph.bakerlab.org/result.php?resultid=456095

2) Message boards : RALPH@home bug list : Bug reports for 5.49-5.51 (Message 2843)
Posted 7 Mar 2007 by Viromancy
Post:
Another very rapid access violation in 5.50, this time without any attempt to view the graphics when the WU started: 451729

Half the WUs my machine has processed under 5.50 have now failed within seconds of starting. The "incorrect function" errors with the first two ab-initio RNA folding WUs seem to have stopped, but both of the access violation errors today have been with DOCKING_1rhj_SYMM_11rhj_1_d.s036_bigrun.out. units.

3) Message boards : RALPH@home bug list : Bug reports for 5.49-5.51 (Message 2840)
Posted 7 Mar 2007 by Viromancy
Post:
Out of seven WUs run under ver 5.50 so far, I've had three rapid failures within seconds of the run starting:

Two "Incorrect function. (0x1) - exit code 1 (0x1)" - 445054 and 446305

One access violation, which may have occurred when I tried to show graphics - 450818


4) Message boards : RALPH@home bug list : bug report for rosetta 5.47 & 5.48 (Message 2809)
Posted 27 Feb 2007 by Viromancy
Post:
Another watchdog termination:

http://ralph.bakerlab.org/result.php?resultid=437809
5) Message boards : RALPH@home bug list : Bug Reports for 5.45 (Message 2761)
Posted 3 Feb 2007 by Viromancy
Post:
Well, after all the head scratching in the thread above, it seems I've finally managed to crack the Rosetta Weirdness on my machine. And in some respects it's obvious, while in others it's baffling. It seems I managed to pick a totally borderline overclock setting for my 2.66 GHZ C2D. Every other application and BOINC client program ran at 3.46 GHz without any problem, and that included all the overclock stress-test applications I ran. Apparently, though, Rosetta from mid 5.43 onwards doesn't.

So after going mad when 5.45 didn't work, I tried dropping the effective clock to 3.40 GHz and Vcore down to 5.125V. Rosetta ver 5.45 now appears to be totally stable for a 1.7% reduction in overclocked processor speed, right up to a 24 hour WU timing. I think I can live with that :-) Bloody peculiar, though. Maybe Rosetta should get prepared for being used as an OC stability check, because nothing else showed any effect; though admittedly I didn't try computing prime numbers for 12 hours...
6) Message boards : RALPH@home bug list : Bug Reports for 5.45 (Message 2742)
Posted 29 Jan 2007 by Viromancy
Post:
The error message you got is certainly one of the symtoms related to graphics, but definitely not limited to that. May I ask if you have experienced any stability issue with your machine in general?


Hi Chu. Apologies for the long post.

No, I've never had any stability issue with my machine for any applications I run on it, with the sole exception that it doesn't like running the BOINC manager at the same time as I'm ripping DVDs. Other than that, it's rock solid. It's fairly well overclocked -I'm running a Core2Duo E6700 at 3.46 GHz, and my PC6400-rated RAM is actually running as PC8200 - but it's tested completely stable and several months of running both cores at 100% capacity 24/7 has never generated a single error for any BOINC application WU except Rosetta. Rosetta, though, became very touchy about running. It would inevitably fail a WU that was pre-empted and swapped out to allow something else to run. I had to leave it runing all the time on one core.

We certainly do not want to lose users because of application stability and that is why we are trying to work on improving it. Maybe you can check whether this is improved in 5.45 and if the failure rate goes down significantly, you may considering attaching back to Rosetta@Home.


I was quite puzzled and a bit disturbed at how the failure rate on Rosetta got more and more pronounced over time without any change to my machine's configuration or any other evidence of instability. I kept going for as long as possible because I liked crunching Rosetta and I'd accumulated a very respectable number of WUs. But the failure rate was becoming alarming, and on the 15th-16th January this year some 75-80% of all WUs aborted prematurely. That's when I regretfully had to call a halt. I joined RALPH to see whether the newer versions were more stable with an eye to going back to Rosetta when they're implemented. It's hard to tell, since the fairly irregular availability of work means I don't have a large WU base to draw conclusions from, but both 5.45 and 5.44 before it seem more stable than 5.43 on my machine; for one thing, they can both be swapped in and out to allow other BOINC applications to run without causing problems.

Out of curiosity, since the beta versions seemed more stable, I allowed my BOINC manager to download some new Rosetta workunits under 5.43 on Jan 27th. Sure enough, the first three it tried to run all failed with access violations, here, here and here. The fourth WU succeeded. By that stage, though, I'd had enough again and shut it down.

I have no idea why this is happening, and the 10% failure rate you mention would have been, if anything, an overestimate of the situation during the first few months I was crunching. The problems really seem to stem from the introduction of 5.43; which is puzzling since I don't use the graphics. I'll certainly try Rosetta again when 5.43 is upgraded, but I'd be a lot happier if I knew what was going wrong.


7) Message boards : RALPH@home bug list : Bug Reports for 5.45 (Message 2735)
Posted 28 Jan 2007 by Viromancy
Post:
Failed WU here.

Same type of error that forced me to stop crunching Rosetta altogether after decreasing stability for ver 5.43 resulted in around 75% of WUs aborting prematurely. Never had this problem at all with any WUs from other BOINC applications I run (World Community Grid/Malaria Control) and very rare with Rosetta before version 5.43. Had one instance of the same with version 5.44 here. Also, along with others, saw three odd, unrelated WU failures with ver 5.44 just before 5.45 was introduced here, here and here. I know these latter aren't ver 5.45, but for sake of completelness I thought it was worth mentioning.

I don't use graphics, at all. All these errors, and almost all of the constant errors being thrown up by Rosetta ver 5.43, occurred while the application was running in the background and the machine was otherwise idle.






©2024 University of Washington
http://www.bakerlab.org