Message boards : RALPH@home bug list : Bug Reports for 5.45
Previous · 1 · 2 · 3 · Next
Author | Message |
---|---|
Chu Volunteer moderator Project developer Project scientist Send message Joined: 26 Sep 06 Posts: 61 Credit: 12,545 RAC: 0 |
If there is any apology to make, that should be from us. Thank you for your time and effort helping us. I see, you were having problem of pre-empting a rosetta job and swapping it in and out with other BOINC applications. This is consistent with my previous speculation that your problem is probably not grahic-related. Honestly speaking, I don't know exactly either about what has gone wrong, but it could be somehow related to the BOINC api we were using for the rosetta 5.43 (though it did not explain why the problem did not happen universally on all other cilents' machines). The current 5.45 being tested on Ralph has been built with the newest version of BOINC API and that might help solve your problem. The plan is to put it on Rosetta@Home either later today or tomorrow. So please give it a try when it is upgraded and see if things improve on your side. Again, thank you for your generous contribution to our project. The error message you got is certainly one of the symtoms related to graphics, but definitely not limited to that. May I ask if you have experienced any stability issue with your machine in general? |
feet1st Send message Joined: 7 Mar 06 Posts: 313 Credit: 116,623 RAC: 0 |
My previously problematic machine just went 18hrs, ss active, without a burp. Successfully complete 3 WUs and is still crunching on a fourth. During the start of getting these WUs I had set to enable my screen saver, went to take a shower, forgot I had left Rosetta active too, and by time I got back to this machine it was hung already. ...the Rosetta WU, not Ralph! ...so I'd say things are looking great on Windows. |
Billy Send message Joined: 29 Jan 07 Posts: 14 Credit: 7,865 RAC: 0 |
It isn't possible to test this update on my Mac as there is no work units. I did get 2 work units on one day, but they ran and I didn't notice them, so I couldn't turn on the graphics. |
Chu Volunteer moderator Project developer Project scientist Send message Joined: 26 Sep 06 Posts: 61 Credit: 12,545 RAC: 0 |
Now it is updated on Rosetta@Home and you will get plenty of WUs to crunch. Just be aware that there is still some minor problem unsolved for mac platforms. See here It isn't possible to test this update on my Mac as there is no work units. I did get 2 work units on one day, but they ran and I didn't notice them, so I couldn't turn on the graphics. |
Rhiju Volunteer moderator Project developer Project scientist Send message Joined: 14 Feb 06 Posts: 161 Credit: 3,725 RAC: 0 |
Work units of the form s018__CASP7_ASSEMBLE_SAVE_ALL_OUT_hom001__IGNORE_THE_REST_s018__BOINC_LOOP_RELAX__1446_0.clean.out.2 are acting a little wacky -- I'm working on the fix! Now it is updated on Rosetta@Home and you will get plenty of WUs to crunch. Just be aware that there is still some minor problem unsolved for mac platforms. See here |
Conan Send message Joined: 16 Feb 06 Posts: 364 Credit: 1,368,421 RAC: 0 |
> Had this one fail, was not at the computer so did not operate Boinc screensaver still using standard Windows one. All others have progressed with no trouble so far. http:ralph.bakerlab.org/result.php?resultid=411601 <message> - exit code -1073741819 (0xc0000005) </message> <stderr_txt> # random seed: 2755617 Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x00681A55 read attempt to address 0x7BCFB090 Engaging BOINC Windows Runtime Debugger... |
Bober [B@P] Send message Joined: 18 Jun 06 Posts: 6 Credit: 15,427 RAC: 0 |
|
tallguy-13088 Send message Joined: 17 Feb 06 Posts: 10 Credit: 121,701 RAC: 0 |
Hello, I just aborted two RALPH work Units. They were: s018__CASP7_ASSEMBLE_SAVE_ALL_OUT_hom001__IGNORE_THE_REST_s018__BOINC_LOOP_RELAX__1446_0.clean.out.1_1670_3 - and - s018__CASP7_ASSEMBLE_SAVE_ALL_OUT_hom001__IGNORE_THE_REST_s018__BOINC_LOOP_RELAX__1446_0.clean.out.2_1670_3 Both were at 100%, PRE-EMPTED and still accumulating time while other projects were active. Earlier this evening, both had accumulated 10+ hours apiece. Upon restarting BOINC Manager (v5.4.9), unit #1 dropped back to 5.442% completion (at 49m 16s accumulated time) and the second went back to 10.442% completion at 45m 09s accumulated time). The graphics stated the second was in "stage assembly" for the process. I am running W2K Build 2195 Service Pack 4 on dual Xeon 2.8Ghz cores. Ralph@Home was at 5.45. If there is any more info you need, please reply to this post. Thanks! |
=Lupus= Send message Joined: 23 Sep 06 Posts: 4 Credit: 35,610 RAC: 0 |
Result 412972 same 0xc0000005 error. I was not even near the "show grafx" button! Good luck in bug-hunting, =Lupus= |
Conan Send message Joined: 16 Feb 06 Posts: 364 Credit: 1,368,421 RAC: 0 |
> Got a different one this time. It had got to 100.00% but the Boinc Manager said it was still running. So I checked in my System Monitor (I am using Linux on this Opteron 275 machine) and it said that 1 of my 4 cpus was at idle and the other 3 at 100%. This then changed with the idle cpu moving from cpu to cpu till all 4 were swapping the idle job around from core to core. It also still held 166 MB of memory. I had to abort it then all cpus ran at 100% again. This workunit https://ralph.bakerlab.org/workunit.php?wuid=364578 |
genes Send message Joined: 16 Feb 06 Posts: 45 Credit: 43,706 RAC: 20 |
I am still having problems when I display the graphics, notably when I enable the screensaver on [url=https://ralph.bakerlab.org/show_host_detail.php?hostid=2016]this computer{/url]. I currently have an ATI x850x graphics card installed, and the installed driver is 7-1_xp_dd_ccc_wdm_enu_40211 (catalyst version). Here is what I saw happen: the BOINC screensaver was running, and over time I saw the CPDN graphics, the QMC graphics, and either Ralph or Rosetta (both of which are 5.45). The last graphics I saw were from Ralph or Rosetta, then I came back and saw that the "VPU recover" feature had activated (display driver resets instead of hanging, and prepares a crash report for ATI). I allowed it to submit the report, and the Rosetta/Ralph WU did not crash, but finished normally, so I can't point you to the bad WU. Later today I will put back the NVidia card that I also use with this machine (a GeForce FX5950), and see how that behaves. |
genes Send message Joined: 16 Feb 06 Posts: 45 Credit: 43,706 RAC: 20 |
Rats. I typo'ed the link, and I can't edit it. I'll try again. It's this computer |
genes Send message Joined: 16 Feb 06 Posts: 45 Credit: 43,706 RAC: 20 |
I have the NVidia card installed (it's a GeForce Fx5950), and I haven't seen any graphics problems since, either with Ralph or Rosetta. I'm using driver version 93.71. So much for ATI. |
Viromancy Send message Joined: 20 Jan 07 Posts: 7 Credit: 1,425 RAC: 0 |
Well, after all the head scratching in the thread above, it seems I've finally managed to crack the Rosetta Weirdness on my machine. And in some respects it's obvious, while in others it's baffling. It seems I managed to pick a totally borderline overclock setting for my 2.66 GHZ C2D. Every other application and BOINC client program ran at 3.46 GHz without any problem, and that included all the overclock stress-test applications I ran. Apparently, though, Rosetta from mid 5.43 onwards doesn't. So after going mad when 5.45 didn't work, I tried dropping the effective clock to 3.40 GHz and Vcore down to 5.125V. Rosetta ver 5.45 now appears to be totally stable for a 1.7% reduction in overclocked processor speed, right up to a 24 hour WU timing. I think I can live with that :-) Bloody peculiar, though. Maybe Rosetta should get prepared for being used as an OC stability check, because nothing else showed any effect; though admittedly I didn't try computing prime numbers for 12 hours... |
zombie67 [MM] Send message Joined: 8 Aug 06 Posts: 75 Credit: 2,396,363 RAC: 6,299 |
|
Conan Send message Joined: 16 Feb 06 Posts: 364 Credit: 1,368,421 RAC: 0 |
> What happened to the crediting system? It is back to what you get is what you claim, I checked one persons work units and he is getting up to 50 credits an hour (398 credits on an 8 hour WU with not that many decoys done) on the latest batch. Sure beats my 14 to low 20's that I get for my 6 hours processing per WU. |
feet1st Send message Joined: 7 Mar 06 Posts: 313 Credit: 116,623 RAC: 0 |
I just got one of these WUs: 1who__BOINC_ABINITIO_CONTROL2__1749_26_0 using rosetta_beta version 545 the graphic doesn't show the sidechains. |
Conan Send message Joined: 16 Feb 06 Posts: 364 Credit: 1,368,421 RAC: 0 |
> Just had 4 Work Units fail, all at 1 hour processing time, I am expecting the other 2 to fail as well. All the work units got 'stuck' and the Watchdog says it ended the run, but this is not the case. All 4 work units on the Boinc Manager said that they were still running with NO CPU usage but still using up to 308 MB of RAM for each WU. All 4 got to 1 hour (my preferences are for 6 hours) and then said they were 100% complete but the WU did not release the CPU to go to another task. http//ralph.bakerlab.org/result.php?resultid=420621 http//ralph.bakerlab.org/result.php?resultid=420709 http//ralph.bakerlab.org/result.php?resultid=420761 http//ralph.bakerlab.org/result.php?resultid=420767 Thanks |
Chu Volunteer moderator Project developer Project scientist Send message Joined: 26 Sep 06 Posts: 61 Credit: 12,545 RAC: 0 |
sounds like some problem interfacing with BOINC manager. Those WUs themselves are fine and several of them you killed actually showed that they were stuck at score 0 which means this did not happen in the middle of a simulation. Could you please next time close the BOINC manager and re-open it to see if any of these WUs will be finished and reported? If that does not help, then go ahead to kill them. In addition, it seems to be specific to your linux hosts, but not Windows, right? € > Just had 4 Work Units fail, all at 1 hour processing time, I am expecting the other 2 to fail as well. |
Chu Volunteer moderator Project developer Project scientist Send message Joined: 26 Sep 06 Posts: 61 Credit: 12,545 RAC: 0 |
in early stage of some simulations, we carried out low-resolution search and thus sidechains will not be shown. Usually in the first box, there will either "search backbone"( no sidechains) or "search_full_atom" (with sidechains). I just got one of these WUs: |
Message boards :
RALPH@home bug list :
Bug Reports for 5.45
©2024 University of Washington
http://www.bakerlab.org