Posts by genes

1) Message boards : RALPH@home bug list : Bug reports for 5.55 (Message 2934)
Posted 29 Mar 2007 by genes
Post:
Had these errors overnight on machines at work, so I didn't see what they did:

resultid=471512
resultid=472465

One's a -161, other's an "incorrect function". I've got one running here right now that has the 100000000000000000000.... problem, resultid=471927, but it looks like it otherwise is operating normally, so I'll let it finish.
2) Message boards : RALPH@home bug list : Bug reports for Ralph 5.52-5.54 (Message 2896)
Posted 21 Mar 2007 by genes
Post:
No problems so far... :) returned 12 valid 5.54, 1 valid 5.53, 50 valid 5.52 WU's all on Windows boxes. No errors yet.
3) Message boards : RALPH@home bug list : Bug reports for 5.49-5.51 (Message 2850)
Posted 10 Mar 2007 by genes
Post:
Some more access violations...

453604
453706
453803
454246

none of them ran longer than 2 minutes.
4) Message boards : RALPH@home bug list : Bug reports for 5.49-5.51 (Message 2835)
Posted 6 Mar 2007 by genes
Post:
Wow! The graphics are awesome! (not a bug) :-)
5) Message boards : RALPH@home bug list : Bug reports for 5.49-5.51 (Message 2827)
Posted 5 Mar 2007 by genes
Post:
Couple more errors, this time not Access Violations, but code 1 "Incorrect function".

resultid=445073
resultid=445076

I did have two finish successfully, though.

Note: All of these errors occurred on machines that were sitting undisturbed, so no graphics were running. I have graphics enabled, but only for an hour, then blank.
6) Message boards : RALPH@home bug list : Bug reports for 5.49-5.51 (Message 2822)
Posted 4 Mar 2007 by genes
Post:
Got a bunch of errors with 5.50, but I was away yesterday, so I haven't seen them run yet...

resultid=444352
resultid=444488
resultid=444516
resultid=444519
resultid=444547
resultid=444580
resultid=444591

All of them are access violations, and they only ran for a few seconds.
7) Message boards : RALPH@home bug list : Bug Reports for 5.45 (Message 2760)
Posted 3 Feb 2007 by genes
Post:
I have the NVidia card installed (it's a GeForce Fx5950), and I haven't seen any graphics problems since, either with Ralph or Rosetta. I'm using driver version 93.71. So much for ATI.
8) Message boards : RALPH@home bug list : Bug Reports for 5.45 (Message 2758)
Posted 1 Feb 2007 by genes
Post:
Rats. I typo'ed the link, and I can't edit it. I'll try again. It's this computer
9) Message boards : RALPH@home bug list : Bug Reports for 5.45 (Message 2757)
Posted 1 Feb 2007 by genes
Post:
I am still having problems when I display the graphics, notably when I enable the screensaver on [url=http://ralph.bakerlab.org/show_host_detail.php?hostid=2016]this computer{/url].

I currently have an ATI x850x graphics card installed, and the installed driver is 7-1_xp_dd_ccc_wdm_enu_40211 (catalyst version). Here is what I saw happen: the BOINC screensaver was running, and over time I saw the CPDN graphics, the QMC graphics, and either Ralph or Rosetta (both of which are 5.45). The last graphics I saw were from Ralph or Rosetta, then I came back and saw that the "VPU recover" feature had activated (display driver resets instead of hanging, and prepares a crash report for ATI). I allowed it to submit the report, and the Rosetta/Ralph WU did not crash, but finished normally, so I can't point you to the bad WU.

Later today I will put back the NVidia card that I also use with this machine (a GeForce FX5950), and see how that behaves.
10) Message boards : RALPH@home bug list : Bug Reports for 5.44 (Message 2690)
Posted 21 Jan 2007 by genes
Post:
I saw this WU (which is a 5.44) running this morning, and thought "Ooh, maybe they fixed the graphics", so I clicked on "show graphics". Well, the graphics ran for a few seconds, then locked up. Not only that, but the whole machine locked up. After about 20 seconds or so, a feature of this new ATI driver I'm using (7.1) kicked in: it's called VPU recover. The display driver basically reset itself, and everything came back. The WU in question is still running, it didn't error out. I guess I can safely assume, though, that the graphics bug is not fixed yet.
11) Message boards : RALPH@home bug list : Bug reports for Ralph 5.42 and 5.43 (Message 2647)
Posted 20 Dec 2006 by genes
Post:
Got an error on this WU today: resultid=375110

It was an 0xC0000005.

Running with a Quad Xeon (Sossaman), XPSP2 and Boinc core 5.8.0. Graphics were enabled. No biggie, just sayin'.

Errors seem to be somewhat less with 5.8.0 and 5.43, but they still happen. Not shown, of course, are the ones that I save by stopping and restarting Boinc instead of aborting them. Locking/mutexes are the way to go.
12) Message boards : RALPH@home bug list : Bug reports for Ralph 5.42 and 5.43 (Message 2632)
Posted 16 Dec 2006 by genes
Post:
Yes, that is correct, it did not freeze the computer.

Edit: I can usually prevent frozen WU's from being terminated by pressing ctrl-alt-del to get the task manager to be displayed. I will also usually get the taskbar, so I can then close the BOINC manager on the taskbar, then the BOINC CC in the systray. If I do that (in that order), BOINC shuts down all the projects in an orderly manner, and restarting BOINC will then restart all the current WU's from their last checkpoint, even the frozen Rosetta. It will then usually pass the point at which it froze and complete normally. This kind of freezing is usually due to the graphics (running in screensaver mode).

Edit: I've had this one fail recently. resultid=363974 It produced a lot of nice debug output.

Hi gene, that job just crashed and did not freeze your computer, right? From users' report and my local test, it looks like that if a frozen WU is forced to be terminated, it reports error code as - exit code 1073807364 (0x40010004). If a WU just crashes itself without freezing the host computer, it will reports error code as -1073741819 (0xc0000005).
I had a WU fail today, this message was in the log:

12/14/2006 9:14:07 PM|ralph@home|Unrecoverable error for result 1ten__BOINC_POSE_ABRELAX_VARY_ALL_BOND_ANGLES_VARY_ALL_BOND_DISTANCES_NEWRELAXFLAGS_frags83__1561_15_0 ( - exit code -1073741819 (0xc0000005))

This result: resultid=362757

I came back to the computer and had a Windows error message on the screen "Please tell Microsoft about this problem..." . I don't know if graphics were involved, since I was out, however I do have graphics enabled on this machine, and it is a multiprocessor machine. hostid=2016


13) Message boards : RALPH@home bug list : Bug reports for Ralph 5.42 and 5.43 (Message 2625)
Posted 15 Dec 2006 by genes
Post:
I had a WU fail today, this message was in the log:

12/14/2006 9:14:07 PM|ralph@home|Unrecoverable error for result 1ten__BOINC_POSE_ABRELAX_VARY_ALL_BOND_ANGLES_VARY_ALL_BOND_DISTANCES_NEWRELAXFLAGS_frags83__1561_15_0 ( - exit code -1073741819 (0xc0000005))

This result: resultid=362757

I came back to the computer and had a Windows error message on the screen "Please tell Microsoft about this problem..." . I don't know if graphics were involved, since I was out, however I do have graphics enabled on this machine, and it is a multiprocessor machine. hostid=2016
14) Message boards : RALPH@home bug list : Bug reports for Ralph 5.42 and 5.43 (Message 2597)
Posted 12 Dec 2006 by genes
Post:
I got a few WU's on my home machine this morning, and it didn't have any Rosetta, and the Ralph WU's appear to be trouble-free so far, screensaver-wise. When I get home later, I'll see if it crashed.

Here's the machine: hostid=2016
15) Message boards : RALPH@home bug list : Bug reports for Ralph 5.37 through 5.40 (Message 2514)
Posted 9 Nov 2006 by genes
Post:
Got this watchdog timeout: resultid=323419 with debugging info.

16) Message boards : RALPH@home bug list : Bug reports for Ralph 5.37 through 5.40 (Message 2492)
Posted 7 Nov 2006 by genes
Post:
Had some -161's:
resultid=317218
resultid=317217
resultid=315313

a couple of "incorrect function" results:
resultid=315015
resultid=315012

Here's a couple with a big dump:
resultid=315014
resultid=315010

and a "downloading" one:
resultid=314488

At least a few of these died while screensaver graphics were running. In general, I come back to the machine and see a different project's screensaver frozen on the screen, but it isn't running. The screensaver cannot be exited, but I can get the machine back with ctrl-alt-del. I look at task manager and see a Ralph WU's process listed, but using no CPU time. If I kill it, the screensaver graphics disappear and I get control of the machine back. The Ralph WU's status changes to "Computation error" and that's that.

That being said, the very latest one (resultid=317218, a -161 error) happened while I was using the machine and no graphics were being displayed.

Still have a couple of 5.38's left, no 5.39's yet.
17) Message boards : RALPH@home bug list : Bug reports for Ralph 5.36 (Message 2441)
Posted 1 Nov 2006 by genes
Post:
Had this one happen a few minutes ago --

http://ralph.bakerlab.org/result.php?resultid=307179

I clicked "show graphics", and the graphics came up, then froze a few seconds later. The Ralph App was still running, and together with the graphics was using up 2 CPU's of my quad CPU (2 virtual) system. I couldn't close the graphics, if I tried I got a Windows message box (...not responding, end now?) which I canceled a few times to wait (to no avail), and when I chose "end now", the Ralph app terminated with a computation error.

I do use the screensaver, which mostly works, but occasionally errors out or needs to be killed like this.
18) Message boards : RALPH@home bug list : Bug Report for Ralph 5.28 (Message 2324)
Posted 6 Oct 2006 by genes
Post:
I tried to google what the error code(0x0000005 means when it happens, some people suggest a hardware issue, but no firm answers. Anybody has an idea?


I looked at some of the error results that were posted, and I saw for the error code 0xffffffffc0000005, which, if you look at the lower 32 bits is 0xc0000005, which is an access violation. You know, like using a bad pointer, or trying to read or write to memory you don't have.

Hope that helps.
19) Message boards : Current tests : Continue crunching 5.06 Ralph?? (Message 1433)
Posted 29 Apr 2006 by genes
Post:
Well the watchdog finally killed it after almost 17 hours. I had my run-time preference set to 4 hours. I see that the watchdog waits for greater than 4X your preferred time. I guess that answers it...

Resetting my run-time preference to (not selected) so it runs whatever the default is.
20) Message boards : Current tests : Continue crunching 5.06 Ralph?? (Message 1432)
Posted 29 Apr 2006 by genes
Post:
I have this 5.06 WU:

WATCHDOG_KILL_VERY_LONG_JOBS_447_2_1

Currently at 1.03% after 14 hours. It's not stuck, it's moving very slowly, at step 6 million and change. [edit] that's 62 million! [/edit]

When is the watchdog supposed to kill it?


Next 20



©2024 University of Washington
http://www.bakerlab.org