Posts by genes

21) Message boards : RALPH@home bug list : Debugger Stuff (Message 1279)
Posted 20 Apr 2006 by genes
Post:
Carlos,

Yes, I agree it was confusing. But what I think I understand is that all I have to do to have the new debugger installed is to use the newer 5.4.x versions of the BOINC CC.
I usually try to run with the latest, so no problems there. Running with 5.4.4 everywhere now. I saw the new files in the Boinc folder, so hopefully everything's OK.
22) Message boards : Number crunching : Max time (Message 1209)
Posted 18 Apr 2006 by genes
Post:
4.99 seems to have a problem in that multiple models will finish (you can see the red dots in the graphics), yet it will still say "model 1". I had them running over 48 hours before I finally gave up and aborted them. (2 running and 1 waiting)

(yes they were also the "FACONTACTS" WU's.)
23) Message boards : RALPH@home bug list : RALPH Version News! - Version 4.99 (Windows) released! (Message 1194)
Posted 16 Apr 2006 by genes
Post:
Carlos,

Thanks. I did end up aborting them, since they had obvious bugs. I had a lot of red dots showing, but they still claimed they were on the first model.

->links. I was trying to use the angle brackets, and didn't have the patience to fool with it after it didn't work. It would be nice if there was a little button in the message editor that said "make a link".

I'll go check out you debugger link!
24) Message boards : RALPH@home bug list : RALPH Version News! - Version 4.99 (Windows) released! (Message 1180)
Posted 15 Apr 2006 by genes
Post:
Have a couple of 4.99 units that will NOT end. They are not crashing, are advancing and processing multiple models, despite saying "model 1" in the graphics (there are multiple red dots now). Admittedly I had rebooted the machine several times doing various things before realizing that they are saving NO checkpoints, and will restart from zero each time. That got me up to 24 hours on one WU, around 19 hours on the other. Now they are at 45-1/2 hours for one, 40-1/2 for the other, and showing 3.02% and 2.09% complete, respectively, in the BOINC manager.

These are the WU's:

http://ralph.bakerlab.org/workunit.php?wuid=78685
http://ralph.bakerlab.org/workunit.php?wuid=78717

There is also this one, which hasn't even started yet:

http://ralph.bakerlab.org/workunit.php?wuid=78686

This is not an old P2-400 machine here. It is a dual Xeon, 3.06GHz, with 2GB of memory.

(I tried to make these into links, with no success.)

So, the question is here, should these be aborted, especially given the fact that 5.00 is out?

[another edit]
In the meantime, I cannot reboot this machine or stop BOINC or these WU's will restart from zero AGAIN.
[/another edit]
25) Message boards : RALPH@home bug list : RALPH Version News! - Version 4.97 (Win/Lin/Mac) released! (Message 1043)
Posted 8 Apr 2006 by genes
Post:
Had my very first one fail --

4/7/2006 9:46:24 PM|ralph@home|Unrecoverable error for result HBLR_1.0_2tif_375_54_0 ( - exit code -1073741819 (0xc0000005))

Result:
http://ralph.bakerlab.org/result.php?resultid=79821
26) Message boards : RALPH@home bug list : Report \"stuck at 1%\" bugs here (Message 779)
Posted 2 Mar 2006 by genes
Post:

ANY WU THAT IS RESTARTED FOR ANY REASON BEFORE IT REACHES THE FIRST CHECKPOINT WILL START OVER FROM 0%. (the first checkpoint occurs when the percent complete reaches any value GREATER than 1% complete)

Anything that removes the WU from memory before it reaches the first checkpoint is considered to be a restart. (Application swaps with keep in memory set to no, Turning off the computer, Restarting the computer, restarting BOINC, and suspending and restarting the project are all events that remove the WU from memory).



Well, the machine I had set up to test "leave in memory = NO" has restarted a bunch of times, basically every time that the apps switch. I just changed that to "leave in memory = YES".

I would guess that we can't do that test anymore while 4.90 WU's are being sent out.

[edit]
BTW, I'm now running BOINC ver. 5.3.22, since it has the ability to use a "global_prefs_override.xml" file to quickly change preferences like Leave Apps In Memory without worrying what venue a machine belongs to or what other machines the change might affect. FINALLY!
[/edit]

27) Message boards : RALPH@home bug list : Report \"stuck at 1%\" bugs here (Message 730)
Posted 28 Feb 2006 by genes
Post:
Question: Should we set our run-time preference higher for these 4.90 WU's? Since they seem to be running slowly (due to debugging code maybe?) they aren't going to get much done in the recommended 2 hours. I have mine set at 4 hours for my P3 machines and even they aren't getting much done.
28) Message boards : RALPH@home bug list : Report \"stuck at 1%\" bugs here (Message 718)
Posted 28 Feb 2006 by genes
Post:
I don't know how long it's supposed to spend "initializing", but I got a new 4.90 WU which has been initializing (at 1%, with the dots blinking) now for over 30 minutes. There is a molecule in the "native" box and one in the "searching" box, but the other boxes are empty. The lines defining the edges of the boxes are also oddly shaped, on the empty boxes the upper right corners are folded down and to the left.

This machine has "leave in memory" set to YES. I'll let it keep running, we'll see what happens.




I saw one do that the other day, but it started and while it was initializing it did an application swap. This left the display just as you described it until the WU started up again.

See if this is what is happening on your system.


It is swapped out right now. We'll see what happens when it comes back. The machine is a Dual P3, 1GHz. Run-time prefs set to 4 hours.

[edit]
It's back. It has gotten past the "initializing" stage, and is on step 25000 or so. Still at 1%, but running (steps counting, graphics moving). Verrry slowwwly. I suspect debugging code has been put into it, much like when Seti Boinc first started out (and was crashing constantly).
[/edit]

29) Message boards : RALPH@home bug list : Report \"stuck at 1%\" bugs here (Message 712)
Posted 28 Feb 2006 by genes
Post:
I don't know how long it's supposed to spend "initializing", but I got a new 4.90 WU which has been initializing (at 1%, with the dots blinking) now for over 30 minutes. There is a molecule in the "native" box and one in the "searching" box, but the other boxes are empty. The lines defining the edges of the boxes are also oddly shaped, on the empty boxes the upper right corners are folded down and to the left.

This machine has "leave in memory" set to YES. I'll let it keep running, we'll see what happens.

30) Message boards : RALPH@home bug list : Report - Previously Unclassified Work Unit Errors (Message 554)
Posted 24 Feb 2006 by genes
Post:
I just posted this over at Rosetta:

http://boinc.bakerlab.org/rosetta/forum_thread.php?id=1143

The machine in question has been failing Rosetta WU's, and just failed a Ralph WU the same way. I will try setting the run time lower to see if that helps.

They all failed with (0xc000000d) errors.
[edit]
setting run time to 4 hours for now. (2 hours won't do too much on a P3.)
[/edit]
[edit]
the failed WU:
http://ralph.bakerlab.org/result.php?resultid=6153
[/edit]
31) Message boards : RALPH@home bug list : Report \"failure when switching projects without keeping applications in memory\" bugs here (Message 349)
Posted 20 Feb 2006 by genes
Post:
My machine that has been failing WU's with "Leave in Memory = NO" has completed a WU successfully with it set to YES. I believe it has demonstrated that it can complete a WU without crashing.

The WU:

http://ralph.bakerlab.org/result.php?resultid=5292

The machine:

http://ralph.bakerlab.org/show_host_detail.php?hostid=76

It's currently half finished with a Rosetta WU, so I'll leave it set to YES until the Rosetta finishes, then I'll switch it back. It's a Dual P3 1GHz, also processing CPDN, Einstein, S@H, and S@H Beta. None of those projects seem to be affected by the "Leave in Memory" setting so far.
32) Message boards : Current tests : February 18th, NEW APPLICATION VERSIONS (Message 285)
Posted 19 Feb 2006 by genes
Post:
Thanks for the info. That 4.85 WU is at 92% and 7.5 hours (roughly). Looks like it will come in pretty close to the 8 hour spec.
33) Message boards : Current tests : Switching between projects with applications removed from memory (Message 284)
Posted 19 Feb 2006 by genes
Post:
Thanks for the info. :-)
34) Message boards : Current tests : Switching between projects with applications removed from memory (Message 247)
Posted 18 Feb 2006 by genes
Post:
I have a 4.85 WU now. Are these new changes for the "Leave In Memory = NO" bug?
35) Message boards : Current tests : February 18th, NEW APPLICATION VERSIONS (Message 246)
Posted 18 Feb 2006 by genes
Post:
Have a 4.85 now. They're changing pretty quickly.
36) Message boards : RALPH@home bug list : Preferences across projects (Message 186)
Posted 18 Feb 2006 by genes
Post:
Anytime you edit General Prefs, the latest ones you edit get propagated to all projects. So, better to edit them on a project where they are close to how you like them already. Then, update from *that* project instead of from the new one (Ralph). Ralph started out with only the defaults and no separate venues, so if you edit those, they go everywhere.

OR, if you want instant gratification, create all the separate venues in Ralph, and copy the prefs from another project in yourself, then update from Ralph.

Oops, we both responded at the same time. Same idea.
37) Message boards : RALPH@home bug list : Report \"failure when switching projects without keeping applications in memory\" bugs here (Message 183)
Posted 18 Feb 2006 by genes
Post:
OK, I have just set that machine's "Leave in Memory" to YES. It has had 2/2 failures. Hopefully it'll get some more work soon.

I have another machine which has *always* had Leave in Memory set to YES return a good result. This one:

http://ralph.bakerlab.org/show_host_detail.php?hostid=81


38) Message boards : Number crunching : Resource share: RALPH instead of Rosetta@home? (Message 180)
Posted 18 Feb 2006 by genes
Post:
genes, thx for info, now that I had more time, I played with RALPH settings to find how to do separate configs, without having settings "spill over".

I've set the 1 PC which joined RALPH to "work" and "work"'s general settings include the "Leave app in memory when preempted"=NO. I'm not going to run R@H on this one for the time being (as long as I want to test if R v4.84 solved the issue we have with R v4.81)

Apparently a host (PC) can be in location "work" for project X and in location "home" for project Y (had to look in account_*.xml files, field "host_venue")


Dimitris, I just had a 4.84 crash, so the outlook is not good.

This "work for project X and home for project Y" business has caused me no end of confusion since I am now running 8 PC's and a good assortment of projects. Managing all of their settings and deciding which project to update from is a *real pain*. I don't think Boinc was supposed to work that way, but it can end up that way, depending on the projects involved, and possibly each one's version of the server software. The projects are supposed to communicate with each other and propagate settings from the last one you changed. This assumes that you have joined *each one* with the same exact email address. (At least that's my understanding of it.)

BTW, I thoroughly enjoyed your web page with details on all the projects. Thanks!
39) Message boards : Current tests : Switching between projects with applications removed from memory (Message 179)
Posted 18 Feb 2006 by genes
Post:
Had another WU crash, report here:

http://ralph.bakerlab.org/forum_thread.php?id=2#178
40) Message boards : RALPH@home bug list : Report \"failure when switching projects without keeping applications in memory\" bugs here (Message 178)
Posted 18 Feb 2006 by genes
Post:
Oh, well, just had a 4.84 WU crash:

2/17/2006 8:39:37 PM|ralph@home|Unrecoverable error for result BARCODE_30_1tig__NATIVE_210_3_0 ( - exit code -1073741819 (0xc0000005))

This one:

http://ralph.bakerlab.org/result.php?resultid=2785

This computer:

http://ralph.bakerlab.org/show_host_detail.php?hostid=76

So far both WU's this machine has had have crashed. It has "leave in Memory" set to "NO".


Previous 20 · Next 20



©2024 University of Washington
http://www.bakerlab.org