Posts by Pepo

41) Message boards : RALPH@home bug list : Bug Reports for Rosetta Mini Versions 1.+ (Message 3822)
Posted 12 Mar 2008 by Pepo
Post:
i cant stop rosetta mini. He ever calculate. When i give supsend project he calculate and calculate and calculate...

he do the same when boinc autochange to other project...

Happens sometimes, mostly to apps run under a wrapper (although this is not Rosetta Mini's case). Best to leave such task running, it will mostly finish correctly, just as a side-effect, (nCPUs + 1) tasks will be running in parallel.

Peter
42) Message boards : Current tests : Help us debug minirosetta. (Message 3805)
Posted 27 Feb 2008 by Pepo
Post:
When the time comes to update to 1.09, what is the procedure?

When the new pdb symbols file like http://ralph.bakerlab.org/download/minirosetta_1.09_windows_intelx86.pdb will become available, just copy it into your Ralph project folder near the minirosetta_1.08_windows_intelx86.pdb and minirosetta_1.09_windows_intelx86.exe, just as it was said in Message 3695.

Will a download of the .pdb file for 1.09 overwrite the 1.08 version?

No. You can delete it manually after finshing last 1.08 task. (Maybe, to be on safe side, better after you'll notice, that the 1.08 exe automagically disappeared from your Ralph folder?)

I have plenty of disk space, so I don't mind having excess files hanging around, but will the 1.08 symbols interfere with the 1.09 debugging?

No, it will not. You can also happily leave it there.

Peter
43) Message boards : RALPH@home bug list : Bug Reports for Rosetta Mini Versions 1.+ (Message 3801)
Posted 26 Feb 2008 by Pepo
Post:
jobs may appear stuck because the current version of mini only updates the percent complete after a checkpoint is made and checkpoints aren't made that often during the full-atom refinement stage.

That may be so, but in the mini 1.08 cases I reported the cpu usage was running at about 1%. In my book that means stuck!

There seems to be two definitions of "stuck" in broad use, which should be distinguished:

- tasks which do consume CPU time, but do not progress (%-wise) for maybe hours (because of not updating it), these might or might not checkpoint and will or will not be preempted after theit timeslot accordingly, and
- these which do not consume any CPU time (and do not progress %-wise accordingly :-) but probably still exchange heartbeat messages with the client, so client still leaves them in their latent state, hoping they might checkpoint soon. These "sleepy" tasks block one CPU from Boinc computations and might stay in such state until either being manualy suspended (and/)or the client is restarted.

Peter
44) Message boards : RALPH@home bug list : Bug Reports for Rosetta Mini Versions 1.+ (Message 3795)
Posted 25 Feb 2008 by Pepo
Post:
This Mini 1.08 task ran for 69 min on WinXP and now shows "waiting to run", but 100% complete with "---" time to complete. I suspect it will go through normally once Ralph gets resource share back, I've just never seen such an issue on Windows before.

Yes, it happens, I can see this occasionally, on various projects.

It is just enough that the app code checkpoints at 100% (sort of "YES I've got it!!") after finishing some last functional block, just before exit, the varying combination of different projects, their STD's and task lengths will take the opportunity to punish the application.

Peter
45) Message boards : RALPH@home bug list : Bug Reports for Rosetta Mini Versions 1.+ (Message 3775)
Posted 20 Feb 2008 by Pepo
Post:
I have been getting this kind of report continuously over the last few days!

Could you point to any of your failed results? (Hidden computers.)

To be more specific, the URL links to the mentioned results score13_hb_envtest62_A_1tig__3299_3942_0 and score13_hb_envtest62_A_1a19A_3299_3939_0.

I'm occasionally getting the "exited with zero status" mesages too. Last time I suppose (I'm just making assumptions from logs and task's stdout) because exiting Boinc did not notify preempted Ralph task (although it did notify 3 other running/preempted tasks), this did not remove the lock file and 37 seconds after new start few hours later, the task said "Can't acquire lockfile - exiting" and client said "Task .... exited with zero status but no 'finished' file". (The task was then correctly restarted and crunched until successful end.)

Peter
46) Message boards : RALPH@home bug list : Bug Reports for Rosetta Mini Versions 1.+ (Message 3772)
Posted 20 Feb 2008 by Pepo
Post:
I have been getting this kind of report continuously over the last few days!

Could you point to any of your failed results? (Hidden computers.)

Peter
47) Message boards : RALPH@home bug list : Rosetta mini 1.07 (Message 3771)
Posted 20 Feb 2008 by Pepo
Post:
I had noticed a few days ago that an Einstein and a Rosetta Mini task were running together. Both were listed as running, but only the Rosetta task was accruing CPU time. Top confirmed that the E@H task wasn't getting any CPU time. Stopping/starting the daemon got things running properly again and I didn't think anything else of it.

I saw the same thing again this evening. I took some screen shots before and after restarting the daemon.


Happened again.

It's only happening with Einstein. Other projects are running fine with Rosetta Mini.


I am in no position (and totally unqualified) to check this out, but if this were to happen on my Mac my first question would be: What is the behavior of the kernel_task process? (See <a href="http://ralph.bakerlab.org//forum_thread.php?id=326&nowrap=true#3338">this message</a>.)


Can you put that in English for the technically challenged among us (/me points to self)?

My reaction to the whole thread 326 is, that it contains very confusing combination of posts, especialy these concerning kenel_task process and these from Anonymous. Maybe the only relevant think I'd agree with was what anders_n suggested.

I've also had such sleepy tasks in the past and also few days ago. Not only Rosettas, maybe mostly. No idea whether to blame the application or Boinc client for this. Either the client will forget to really wake up the app, or the app will miss the wakeup, or is somehow stuck internally and really can't wake up? (Although it is still responding to the heartbeat.) Maybe this is then the reason why only restarting the client cures the situation (although restarting the single sleepy application would be enough).

Maybe it could help to debug such cases if the wake-up event from client would need a response from the app and client would log such missing response?

Peter
48) Message boards : RALPH@home bug list : Rosetta mini 1.07 (Message 3748)
Posted 17 Feb 2008 by Pepo
Post:
I had noticed a few days ago that an Einstein and a Rosetta Mini task were running together. Both were listed as running, but only the Rosetta task was accruing CPU time. Top confirmed that the E@H task wasn't getting any CPU time. Stopping/starting the daemon got things running properly again and I didn't think anything else of it.

I saw the same thing again this evening. I took some screen shots before and after restarting the daemon.

I've also got one sllepy Rosetta Mini 1.07 task from Rosetta, on Windows Boinc 5.10.41. It was also able to continue after restart and finally validated.

Peter
49) Message boards : Current tests : Help us debug minirosetta. (Message 3736)
Posted 14 Feb 2008 by Pepo
Post:
I am hesitant to add this to R@h. I'd rather point out the positive than negative for the good of the project. I wonder what other users think.

I think it is good to have at least the feeling, how good/bad it goes.
Especialy here where the bug hunting happens.

But understand your feeling concerning R@h.

Peter
50) Message boards : Feedback : Run time defaults (Message 3735)
Posted 14 Feb 2008 by Pepo
Post:
If it is really true, that just the applications are tested, not also the data, then you i]might[/i] be true.

But still just might, because you have to test the apps' behavior with some (real) data and the more data you feed the apps with, the better they are tested.
51) Message boards : Feedback : Run time defaults (Message 3730)
Posted 13 Feb 2008 by Pepo
Post:
...run time preferences: larger number of workunits or more decoys?

I don't know. Here I leave the run time at "not selected". I figure that way the developers can change the time if needed.

The prefs page says: "Target CPU run time (not selected defaults to 1 hour)" and I was thinking of enlarging my target time from 2 hours :-)


Good that you wrote "that way the developers can change the time if needed", I was looking around for the exact meaning and found dekim's comment:
We would prefer lower run times so that results are returned quicker.

So I'll rather stick to my 2 hours (trade-off between still somewhat faster turn-around and not just one decoy).

Peter
52) Message boards : Feedback : Run time defaults (Message 3728)
Posted 13 Feb 2008 by Pepo
Post:
Additional question on devs according to run time preferences: here at RALPH, is it better to prefer testing larger number of workunits (producing less decoys for each one, 1-5), or rather somewhat more decoys (5-15-...) from each WU, at the expense of the number of tested WUs?

Or does it really not matter?

Peter
53) Message boards : Current tests : Help us debug minirosetta. (Message 3711)
Posted 12 Feb 2008 by Pepo
Post:
Ralph (or the mini app) does not appear to honor the runtime preference set in Ralph Preferences. [... After] download new preferences. I still don't know if it has the correct run-time preference set. I configured it for 1 hour, and the "To Completion" in the Boinc Manager shows 5:24:14 for all tasks.

Is it just the "To Completion" in Boinc Manager or also the real runtime, which does not honour your prefs? What's your Duration Correction Factor? It seems that the run time estimate is "overestimated" and will be correct just when the DCF settles down.

For instance, my Pentium III has target run time 4 hours, DCF=0.15 and real runtime ~2:46 (average of 8 results, 2:18-3:31, 1-3 decoys). Other machine, C2D T7200 has target runtime 2 hours, DCF=0.29 and real runtime ~1:35 (average of 11 results 1:06-2:16, 1-6 decoys).

Peter
54) Message boards : RALPH@home bug list : Rosetta Mini 1.06 (Message 3694)
Posted 7 Feb 2008 by Pepo
Post:
score13_hb_envtest62_A_1a68__3157_1

http://score13_hb_envtest62_A_1a68__3157_1

run with the same app was valid

Link should be score13_hb_envtest62_A_1a68__3157_1.

Peter
55) Message boards : RALPH@home bug list : Rosetta Mini 1.05 identified as a possible virus (Message 3687)
Posted 6 Feb 2008 by Pepo
Post:
I use NOD32 as a virus scanner and it has identified Mini 1.05 as a possible variant of the Win32/Statik virus. Has anyone else had any trouble with this?

Apparently alexpon had, with NOD32.

Peter
56) Message boards : RALPH@home bug list : Bug Reports for Rosetta Mini Versions 1.+ (Message 3680)
Posted 31 Jan 2008 by Pepo
Post:
all the "score" 1.05 WU's are failing after 5 seconds with this error
I have had about 4 so far and 4 more to go which I expect will fail as well.

Example this WU

Yours is on a Linux host.

My score_13_hb_env_test62_A_1louA_3083_1_0 WU on Windows is already running 45 minutes and does not complain. 1 more to go.

And I see 2 more are waiting on my Linux host. To check for such fast death - I can do it immediately now...

...the score_13_hb_env_test62_A_1bkrA_3084_5_0 WU on Linux is running 5 minutes without crash. Hope to see it continuing this way. 1 more to go.

Peter
57) Message boards : RALPH@home bug list : Rosetta min 1.03 (Message 3675)
Posted 27 Jan 2008 by Pepo
Post:
Thanks for the info! Was this ever a problem for the Mac app? Or only windows?

According to Message 3651, apparently for Mac OS too.

Peter
58) Message boards : RALPH@home bug list : Rosetta min 1.03 (Message 3646)
Posted 18 Jan 2008 by Pepo
Post:
Removed again and downloaded 5.10.38, amazingly this worked and all files kept going and are still running.

I will check those other suggestions above to get rid of ralph mini.

With 5.10.38 you are done.

Peter
59) Message boards : Current tests : Rosetta Mimi 1.03 locking up Boinc on windows (Message 3641)
Posted 17 Jan 2008 by Pepo
Post:
Is this really a windows-only problem? If we are talking about read-only files unable to be deleted, doesn't that hit all OS?

I think it should pertain to all Boinc versions. Except it did not :-)

What does your Mac say to Rosetta mini?

Possibly the reason is a rather small count of non-Windows hosts to receive AND notice such problematic WU?

Peter
60) Message boards : RALPH@home bug list : Bug Reports for Rosetta Mini Versions 1.+ (Message 3639)
Posted 16 Jan 2008 by Pepo
Post:
Peter, you mean like it can be ported over to I.e. the ps3 in the future?

How the heck should I know this? Ported - probably yes, for sure. Let it effectively use some/all of the SPEs - would be nice. Get an SPE SDK, plenty of dedicated time, some SPE-experienced programmer(s), cook it long enough and...

Peter


Previous 20 · Next 20



©2024 University of Washington
http://www.bakerlab.org