41)
Message boards :
RALPH@home bug list :
Bug Reports for Rosetta Mini Versions 1.+
(Message 3822)
Posted 12 Mar 2008 by Pepo Post: i cant stop rosetta mini. He ever calculate. When i give supsend project he calculate and calculate and calculate... Happens sometimes, mostly to apps run under a wrapper (although this is not Rosetta Mini's case). Best to leave such task running, it will mostly finish correctly, just as a side-effect, (nCPUs + 1) tasks will be running in parallel. Peter |
42)
Message boards :
Current tests :
Help us debug minirosetta.
(Message 3805)
Posted 27 Feb 2008 by Pepo Post: When the time comes to update to 1.09, what is the procedure? When the new pdb symbols file like http://ralph.bakerlab.org/download/minirosetta_1.09_windows_intelx86.pdb will become available, just copy it into your Ralph project folder near the minirosetta_1.08_windows_intelx86.pdb and minirosetta_1.09_windows_intelx86.exe, just as it was said in Message 3695. Will a download of the .pdb file for 1.09 overwrite the 1.08 version? No. You can delete it manually after finshing last 1.08 task. (Maybe, to be on safe side, better after you'll notice, that the 1.08 exe automagically disappeared from your Ralph folder?) I have plenty of disk space, so I don't mind having excess files hanging around, but will the 1.08 symbols interfere with the 1.09 debugging? No, it will not. You can also happily leave it there. Peter |
43)
Message boards :
RALPH@home bug list :
Bug Reports for Rosetta Mini Versions 1.+
(Message 3801)
Posted 26 Feb 2008 by Pepo Post: jobs may appear stuck because the current version of mini only updates the percent complete after a checkpoint is made and checkpoints aren't made that often during the full-atom refinement stage. There seems to be two definitions of "stuck" in broad use, which should be distinguished: - tasks which do consume CPU time, but do not progress (%-wise) for maybe hours (because of not updating it), these might or might not checkpoint and will or will not be preempted after theit timeslot accordingly, and - these which do not consume any CPU time (and do not progress %-wise accordingly :-) but probably still exchange heartbeat messages with the client, so client still leaves them in their latent state, hoping they might checkpoint soon. These "sleepy" tasks block one CPU from Boinc computations and might stay in such state until either being manualy suspended (and/)or the client is restarted. Peter |
44)
Message boards :
RALPH@home bug list :
Bug Reports for Rosetta Mini Versions 1.+
(Message 3795)
Posted 25 Feb 2008 by Pepo Post: This Mini 1.08 task ran for 69 min on WinXP and now shows "waiting to run", but 100% complete with "---" time to complete. I suspect it will go through normally once Ralph gets resource share back, I've just never seen such an issue on Windows before. Yes, it happens, I can see this occasionally, on various projects. It is just enough that the app code checkpoints at 100% (sort of "YES I've got it!!") after finishing some last functional block, just before exit, the varying combination of different projects, their STD's and task lengths will take the opportunity to punish the application. Peter |
45)
Message boards :
RALPH@home bug list :
Bug Reports for Rosetta Mini Versions 1.+
(Message 3775)
Posted 20 Feb 2008 by Pepo Post: I have been getting this kind of report continuously over the last few days! To be more specific, the URL links to the mentioned results score13_hb_envtest62_A_1tig__3299_3942_0 and score13_hb_envtest62_A_1a19A_3299_3939_0. I'm occasionally getting the "exited with zero status" mesages too. Last time I suppose (I'm just making assumptions from logs and task's stdout) because exiting Boinc did not notify preempted Ralph task (although it did notify 3 other running/preempted tasks), this did not remove the lock file and 37 seconds after new start few hours later, the task said "Can't acquire lockfile - exiting" and client said "Task .... exited with zero status but no 'finished' file". (The task was then correctly restarted and crunched until successful end.) Peter |
46)
Message boards :
RALPH@home bug list :
Bug Reports for Rosetta Mini Versions 1.+
(Message 3772)
Posted 20 Feb 2008 by Pepo Post: I have been getting this kind of report continuously over the last few days! Could you point to any of your failed results? (Hidden computers.) Peter |
47)
Message boards :
RALPH@home bug list :
Rosetta mini 1.07
(Message 3771)
Posted 20 Feb 2008 by Pepo Post: I had noticed a few days ago that an Einstein and a Rosetta Mini task were running together. Both were listed as running, but only the Rosetta task was accruing CPU time. Top confirmed that the E@H task wasn't getting any CPU time. Stopping/starting the daemon got things running properly again and I didn't think anything else of it. My reaction to the whole thread 326 is, that it contains very confusing combination of posts, especialy these concerning kenel_task process and these from Anonymous. Maybe the only relevant think I'd agree with was what anders_n suggested. I've also had such sleepy tasks in the past and also few days ago. Not only Rosettas, maybe mostly. No idea whether to blame the application or Boinc client for this. Either the client will forget to really wake up the app, or the app will miss the wakeup, or is somehow stuck internally and really can't wake up? (Although it is still responding to the heartbeat.) Maybe this is then the reason why only restarting the client cures the situation (although restarting the single sleepy application would be enough). Maybe it could help to debug such cases if the wake-up event from client would need a response from the app and client would log such missing response? Peter |
48)
Message boards :
RALPH@home bug list :
Rosetta mini 1.07
(Message 3748)
Posted 17 Feb 2008 by Pepo Post: I had noticed a few days ago that an Einstein and a Rosetta Mini task were running together. Both were listed as running, but only the Rosetta task was accruing CPU time. Top confirmed that the E@H task wasn't getting any CPU time. Stopping/starting the daemon got things running properly again and I didn't think anything else of it. I've also got one sllepy Rosetta Mini 1.07 task from Rosetta, on Windows Boinc 5.10.41. It was also able to continue after restart and finally validated. Peter |
49)
Message boards :
Current tests :
Help us debug minirosetta.
(Message 3736)
Posted 14 Feb 2008 by Pepo Post: I am hesitant to add this to R@h. I'd rather point out the positive than negative for the good of the project. I wonder what other users think. I think it is good to have at least the feeling, how good/bad it goes. Especialy here where the bug hunting happens. But understand your feeling concerning R@h. Peter |
50)
Message boards :
Feedback :
Run time defaults
(Message 3735)
Posted 14 Feb 2008 by Pepo Post: If it is really true, that just the applications are tested, not also the data, then you i]might[/i] be true. But still just might, because you have to test the apps' behavior with some (real) data and the more data you feed the apps with, the better they are tested. |
51)
Message boards :
Feedback :
Run time defaults
(Message 3730)
Posted 13 Feb 2008 by Pepo Post: ...run time preferences: larger number of workunits or more decoys? The prefs page says: "Target CPU run time (not selected defaults to 1 hour)" and I was thinking of enlarging my target time from 2 hours :-) Good that you wrote "that way the developers can change the time if needed", I was looking around for the exact meaning and found dekim's comment: We would prefer lower run times so that results are returned quicker. So I'll rather stick to my 2 hours (trade-off between still somewhat faster turn-around and not just one decoy). Peter |
52)
Message boards :
Feedback :
Run time defaults
(Message 3728)
Posted 13 Feb 2008 by Pepo Post: Additional question on devs according to run time preferences: here at RALPH, is it better to prefer testing larger number of workunits (producing less decoys for each one, 1-5), or rather somewhat more decoys (5-15-...) from each WU, at the expense of the number of tested WUs? Or does it really not matter? Peter |
53)
Message boards :
Current tests :
Help us debug minirosetta.
(Message 3711)
Posted 12 Feb 2008 by Pepo Post: Ralph (or the mini app) does not appear to honor the runtime preference set in Ralph Preferences. [... After] download new preferences. I still don't know if it has the correct run-time preference set. I configured it for 1 hour, and the "To Completion" in the Boinc Manager shows 5:24:14 for all tasks. Is it just the "To Completion" in Boinc Manager or also the real runtime, which does not honour your prefs? What's your Duration Correction Factor? It seems that the run time estimate is "overestimated" and will be correct just when the DCF settles down. For instance, my Pentium III has target run time 4 hours, DCF=0.15 and real runtime ~2:46 (average of 8 results, 2:18-3:31, 1-3 decoys). Other machine, C2D T7200 has target runtime 2 hours, DCF=0.29 and real runtime ~1:35 (average of 11 results 1:06-2:16, 1-6 decoys). Peter |
54)
Message boards :
RALPH@home bug list :
Rosetta Mini 1.06
(Message 3694)
Posted 7 Feb 2008 by Pepo Post: score13_hb_envtest62_A_1a68__3157_1 Link should be score13_hb_envtest62_A_1a68__3157_1. Peter |
55)
Message boards :
RALPH@home bug list :
Rosetta Mini 1.05 identified as a possible virus
(Message 3687)
Posted 6 Feb 2008 by Pepo Post: I use NOD32 as a virus scanner and it has identified Mini 1.05 as a possible variant of the Win32/Statik virus. Has anyone else had any trouble with this? Apparently alexpon had, with NOD32. Peter |
56)
Message boards :
RALPH@home bug list :
Bug Reports for Rosetta Mini Versions 1.+
(Message 3680)
Posted 31 Jan 2008 by Pepo Post: all the "score" 1.05 WU's are failing after 5 seconds with this error Yours is on a Linux host. My score_13_hb_env_test62_A_1louA_3083_1_0 WU on Windows is already running 45 minutes and does not complain. 1 more to go. And I see 2 more are waiting on my Linux host. To check for such fast death - I can do it immediately now... ...the score_13_hb_env_test62_A_1bkrA_3084_5_0 WU on Linux is running 5 minutes without crash. Hope to see it continuing this way. 1 more to go. Peter |
57)
Message boards :
RALPH@home bug list :
Rosetta min 1.03
(Message 3675)
Posted 27 Jan 2008 by Pepo Post: Thanks for the info! Was this ever a problem for the Mac app? Or only windows? According to Message 3651, apparently for Mac OS too. Peter |
58)
Message boards :
RALPH@home bug list :
Rosetta min 1.03
(Message 3646)
Posted 18 Jan 2008 by Pepo Post: Removed again and downloaded 5.10.38, amazingly this worked and all files kept going and are still running. With 5.10.38 you are done. Peter |
59)
Message boards :
Current tests :
Rosetta Mimi 1.03 locking up Boinc on windows
(Message 3641)
Posted 17 Jan 2008 by Pepo Post: Is this really a windows-only problem? If we are talking about read-only files unable to be deleted, doesn't that hit all OS? I think it should pertain to all Boinc versions. Except it did not :-) What does your Mac say to Rosetta mini? Possibly the reason is a rather small count of non-Windows hosts to receive AND notice such problematic WU? Peter |
60)
Message boards :
RALPH@home bug list :
Bug Reports for Rosetta Mini Versions 1.+
(Message 3639)
Posted 16 Jan 2008 by Pepo Post: Peter, you mean like it can be ported over to I.e. the ps3 in the future? How the heck should I know this? Ported - probably yes, for sure. Let it effectively use some/all of the SPEs - would be nice. Get an SPE SDK, plenty of dedicated time, some SPE-experienced programmer(s), cook it long enough and... Peter |
©2024 University of Washington
http://www.bakerlab.org