Message boards : RALPH@home bug list : Report \"stuck at 1%\" bugs here
Previous · 1 . . . 4 · 5 · 6 · 7
Author | Message |
---|---|
Moderator9 Volunteer moderator Send message Joined: 16 Feb 06 Posts: 251 Credit: 0 RAC: 0 |
Tony, I agree with you as well. The directive from the dev's is *not* to abort work units unless specifically asked to. https://ralph.bakerlab.org/forum_thread.php?id=18 Tony, What is your Ralph time setting? I presume this Work Unit is WAY beyond the time it should have quit, since it seems ot have not finished a single model yet. If you can find some way to make it think it is done so it can report normally (with errors) that would be great. But assuming you are running the "Debuger" stuff Ron has been talking about and the 5.4.x BOINC it may return enough data no matter how it finishes. I see the system this is on is a Windows 98 machine. There are some issues with Windows 98 that they hope are fixed in version 5.0. There are not a lot of Windows 98 machines attached to RALPH right now. So (here is where I am going out on that limb) in MY opinion, you should abort the Work Unit and update to version 5.0. BUT WAIT!!! I am going to bring this to the Project team and let them chime in here for a final opinion, as you have something unique going on there. At the same time I will point out the need to clearly state what people should do at the time of a new release. Perhaps we can even get them to add something telling all of us what they are testing. Please stand by EDIT: Message sent Moderator9 RALPH@home FAQs RALPH@home Guidelines Moderator Contact |
Astro Send message Joined: 16 Feb 06 Posts: 141 Credit: 32,977 RAC: 0 |
Thank you This old machine just keeps chugging and I have no problem letting it continue as long as it may be useful. I don't care if it wants to run 2000 hours. I'm getting interested if it'll finish at 100%(that's 100 days from now at this rate,LOL) There is no debug software on that old machine (it's stuffed under an end table in the corner (ultra microatx frame), has no mouse, no keyboard,no monitor, it's only viewable via Realvnc. It is hooked to an UPS, but my fear is the memory leaks will be what causes this to stop crunching and not some other error. It's now at 3.5633% done, 122:54:06, stage Full atom relax, Model 1, Step 34155, 124:39:17 remaining, oh yeah, there's 24 red dots now (whatever the red dots are) My Ralph prefs: Resource share If you participate in multiple BOINC projects, this is the proportion of your resources used by RALPH@home 10 Percentage of CPU time used for graphics not selected Number of frames per second for graphics not selected Target CPU run time 4 hours Miscellaneous Should RALPH@home send you email newsletters? yes Should RALPH@home show your computers on its web site? yes Default computer location home my general prefs: Processor usage Do work while computer is running on batteries? (matters only for portable computers) yes Do work while computer is in use? yes Do work only between the hours of (no restriction) Leave applications in memory while preempted? (suspended applications will consume swap space if 'yes') yes Switch between applications every (recommended: 60 minutes) 180 minutes On multiprocessors, use at most 1 processors Disk and memory usage Use no more than 400 GB disk space Leave at least (Values smaller than 0.001 are ignored) .25 GB disk space free Use no more than 85% of total disk space Write to disk at most every 600 seconds Use no more than 100% of total virtual memory Network usage Connect to network about every (determines size of work cache; maximum 10 days) 3 days Confirm before connecting to Internet? (matters only if you have a modem, ISDN or VPN connection) no Disconnect when done? (matters only if you have a modem, ISDN or VPN connection) no Maximum download rate: 200 KB/s Maximum upload rate: 200 KB/s Use network only between the hours of Enforced by versions 4.46 and greater (no restriction) Skip image file verification? Check this ONLY if your Internet provider modifies image files (UMTS does this, for example). Skipping verification reduces the security of BOINC. no |
Divide Overflow Send message Joined: 15 Feb 06 Posts: 12 Credit: 128,027 RAC: 0 |
I had a v4.99 FACONTACTS_NOFILTERS WU that was behaving in a similar manner. Not stuck, but constant computing for the first model with incredibly slow completion % increases. After much debate, I decided something was wrong and finally aborted it after running for over 33 hours and only reaching 8% done. It was resent with the v5.00 app to another host and was finished successfully in a normal length of time. https://ralph.bakerlab.org/result.php?resultid=86791 Since I was running this on a WinxP machine, I think this problem is specific to the application and not your operating system. |
Moderator9 Volunteer moderator Send message Joined: 16 Feb 06 Posts: 251 Credit: 0 RAC: 0 |
Tony, The word from David Kim is to abort it. There is a new release coming with a kind of auto-abort feature for these kinds of loops. That of course will need testing. The plan is to automatically award credit for work units that the system auto aborts. While I have nothing from David Kim on this last point I would assume the auto abort feature would also provide some enhanced error messages. In any case the plan is that this will solve any looping situation (including loops caused by not keeping the application in memory). So watch for the new version soon. Moderator9 RALPH@home FAQs RALPH@home Guidelines Moderator Contact |
Astro Send message Joined: 16 Feb 06 Posts: 141 Credit: 32,977 RAC: 0 |
|
Rhiju Volunteer moderator Project developer Project scientist Send message Joined: 14 Feb 06 Posts: 161 Credit: 3,725 RAC: 0 |
Hi guys, we just posted the new ralph app 5.01, and are going to try to break it! I wanted to clarify one point. We *don't* yet have a fix for truly hanging jobs. We do have a rough fix for jobs that are constantly getting interrupted (say when BOINC switches to another project) and restarted without leaving Rosetta@home in memory. If that happens more than 5 times, we have Rosetta exit gracefully! But the more general problem -- if the client doesn't do anything for 10 minutes (or 100 hours as reported below!) -- isn't fixed. YET. Working on it. Please keep posting! Thanks, I aborted it. WU in question |
Rollo Send message Joined: 13 Apr 06 Posts: 4 Credit: 610 RAC: 0 |
This WU got stuck for more than 5 minutes without any movement at the graphics. Than crashed before I could abort it. Version 5.00 <core_client_version>5.4.4</core_client_version> <message>Unzulässige Funktion. (0x1) - exit code 1 (0x1) </message> <stderr_txt> # random seed: 3887865 # cpu_run_time_pref: 3600 ERROR:: Exit at: .tether.cc line:411 |
Steven Purvis Send message Joined: 1 Mar 06 Posts: 1 Credit: 8,880 RAC: 0 |
I got two which seemed to keep going at 1% ish (for ralph v4.99) Result 84721 and Result 84734 Result 84734 I allowed to run for 19 hours, where as Result 84721 I only allowed to run for a couple of hours. The 19 hours result seemed to haev terminated itself as it never seemed to get any more points on the graphic. I have Windows XP and BOINC CC v 5.2.7. I have just downloaded a new set of workunits with ralph beta 5.00. |
Dotsch Send message Joined: 4 Mar 06 Posts: 12 Credit: 13,725 RAC: 0 |
https://ralph.bakerlab.org/result.php?resultid=97260 https://ralph.bakerlab.org/workunit.php?wuid=86093 https://ralph.bakerlab.org/show_host_detail.php?hostid=2323 |
Message boards :
RALPH@home bug list :
Report \"stuck at 1%\" bugs here
©2024 University of Washington
http://www.bakerlab.org