Bug reports for 5.55

Message boards : RALPH@home bug list : Bug reports for 5.55

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Rhiju
Volunteer moderator
Project developer
Project scientist

Send message
Joined: 14 Feb 06
Posts: 161
Credit: 3,725
RAC: 0
Message 2913 - Posted: 28 Mar 2007, 19:56:46 UTC

Ralph 5.55 -- there's quite a bit of new stuff packed into this update.
We'll be paying careful attention to the timer (reports of "percentage complete") as well as a new mode that folds and docks at the same time.
ID: 2913 · Report as offensive    Reply Quote
Profile ashriel

Send message
Joined: 3 Mar 07
Posts: 11
Credit: 648
RAC: 0
Message 2914 - Posted: 28 Mar 2007, 20:29:19 UTC

Hello

The WU started and jumped on 100% the next second.
It keeps running and time increases normally.
ID: 2914 · Report as offensive    Reply Quote
Profile [B^S] thierry@home
Avatar

Send message
Joined: 15 Feb 06
Posts: 20
Credit: 17,624
RAC: 0
Message 2915 - Posted: 28 Mar 2007, 20:32:20 UTC
Last modified: 28 Mar 2007, 20:40:20 UTC

Hi,
I just get a WU 5.55: 1l2x__BOINC_INCREASE_CYCLES10_RNA_ABINITIO-1l2x_-_1868_11_0

It starts crunching with a % in Progress = 100%. But it continues to crunch.
The screen saver is normal except that the % is written : 1 then 00000000.... through the entire screen.
ID: 2915 · Report as offensive    Reply Quote
Profile UBT - Mikeejones

Send message
Joined: 22 Mar 06
Posts: 2
Credit: 3,174
RAC: 0
Message 2916 - Posted: 28 Mar 2007, 21:57:14 UTC

I don't mess about if a WU says 100% complete and CPU time increases. Sorry but as soon as I saw that I aborted both WUs - been caught by this sort of thing before and wasted a lot of cycles! It may have carried on to completion but I wasn't going to try to find out just in case!

https://ralph.bakerlab.org/workunit.php?wuid=416831
https://ralph.bakerlab.org/workunit.php?wuid=416907 refers
ID: 2916 · Report as offensive    Reply Quote
Profile feet1st

Send message
Joined: 7 Mar 06
Posts: 313
Credit: 116,623
RAC: 0
Message 2917 - Posted: 28 Mar 2007, 22:02:09 UTC
Last modified: 28 Mar 2007, 22:06:36 UTC

Another usability issue, which may be simple to improve is at step 340,000, which apparently is a magic number in the processing. This is where you clear out the histogram of energy and RMSD. It then "hangs" for 15 seconds or so, (more like a minute I suppose on a slower machine) and then takes another 10 seconds or so to do the first step or two after that.

Any program that suddenly has portions of the screen blank out, and then shows no activity (unless of course you notice the CPU seconds counting up) for more then the attention span of the caffine-loaded viewer is immediately diagnosed as being "hung" and requiring manual intervention... (as if the 5 seconds you've waited already wasn't enough for the program to trash your computer if it was going to).

...anyway, if you could just NOT blank out those graphs until you complete the initialization or whatever is happening there as step 340,000 chuggs, then it would be a sizable smidge less alarming in appearance. It would be even better if you could impose a few more "steps" in to that long processing of step 340,000.
ID: 2917 · Report as offensive    Reply Quote
Profile Bober [B@P]

Send message
Joined: 18 Jun 06
Posts: 6
Credit: 15,427
RAC: 0
Message 2918 - Posted: 28 Mar 2007, 22:11:01 UTC - in response to Message 2914.  

Hello

The WU started and jumped on 100% the next second.
It keeps running and time increases normally.


I've got the same. But I'm not aborting them yet.
ID: 2918 · Report as offensive    Reply Quote
Profile idahofisherman
Avatar

Send message
Joined: 7 Nov 06
Posts: 1
Credit: 9,435
RAC: 0
Message 2919 - Posted: 28 Mar 2007, 22:47:30 UTC

I am having the same thing happening. I will let the run for a couple of hours and then abort them if they have not completed.

Hopefully this will not be a waste of CPU time, just a simple programming error. Please post a message when this is fixed as I have stopped this project from recieving any more task.
ID: 2919 · Report as offensive    Reply Quote
Profile Bober [B@P]

Send message
Joined: 18 Jun 06
Posts: 6
Credit: 15,427
RAC: 0
Message 2920 - Posted: 28 Mar 2007, 22:48:59 UTC - in response to Message 2919.  

My 5.55 WU has just finished...no error...points granted - I think there is no need to abort them.
ID: 2920 · Report as offensive    Reply Quote
Rhiju
Volunteer moderator
Project developer
Project scientist

Send message
Joined: 14 Feb 06
Posts: 161
Credit: 3,725
RAC: 0
Message 2921 - Posted: 28 Mar 2007, 23:09:26 UTC - in response to Message 2917.  

Wow good eye! I haven't seen that hang, but your explanation makes sense. Let me see what we can do.

On the other issues -- my mac screensaver says that the percentage complete is "inf%". This sounds like the issues reported below, too with large percentage complete values. Dang!

I haven't been able to reproduce the Mac issue (process not found) noted on the R@H message boards yet. But I'm hoping to find a fix for the next update.

Finally, one of our old style protein WUs is consistently failing, so I need to ask the other developer about that. Weird.

Thanks for all the posts so far! This kind of quick feedback helps tremendously!

Another usability issue, which may be simple to improve is at step 340,000, which apparently is a magic number in the processing. This is where you clear out the histogram of energy and RMSD. It then "hangs" for 15 seconds or so, (more like a minute I suppose on a slower machine) and then takes another 10 seconds or so to do the first step or two after that.

Any program that suddenly has portions of the screen blank out, and then shows no activity (unless of course you notice the CPU seconds counting up) for more then the attention span of the caffine-loaded viewer is immediately diagnosed as being "hung" and requiring manual intervention... (as if the 5 seconds you've waited already wasn't enough for the program to trash your computer if it was going to).

...anyway, if you could just NOT blank out those graphs until you complete the initialization or whatever is happening there as step 340,000 chuggs, then it would be a sizable smidge less alarming in appearance. It would be even better if you could impose a few more "steps" in to that long processing of step 340,000.


ID: 2921 · Report as offensive    Reply Quote
Profile ashriel

Send message
Joined: 3 Mar 07
Posts: 11
Credit: 648
RAC: 0
Message 2922 - Posted: 29 Mar 2007, 2:30:39 UTC
Last modified: 29 Mar 2007, 2:32:01 UTC

The WU mentioned above finished normally.

    CPU time (sec) - claimed credit - granted credit
    3,347.68 -------- 9.89 ------------- 7.60



ID: 2922 · Report as offensive    Reply Quote
Pieface

Send message
Joined: 16 Feb 06
Posts: 64
Credit: 203,513
RAC: 0
Message 2923 - Posted: 29 Mar 2007, 3:18:56 UTC
Last modified: 29 Mar 2007, 3:20:56 UTC

This one errored out on 5.55:

Resid 472582

1wrpA_BOINC_SYMM_FOLD_AND_DOCK-1wrpA-truncate__1873_21_1

ERROR:: Exit at: .fold_tree.cc line:809
ID: 2923 · Report as offensive    Reply Quote
Rhiju
Volunteer moderator
Project developer
Project scientist

Send message
Joined: 14 Feb 06
Posts: 161
Credit: 3,725
RAC: 0
Message 2924 - Posted: 29 Mar 2007, 3:43:25 UTC - in response to Message 2923.  

Yup, looking at it. Hopefully will be fixed in the next update (tonight or tomorrow).

This one errored out on 5.55:

Resid 472582

1wrpA_BOINC_SYMM_FOLD_AND_DOCK-1wrpA-truncate__1873_21_1

ERROR:: Exit at: .fold_tree.cc line:809


ID: 2924 · Report as offensive    Reply Quote
Profile anders n

Send message
Joined: 16 Feb 06
Posts: 166
Credit: 131,419
RAC: 0
Message 2925 - Posted: 29 Mar 2007, 3:59:23 UTC - in response to Message 2921.  

I haven't been able to reproduce the Mac issue (process not found) noted on the R@H message boards yet. But I'm hoping to find a fix for the next update


How about the other MAC issue where Ralph/Rosetta hangs after beening preemted and then resumed.
I just checked my MAC and 1 WU on each project was hanging.

Anders n

ID: 2925 · Report as offensive    Reply Quote
Profile feet1st

Send message
Joined: 7 Mar 06
Posts: 313
Credit: 116,623
RAC: 0
Message 2926 - Posted: 29 Mar 2007, 4:44:47 UTC

This task is Rosetta, but was wondering, I've got 24hr run time preference... this bad boy has been crunching for 14hrs and isn't complete with model 3 yet. The % complete shows 42.1%.

Still seems to be crunching just fine, but was wondering, does this mean it's only taken 1 checkpoint during this third model? Or, is there any way from the graphic to tell when a checkpoint has been actually taken? It's on step 395,000, so it must have been crunching for several hours.
ID: 2926 · Report as offensive    Reply Quote
Rhiju
Volunteer moderator
Project developer
Project scientist

Send message
Joined: 14 Feb 06
Posts: 161
Credit: 3,725
RAC: 0
Message 2927 - Posted: 29 Mar 2007, 5:38:22 UTC - in response to Message 2926.  

Hi feet1st -- sorry that workunit is taking a while. You're right that the WU isn't checkpointing until the end of the model, and that could cause a problem for some users that preempt often. We're working on a general checkpointing scheme for all modes, but it won't be ready for another week or two...

This task is Rosetta, but was wondering, I've got 24hr run time preference... this bad boy has been crunching for 14hrs and isn't complete with model 3 yet. The % complete shows 42.1%.

Still seems to be crunching just fine, but was wondering, does this mean it's only taken 1 checkpoint during this third model? Or, is there any way from the graphic to tell when a checkpoint has been actually taken? It's on step 395,000, so it must have been crunching for several hours.


ID: 2927 · Report as offensive    Reply Quote
Rhiju
Volunteer moderator
Project developer
Project scientist

Send message
Joined: 14 Feb 06
Posts: 161
Credit: 3,725
RAC: 0
Message 2928 - Posted: 29 Mar 2007, 5:39:24 UTC - in response to Message 2925.  

Anders n, actually, wait, when did this start happening for you? Is there a discussion thread on this?

I haven't been able to reproduce the Mac issue (process not found) noted on the R@H message boards yet. But I'm hoping to find a fix for the next update


How about the other MAC issue where Ralph/Rosetta hangs after beening preemted and then resumed.
I just checked my MAC and 1 WU on each project was hanging.

Anders n


ID: 2928 · Report as offensive    Reply Quote
Profile anders n

Send message
Joined: 16 Feb 06
Posts: 166
Credit: 131,419
RAC: 0
Message 2930 - Posted: 29 Mar 2007, 6:18:23 UTC - in response to Message 2928.  

Anders n, actually, wait, when did this start happening for you? Is there a discussion thread on this?

I haven't been able to reproduce the Mac issue (process not found) noted on the R@H message boards yet. But I'm hoping to find a fix for the next update


How about the other MAC issue where Ralph/Rosetta hangs after beening preemted and then resumed.
I just checked my MAC and 1 WU on each project was hanging.

Anders n



Se Bug reports 5.52-5.54.

It started 18/3.

Anders n
ID: 2930 · Report as offensive    Reply Quote
Rhiju
Volunteer moderator
Project developer
Project scientist

Send message
Joined: 14 Feb 06
Posts: 161
Credit: 3,725
RAC: 0
Message 2931 - Posted: 29 Mar 2007, 8:17:20 UTC - in response to Message 2930.  

I see ... actually I thought this was a graphics bug, and thought it might be fixed in the latest update, but that's not the case. I wonder if I can reproduce it on my machine, switching between ralph and some other app.

Anders n, actually, wait, when did this start happening for you? Is there a discussion thread on this?

I haven't been able to reproduce the Mac issue (process not found) noted on the R@H message boards yet. But I'm hoping to find a fix for the next update


How about the other MAC issue where Ralph/Rosetta hangs after beening preemted and then resumed.
I just checked my MAC and 1 WU on each project was hanging.

Anders n



Se Bug reports 5.52-5.54.

It started 18/3.

Anders n


ID: 2931 · Report as offensive    Reply Quote
genes
Avatar

Send message
Joined: 16 Feb 06
Posts: 45
Credit: 43,706
RAC: 20
Message 2934 - Posted: 29 Mar 2007, 11:46:33 UTC

Had these errors overnight on machines at work, so I didn't see what they did:

resultid=471512
resultid=472465

One's a -161, other's an "incorrect function". I've got one running here right now that has the 100000000000000000000.... problem, resultid=471927, but it looks like it otherwise is operating normally, so I'll let it finish.

ID: 2934 · Report as offensive    Reply Quote
Profile Conan
Avatar

Send message
Joined: 16 Feb 06
Posts: 364
Credit: 1,368,421
RAC: 0
Message 2936 - Posted: 29 Mar 2007, 12:18:49 UTC

Had this WU fail with MAXIMUM DISK SPACE EXCEEDED, I have many GigaBytes so this should not be the problem
https://ralph.bakerlab.org/result.php?resultid=471223

Also had these two fail with the old ERROR -161,
https://ralph.bakerlab.org/result.php?resultid=471479
https://ralph.bakerlab.org/result.php?resultid=471480

I currently have one running that may be a 5.55 or a 5.56 not sure, but it has jumped straight to 100% as some others have reported with the time to complete still going up but only 1 hour 40 minutes done on a 6 hour preferance. Windows machine.

Strangely I have two others that have switched and are 'Waiting to run' but the Time to completion is still ticking over and also the percentage done is moving up, yet the CPU Time is not moving. I have a dual cpu dual core machine so 4 cores are running and they are all accounted for so Why is Boinc saying I have 6 cores doing something? Very strange. Linux machine.
ID: 2936 · Report as offensive    Reply Quote
1 · 2 · Next

Message boards : RALPH@home bug list : Bug reports for 5.55



©2024 University of Washington
http://www.bakerlab.org