RALPH@home

Bug reports for 5.55

  UW Seal
 
[ Home ] [ Join ] [ About ] [ Participants ] [ Community ] [ Statistics ]
  [ login/out ]


Advanced search

Message boards : RALPH@home bug list : Bug reports for 5.55

AuthorMessage
Rhiju
Forum moderator
Project developer
Project scientist

Joined: Feb 14 06
Posts: 161
ID: 4
Credit: 3,725
RAC: 0
Message 2913 - Posted 28 Mar 2007 19:56:46 UTC

    Ralph 5.55 -- there\'s quite a bit of new stuff packed into this update.
    We\'ll be paying careful attention to the timer (reports of \"percentage complete\") as well as a new mode that folds and docks at the same time.
    ____________

    Profile ashriel

    Joined: Mar 3 07
    Posts: 11
    ID: 2714
    Credit: 648
    RAC: 0
    Message 2914 - Posted 28 Mar 2007 20:29:19 UTC

      Hello

      The WU started and jumped on 100% the next second.
      It keeps running and time increases normally.
      ____________

      Profile [B^S] thierry@home
      Avatar

      Joined: Feb 15 06
      Posts: 20
      ID: 12
      Credit: 17,624
      RAC: 0
      Message 2915 - Posted 28 Mar 2007 20:32:20 UTC

        Last modified: 28 Mar 2007 20:40:20 UTC

        Hi,
        I just get a WU 5.55: 1l2x__BOINC_INCREASE_CYCLES10_RNA_ABINITIO-1l2x_-_1868_11_0

        It starts crunching with a % in Progress = 100%. But it continues to crunch.
        The screen saver is normal except that the % is written : 1 then 00000000.... through the entire screen.

        Profile UBT - Mikeejones

        Joined: Mar 22 06
        Posts: 2
        ID: 1169
        Credit: 3,174
        RAC: 0
        Message 2916 - Posted 28 Mar 2007 21:57:14 UTC

          I don\'t mess about if a WU says 100% complete and CPU time increases. Sorry but as soon as I saw that I aborted both WUs - been caught by this sort of thing before and wasted a lot of cycles! It may have carried on to completion but I wasn\'t going to try to find out just in case!

          http://ralph.bakerlab.org/workunit.php?wuid=416831
          http://ralph.bakerlab.org/workunit.php?wuid=416907 refers
          ____________

          Profile feet1st

          Joined: Mar 7 06
          Posts: 312
          ID: 1028
          Credit: 110,522
          RAC: 0
          Message 2917 - Posted 28 Mar 2007 22:02:09 UTC

            Last modified: 28 Mar 2007 22:06:36 UTC

            Another usability issue, which may be simple to improve is at step 340,000, which apparently is a magic number in the processing. This is where you clear out the histogram of energy and RMSD. It then \"hangs\" for 15 seconds or so, (more like a minute I suppose on a slower machine) and then takes another 10 seconds or so to do the first step or two after that.

            Any program that suddenly has portions of the screen blank out, and then shows no activity (unless of course you notice the CPU seconds counting up) for more then the attention span of the caffine-loaded viewer is immediately diagnosed as being \"hung\" and requiring manual intervention... (as if the 5 seconds you\'ve waited already wasn\'t enough for the program to trash your computer if it was going to).

            ...anyway, if you could just NOT blank out those graphs until you complete the initialization or whatever is happening there as step 340,000 chuggs, then it would be a sizable smidge less alarming in appearance. It would be even better if you could impose a few more \"steps\" in to that long processing of step 340,000.
            ____________

            Profile Bober [B@P]

            Joined: Jun 18 06
            Posts: 6
            ID: 1538
            Credit: 15,427
            RAC: 0
            Message 2918 - Posted 28 Mar 2007 22:11:01 UTC - in response to Message 2914.

              Hello

              The WU started and jumped on 100% the next second.
              It keeps running and time increases normally.


              I\'ve got the same. But I\'m not aborting them yet.
              ____________

              Profile idahofisherman
              Avatar

              Joined: Nov 7 06
              Posts: 1
              ID: 2194
              Credit: 9,435
              RAC: 0
              Message 2919 - Posted 28 Mar 2007 22:47:30 UTC

                I am having the same thing happening. I will let the run for a couple of hours and then abort them if they have not completed.

                Hopefully this will not be a waste of CPU time, just a simple programming error. Please post a message when this is fixed as I have stopped this project from recieving any more task.

                Profile Bober [B@P]

                Joined: Jun 18 06
                Posts: 6
                ID: 1538
                Credit: 15,427
                RAC: 0
                Message 2920 - Posted 28 Mar 2007 22:48:59 UTC - in response to Message 2919.

                  My 5.55 WU has just finished...no error...points granted - I think there is no need to abort them.
                  ____________

                  Rhiju
                  Forum moderator
                  Project developer
                  Project scientist

                  Joined: Feb 14 06
                  Posts: 161
                  ID: 4
                  Credit: 3,725
                  RAC: 0
                  Message 2921 - Posted 28 Mar 2007 23:09:26 UTC - in response to Message 2917.

                    Wow good eye! I haven\'t seen that hang, but your explanation makes sense. Let me see what we can do.

                    On the other issues -- my mac screensaver says that the percentage complete is \"inf%\". This sounds like the issues reported below, too with large percentage complete values. Dang!

                    I haven\'t been able to reproduce the Mac issue (process not found) noted on the R@H message boards yet. But I\'m hoping to find a fix for the next update.

                    Finally, one of our old style protein WUs is consistently failing, so I need to ask the other developer about that. Weird.

                    Thanks for all the posts so far! This kind of quick feedback helps tremendously!

                    Another usability issue, which may be simple to improve is at step 340,000, which apparently is a magic number in the processing. This is where you clear out the histogram of energy and RMSD. It then \"hangs\" for 15 seconds or so, (more like a minute I suppose on a slower machine) and then takes another 10 seconds or so to do the first step or two after that.

                    Any program that suddenly has portions of the screen blank out, and then shows no activity (unless of course you notice the CPU seconds counting up) for more then the attention span of the caffine-loaded viewer is immediately diagnosed as being \"hung\" and requiring manual intervention... (as if the 5 seconds you\'ve waited already wasn\'t enough for the program to trash your computer if it was going to).

                    ...anyway, if you could just NOT blank out those graphs until you complete the initialization or whatever is happening there as step 340,000 chuggs, then it would be a sizable smidge less alarming in appearance. It would be even better if you could impose a few more \"steps\" in to that long processing of step 340,000.


                    ____________

                    Profile ashriel

                    Joined: Mar 3 07
                    Posts: 11
                    ID: 2714
                    Credit: 648
                    RAC: 0
                    Message 2922 - Posted 29 Mar 2007 2:30:39 UTC

                      Last modified: 29 Mar 2007 2:32:01 UTC

                      The WU mentioned above finished normally.


                        CPU time (sec) - claimed credit - granted credit
                        3,347.68 -------- 9.89 ------------- 7.60



                      ____________

                      Pieface

                      Joined: Feb 16 06
                      Posts: 64
                      ID: 234
                      Credit: 203,513
                      RAC: 0
                      Message 2923 - Posted 29 Mar 2007 3:18:56 UTC

                        Last modified: 29 Mar 2007 3:20:56 UTC

                        This one errored out on 5.55:

                        Resid 472582

                        1wrpA_BOINC_SYMM_FOLD_AND_DOCK-1wrpA-truncate__1873_21_1

                        ERROR:: Exit at: .\\fold_tree.cc line:809

                        Rhiju
                        Forum moderator
                        Project developer
                        Project scientist

                        Joined: Feb 14 06
                        Posts: 161
                        ID: 4
                        Credit: 3,725
                        RAC: 0
                        Message 2924 - Posted 29 Mar 2007 3:43:25 UTC - in response to Message 2923.

                          Yup, looking at it. Hopefully will be fixed in the next update (tonight or tomorrow).

                          This one errored out on 5.55:

                          Resid 472582

                          1wrpA_BOINC_SYMM_FOLD_AND_DOCK-1wrpA-truncate__1873_21_1

                          ERROR:: Exit at: .\\fold_tree.cc line:809


                          ____________

                          Profile anders n

                          Joined: Feb 16 06
                          Posts: 166
                          ID: 91
                          Credit: 131,419
                          RAC: 0
                          Message 2925 - Posted 29 Mar 2007 3:59:23 UTC - in response to Message 2921.

                            I haven\'t been able to reproduce the Mac issue (process not found) noted on the R@H message boards yet. But I\'m hoping to find a fix for the next update


                            How about the other MAC issue where Ralph/Rosetta hangs after beening preemted and then resumed.
                            I just checked my MAC and 1 WU on each project was hanging.

                            Anders n

                            ____________

                            Profile feet1st

                            Joined: Mar 7 06
                            Posts: 312
                            ID: 1028
                            Credit: 110,522
                            RAC: 0
                            Message 2926 - Posted 29 Mar 2007 4:44:47 UTC

                              This task is Rosetta, but was wondering, I\'ve got 24hr run time preference... this bad boy has been crunching for 14hrs and isn\'t complete with model 3 yet. The % complete shows 42.1%.

                              Still seems to be crunching just fine, but was wondering, does this mean it\'s only taken 1 checkpoint during this third model? Or, is there any way from the graphic to tell when a checkpoint has been actually taken? It\'s on step 395,000, so it must have been crunching for several hours.
                              ____________

                              Rhiju
                              Forum moderator
                              Project developer
                              Project scientist

                              Joined: Feb 14 06
                              Posts: 161
                              ID: 4
                              Credit: 3,725
                              RAC: 0
                              Message 2927 - Posted 29 Mar 2007 5:38:22 UTC - in response to Message 2926.

                                Hi feet1st -- sorry that workunit is taking a while. You\'re right that the WU isn\'t checkpointing until the end of the model, and that could cause a problem for some users that preempt often. We\'re working on a general checkpointing scheme for all modes, but it won\'t be ready for another week or two...

                                This task is Rosetta, but was wondering, I\'ve got 24hr run time preference... this bad boy has been crunching for 14hrs and isn\'t complete with model 3 yet. The % complete shows 42.1%.

                                Still seems to be crunching just fine, but was wondering, does this mean it\'s only taken 1 checkpoint during this third model? Or, is there any way from the graphic to tell when a checkpoint has been actually taken? It\'s on step 395,000, so it must have been crunching for several hours.


                                ____________

                                Rhiju
                                Forum moderator
                                Project developer
                                Project scientist

                                Joined: Feb 14 06
                                Posts: 161
                                ID: 4
                                Credit: 3,725
                                RAC: 0
                                Message 2928 - Posted 29 Mar 2007 5:39:24 UTC - in response to Message 2925.

                                  Anders n, actually, wait, when did this start happening for you? Is there a discussion thread on this?

                                  I haven\'t been able to reproduce the Mac issue (process not found) noted on the R@H message boards yet. But I\'m hoping to find a fix for the next update


                                  How about the other MAC issue where Ralph/Rosetta hangs after beening preemted and then resumed.
                                  I just checked my MAC and 1 WU on each project was hanging.

                                  Anders n


                                  ____________

                                  Profile anders n

                                  Joined: Feb 16 06
                                  Posts: 166
                                  ID: 91
                                  Credit: 131,419
                                  RAC: 0
                                  Message 2930 - Posted 29 Mar 2007 6:18:23 UTC - in response to Message 2928.

                                    Anders n, actually, wait, when did this start happening for you? Is there a discussion thread on this?

                                    I haven\'t been able to reproduce the Mac issue (process not found) noted on the R@H message boards yet. But I\'m hoping to find a fix for the next update


                                    How about the other MAC issue where Ralph/Rosetta hangs after beening preemted and then resumed.
                                    I just checked my MAC and 1 WU on each project was hanging.

                                    Anders n



                                    Se Bug reports 5.52-5.54.

                                    It started 18/3.

                                    Anders n
                                    ____________

                                    Rhiju
                                    Forum moderator
                                    Project developer
                                    Project scientist

                                    Joined: Feb 14 06
                                    Posts: 161
                                    ID: 4
                                    Credit: 3,725
                                    RAC: 0
                                    Message 2931 - Posted 29 Mar 2007 8:17:20 UTC - in response to Message 2930.

                                      I see ... actually I thought this was a graphics bug, and thought it might be fixed in the latest update, but that\'s not the case. I wonder if I can reproduce it on my machine, switching between ralph and some other app.

                                      Anders n, actually, wait, when did this start happening for you? Is there a discussion thread on this?

                                      I haven\'t been able to reproduce the Mac issue (process not found) noted on the R@H message boards yet. But I\'m hoping to find a fix for the next update


                                      How about the other MAC issue where Ralph/Rosetta hangs after beening preemted and then resumed.
                                      I just checked my MAC and 1 WU on each project was hanging.

                                      Anders n



                                      Se Bug reports 5.52-5.54.

                                      It started 18/3.

                                      Anders n


                                      ____________

                                      genes
                                      Avatar

                                      Joined: Feb 16 06
                                      Posts: 45
                                      ID: 57
                                      Credit: 43,300
                                      RAC: 0
                                      Message 2934 - Posted 29 Mar 2007 11:46:33 UTC

                                        Had these errors overnight on machines at work, so I didn\'t see what they did:

                                        resultid=471512
                                        resultid=472465

                                        One\'s a -161, other\'s an \"incorrect function\". I\'ve got one running here right now that has the 100000000000000000000.... problem, resultid=471927, but it looks like it otherwise is operating normally, so I\'ll let it finish.

                                        ____________

                                        Profile Conan
                                        Avatar

                                        Joined: Feb 16 06
                                        Posts: 345
                                        ID: 145
                                        Credit: 1,328,309
                                        RAC: 299
                                        Message 2936 - Posted 29 Mar 2007 12:18:49 UTC

                                          Had this WU fail with MAXIMUM DISK SPACE EXCEEDED, I have many GigaBytes so this should not be the problem
                                          http://ralph.bakerlab.org/result.php?resultid=471223

                                          Also had these two fail with the old ERROR -161,
                                          http://ralph.bakerlab.org/result.php?resultid=471479
                                          http://ralph.bakerlab.org/result.php?resultid=471480

                                          I currently have one running that may be a 5.55 or a 5.56 not sure, but it has jumped straight to 100% as some others have reported with the time to complete still going up but only 1 hour 40 minutes done on a 6 hour preferance. Windows machine.

                                          Strangely I have two others that have switched and are \'Waiting to run\' but the Time to completion is still ticking over and also the percentage done is moving up, yet the CPU Time is not moving. I have a dual cpu dual core machine so 4 cores are running and they are all accounted for so Why is Boinc saying I have 6 cores doing something? Very strange. Linux machine.
                                          ____________

                                          Dustin Ragan

                                          Joined: Mar 16 07
                                          Posts: 1
                                          ID: 2773
                                          Credit: 7,447
                                          RAC: 0
                                          Message 2938 - Posted 29 Mar 2007 13:37:48 UTC

                                            I have a workunit on my home laptop that\'s up to 203% completion right now.

                                            Curiously enough, according to Boinc it says that Chess960 is running, but the only one accumulating % is RALPH.

                                            Profile feet1st

                                            Joined: Mar 7 06
                                            Posts: 312
                                            ID: 1028
                                            Credit: 110,522
                                            RAC: 0
                                            Message 2940 - Posted 29 Mar 2007 15:44:21 UTC - in response to Message 2921.

                                              Wow good eye! I haven\'t seen that hang, but your explanation makes sense. Let me see what we can do.

                                              On the other issues -- my mac screensaver says that the percentage complete is \"inf%\". This sounds like the issues reported below, too with large percentage complete values. Dang!

                                              I haven\'t been able to reproduce the Mac issue (process not found) noted on the R@H message boards yet. But I\'m hoping to find a fix for the next update.

                                              Finally, one of our old style protein WUs is consistently failing, so I need to ask the other developer about that. Weird.

                                              Thanks for all the posts so far! This kind of quick feedback helps tremendously!

                                              Another usability issue, which may be simple to improve is at step 340,000, which apparently is a magic number in the processing. This is where you clear out the histogram of energy and RMSD. It then \"hangs\" for 15 seconds or so, (more like a minute I suppose on a slower machine) and then takes another 10 seconds or so to do the first step or two after that.

                                              Any program that suddenly has portions of the screen blank out, and then shows no activity (unless of course you notice the CPU seconds counting up) for more then the attention span of the caffine-loaded viewer is immediately diagnosed as being \"hung\" and requiring manual intervention... (as if the 5 seconds you\'ve waited already wasn\'t enough for the program to trash your computer if it was going to).

                                              ...anyway, if you could just NOT blank out those graphs until you complete the initialization or whatever is happening there as step 340,000 chuggs, then it would be a sizable smidge less alarming in appearance. It would be even better if you could impose a few more \"steps\" in to that long processing of step 340,000.



                                              Actually... thinking about it more, it would seem better to just leave the graphs, and since the rest of the screen will not be moving for an extended period of time, could you change the \"stage\" to something like \"Initializing full atom relax...\" with the strobing periods at the end? (if indeed that describes what Rosetta is actually DOING at that point that is :)
                                              ____________

                                              Mark Reiss
                                              Avatar

                                              Joined: Aug 3 06
                                              Posts: 2
                                              ID: 1648
                                              Credit: 3,911
                                              RAC: 0
                                              Message 2948 - Posted 29 Mar 2007 19:53:59 UTC - in response to Message 2914.

                                                Hello
                                                The WU started and jumped on 100% the next second.
                                                It keeps running and time increases normally.


                                                Hi all:
                                                Mine also junped to 100% right away but since I have my prefrences set to 8 hours I will let it run.
                                                Mark Reiss
                                                ____________

                                                Message boards : RALPH@home bug list : Bug reports for 5.55


                                                Home | Join | About | Participants | Community | Statistics

                                                Copyright © 2017 University of Washington

                                                Last Modified: 20 Nov 2008 19:41:56 UTC
                                                Back to top ^