RALPH@home

Bug Report for Ralph 5.28

  UW Seal
 
[ Home ] [ Join ] [ About ] [ Participants ] [ Community ] [ Statistics ]
  [ login/out ]


Advanced search

Message boards : RALPH@home bug list : Bug Report for Ralph 5.28

AuthorMessage
Chu
Forum moderator
Project developer
Project scientist

Joined: Sep 26 06
Posts: 61
ID: 1900
Credit: 12,545
RAC: 0
Message 2305 - Posted 3 Oct 2006 0:26:02 UTC

    Last modified: 3 Oct 2006 0:21:05 UTC

    Ralph has been updated to 5.28. In this version, we added all-atom sidechain display to the screensaver during Rosetta sidechain refinement stage. Also users now can rotate any of the four displayed model on the screen using a mouse. Several new flags have been added to Rosetta too.

    Profile anders n

    Joined: Feb 16 06
    Posts: 166
    ID: 91
    Credit: 131,419
    RAC: 0
    Message 2306 - Posted 3 Oct 2006 5:18:26 UTC

      Last modified: 3 Oct 2006 5:05:50 UTC

      ERROR:: Unable to determine sequence length from pdb file

      http://ralph.bakerlab.org/result.php?resultid=274727

      http://ralph.bakerlab.org/result.php?resultid=274830

      http://ralph.bakerlab.org/result.php?resultid=274785

      http://ralph.bakerlab.org/result.php?resultid=274908

      Anders n
      ____________

      idlorj

      Joined: Feb 16 06
      Posts: 2
      ID: 282
      Credit: 20,030
      RAC: 0
      Message 2308 - Posted 3 Oct 2006 18:04:53 UTC

        I recieved the same error

        http://ralph.bakerlab.org/result.php?resultid=275739

        Of the ones completed only this workunit failed for me.

        Buffalo Bill

        Joined: May 13 06
        Posts: 5
        ID: 1397
        Credit: 12,223
        RAC: 0
        Message 2311 - Posted 5 Oct 2006 3:49:32 UTC

          Errors for these:

          277562
          277561

          Uakey

          Joined: Sep 8 06
          Posts: 1
          ID: 1807
          Credit: 141
          RAC: 0
          Message 2312 - Posted 5 Oct 2006 7:47:18 UTC

            Last modified: 5 Oct 2006 7:36:38 UTC

            Unhandled Exception Detected

            http://ralph.bakerlab.org/result.php?resultid=277671

            Profile anders n

            Joined: Feb 16 06
            Posts: 166
            ID: 91
            Credit: 131,419
            RAC: 0
            Message 2313 - Posted 5 Oct 2006 8:11:30 UTC - in response to Message 2312.

              Unhandled Exception Detected

              http://ralph.bakerlab.org/result.php?resultid=277671



              This one to.

              http://ralph.bakerlab.org/result.php?resultid=276753

              Anders n
              ____________

              Profile Krzychu P.

              Joined: Feb 16 06
              Posts: 19
              ID: 114
              Credit: 10,236
              RAC: 0
              Message 2314 - Posted 5 Oct 2006 8:45:35 UTC

                I get in message tab:
                2006-10-05 10:44:52|ralph@home|Unrecoverable error for result DOCK_1BQL_unbound_perturb_benchmark_1330_19_0 (Niepoprawna funkcja. (0x1) - exit code 1 (0x1))

                and with next WU:

                2006-10-05 10:50:53|ralph@home|Unrecoverable error for result DOCK_1JHL_unbound_perturb_benchmark_1330_19_0 (Niepoprawna funkcja. (0x1) - exit code 1 (0x1))

                Niepoprawna funkcja = incorrect function
                ____________

                Profile Krzychu P.

                Joined: Feb 16 06
                Posts: 19
                ID: 114
                Credit: 10,236
                RAC: 0
                Message 2315 - Posted 5 Oct 2006 9:57:41 UTC

                  Next two WU\'s with the same:

                  2006-10-05 10:58:45|ralph@home|Unrecoverable error for result DOCK_1FQ1_unbound_perturb_benchmark_1330_19_0 (Niepoprawna funkcja. (0x1) - exit code 1 (0x1))

                  2006-10-05 11:04:21|ralph@home|Unrecoverable error for result DOCK_1MEL_unbound_perturb_benchmark_1330_20_0 (Niepoprawna funkcja. (0x1) - exit code 1 (0x1))

                  ____________

                  Profile feet1st

                  Joined: Mar 7 06
                  Posts: 312
                  ID: 1028
                  Credit: 110,522
                  RAC: 0
                  Message 2316 - Posted 5 Oct 2006 16:36:23 UTC

                    While displaying graphics for this WU, the first box seems stuck with status message \"Try Rotamers\", and the graphic in that first box is not refreshing. The Accepted box is jumping around, but no movement in first box.

                    ...while I was typing the above, it went to \"Packing\" status (step 340,000) and now it seems to be progressing normally and refreshing the graphic.
                    ____________

                    Chu
                    Forum moderator
                    Project developer
                    Project scientist

                    Joined: Sep 26 06
                    Posts: 61
                    ID: 1900
                    Credit: 12,545
                    RAC: 0
                    Message 2317 - Posted 5 Oct 2006 16:54:58 UTC - in response to Message 2315.

                      Thanks. This will be very useful for us to track down those hidden bugs in the code.

                      Next two WU\'s with the same:

                      2006-10-05 10:58:45|ralph@home|Unrecoverable error for result DOCK_1FQ1_unbound_perturb_benchmark_1330_19_0 (Niepoprawna funkcja. (0x1) - exit code 1 (0x1))

                      2006-10-05 11:04:21|ralph@home|Unrecoverable error for result DOCK_1MEL_unbound_perturb_benchmark_1330_20_0 (Niepoprawna funkcja. (0x1) - exit code 1 (0x1))

                      Chu
                      Forum moderator
                      Project developer
                      Project scientist

                      Joined: Sep 26 06
                      Posts: 61
                      ID: 1900
                      Credit: 12,545
                      RAC: 0
                      Message 2318 - Posted 5 Oct 2006 17:20:45 UTC - in response to Message 2316.

                        The current setup in the screensaver graphic is to flash protein sidechains only in \"Try Rotamer\" and \"Packing\" stage. \"Try Rotamers\" is supposed to be very fast each time, and we called them a lot in the code. So people are expected to see protein sidechains flashed on and off most of the time. I think the trajectory is still moving forward as reflected by the moving protein in the accepted box. So it looks like a graphic issue. I am wondering if the more demanding task (showing sidechains on and off quickly) for the graphic is the reason for more frequent \"Unhandle Exception\" errors we have seen in this updated application. I tried to google what the error code(0x0000005 means when it happens, some people suggest a hardware issue, but no firm answers. Anybody has an idea?

                        While displaying graphics for this WU, the first box seems stuck with status message \"Try Rotamers\", and the graphic in that first box is not refreshing. The Accepted box is jumping around, but no movement in first box.

                        ...while I was typing the above, it went to \"Packing\" status (step 340,000) and now it seems to be progressing normally and refreshing the graphic.

                        Pieface

                        Joined: Feb 16 06
                        Posts: 64
                        ID: 234
                        Credit: 203,513
                        RAC: 0
                        Message 2320 - Posted 5 Oct 2006 19:54:33 UTC

                          We used to see them a lot before ?Rom? came in and worked thru the code.
                          I think that was about the time they set-up Ralph for testing. Maybe he can help? I know back then we had to keep Rosetta in-memory and not let it get swapped out to avoid problems also.
                          ____________

                          Profile anders n

                          Joined: Feb 16 06
                          Posts: 166
                          ID: 91
                          Credit: 131,419
                          RAC: 0
                          Message 2321 - Posted 5 Oct 2006 20:02:23 UTC - in response to Message 2320.

                            Last modified: 5 Oct 2006 19:47:13 UTC

                            We used to see them a lot before ?Rom? came in and worked thru the code.
                            I think that was about the time they set-up Ralph for testing. Maybe he can help? I know back then we had to keep Rosetta in-memory and not let it get swapped out to avoid problems also.



                            I still have all my computers set to yes \"leave in memory\".

                            Anders n
                            ____________

                            Profile feet1st

                            Joined: Mar 7 06
                            Posts: 312
                            ID: 1028
                            Credit: 110,522
                            RAC: 0
                            Message 2323 - Posted 6 Oct 2006 1:48:36 UTC

                              I think a part of what we should be testing here is the description of what to expect from the new release and the new graphics. We\'ve had problems before where intentional changes are made... not really described to people, tested on Ralph, rolled out to Rosetta and then people start posted in the \"bugs with this release\" thread, simply because they don\'t understand what is happening and that the change is intentional. They don\'t realize that the science is changing over time and that this is part of the point of the project. etc.

                              So... could you take a stab at describing the changes to the graphics? And the phases and behavior we should expect to observe? Dr. Baker mentioned in his journal that sidechains would be shown... but that raises more questions then it answers. Because this is all second nature to you, let me explain some questions I can see people will be having:

                              What are side chains?
                              Why do they blink on and off?
                              Why can\'t I see side chains in the other boxes? Mine must be broken.
                              Are we doing calculations differently now on the side chains? Or simply illustraiting the calculations we\'ve been doing all along?

                              I feel these types of questions should be answered HERE on Ralph, so that we can comment on them and ask further questions... have the description revised and improved to address those questions and then use that to post the beginning of the \"report problems with v x.yy in this thread\", and to then summarize and post to the news on the homepage.
                              ____________

                              genes
                              Avatar

                              Joined: Feb 16 06
                              Posts: 45
                              ID: 57
                              Credit: 43,300
                              RAC: 0
                              Message 2324 - Posted 6 Oct 2006 2:59:59 UTC - in response to Message 2318.

                                I tried to google what the error code(0x0000005 means when it happens, some people suggest a hardware issue, but no firm answers. Anybody has an idea?


                                I looked at some of the error results that were posted, and I saw for the error code 0xffffffffc0000005, which, if you look at the lower 32 bits is 0xc0000005, which is an access violation. You know, like using a bad pointer, or trying to read or write to memory you don\'t have.

                                Hope that helps.
                                ____________

                                SafeAggie

                                Joined: Oct 5 06
                                Posts: 6
                                ID: 1943
                                Credit: 4,207
                                RAC: 0
                                Message 2325 - Posted 6 Oct 2006 12:13:22 UTC

                                  Unrecoverable error for result

                                  [list]
                                  *1tit__BOINC_ABRELAX_SAVE_ALL_OUT_truess__1329_17_0 ( - exit code -1073741819 (0xc0000005))
                                  *2vik__BOINC_ABRELAX_SAVE_ALL_OUT_truess__1329_18_0 ( - exit code -1073741819 (0xc0000005))
                                  *2vik__BOINC_ABRELAX_SAVE_ALL_OUT_truess__1329_19_1 ( - exit code -1073741819 (0xc0000005))
                                  [\\list]

                                  resultid=279007
                                  resultid=277677
                                  resultid=277610
                                  resultid=277591
                                  resultid=277577

                                  Profile Conan
                                  Avatar

                                  Joined: Feb 16 06
                                  Posts: 344
                                  ID: 145
                                  Credit: 1,309,534
                                  RAC: 0
                                  Message 2326 - Posted 6 Oct 2006 14:13:52 UTC

                                    2Seems to be a Graphics problem. The graphics just freeze and will not release. The workunit appears to still be running (checked with Task Manager) but can\'t get rid of graphic screen, have to end process which causes a computation error. If I let it go I get this error
                                    **********************************************************************
                                    Rosetta score is stuck or going too long. Watchdog is ending the run!
                                    Stuck at score 0.966268 for 3600 seconds
                                    **********************************************************************
                                    GZIP SILENT FILE: .\\xx1c9o.out

                                    The stuck score changes for different failed workunits. Only happening on Windows machine as the Linux machine has no graphics.

                                    So far the above error is on
                                    http://ralph.bakerlab.org/result.php?resultid=279500
                                    http://ralph.bakerlab.org/result.php?resultid=279505
                                    http://ralph.bakerlab.org/result.php?resultid=247849

                                    Also have had Unhandled Exception Record errors
                                    Reason:Access Violation (0xc0000005) at address 0x00759D94 read attempt to address 0x00000011. Had this on http://ralph.bakerlab.org/result.php?resultid=247851
                                    Reason:Access Violation (0xc0000005) at address 0x00759D7D read attempt to address 0x00000011. had this on
                                    http://ralph.bakerlab.org/result.php?resultid=247850

                                    Have had no successful WU\'s on this batch of Windows WU\'s.
                                    Have extended the time for the screen saver to come on to 300 minutes from 20 to see if this stops the problem.
                                    ____________

                                    Chu
                                    Forum moderator
                                    Project developer
                                    Project scientist

                                    Joined: Sep 26 06
                                    Posts: 61
                                    ID: 1900
                                    Credit: 12,545
                                    RAC: 0
                                    Message 2327 - Posted 6 Oct 2006 18:43:59 UTC

                                      Thanks for the suggestion, feet1st. I will give it a try here.

                                      What are side chains?
                                      For general knowledge about protein and amino acid, please visit here. I notice that some key terms like \"sidechains\" are missing and we will update that soon. Here is something I wrote quickly and it definitely needs to be improved.
                                      20 natural amino acids have part of their structures in common and they are connected via peptide-bond linkages to form the topology of a protein structure. The common part is called \"backbone\". Amino acids differ from each other by its unique chemical group which connects to the C-alpha atom of the backbone and that group is called \"side chain\".

                                      Why do they blink on and off?
                                      Sidechains are part of protein structures but we only choose to show them on screen during some certain stages when they are being changed by the program.

                                      Why can\'t I see side chains in the other boxes? Mine must be broken.
                                      So far we only enable the sidechain drawing for the \"searching\" box.

                                      Are we doing calculations differently now on the side chains? Or simply illustraiting the calculations we\'ve been doing all along?
                                      Sidechain calculation has been part of Rosetta program for a long time, but we only start to have them illustrated on the screen recently.

                                      I understand the above is still far from an adequate description. But I will try my best to provide more information on future updates and this will give people like you guys who volunteer to help testing a better chance to send us feedbacks. Many thanks again.

                                      Profile Conan
                                      Avatar

                                      Joined: Feb 16 06
                                      Posts: 344
                                      ID: 145
                                      Credit: 1,309,534
                                      RAC: 0
                                      Message 2329 - Posted 6 Oct 2006 23:43:40 UTC

                                        Well these new Ralph work units are impressive. After creating lots of problems such as freezing my computer when the graphics lock up, it now appears to have stuffed Boinc altogether. During the night Boinc closed down due to the graphics problem. I restarted Boinc which ran for a while then errored out and now Boinc won\'t run anymore due to Visual C++ runtime errors.
                                        I am now trying to get Boinc running again by either replacing the MSVCP71.dll and MSVCP80.dll files in Boinc or I will just have to reload Boinc.
                                        This is only affecting my Windows Xp machine with the latest updates.
                                        ____________

                                        Profile Conan
                                        Avatar

                                        Joined: Feb 16 06
                                        Posts: 344
                                        ID: 145
                                        Credit: 1,309,534
                                        RAC: 0
                                        Message 2330 - Posted 7 Oct 2006 1:56:54 UTC

                                          I now have 2 problems. I can no longer get BOINC to run on my AMD 4800+ Windows XP machine. Problem started with the Ralph graphic freezes, now all I get is Visual C++ runtime errors when Boinc.exe attempts to run. I have re-installed Boinc and downloaded .net software for the VC libraries but still can\'t get Boinc to run anymore. Also rebooted a few times, no joy. Help please.

                                          Problem 2, just had 20 WU\'s download to one of my Linux machines and all 20 Errored out in minutes of downloading with this error:-

                                          ERROR:: Unable to determine sequence length from pdb file

                                          http://ralph.bakerlab.org/workunit.php?wuid=247972, 247980, 247982, 247983, 247993, 247994, 247995, 247996, 247997, 247998, 247999, 248001, 248002, 248004, 248005, 248006, 248051, 248052, 248053, 248054
                                          ____________

                                          Brian B

                                          Joined: Feb 17 06
                                          Posts: 9
                                          ID: 362
                                          Credit: 2,632
                                          RAC: 0
                                          Message 2336 - Posted 7 Oct 2006 5:00:56 UTC

                                            Hi all. I know 5.28 is done know but wanted to let you know a issue I seen this morning on my laptop just in case it slips through to 5.29. Seems that even though BONIC switched to another project, 5.28 was still eating up CPU time. I had BONIC set to Run Based on Preferences, which is to suspend while user is active. I was using the laptop and it was running very, very, very slow, so I brought up the task manager and found that 5.28 was still running and using 100% cpu, even though BONIC was in suspension. I had to End Process 5.28 in order to get it to stop. The funny thing about it was when I looked at the message log, 5.28 was not the last project running, a different project was running be for BONIC when into suspension because I starting using the computer. There were no errors running the wu either.

                                            I\'m on Win2k running several projects with the following:
                                            10/06/2006 11:51:29 PM||Starting BOINC client version 5.2.13 for windows_intelx86
                                            10/06/2006 11:51:29 PM||libcurl/7.14.0 OpenSSL/0.9.8 zlib/1.2.3
                                            10/06/2006 11:51:29 PM||Data directory: C:\\Program Files\\BOINC
                                            10/06/2006 11:51:30 PM||Processor: 1 GenuineIntel Mobile Intel(R) Pentium(R) 4 - M CPU 2.00GHz
                                            10/06/2006 11:51:30 PM||Memory: 766.98 MB physical, 1.08 GB virtual
                                            10/06/2006 11:51:30 PM||Disk: 9.76 GB total, 1.70 GB free

                                            Thanks and good luck!
                                            ____________

                                            Brian B

                                            Joined: Feb 17 06
                                            Posts: 9
                                            ID: 362
                                            Credit: 2,632
                                            RAC: 0
                                            Message 2342 - Posted 7 Oct 2006 13:00:02 UTC - in response to Message 2336.

                                              Hi again. Thought I should update you on my previous post. I checked the computer this morning and it was running slow again, so I checked BONIC to see what was up. Another project was running (different from last time) not 5.28. I checked Task Manager to see what was going on and sure enough, the cpu was at 100% even though BONIC was in suspend mode, and 5.28 was the top in cpu time (over 7hours on a 3 hour wu). I decided to try something different and exited BOINC instead of ending 5.28\'s process, and right after it closed cpu percentage dropped. Started BONIC back up and the wu/result was now back at \'---\'for \'CPU time\' and \'3:40:21\' for \'To completion\' (was at 0:21:35?? and 7:??:??, don\'t remember exact numbers). I decided to abort the wu since this was the second time I found it had locked up BONIC and the computer.

                                              Take care...

                                              Brian

                                              Profile Conan
                                              Avatar

                                              Joined: Feb 16 06
                                              Posts: 344
                                              ID: 145
                                              Credit: 1,309,534
                                              RAC: 0
                                              Message 2347 - Posted 7 Oct 2006 16:53:41 UTC

                                                Last modified: 7 Oct 2006 16:40:08 UTC

                                                >>> After a day of trying to get rid of the Runtime errors caused by Ralph screensaver and not being able to get Boinc running at all. I ended up re-downloading Boinc and installing a different version as my version no matter how many times I installed it would not run with due to visual C++ runtime errors.
                                                It ran and dumped all my work units, downloading a heap of new ones, which then got dumped again when I tried to add my back up files back in and now Einstein and Ralph won\'t talk to the servers saying they are backing off for 18 hours or so.
                                                I think this is due to all the WU\'s that got dumped.
                                                Anyway I sorted things out sort off and I am crunching again.
                                                Not happy with what Ralph did to the Boinc programme and my computer, causing me to lose a days work a couple of times.

                                                I have also found the fresh download has registered my computer twice now.
                                                ____________

                                                Chu
                                                Forum moderator
                                                Project developer
                                                Project scientist

                                                Joined: Sep 26 06
                                                Posts: 61
                                                ID: 1900
                                                Credit: 12,545
                                                RAC: 0
                                                Message 2367 - Posted 9 Oct 2006 19:13:20 UTC - in response to Message 2342.

                                                  Hi Brian, please let us know if you see this problem again with newer updates other than 5.28 and then we will look into it to see if it is a general problem. One thing which might be helpful is to update your bonic software... Thanks for you help and feedback

                                                  Hi again. Thought I should update you on my previous post. I checked the computer this morning and it was running slow again, so I checked BONIC to see what was up. Another project was running (different from last time) not 5.28. I checked Task Manager to see what was going on and sure enough, the cpu was at 100% even though BONIC was in suspend mode, and 5.28 was the top in cpu time (over 7hours on a 3 hour wu). I decided to try something different and exited BOINC instead of ending 5.28\'s process, and right after it closed cpu percentage dropped. Started BONIC back up and the wu/result was now back at \'---\'for \'CPU time\' and \'3:40:21\' for \'To completion\' (was at 0:21:35?? and 7:??:??, don\'t remember exact numbers). I decided to abort the wu since this was the second time I found it had locked up BONIC and the computer.

                                                  Take care...

                                                  Brian

                                                  Chu
                                                  Forum moderator
                                                  Project developer
                                                  Project scientist

                                                  Joined: Sep 26 06
                                                  Posts: 61
                                                  ID: 1900
                                                  Credit: 12,545
                                                  RAC: 0
                                                  Message 2368 - Posted 9 Oct 2006 19:22:17 UTC - in response to Message 2347.

                                                    Last modified: 9 Oct 2006 19:06:53 UTC

                                                    Hi Conan, I do not know what is causing that problem. We did not change any boinc program and only updated Rosetta application. Sorry for the inconvenience.

                                                    >>> After a day of trying to get rid of the Runtime errors caused by Ralph screensaver and not being able to get Boinc running at all. I ended up re-downloading Boinc and installing a different version as my version no matter how many times I installed it would not run with due to visual C++ runtime errors.
                                                    It ran and dumped all my work units, downloading a heap of new ones, which then got dumped again when I tried to add my back up files back in and now Einstein and Ralph won\'t talk to the servers saying they are backing off for 18 hours or so.
                                                    I think this is due to all the WU\'s that got dumped.
                                                    Anyway I sorted things out sort off and I am crunching again.
                                                    Not happy with what Ralph did to the Boinc programme and my computer, causing me to lose a days work a couple of times.

                                                    I have also found the fresh download has registered my computer twice now.

                                                    Message boards : RALPH@home bug list : Bug Report for Ralph 5.28


                                                    Home | Join | About | Participants | Community | Statistics

                                                    Copyright © 2017 University of Washington

                                                    Last Modified: 20 Nov 2008 19:41:56 UTC
                                                    Back to top ^