RALPH@home

Bug reports for version 5.93

  UW Seal
 
[ Home ] [ Join ] [ About ] [ Participants ] [ Community ] [ Statistics ]
  [ login/out ]


Advanced search

Message boards : RALPH@home bug list : Bug reports for version 5.93

AuthorMessage
Ingemar
Forum moderator
Project developer
Project scientist

Joined: Mar 7 07
Posts: 9
ID: 2729
Credit: 76
RAC: 0
Message 3593 - Posted 4 Jan 2008 6:25:19 UTC

    Last modified: 4 Jan 2008 6:30:35 UTC

    Please report any weird behavior of rosetta version 5.93!

    Dr Who Fan
    Avatar

    Joined: Sep 2 06
    Posts: 63
    ID: 1787
    Credit: 43,843
    RAC: 22
    Message 3599 - Posted 12 Jan 2008 9:06:49 UTC

      Last modified: 12 Jan 2008 9:14:59 UTC

      This Work Unit 649494 exited with a \"161\" error:

      Outcome Client error
      Client state Compute error
      Exit status 0 (0x0)
      Computer ID 10396
      CPU time 6315.28125

      stderr out
      <core_client_version>6.1.0</core_client_version>
      <![CDATA[
      <stderr_txt>
      # cpu_run_time_pref: 7200
      # random seed: 1553865
      ======================================================
      DONE :: 1 starting structures 6315.06 cpu seconds
      This process generated 5 decoys from 5 attempts
      ======================================================


      BOINC :: Watchdog shutting down...
      BOINC :: BOINC support services shutting down...

      </stderr_txt>
      <message>
      <file_xfer_error>
      <file_name>trunc_solit_BOINC_ABRELAX_-trunc_solit-_2891_27_0_0</file_name>
      <error_code>-161</error_code>
      </file_xfer_error>

      </message>
      ]]>

      Dr Who Fan
      Avatar

      Joined: Sep 2 06
      Posts: 63
      ID: 1787
      Credit: 43,843
      RAC: 22
      Message 3600 - Posted 12 Jan 2008 9:14:18 UTC

        This Work Unit 649484 exited with a \"161\" error for me and my wingman.
        Details below from my result id:

        Outcome Client error
        Client state Compute error
        Exit status 0 (0x0)
        Computer ID 4500
        Report deadline 16 Jan 2008 1:09:00 UTC
        CPU time 7364.203125
        stderr out

        <core_client_version>6.1.0</core_client_version>
        <![CDATA[
        <stderr_txt>
        # cpu_run_time_pref: 7200
        # random seed: 1553875
        ======================================================
        DONE :: 1 starting structures 7363.47 cpu seconds
        This process generated 3 decoys from 3 attempts
        ======================================================


        BOINC :: Watchdog shutting down...
        BOINC :: BOINC support services shutting down...

        </stderr_txt>
        <message>
        <file_xfer_error>
        <file_name>trunc_solit_BOINC_ABRELAX_-trunc_solit-_2891_17_1_0</file_name>
        <error_code>-161</error_code>
        </file_xfer_error>

        </message>
        ]]>

        Dr Who Fan
        Avatar

        Joined: Sep 2 06
        Posts: 63
        ID: 1787
        Credit: 43,843
        RAC: 22
        Message 3608 - Posted 14 Jan 2008 8:02:11 UTC

          Another 161 error to report:

          http://ralph.bakerlab.org/result.php?resultid=732877

          Exit status 0 (0x0)
          Computer ID 4500
          Report deadline 16 Jan 2008 1:09:00 UTC
          CPU time 7364.203125
          stderr out

          <core_client_version>6.1.0</core_client_version>
          <![CDATA[
          <stderr_txt>
          # cpu_run_time_pref: 7200
          # random seed: 1553875
          ======================================================
          DONE :: 1 starting structures 7363.47 cpu seconds
          This process generated 3 decoys from 3 attempts
          ======================================================


          BOINC :: Watchdog shutting down...
          BOINC :: BOINC support services shutting down...

          </stderr_txt>
          <message>
          <file_xfer_error>
          <file_name>trunc_solit_BOINC_ABRELAX_-trunc_solit-_2891_17_1_0</file_name>
          <error_code>-161</error_code>
          </file_xfer_error>

          </message>
          ]]>

          Validate state Invalid
          Claimed credit 18.0472218006911
          ____________

          Dr Who Fan
          Avatar

          Joined: Sep 2 06
          Posts: 63
          ID: 1787
          Credit: 43,843
          RAC: 22
          Message 3609 - Posted 14 Jan 2008 8:05:31 UTC

            Another 161 error to report:

            http://ralph.bakerlab.org/result.php?resultid=732906

            Outcome Client error
            Client state Compute error
            Exit status 0 (0x0)
            Computer ID 10396
            Report deadline 16 Jan 2008 3:04:44 UTC
            CPU time 6187.515625
            stderr out

            <core_client_version>6.1.0</core_client_version>
            <![CDATA[
            <stderr_txt>
            # cpu_run_time_pref: 7200
            # random seed: 1553752
            ======================================================
            DONE :: 1 starting structures 6186.97 cpu seconds
            This process generated 5 decoys from 5 attempts
            ======================================================


            BOINC :: Watchdog shutting down...
            BOINC :: BOINC support services shutting down...

            </stderr_txt>
            <message>
            <file_xfer_error>
            <file_name>trunc_solit_BOINC_ABRELAX_-trunc_solit-_2892_40_1_0</file_name>
            <error_code>-161</error_code>
            </file_xfer_error>

            </message>
            ]]>

            Validate state Invalid
            Claimed credit 16.9298421717622
            Granted credit 0
            application version 5.93
            ____________

            Snagletooth

            Joined: May 4 07
            Posts: 65
            ID: 3020
            Credit: 112,601
            RAC: 5
            Message 3612 - Posted 14 Jan 2008 11:12:48 UTC

              Another \"161\" error for trunc_solit_BOINC_ABRELAX_-trunc_solit-_2891_53_1

              workunit 649520 has now been sent to a third cruncher


              <core_client_version>5.10.20</core_client_version>
              <![CDATA[
              <stderr_txt>
              # cpu_run_time_pref: 36000
              # random seed: 1553839
              # cpu_run_time_pref: 36000
              ======================================================
              DONE :: 1 starting structures 35646.5 cpu seconds
              This process generated 6 decoys from 6 attempts
              ======================================================


              BOINC :: Watchdog shutting down...
              BOINC :: BOINC support services shutting down...

              </stderr_txt>
              <message>
              <file_xfer_error>
              <file_name>trunc_solit_BOINC_ABRELAX_-trunc_solit-_2891_53_1_0</file_name>
              <error_code>-161</error_code>
              </file_xfer_error>

              </message>
              ]]>


              BigMike
              Avatar

              Joined: Feb 23 06
              Posts: 63
              ID: 738
              Credit: 58,730
              RAC: 0
              Message 3629 - Posted 16 Jan 2008 3:35:26 UTC

                Wow ... that didn\'t take long...

                <core_client_version>5.10.30</core_client_version>
                <![CDATA[
                <message>
                Incorrect function. (0x1) - exit code 1 (0x1)
                </message>
                <stderr_txt>
                # cpu_run_time_pref: 3600
                ERROR:: Unable to determine sequence length from pdb file
                ERROR:: Exit from: .\\pose.cc line: 1983

                </stderr_txt>
                ]]>


                ____________
                Don't believe everything you think.

                Eric Ogletree

                Joined: Aug 27 07
                Posts: 1
                ID: 3430
                Credit: 24,361
                RAC: 0
                Message 3638 - Posted 16 Jan 2008 15:45:54 UTC

                  Got four of them here. Hope it helps. :)

                  16/01/2008 1:27:23 AM|ralph@home|Reason: Unrecoverable error for result mini_-1a32_-test_2898_200_0 (<file_xfer_error> <file_name>mini_-1a32_-test_2898_200_0_0</file_name> <error_code>-161</error_code></file_xfer_error>)

                  16/01/2008 5:34:01 AM|ralph@home|Task mini_-1a32_-test_2898_193_0 exited with zero status but no \'finished\' file

                  16/01/2008 5:57:55 AM|ralph@home|Task mini_-1a32_-test_2898_206_0 exited with zero status but no \'finished\' file

                  16/01/2008 8:36:34 AM|ralph@home|Reason: Unrecoverable error for result mini_-1a32_-test_2898_193_0 (<file_xfer_error> <file_name>mini_-1a32_-test_2898_193_0_0</file_name> <error_code>-161</error_code></file_xfer_error>)

                  ramostol

                  Joined: Mar 29 07
                  Posts: 24
                  ID: 2840
                  Credit: 31,121
                  RAC: 0
                  Message 3652 - Posted 20 Jan 2008 11:17:06 UTC

                    You probably know this now, but anyhow:

                    (Some?) trunc_solit-wus seem unable to create proper output files.

                    3 invalid results for trunc_solit_BOINC_ABRELAX_-trunc_solit-_2934_25

                    Profile Conan
                    Avatar

                    Joined: Feb 16 06
                    Posts: 344
                    ID: 145
                    Credit: 1,309,534
                    RAC: 0
                    Message 3653 - Posted 20 Jan 2008 14:38:18 UTC

                      Getting the same error as some others here with WU type \'trunc_solit\'

                      WU 732726
                      WU 732758
                      WU 732769
                      WU 732802
                      WU 732803
                      WU 733276
                      WU 736191

                      <core_client_version>5.10.21</core_client_version>
                      <![CDATA[
                      <stderr_txt>
                      Graphics are disabled due to configuration...
                      # cpu_run_time_pref: 21600
                      # random seed: 1553847
                      ======================================================
                      DONE :: 1 starting structures 21007.6 cpu seconds
                      This process generated 30 decoys from 30 attempts
                      ======================================================


                      BOINC :: Watchdog shutting down...
                      BOINC :: BOINC support services shutting down...

                      </stderr_txt>
                      <message>
                      <file_xfer_error>
                      <file_name>trunc_solit_BOINC_ABRELAX_-trunc_solit-_2891_45_0_0</file_name>
                      <error_code>-161</error_code>
                      </file_xfer_error>

                      ____________

                      Profile Conan
                      Avatar

                      Joined: Feb 16 06
                      Posts: 344
                      ID: 145
                      Credit: 1,309,534
                      RAC: 0
                      Message 3660 - Posted 21 Jan 2008 10:56:12 UTC

                        WU 736639

                        <core_client_version>5.10.21</core_client_version>
                        <![CDATA[
                        <message>
                        process exited with code 1 (0x1, -255)
                        </message>
                        <stderr_txt>
                        Graphics are disabled due to configuration...
                        ERROR:: Unable to obtain total_residue & sequence.
                        start pdb file must be provided.
                        ERROR:: Exit from: input_pdb.cc line: 2968
                        # cpu_run_time_pref: 21600

                        ____________

                        Profile Conan
                        Avatar

                        Joined: Feb 16 06
                        Posts: 344
                        ID: 145
                        Credit: 1,309,534
                        RAC: 0
                        Message 3664 - Posted 22 Jan 2008 21:22:01 UTC

                          Have a WU running at the moment (1 of 4 but not sure exactly which one), that is behaving very strange.
                          I noticed this morning that I had a Ralph WU that had completed at 100% after 17:29:51 but was still showing as running at High Priority.

                          Suspending and resuming made no difference so I stopped Boinc Manager and restarted.

                          The WU appeared to have gone but on checking further I found that it has gone back to a process time of 4 hours 12 minutes and going as normal again but still at High Priority.

                          Is this normal for these 2h4o_BOINC_TWIST type work units?
                          ____________

                          Profile Conan
                          Avatar

                          Joined: Feb 16 06
                          Posts: 344
                          ID: 145
                          Credit: 1,309,534
                          RAC: 0
                          Message 3666 - Posted 23 Jan 2008 14:10:36 UTC - in response to Message 3664.

                            Have a WU running at the moment (1 of 4 but not sure exactly which one), that is behaving very strange.
                            I noticed this morning that I had a Ralph WU that had completed at 100% after 17:29:51 but was still showing as running at High Priority.

                            Suspending and resuming made no difference so I stopped Boinc Manager and restarted.

                            The WU appeared to have gone but on checking further I found that it has gone back to a process time of 4 hours 12 minutes and going as normal again but still at High Priority.

                            Is this normal for these 2h4o_BOINC_TWIST type work units?


                            I am suspecting that the WU resets and starts again, so I lost possibly up to 17 hours processing time.
                            Of the 4 I received, 1 has now completed normally without an indication of problems.
                            2 more are now up to 15 and 16 hours at 98.5% with 9 minutes 56 seconds left on both. One has switched to let another project run but the other is running at High Priority.
                            ____________

                            Profile Conan
                            Avatar

                            Joined: Feb 16 06
                            Posts: 344
                            ID: 145
                            Credit: 1,309,534
                            RAC: 0
                            Message 3667 - Posted 23 Jan 2008 22:57:59 UTC - in response to Message 3666.

                              Have a WU running at the moment (1 of 4 but not sure exactly which one), that is behaving very strange.
                              I noticed this morning that I had a Ralph WU that had completed at 100% after 17:29:51 but was still showing as running at High Priority.

                              Suspending and resuming made no difference so I stopped Boinc Manager and restarted.

                              The WU appeared to have gone but on checking further I found that it has gone back to a process time of 4 hours 12 minutes and going as normal again but still at High Priority.

                              Is this normal for these 2h4o_BOINC_TWIST type work units?


                              I am suspecting that the WU resets and starts again, so I lost possibly up to 17 hours processing time.
                              Of the 4 I received, 1 has now completed normally without an indication of problems.
                              2 more are now up to 15 and 16 hours at 98.5% with 9 minutes 56 seconds left on both. One has switched to let another project run but the other is running at High Priority.


                              Ok WU 736936 finished without error and in normal 6 hour preference range. I believe that this is the WU that got to 17:29:51 then after restarting BM it went back to normal, but I can\'t prove that, it could of been one of the following WU\'s.

                              WU 736937 went for 16:24:27 (59067.94 seconds) and then returned a computation error, that was a lot of wasted effort, here is the error output

                              59067.941307
                              stderr out

                              <core_client_version>5.10.21</core_client_version>
                              <![CDATA[
                              <message>
                              process exited with code 193 (0xc1, -63)
                              </message>
                              <stderr_txt>
                              Graphics are disabled due to configuration...
                              # cpu_run_time_pref: 21600
                              # random seed: 1551605
                              **********************************************************************
                              Rosetta score is stuck or going too long. Watchdog is ending the run!
                              Stuck at score 16.1773 for 900 seconds
                              **********************************************************************
                              GZIP SILENT FILE: ./xx2h4o.out
                              *** glibc detected *** corrupted double-linked list: 0xae7e1098 ***
                              SIGABRT: abort called
                              Stack trace (14 frames):
                              [0x8da3037]
                              [0x8d9de2c]
                              [0xb7f8c420]
                              [0x8e0e444]
                              [0x8e2330f]
                              [0x8e28532]
                              [0x8e28653]
                              [0x8e0e9b4]
                              [0x8d9fab7]
                              [0x8d9ff27]
                              [0x8d2023d]
                              [0x8d20f35]
                              [0x8d9a0c5]
                              [0x8e3aa1a]

                              Exiting...
                              SIGSEGV: segmentation violation
                              Stack trace (18 frames):
                              [0x8da3037]
                              [0x8d9de2c]
                              [0xb7f8c420]
                              [0x8cad54d]
                              [0x8c11820]
                              [0x8c14e33]
                              [0x804c7c2]
                              [0x8a835ed]
                              [0x8a8586f]
                              [0x89363de]
                              [0x89380e3]
                              [0x893ba27]
                              [0x898ad7a]
                              [0x85e96d6]
                              [0x87289d2]
                              [0x8728af2]
                              [0x8e07384]
                              [0x8048111]

                              Exiting...
                              FILE_LOCK::unlock(): close failed.: Bad file descriptor
                              Graphics are disabled due to configuration...
                              # cpu_run_time_pref: 21600
                              SIGSEGV: segmentation violation
                              Stack trace (18 frames):
                              [0x8da3037]
                              [0x8d9de2c]
                              [0xb7f00420]
                              [0x8cad54d]
                              [0x8c11820]
                              [0x8c14e33]
                              [0x804c7c2]
                              [0x8a835ed]
                              [0x8a8586f]
                              [0x89363de]
                              [0x8938119]
                              [0x893ba27]
                              [0x898ad7a]
                              [0x85e96d6]
                              [0x87289d2]
                              [0x8728af2]
                              [0x8e07384]
                              [0x8048111]

                              Exiting...

                              WU 736938 ran for 21:48:59 (78,539.06 seconds) was validated but returned a very poor credit amount for such a long process time.

                              Both the last two WU\'s were stopped by the Watchdog for being stuck.
                              ____________

                              RAD-Poland

                              Joined: Apr 6 07
                              Posts: 6
                              ID: 2897
                              Credit: 100,029
                              RAC: 0
                              Message 3668 - Posted 24 Jan 2008 10:23:44 UTC

                                Last modified: 24 Jan 2008 10:25:03 UTC

                                Workunit 652259

                                <core_client_version>5.10.10</core_client_version>
                                <![CDATA[
                                <message>
                                process exited with code 193 (0xc1, -63)
                                </message>
                                <stderr_txt>
                                Graphics are disabled due to configuration...
                                # cpu_run_time_pref: 3600
                                # random seed: 1551090
                                **********************************************************************
                                Rosetta score is stuck or going too long. Watchdog is ending the run!
                                CPU time: 17762 seconds. Greater than 4X preferred time: 3600 seconds
                                **********************************************************************
                                GZIP SILENT FILE: ./xxgp04.out
                                SIGSEGV: segmentation violation
                                Stack trace (25 frames):
                                [0x8da3037]
                                ...

                                Validate state Invalid

                                Basilaris

                                Joined: Feb 16 06
                                Posts: 2
                                ID: 265
                                Credit: 10,006
                                RAC: 0
                                Message 3669 - Posted 24 Jan 2008 18:55:46 UTC

                                  2h4o_Boinc_Twist_Angle_Symm_Fold_and_Dock-2h4o_-native__2970_18_0 did not continue at Model 2, Step: 34817, RMSDE 1.187E+004, Energy: -68.98463. Time and Percent complete went on, but nothing happend. After restarting it was the same: it went up to step 34817 and stop. And the graphics went were faulty too.

                                  Keith T.
                                  Avatar

                                  Joined: May 4 07
                                  Posts: 12
                                  ID: 3019
                                  Credit: 10,828
                                  RAC: 0
                                  Message 3672 - Posted 25 Jan 2008 23:47:35 UTC

                                    http://ralph.bakerlab.org/workunit.php?wuid=651754 2h4o__BOINC_TWIST_RINGS_TWIST_ANGLE_SYMM_FOLD_AND_DOCK-2h4o_-native__2970_79 got stuck on the 9th decoy for over an hour at least twice.

                                    I eventually changed the CPU run time down to 4 hours from 8 to get the WU to finish before it\'s deadline. I did try exiting BOINC a few times as well. The WU was stuck on the 9th decoy and restarted the same one at least twice.

                                    Keith
                                    ____________

                                    Profile Conan
                                    Avatar

                                    Joined: Feb 16 06
                                    Posts: 344
                                    ID: 145
                                    Credit: 1,309,534
                                    RAC: 0
                                    Message 3684 - Posted 4 Feb 2008 5:22:35 UTC

                                      Just finished this one.
                                      It took over 24 hours before the watchdog stopped it. Should of claimed 300 credits but was granted 80 for less than 3.5 cr/h. Pretty miserable.

                                      So Work units are still not adhering to preferences. Bug not fixed.
                                      ____________

                                      Profile [B^S] JoeB@Ky

                                      Joined: Oct 11 06
                                      Posts: 8
                                      ID: 1990
                                      Credit: 39,098
                                      RAC: 0
                                      Message 3703 - Posted 10 Feb 2008 5:35:08 UTC

                                        I had 2 WU\'s load on my 2.13GHZ C2D with about a ~1 hr run time. Both were stuck at ~84.3/84.4% after running 1:42/1:46 hrs. I let them stay that way for an additional ~2.25 hours before aborting them yesterday PM. No such problems on my 3.4GHZ P4w/HT; the 2 WU\'s on it now loaded at ~2.0 hr run time and after 1:07:04 run time the 1st one is at 86.7% done, no freeze up.
                                        I just DLed the code file listed on the news buletin on the Bonic Synergy web site and put it in the Ralph PROJECT Folder on the C2D box. I noticed at that time that there was a similar file named: \"minirosetta_1.03_windows_intelx86\" dated 1-15-08. But it didn\'t have the .pbd file extention on the end of it. My P4 box, RALPH directory, already has the current 1.07 code file w/ the .pbd extention. Might be why it wasn\'t working right on the C2D box!

                                        quimillo

                                        Joined: Feb 14 08
                                        Posts: 2
                                        ID: 4044
                                        Credit: 8,257
                                        RAC: 0
                                        Message 3741 - Posted 14 Feb 2008 21:10:24 UTC

                                          task tol5__BOINC_SYMM_FOLD_AND_DOCK_RELAX_ONLY-tol5_-lowres_dock_-dock_3218__3305_1_0
                                          using rosetta_beta version 593

                                          time of CPU stopped in: 04:38:41

                                          Progress: 100%

                                          Status: Running, high prioprity

                                          BOINC client version 5.10.28 for i686-pc-linux-gnu

                                          What I do?

                                          Message boards : RALPH@home bug list : Bug reports for version 5.93


                                          Home | Join | About | Participants | Community | Statistics

                                          Copyright © 2017 University of Washington

                                          Last Modified: 20 Nov 2008 19:41:56 UTC
                                          Back to top ^