RALPH@home

Minirosetta Beta 3.06

  UW Seal
 
[ Home ] [ Join ] [ About ] [ Participants ] [ Community ] [ Statistics ]
  [ login/out ]


Advanced search

Message boards : RALPH@home bug list : Minirosetta Beta 3.06

AuthorMessage
Profile [VENETO] boboviz

Joined: Apr 9 08
Posts: 508
ID: 4205
Credit: 727,228
RAC: 146
Message 5271 - Posted 2 May 2011 21:54:48 UTC

    Why beta? Everything on Ralph is beta....

    Profile robertmiles

    Joined: Jan 13 09
    Posts: 79
    ID: 5137
    Credit: 239,632
    RAC: 59
    Message 5272 - Posted 3 May 2011 4:43:14 UTC - in response to Message 5271.

      Are you sure? I thought it was alpha instead.

      Profile [VENETO] boboviz

      Joined: Apr 9 08
      Posts: 508
      ID: 4205
      Credit: 727,228
      RAC: 146
      Message 5273 - Posted 3 May 2011 5:00:49 UTC - in response to Message 5272.

        Are you sure? I thought it was alpha instead.


        :-)

        Profile [VENETO] boboviz

        Joined: Apr 9 08
        Posts: 508
        ID: 4205
        Credit: 727,228
        RAC: 146
        Message 5274 - Posted 3 May 2011 7:53:20 UTC - in response to Message 5272.

          Seriously, we now that ralph is "alpha/beta/not stable/etc" project.
          Use "beta" in the name of 3.06 version stands for?

          Profile [VENETO] boboviz

          Joined: Apr 9 08
          Posts: 508
          ID: 4205
          Credit: 727,228
          RAC: 146
          Message 5275 - Posted 3 May 2011 17:20:07 UTC

            Graphic crashs on 3.06 wus....

            Profile [SG-FC] dingdong

            Joined: Mar 17 09
            Posts: 17
            ID: 5274
            Credit: 3,807,134
            RAC: 734
            Message 5276 - Posted 3 May 2011 17:44:17 UTC

              Hi,
              This WUs crashed Boincmanager (reboot was necessary):

              T515_ba_rs_stg0_lrlxjcst_t000__casp9_SAVE_ALL_OUT_15177_88_1
              T515_ba_rs_stg0_lrlxjcst_t000__casp9_SAVE_ALL_OUT_15177_94_1

              <core_client_version>6.10.58</core_client_version>
              <![CDATA[
              <message>
              Maximum disk usage exceeded
              </message>
              ]]>

              Invalid

              Boincview ment "ressource limit exceeded",
              client state ist "aborted by user".

              my preferences "Disk and memory usage" are:
              Use at most 500 GB disk space
              Leave at least 1 GB disk space free
              Use at most 99% of total disk space
              Use at most 90% of page file (swap space)
              Use at most 95% of memory when computer is in use
              Use at most 100% of memory when computer is not in use

              The target CPU run time is 4 hours, but they run more than 6h until crash.

              Profile Saenger
              Avatar

              Joined: Feb 28 06
              Posts: 12
              ID: 861
              Credit: 47,572
              RAC: 38
              Message 5277 - Posted 4 May 2011 4:41:14 UTC

                Last modified: 4 May 2011 4:44:29 UTC

                Woke up this morning to an idling computer, while it pretended to run 3 RALPH on parallel.
                I suspended RALPH for 15 seconds, CPU-usage resumed.
                I reactivated RALPH, the WUs started again one after the other.
                One was reset in CPU-time from 8h to 0, the second one from 8:30 to 2:20, the last one is still at 4:40,.
                The CPU-usage is all right again, let's see what will wait for me once I return from work ;)

                Here's a picture from my system monitor:

                ____________
                Gruesse vom Saenger

                For questions about Boinc look in the BOINC-Wiki

                Profile [VENETO] boboviz

                Joined: Apr 9 08
                Posts: 508
                ID: 4205
                Credit: 727,228
                RAC: 146
                Message 5279 - Posted 4 May 2011 11:22:21 UTC

                  2028060

                  ERROR: Cannot open PDB file "1xngA.pdb"
                  ERROR:: Exit from: ..\..\..\src\core\import_pose\import_pose.cc line: 199
                  BOINC:: Error reading and gzipping output datafile: default.out
                  called boinc_finish

                  Invalid

                  Profile Conan
                  Avatar

                  Joined: Feb 16 06
                  Posts: 344
                  ID: 145
                  Credit: 1,310,876
                  RAC: 5
                  Message 5280 - Posted 4 May 2011 12:17:56 UTC - in response to Message 5279.

                    2028060

                    ERROR: Cannot open PDB file "1xngA.pdb"
                    ERROR:: Exit from: ..\..\..\src\core\import_pose\import_pose.cc line: 199
                    BOINC:: Error reading and gzipping output datafile: default.out
                    called boinc_finish

                    Invalid


                    I have the same error on 2 Windows work units

                    this one
                    and this one

                    Also had an "Unhandled Exception Record"
                    "Access Violation (0xc0000005) at address 0x00470309 read attempt to address 0x00000014
                    On WU 2028097

                    All errors were on my Windows machines. Linux are all OK.

                    Conan
                    ____________

                    Profile robertmiles

                    Joined: Jan 13 09
                    Posts: 79
                    ID: 5137
                    Credit: 239,632
                    RAC: 59
                    Message 5281 - Posted 4 May 2011 16:37:43 UTC

                      Rosetta Mini Beta 3.06
                      T515_ba_rs_stg0_lrljcst_t000__casp9_SAVE_ALL_OUT_15177_82

                      Elapsed 08:53:16
                      Progress 8.771%
                      To completion 22:18:51
                      CPU time at last checkpoint 00:51:55
                      CPU time 00:52:37

                      Looks like a good example of a problem I've seen recently at Rosetta@Home - BOINC thinks it is running constantly, but it is actually using no CPU time at all now.

                      I packed most of the contents of that slot into a .zip file just before I aborted that workunit. Do I need to send it somewhere?

                      Profile Sysadm@Nbg

                      Joined: Dec 9 09
                      Posts: 7
                      ID: 12983
                      Credit: 208,060
                      RAC: 0
                      Message 5282 - Posted 4 May 2011 18:02:09 UTC - in response to Message 5280.

                        2028060

                        ERROR: Cannot open PDB file "1xngA.pdb"
                        ERROR:: Exit from: ..\..\..\src\core\import_pose\import_pose.cc line: 199
                        BOINC:: Error reading and gzipping output datafile: default.out
                        called boinc_finish

                        Invalid


                        I have the same error on 2 Windows work units

                        this one
                        and this one

                        ...

                        All errors were on my Windows machines. Linux are all OK.

                        Conan


                        Got the same error on a Linux 64bit machine >>klick<<

                        Profile [VENETO] boboviz

                        Joined: Apr 9 08
                        Posts: 508
                        ID: 4205
                        Credit: 727,228
                        RAC: 146
                        Message 5283 - Posted 4 May 2011 19:45:29 UTC

                          2028070

                          <core_client_version>6.10.60</core_client_version>
                          <![CDATA[
                          <message>
                          Maximum disk usage exceeded
                          </message>
                          ]]>

                          On win7 32 bit

                          Profile dekim
                          Forum moderator
                          Project administrator
                          Project developer
                          Project scientist

                          Joined: Jan 20 06
                          Posts: 214
                          ID: 1
                          Credit: 481,740
                          RAC: 29
                          Message 5284 - Posted 4 May 2011 21:16:30 UTC - in response to Message 5271.

                            Why beta? Everything on Ralph is beta....


                            It has been quite a long time since we've updated the application and we are worried about backwards compatibility so we created a minirosetta_beta application which is the updated app we are testing and the minirosetta application is the actual production application running on R@h.
                            ____________

                            Profile [VENETO] boboviz

                            Joined: Apr 9 08
                            Posts: 508
                            ID: 4205
                            Credit: 727,228
                            RAC: 146
                            Message 5285 - Posted 5 May 2011 4:50:13 UTC - in response to Message 5284.


                              It has been quite a long time since we've updated the application and we are worried about backwards compatibility so we created a minirosetta_beta application which is the updated app we are testing and the minirosetta application is the actual production application running on R@h.


                              I thought the old code had been abandoned.....

                              Profile feet1st

                              Joined: Mar 7 06
                              Posts: 312
                              ID: 1028
                              Credit: 110,522
                              RAC: 0
                              Message 5286 - Posted 7 May 2011 20:26:33 UTC - in response to Message 5275.

                                Last modified: 7 May 2011 20:29:19 UTC

                                Graphics just crashes on Windows. Window starts to open and then dies.

                                And shortly thereafter, BOINC seems to lose control of the process and it no longer gets CPU time, even though BOINC Manager says it is running.
                                ____________

                                Profile robertmiles

                                Joined: Jan 13 09
                                Posts: 79
                                ID: 5137
                                Credit: 239,632
                                RAC: 59
                                Message 5292 - Posted 10 May 2011 3:26:50 UTC

                                  T0617_casp9_symm_cm_SAVE_ALL_OUT_IGNORE_THE _REST_control_15317_68

                                  Another workunit that stopped using any CPU time at all shortly after a checkpoint, WITHOUT boinc.exe recognizing this.

                                  CPU time at last checkpoint 00:04:28
                                  CPU time 00:04:30
                                  Elapsed time 02:37:06

                                  Still not clear if the Tthrottle extension I'm using to prevent the computer from overheating has anything to do with the problem.

                                  Hope you at least got enough debugging output to pin down the problem more.

                                  Ironworker16
                                  Avatar

                                  Joined: Nov 17 09
                                  Posts: 3
                                  ID: 12640
                                  Credit: 41,840
                                  RAC: 0
                                  Message 5293 - Posted 10 May 2011 21:40:01 UTC

                                    I have 8 work units running for 18 hours and when I just check the CPU time most were between 14 to 18 minutes and one went 53 minutes and all are running high priority.
                                    ____________

                                    skgiven

                                    Joined: Dec 15 07
                                    Posts: 8
                                    ID: 3873
                                    Credit: 158,220
                                    RAC: 27
                                    Message 5295 - Posted 10 May 2011 23:23:41 UTC - in response to Message 5293.

                                      Last modified: 10 May 2011 23:48:12 UTC

                                      Running 6 WU's on a SB (3.7GHz). Estimated run time is about 1h 15min for the task at 65%, the next two others think the task will take 2h and 4h, but I expect this may be change.

                                      Glad to see a new app version in progress (3.06).

                                      2035968 1791676 10 May 2011 19:56:12 UTC 10 May 2011 23:00:39 UTC Over Success Done 3,588.55 31.75 22.08
                                      2035937 1791645 10 May 2011 19:51:54 UTC 10 May 2011 22:16:57 UTC Over Success Done 3,579.52 31.67 25.30
                                      2035891 1791599 10 May 2011 19:47:45 UTC 10 May 2011 22:25:13 UTC Over Success Done 4,109.05 36.35 29.56
                                      2035844 1791552 10 May 2011 19:43:33 UTC 10 May 2011 22:00:24 UTC Over Success Done 3,610.23 31.94 21.31
                                      2035829 1791537 10 May 2011 19:39:13 UTC 10 May 2011 20:48:45 UTC Over Success Done 3,559.34 31.49 24.60

                                      Ironworker16
                                      Avatar

                                      Joined: Nov 17 09
                                      Posts: 3
                                      ID: 12640
                                      Credit: 41,840
                                      RAC: 0
                                      Message 5296 - Posted 10 May 2011 23:53:16 UTC - in response to Message 5293.

                                        I closed and restarted boinc and the times were reset to the cpu time and they have all started to made progress again, 1 is complete.

                                        Profile [VENETO] boboviz

                                        Joined: Apr 9 08
                                        Posts: 508
                                        ID: 4205
                                        Credit: 727,228
                                        RAC: 146
                                        Message 5297 - Posted 11 May 2011 19:08:27 UTC

                                          2036148

                                          Outcome Client error
                                          Client state Compute error
                                          <message>
                                          Funzione non corretta. (0x1) - exit code 1 (0x1)
                                          </message>
                                          Unpacking zip data: ../../projects/ralph.bakerlab.org/minirosetta_database_rev41800.zip
                                          Unpacking WU data ...
                                          Unpacking data: ../../projects/ralph.bakerlab.org/T0589_symm_cm_SAVE_ALL_OUT_IGNORE_THE_REST_control.zip
                                          Setting database description ...
                                          Setting up checkpointing ...
                                          Setting up graphics native ...
                                          BOINC:: Worker startup.
                                          Starting watchdog...
                                          Watchdog active.
                                          Continuing computation from checkpoint: chk_S_3LC0A_0001_FastRelax__chk1_fa ... success!
                                          Continuing computation from checkpoint: chk_S_3LC0A_0001_FastRelax__chk2_fa ... success!
                                          Continuing computation from checkpoint: chk_S_3LC0A_0001_FastRelax__chk3_fa ... success!
                                          Continuing computation from checkpoint: chk_S_3LC0A_0001_FastRelax__chk4_fa ... success!
                                          Continuing computation from checkpoint: chk_S_3LC0A_0001_FastRelax__chk5_fa ... success!

                                          </stderr_txt>
                                          ]]>

                                          Validate state Invalid

                                          Profile [VENETO] boboviz

                                          Joined: Apr 9 08
                                          Posts: 508
                                          ID: 4205
                                          Credit: 727,228
                                          RAC: 146
                                          Message 5298 - Posted 12 May 2011 6:33:05 UTC

                                            2036473
                                            2036489
                                            2036441


                                            Exit status -177 (0xffffffffffffff4f)
                                            <core_client_version>6.10.60</core_client_version>
                                            <![CDATA[
                                            <message>
                                            Maximum disk usage exceeded
                                            </message>
                                            ]]>

                                            Profile [SG-FC] dingdong

                                            Joined: Mar 17 09
                                            Posts: 17
                                            ID: 5274
                                            Credit: 3,807,134
                                            RAC: 734
                                            Message 5299 - Posted 12 May 2011 22:23:00 UTC

                                              Maximum disk usage exceeded - errors in three further wus:


                                              2036502
                                              2036548
                                              2036783

                                              Message boards : RALPH@home bug list : Minirosetta Beta 3.06


                                              Home | Join | About | Participants | Community | Statistics

                                              Copyright © 2017 University of Washington

                                              Last Modified: 20 Nov 2008 19:41:56 UTC
                                              Back to top ^