RALPH@home

Rosetta mini 3.18

  UW Seal
 
[ Home ] [ Join ] [ About ] [ Participants ] [ Community ] [ Statistics ]
  [ login/out ]


Advanced search

Message boards : RALPH@home bug list : Rosetta mini 3.18

AuthorMessage
Profile [VENETO] boboviz

Joined: Apr 9 08
Posts: 517
ID: 4205
Credit: 750,753
RAC: 91
Message 5403 - Posted 1 Nov 2011 11:43:52 UTC

    New version of rosetta mini. New function or a simply debug?

    Profile Conan
    Avatar

    Joined: Feb 16 06
    Posts: 344
    ID: 145
    Credit: 1,325,487
    RAC: 50
    Message 5404 - Posted 1 Nov 2011 11:50:22 UTC

      Last modified: 1 Nov 2011 12:19:13 UTC

      These work units take a very long time to run (around 10 hours on a 6 hour preference) and return very poor credit for the effort (31 points)
      See WU 2305565

      I have another two of these running at the moment and both are already over 9 hours and still going.

      I also have a 3.17 WU that is over 9 hours as well. Most only run 1 to 2 hours.

      Conan
      ____________

      Profile Conan
      Avatar

      Joined: Feb 16 06
      Posts: 344
      ID: 145
      Credit: 1,325,487
      RAC: 50
      Message 5405 - Posted 1 Nov 2011 13:07:26 UTC

        Last modified: 1 Nov 2011 13:08:42 UTC

        Well I have had another 2 work units go to 10 hours 11 minutes before reporting as successful but the points are even lower than before
        WU 2306108 gave 8.77 points
        WU 2306211 gave 17.87 points

        I have aborted WU 2306685 at 27,849 seconds (over 7 hours) and WU 2306474 at 25,481 seconds (7 hours) as they were going to take over 10 hours and give very little in return.

        My preference is for 6 hours so longer than that I have a close look.

        All long running work units have been on my two Linux machines, Windows not affected.
        Also it has been both 3.17 and 3.18 work unit types.

        Conan
        ____________

        Profile robertmiles

        Joined: Jan 13 09
        Posts: 80
        ID: 5137
        Credit: 246,177
        RAC: 16
        Message 5406 - Posted 1 Nov 2011 16:19:43 UTC - in response to Message 5404.

          These work units take a very long time to run (around 10 hours on a 6 hour preference) and return very poor credit for the effort (31 points)
          See WU 2305565

          I have another two of these running at the moment and both are already over 9 hours and still going.

          I also have a 3.17 WU that is over 9 hours as well. Most only run 1 to 2 hours.

          Conan


          These could be looking for a bug I saw earlier in 3.14. If so, something it would be useful to watch for: The workunit stops using any CPU time at all, WITHOUT telling the BOINC manager that there is a problem so that another workunit can run instead. If that's the problem, the workunit can easily sit there not really running for days, since the time limit detection can't run either.

          Rocco Moretti
          Forum moderator
          Project developer
          Project scientist

          Joined: May 18 10
          Posts: 11
          ID: 15514
          Credit: 30,188
          RAC: 0
          Message 5408 - Posted 1 Nov 2011 17:20:40 UTC - in response to Message 5403.

            New version of rosetta mini. New function or a simply debug?


            As mentioned by cmiles over in the Rosetta@home forums, minirosetta 3.18 is identical to minirosetta_beta 3.17, and is identical to the version (3.17) currently being run on Rosetta@home. (The difference in numbering is because of technical reasons.)

            Profile Conan
            Avatar

            Joined: Feb 16 06
            Posts: 344
            ID: 145
            Credit: 1,325,487
            RAC: 50
            Message 5409 - Posted 2 Nov 2011 0:57:31 UTC

              A few error are starting to show up today

              First two are on Linux
              WU 2316336 shows this

              ERROR: seqpos <= size()
              ERROR:: Exit from: src/core/conformation/Conformation.hh line: 267
              BOINC:: Error reading and gzipping output datafile: default.out

              WU 2049320 shows
              ERROR: Illegal value specified for option -run: protocol : medal_abinitio

              Next two are on Windows
              WU 2316455 and WU 2316209

              Both show this

              ERROR: seqpos <= size()
              ERROR:: Exit from: d:\boinc_build\minirosetta_beta_3.17\rosetta_source\src\core/conformation/Conformation.hh line: 267
              BOINC:: Error reading and gzipping output datafile: default.out

              Conan
              ____________

              Profile [VENETO] boboviz

              Joined: Apr 9 08
              Posts: 517
              ID: 4205
              Credit: 750,753
              RAC: 91
              Message 5410 - Posted 2 Nov 2011 17:02:08 UTC - in response to Message 5408.

                Thank for the answer!!

                Profile [VENETO] boboviz

                Joined: Apr 9 08
                Posts: 517
                ID: 4205
                Credit: 750,753
                RAC: 91
                Message 5411 - Posted 2 Nov 2011 17:02:12 UTC - in response to Message 5408.

                  Thanks for the answer!!

                  Profile Conan
                  Avatar

                  Joined: Feb 16 06
                  Posts: 344
                  ID: 145
                  Credit: 1,325,487
                  RAC: 50
                  Message 5412 - Posted 2 Nov 2011 23:57:01 UTC

                    A few more Windows errors
                    WU 2305445
                    - Unhandled Exception Record -
                    Reason: Breakpoint Encountered (0x80000003) at address 0x7C90120E

                    WU 2304479
                    "Maximum Elapsed Time Exceeded"

                    WU 2325467 and WU 2324219

                    ERROR: seqpos <= size()
                    ERROR:: Exit from: d:\boinc_build\minirosetta_beta_3.17\rosetta_source\src\core/conformation/Conformation.hh line: 267
                    BOINC:: Error reading and gzipping output datafile: default.out

                    Conan
                    ____________

                    Profile [VENETO] boboviz

                    Joined: Apr 9 08
                    Posts: 517
                    ID: 4205
                    Credit: 750,753
                    RAC: 91
                    Message 5413 - Posted 4 Nov 2011 9:19:52 UTC

                      Some validate errors
                      2338233
                      2338190

                      BOINC:: Worker startup.
                      Starting watchdog...
                      Watchdog active.
                      ======================================================
                      DONE :: 1 starting structures 1201 cpu seconds
                      This process generated 1 decoys from 1 attempts
                      ======================================================
                      BOINC :: WS_max 0

                      BOINC :: Watchdog shutting down...
                      BOINC :: BOINC support services shutting down cleanly ...
                      called boinc_finish

                      </stderr_txt>
                      ]]>

                      Validate state Invalid

                      Trotador

                      Joined: May 7 10
                      Posts: 19
                      ID: 15474
                      Credit: 9,099,589
                      RAC: 9,551
                      Message 5414 - Posted 4 Nov 2011 17:22:46 UTC

                        Last modified: 4 Nov 2011 17:30:35 UTC

                        Hi

                        Many validation errors today, around 90 out of 320 units, most of them finish in less of 100-200 seconds with few o them reaching 600 or 1000 seconds. So far all the wingmen also failed in these workunits. Both in Linux and in W7.

                        Regarding the extra long units with very low scores, I think they all are TO538..., It happened firtst with the beta 3.17 (TBC) and the subsequent releases behave equally. I tend lately to abort them.

                        regards

                        Edit: I've noticed that many units with validation errors are over 200 seconds

                        Profile [VENETO] boboviz

                        Joined: Apr 9 08
                        Posts: 517
                        ID: 4205
                        Credit: 750,753
                        RAC: 91
                        Message 5415 - Posted 6 Nov 2011 18:12:22 UTC

                          2361173

                          ERROR: seqpos <= size()
                          ERROR:: Exit from: d:\boinc_build\minirosetta_beta_3.17\rosetta_source\src\core/conformation/Conformation.hh line: 267
                          BOINC:: Error reading and gzipping output datafile: default.out
                          called boinc_finish

                          Profile [SG-FC] dingdong

                          Joined: Mar 17 09
                          Posts: 17
                          ID: 5274
                          Credit: 4,036,063
                          RAC: 916
                          Message 5416 - Posted 6 Nov 2011 18:54:36 UTC

                            Work Unit ID 2074393: 50 minutes without action, cpu - load = 0%

                            Trotador

                            Joined: May 7 10
                            Posts: 19
                            ID: 15474
                            Credit: 9,099,589
                            RAC: 9,551
                            Message 5417 - Posted 8 Nov 2011 23:13:25 UTC

                              2KZU_... units are erroring just at the start

                              ERROR: in::file::boinc_wu_zip 4-boinc-submit/2KZU_chromodomain.zip does not exist!
                              ERROR:: Exit from: src/apps/public/boinc/minirosetta.cc line: 168
                              BOINC:: Error reading and gzipping output datafile: default.out
                              called boinc_finish


                              a problem with the nemae of the files

                              svincent

                              Joined: Apr 4 08
                              Posts: 34
                              ID: 4182
                              Credit: 51,768
                              RAC: 0
                              Message 5418 - Posted 11 Nov 2011 2:50:03 UTC

                                Task 2379638 gave a Validate Error, but without anything noteworthy appearing in the log file.

                                Setting up graphics native ...
                                BOINC:: Worker startup.
                                Starting watchdog...
                                Watchdog active.
                                ======================================================
                                DONE :: 1 starting structures 1201 cpu seconds
                                This process generated 1 decoys from 1 attempts
                                ======================================================
                                BOINC :: WS_max 0

                                BOINC :: Watchdog shutting down...
                                BOINC :: BOINC support services shutting down cleanly ...
                                called boinc_finish

                                </stderr_txt>
                                ]]>
                                Validate state Invalid

                                Profile [VENETO] boboviz

                                Joined: Apr 9 08
                                Posts: 517
                                ID: 4205
                                Credit: 750,753
                                RAC: 91
                                Message 5422 - Posted 17 Nov 2011 22:27:56 UTC

                                  Validate errors
                                  2429175
                                  2429148

                                  # cpu_run_time_pref: 7200
                                  ======================================================
                                  DONE :: 32 starting structures 7150.81 cpu seconds
                                  This process generated 32 decoys from 32 attempts
                                  ======================================================
                                  BOINC :: WS_max 8.35789e+008

                                  BOINC :: Watchdog shutting down...
                                  BOINC :: BOINC support services shutting down cleanly ...
                                  called boinc_finish

                                  </stderr_txt>

                                  Profile Conan
                                  Avatar

                                  Joined: Feb 16 06
                                  Posts: 344
                                  ID: 145
                                  Credit: 1,325,487
                                  RAC: 50
                                  Message 5423 - Posted 19 Nov 2011 14:36:27 UTC

                                    Heaps of bugs/ errors

                                    The following error appeared on one Windows machine but seems to now be processing OK
                                    WU 2435485
                                    WU 2436196
                                    WU 2436362
                                    WU 2436427
                                    WU 2436489
                                    WU 2436965

                                    app_version download error: couldn't get input files:
                                    <file_xfer_error>
                                    <file_name>minirosetta_database_rev45517.zip</file_name>
                                    <error_code>-120</error_code>
                                    <error_message>signature verification failed</error_message>
                                    </file_xfer_error>

                                    WU 2436832 had the following error
                                    ERROR:Option matching -in:file:boinc_wu_fix:zip not found in command line top-level context

                                    ALL of the following Linux Work Units have VALIDATE ERRORS

                                    WU 2441745
                                    WU 2441633
                                    WU 2441023
                                    WU 2439837
                                    WU 2439085
                                    WU 2438445
                                    WU 2438250
                                    WU 2438113
                                    WU 2437946
                                    WU 2437831
                                    WU 2437828
                                    WU 2437700
                                    WU 2437662
                                    WU 2433162
                                    WU 2433080
                                    WU 2441742
                                    WU 2441538
                                    WU 2441458
                                    WU 2441308
                                    WU 2441223
                                    WU 2440847
                                    WU 2440310
                                    WU 2438270
                                    WU 2437847
                                    WU 2437609
                                    WU 2437451
                                    WU 2437440
                                    WU 2437361
                                    WU 2437278
                                    WU 2436307
                                    WU 2429070
                                    WU 2429068

                                    There seems to be a problem

                                    Conan
                                    ____________

                                    Profile Conan
                                    Avatar

                                    Joined: Feb 16 06
                                    Posts: 344
                                    ID: 145
                                    Credit: 1,325,487
                                    RAC: 50
                                    Message 5424 - Posted 20 Nov 2011 0:09:38 UTC

                                      ALL LINUX WORK UNITS GET VALIDATE ERRORS

                                      NONE are successful

                                      Most Windows WUs validate but some are now starting get validate errors as well.

                                      Conan
                                      ____________

                                      Profile [VENETO] boboviz

                                      Joined: Apr 9 08
                                      Posts: 517
                                      ID: 4205
                                      Credit: 750,753
                                      RAC: 91
                                      Message 5427 - Posted 20 Nov 2011 8:09:31 UTC

                                        Despite random validate error on win7, i see this:
                                        I set 1h my wus, some wus run for more than 3h.
                                        The 1h wu gives me from 13 to 16 points, the 3h gives me from 4 to 7 points

                                        Profile [VENETO] boboviz

                                        Joined: Apr 9 08
                                        Posts: 517
                                        ID: 4205
                                        Credit: 750,753
                                        RAC: 91
                                        Message 5432 - Posted 4 Dec 2011 19:33:55 UTC

                                          This batch seems very good! No one error...

                                          Profile [VENETO] boboviz

                                          Joined: Apr 9 08
                                          Posts: 517
                                          ID: 4205
                                          Credit: 750,753
                                          RAC: 91
                                          Message 5434 - Posted 18 Dec 2011 22:18:08 UTC

                                            2512058

                                            BOINC:: Worker startup.
                                            Starting watchdog...
                                            Watchdog active.
                                            ======================================================
                                            DONE :: 10 starting structures 1201 cpu seconds
                                            This process generated 10 decoys from 10 attempts
                                            ======================================================
                                            BOINC :: WS_max 0

                                            BOINC :: Watchdog shutting down...
                                            BOINC :: BOINC support services shutting down cleanly ...
                                            called boinc_finish

                                            </stderr_txt>
                                            <message>
                                            upload failure: <file_xfer_error>
                                            <file_name>2B87_perturbation_2B87_start4_16268_1_0_0</file_name>
                                            <error_code>-161</error_code>
                                            </file_xfer_error>

                                            </message>

                                            Profile [VENETO] boboviz

                                            Joined: Apr 9 08
                                            Posts: 517
                                            ID: 4205
                                            Credit: 750,753
                                            RAC: 91
                                            Message 5435 - Posted 18 Dec 2011 22:19:36 UTC

                                              [http://ralph.bakerlab.org/result.php?resultid=2512048]2512048[/url]

                                              BOINC:: Worker startup.
                                              Starting watchdog...
                                              Watchdog active.

                                              ERROR: Cannot open PDB file "start1.pdb.gz"
                                              ERROR:: Exit from: ..\..\..\src\core\import_pose\import_pose.cc line: 191
                                              BOINC:: Error reading and gzipping output datafile: default.out
                                              called boinc_finish

                                              Profile [VENETO] boboviz

                                              Joined: Apr 9 08
                                              Posts: 517
                                              ID: 4205
                                              Credit: 750,753
                                              RAC: 91
                                              Message 5439 - Posted 26 Dec 2011 19:48:38 UTC

                                                2514839

                                                CPU time 0
                                                stderr out

                                                <core_client_version>6.10.60</core_client_version>
                                                <![CDATA[
                                                <message>
                                                Maximum disk usage exceeded
                                                </message>
                                                ]]>

                                                Validate state Invalid
                                                Claimed credit 0
                                                Granted credit 0
                                                application version 3.18

                                                Message boards : RALPH@home bug list : Rosetta mini 3.18


                                                Home | Join | About | Participants | Community | Statistics

                                                Copyright © 2017 University of Washington

                                                Last Modified: 20 Nov 2008 19:41:56 UTC
                                                Back to top ^