| Author | Message |
|
|
|
Fixes a stack overflow bug in 2.02.
Please report issues here :)
|
|
|
|
|
|
No wu, no debug.... |
|
|
|
|
|
11 out of 15 died within 30 secs the other 4 are still running @ 10+mins
Task ID 1686211
Name homopt1.t369_.t369_.IGNORE_THE_REST.S_00001_0000857_00042.pdb.JOB_13568_1_0
Workunit 1489861
ERROR: Conformation: fold_tree nres should match conformation nres. conformation nres: 147 fold_tree nres: 148
ERROR:: Exit from: ..\..\src\core\conformation\Conformation.cc line: 234
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish
</stderr_txt>
]]>
____________
|
|
|
|
|
|
Have so far had 5 fail and 3 succeed.
Result 1686388 had this error
ERROR: Conformation: fold_tree nres should match conformation nres. conformation nres: 147 fold_tree nres: 148
ERROR:: Exit from: src/core/conformation/Conformation.cc line: 234
BOINC:: Error reading and gzipping output datafile: default.out
And results 1687361
1687351
1687325
1687296
all had the following error
ERROR: Conformation: fold_tree nres should match conformation nres. conformation nres: 139 fold_tree nres: 140
ERROR:: Exit from: src/core/conformation/Conformation.cc line: 234
BOINC:: Error reading and gzipping output datafile: default.out
____________
 |
|
|
|
|
|
homopt1.t369_.t369_.IGNORE_THE_REST.S_00002_0000860_027.pdb.JOB_13568_1
homopt1.t369_.t369_.IGNORE_THE_REST.S_00002_0000712_020.pdb.JOB_13568_1
after 19-22 seconds ended with:
ERROR: Conformation: fold_tree nres should match conformation nres. conformation nres: 147 fold_tree nres: 148
ERROR:: Exit from: src/core/conformation/Conformation.cc line: 234
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish
homopt1.t331_.t331_.IGNORE_THE_REST.S_00002_0000988_00044.pdb.JOB_13563_1
ended after 6 seconds with ERROR: Conformation: fold_tree nres should match conformation nres. conformation nres: 139 fold_tree nres: 140
ERROR:: Exit from: src/core/conformation/Conformation.cc line: 234
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish
homopt1.t368_.t368_.IGNORE_THE_REST.S_00001_0000086_00055.pdb.JOB_13567_1_0 ran successfully completing 11 decoys in 13877 seconds
homopt1.t313_.t313_.IGNORE_THE_REST.S_00001_0000429_10056.pdb.JOB_13559_1 appears to be crunching successfully, currently working on Model 1 step 4206 after one hour.
|
|
|
|
|
Have so far had 5 fail and 3 succeed.
Result 1686388 had this error
ERROR: Conformation: fold_tree nres should match conformation nres. conformation nres: 147 fold_tree nres: 148
ERROR:: Exit from: src/core/conformation/Conformation.cc line: 234
BOINC:: Error reading and gzipping output datafile: default.out
And results 1687361
1687351
1687325
1687296
all had the following error
ERROR: Conformation: fold_tree nres should match conformation nres. conformation nres: 139 fold_tree nres: 140
ERROR:: Exit from: src/core/conformation/Conformation.cc line: 234
BOINC:: Error reading and gzipping output datafile: default.out
Have had another 11 fail with a similar error, all exit at the same line (234) only the conformation error varies with a few different numbers.
They all error after less than 20 minutes.
____________
 |
|
|
|
|
|
6 tasks with the same kind of bugs as listed by the others:
ERROR: Conformation: fold_tree nres should match conformation nres. conformation nres: 147 fold_tree nres: 148
ERROR:: Exit from: ..\..\src\core\conformation\Conformation.cc line: 234
BOINC:: Error reading and gzipping output datafile: default.out
http://ralph.bakerlab.org/result.php?resultid=1686083
http://ralph.bakerlab.org/result.php?resultid=1686082
http://ralph.bakerlab.org/result.php?resultid=1686077
and
ERROR: Conformation: fold_tree nres should match conformation nres. conformation nres: 154 fold_tree nres: 155
ERROR:: Exit from: ..\..\src\core\conformation\Conformation.cc line: 234
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish
http://ralph.bakerlab.org/result.php?resultid=1686064
http://ralph.bakerlab.org/result.php?resultid=1686063
http://ralph.bakerlab.org/result.php?resultid=1686062 |
|
|
|
|
|
http://ralph.bakerlab.org/result.php?resultid=1693258
ERROR: Cannot open PDB file "native.pdb"
ERROR:: Exit from: ..\..\src\core\io\pdb\pose_io.cc line: 170
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish
|
|
|
|
|
|
I got a bunch of errors, but more valids 'til now.
Typical error:
stderr out
<core_client_version>6.10.17</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1, -255)
</message>
<stderr_txt>
[2010- 1- 5 2:32: 2:] :: BOINC:: Initializing ... ok.
[2010- 1- 5 2:32: 2:] :: BOINC :: boinc_init()
BOINC:: Setting up shared resources ... ok.
BOINC:: Setting up semaphores ... ok.
BOINC:: Updating status ... ok.
BOINC:: Registering timer callback... ok.
BOINC:: Worker initialized successfully.
Registering options..
Registered extra options.
Initializing broker options ...
Registered extra options.
Initializing core...
Initializing options.... ok
Options::initialize()
Options::adding_options()
Options::initialize() Check specs.
Options::initialize() End reached
Loaded options.... ok
Processed options.... ok
Initializing random generators... ok
Initialization complete.
Setting WU description ...
Unpacking zip data: ../../projects/ralph.bakerlab.org/minirosetta_database_rev34260.zip
Unpacking WU data ...
Unpacking data: ../../projects/ralph.bakerlab.org/job_boinc_t373__loopbuild_threading_tex.boinc.zip
Setting database description ...
Setting up checkpointing ...
Setting up graphics native ...
ERROR: Cannot open PDB file "native.pdb"
ERROR:: Exit from: src/core/io/pdb/pose_io.cc line: 170
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish
</stderr_txt>
]]>
____________
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki |
|
|
|
|
|
One wu errored out in the recent bunch of tasks:
job_boinc_t318__filtered_loopbuild_threading_tex_IGNORE_THE_REST_13730_7_0
ERROR: can't find residue type at pos 11in sequence MTQVLVRNGI QAVGDGLTSL IIVGKKSVLK NVTFEGKFKE VAQKFVTDGD SWNSMISRIP ASGRHPLHYE LAHLITVPDASSRGNTPTNA HSIYKELKPI NYPEDTKNVH FVLFAEYPDV LSHVAAIART FCKFSMKTSG IRELNVNIDV VCDKLTNEDAVFLTDLSESV RETARLIDTP ANILTTDALV DEAVKVGNAT GSKITVIRGE ELLKAGFGGI YHVGKAGPTP PAFVVLSHEVPGSTEHIALV GKGVVYDTGG LQIKTKTGMP NMKRDMGGAA GMLEAYSALV KHGFSQTLHA CLCIVENNVS PIANKPDDIIKMLSGKTVEI NNTDAEGRLI LADGVFYAKE TLKATTIFDM ATLTGAQAWL SGRLHGAAMT NDEQLENEII KAGKASGDLVAPMLFAPDLF FGDLKSSIAD MKNSNLGKMD GPPSAVAGLL IGAHIGFGEG LRWLHLDIAA PAEVGDRGTG YGPALFSTLLGKYTSVPMLK Q
ERROR:: Exit from: src/core/chemical/util.cc line: 379
BOINC:: Error reading and gzipping output datafile: default.out
|
|
|
|
|
|
One ran about 11 hours instead of the 6 hours I asked for, and produced just one decoy. Appears OK otherwise, though.
http://ralph.bakerlab.org/result.php?resultid=1694077 |
|
|
|
|
|
ha_notyr_3gbn_1ksk_6Jan2010_13738_2_0 ran for the full 4 hour preferred runtime to complete 567 decoys. It also ran the 4 hours straight through which is quite unexpected. Checking the current short term debts the new Ralph wu I received almost two hours ago won't begin crunching for at about 3 more hours. Every other wu appears to be switching out as expected which also suggests it may indeed be a case of Ralph not giving way when asked rather that BOINC suddenly ignoring its preferences.
Earlier a similarly named wu ha_notyr_3gbn_1osh_6Jan2010_13734_2_0 wrapped things up after completing 100 models in 2719.38 seconds.
Snags |
|
|
|
|
|
I have a wu running 18+ hours showing 5,4% complete. Should I abort?
____________
|
|
|
|
|
I have a wu running 18+ hours showing 5,4% complete. Should I abort?
Is that 18 hours of CPU time or wall clock time? (The newer BOINC managers show both.)If CPU time you could try quitting and restarting BOINC (that's quitting the BOINC client not merely closing the manager). Sometimes this gets a stuck WU moving again, sometimes it errors out immediately upon restart; if neither, nothing left to do but abort. |
|
|
|
|
|
Task: 1703705
Workunit: ha_notyr_3gbn_2z84_6Jan2010_13734_1
BOINC:: Worker startup.
Starting watchdog...
Watchdog active.
SIGSEGV: segmentation violation
Crashed executable name: minirosetta_2.03_i686-apple-darwin
built using BOINC library version 6.5.0
Machine type Intel 80486 (32-bit executable)
System version: Macintosh OS 10.5.8 build 9L30
Thu Jan 7 06:48:34 2010
Same workunit segfaults on Windows. |
|
|
|
|
|
Validate errors:
ha_notyr_3gbn_2img_6Jan2010_13734_2
ha_notyr_3gbn_2fu2_6Jan2010_13734_1 |
|
|
|
|
|
Task 1707216 (homopt_cstmc_1.t317_.t317_.IGNORE_THE_REST.S_00006_0000222_090.pdb.JOB_13800_1)
Task 1707217 (homopt_cstmc_1.t317_.t317_.IGNORE_THE_REST.S_00006_0000272_00016.pdb.JOB_13800_1)
Task 1707218 (homopt_cstmc_1.t317_.t317_.IGNORE_THE_REST.S_00006_0000781_045.pdb.JOB_13800_1)
BOINC:: Worker startup.
Starting watchdog...
Watchdog active.
FNAME: native.pdb
FNAME: native.pdb
ERROR: [ERROR] Unable to open constraints file: /work/tex/projects/cm/benchmark/cross_filt/t317_/t317_.aln_list_mike_chosen_bestaln.alns.combined.csts
ERROR:: Exit from: ..\..\src\core\scoring\constraints\ConstraintIO.cc line: 332
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish |
|
|
|
|
|
homopt_cstmc_1.t317_.t317_.IGNORE_THE_REST.S_00001_0000752_10003.pdb.JOB_13800_1_0
process exited with code 1 (0x1, -255)
ERROR: [ERROR] Unable to open constraints file: /work/tex/projects/cm/benchmark/cross_filt/t317_/t317_.aln_list_mike_chosen_bestaln.alns.combined.csts
ERROR:: Exit from: src/core/scoring/constraints/ConstraintIO.cc line: 332
BOINC:: Error reading and gzipping output datafile: default.out
|
|
|
|
|
|
Had an error on this work unit
1719676
ERROR: [ERROR] Unable to open constraints file: i2240.dist_csts
ERROR:: Exit from: src/core/scoring/constraints/ConstraintIO.cc line: 332
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish
____________
 |
|
|
|
|
|
A couple of Validate Errors for your perusal
1721498
1721395
I could not see anything wrong with the work units but apparently they don't Validate.
____________
 |
|
|