Posts by Snagletooth

21) Message boards : RALPH@home bug list : minirosetta 2.05 (Message 5052)
Posted 24 Jan 2010 by Snagletooth
Post:
validate errors

tyrsim_3gbn_2qzq_20Jan2010_14017_2

There are a couple of 2.05 bug reports in the 2.03 thread. You might want to sticky this thread and unsticky the 2.03 thread.

Snags
22) Message boards : RALPH@home bug list : minirosetta 2.03 (Message 5038)
Posted 9 Jan 2010 by Snagletooth
Post:
homopt_cstmc_1.t317_.t317_.IGNORE_THE_REST.S_00001_0000752_10003.pdb.JOB_13800_1_0

process exited with code 1 (0x1, -255)

ERROR: [ERROR] Unable to open constraints file: /work/tex/projects/cm/benchmark/cross_filt/t317_/t317_.aln_list_mike_chosen_bestaln.alns.combined.csts
ERROR:: Exit from: src/core/scoring/constraints/ConstraintIO.cc line: 332
BOINC:: Error reading and gzipping output datafile: default.out
23) Message boards : RALPH@home bug list : minirosetta 2.03 (Message 5034)
Posted 8 Jan 2010 by Snagletooth
Post:
I have a wu running 18+ hours showing 5,4% complete. Should I abort?

Is that 18 hours of CPU time or wall clock time? (The newer BOINC managers show both.)If CPU time you could try quitting and restarting BOINC (that's quitting the BOINC client not merely closing the manager). Sometimes this gets a stuck WU moving again, sometimes it errors out immediately upon restart; if neither, nothing left to do but abort.
24) Message boards : RALPH@home bug list : minirosetta 2.03 (Message 5031)
Posted 7 Jan 2010 by Snagletooth
Post:
ha_notyr_3gbn_1ksk_6Jan2010_13738_2_0 ran for the full 4 hour preferred runtime to complete 567 decoys. It also ran the 4 hours straight through which is quite unexpected. Checking the current short term debts the new Ralph wu I received almost two hours ago won't begin crunching for at about 3 more hours. Every other wu appears to be switching out as expected which also suggests it may indeed be a case of Ralph not giving way when asked rather that BOINC suddenly ignoring its preferences.

Earlier a similarly named wu ha_notyr_3gbn_1osh_6Jan2010_13734_2_0 wrapped things up after completing 100 models in 2719.38 seconds.

Snags
25) Message boards : RALPH@home bug list : minirosetta 2.03 (Message 5019)
Posted 22 Dec 2009 by Snagletooth
Post:
homopt1.t369_.t369_.IGNORE_THE_REST.S_00002_0000860_027.pdb.JOB_13568_1
homopt1.t369_.t369_.IGNORE_THE_REST.S_00002_0000712_020.pdb.JOB_13568_1
after 19-22 seconds ended with:
ERROR: Conformation: fold_tree nres should match conformation nres. conformation nres: 147 fold_tree nres: 148
ERROR:: Exit from: src/core/conformation/Conformation.cc line: 234
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish

homopt1.t331_.t331_.IGNORE_THE_REST.S_00002_0000988_00044.pdb.JOB_13563_1
ended after 6 seconds with ERROR: Conformation: fold_tree nres should match conformation nres. conformation nres: 139 fold_tree nres: 140
ERROR:: Exit from: src/core/conformation/Conformation.cc line: 234
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish

homopt1.t368_.t368_.IGNORE_THE_REST.S_00001_0000086_00055.pdb.JOB_13567_1_0 ran successfully completing 11 decoys in 13877 seconds

homopt1.t313_.t313_.IGNORE_THE_REST.S_00001_0000429_10056.pdb.JOB_13559_1 appears to be crunching successfully, currently working on Model 1 step 4206 after one hour.


26) Message boards : RALPH@home bug list : minirosetta 2.01/2.02 (Message 5007)
Posted 8 Dec 2009 by Snagletooth
Post:
1dnA_2bkf_0210.redesign.pdb.gz_dckCFA.xml__13366_1_0

ERROR: ERROR: Unable to open silent_input file: 'default.out'
ERROR:: Exit from: src/core/io/silent/SilentFileData.cc line: 86
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish

27) Message boards : RALPH@home bug list : Minirosetta 2.00 (Message 5000)
Posted 15 Nov 2009 by Snagletooth
Post:
Two quick failures:
2d4f_rebuild_loop_perturb_kic_refine_kic_relax_13002_5_1
ERROR: ERROR: FragmentIO: could not open file /work/mtyka/bench/fragments/2d4f/aa2d4fA09_05.200_v1_3.gz
ERROR:: Exit from: src/core/fragment/FragmentIO.cc line: 258
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish

</stderr_txt>
28) Message boards : RALPH@home bug list : Minirosetta 2.00 (Message 4998)
Posted 13 Nov 2009 by Snagletooth
Post:
test1_t367_.1.pdb_t367_.1.loopfile_12945_1

appears to run fine but the output fine is absent:

======================================================
DONE :: 1 starting structures 10949.1 cpu seconds
This process generated 99 decoys from 99 attempts
======================================================

BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down cleanly ...
called boinc_finish

</stderr_txt>
<message>
<file_xfer_error>
<file_name>test1_t367_.1.pdb_t367_.1.loopfile_12945_1_1_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>



Snags
29) Message boards : RALPH@home bug list : minirosetta 1.98 (Message 4985)
Posted 18 Oct 2009 by Snagletooth
Post:
Snagletooth,

Are you sure that workunit was under minirosetta 1.98? All of those I've had with similar names recently were under rosetta beta 5.98 (a different application) instead.

You're right, my mistake, it was Rosetta Beta not Minirosetta. Points to the value of providing links to problem tasks though in this instance I'm sure the developers spotted my mistake immediately.
30) Message boards : RALPH@home bug list : minirosetta 1.98 (Message 4982)
Posted 13 Oct 2009 by Snagletooth
Post:
rossmann2x3_f003_12432_4
exit code 1
ERROR:: Unable to determine sequence length from pdb file
ERROR:: Exit from: .pose.cc line: 2013

Claims to be application version 5.98 not 1.98

Snags
31) Message boards : Number crunching : No new tasks (Message 4956)
Posted 16 Sep 2009 by Snagletooth
Post:
Hi,

I have not got even 1 WU from Ralph since I attached to this project last month. I have been on Rosetta longer, and am crunching regularly on it. Any reasons why I am not getting WUs from Ralph. Can both Rosetta and Ralph run together on the same machine?

Thanks,
Suneet


They run together just fine. Ralph doesn't always have work to send and when they do create WUs it is in much smaller batches than used on Rosetta making it possible that they were all distributed in the time between your requests. Just be patient and eventually you will get one.

Snags
32) Message boards : RALPH@home bug list : Minirosetta 1.95 (Message 4940)
Posted 28 Aug 2009 by Snagletooth
Post:
Another file transfer error:

thioredoxin_PCS_BOINC_abrelax.1xcycles.v1_SAVE_ALL_OUT_12132_19_1
33) Message boards : RALPH@home bug list : Minirosetta 1.95 (Message 4937)
Posted 27 Aug 2009 by Snagletooth
Post:
theta_PCS_BOINC_abrelax.1xcycles.v1_SAVE_ALL_OUT_12132_2_0

21 decoys completed but output file absent:

Thu Aug 27 00:03:14 2009|ralph@home|Computation for task theta_PCS_BOINC_abrelax.1xcycles.v1_SAVE_ALL_OUT_12132_2_0 finished
Thu Aug 27 00:03:14 2009|ralph@home|Output file theta_PCS_BOINC_abrelax.1xcycles.v1_SAVE_ALL_OUT_12132_2_0_0 for task theta_PCS_BOINC_abrelax.1xcycles.v1_SAVE_ALL_OUT_12132_2_0 absent

stderr out:
======================================================
DONE :: 21 starting structures 13985.5 cpu seconds
This process generated 21 decoys from 21 attempts
======================================================

BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down cleanly ...
called boinc_finish

</stderr_txt>
<message>
<file_xfer_error>
<file_name>theta_PCS_BOINC_abrelax.1xcycles.v1_SAVE_ALL_OUT_12132_2_0_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>

34) Message boards : RALPH@home bug list : Minirosetta 1.95 (Message 4935)
Posted 26 Aug 2009 by Snagletooth
Post:
238l_A_103_V_ddg_predictions_82409_010_MUT.238l_A_103_V_.out_12126_1_0

Another one. I was away from the computer while this one ran but it ended itself in less than an hour with the same report as the last WU. I'm sorry I wasn't around to take note of the amount of memory it claimed. Obviously if it's taking up all the computer's memory it's going to get killed by the cruncher. Has anyone tried quitting(not suspending) BOINC (or even just that task) thus removing it from memory then restarting? I wonder if it would error out immediately on restart and further if that error report would contain anything more useful than a simple abort.
I vaguely recall the project adding code to end those WUs that never seem to start and assume that's catching mine though I see nothing obvious in the sterr out unless the clue is this line: reached end of minirosetta::main(). I further assume that claiming one decoy is just to give me some credit without having to run a special validator script. If my assumptions are correct the question of most interest though would be, Why did my WUs end gracefully but Nflight's run on for 16 hours?

Snags
35) Message boards : RALPH@home bug list : Minirosetta 1.95 (Message 4931)
Posted 25 Aug 2009 by Snagletooth
Post:
242l_A_50_I_ddg_predictions_82409_001_MUT.242l_A_50_I_.out_12117_1_0

According to the information in the graphics window this one was still initializing after 52 minutes. A minute or so after I closed the window it uploaded and now reports that it has completed one decoy successfully. As my target runtime is 4 hours I suspect that it is not simply a matter of an error in the graphics. The question then is would it be just as well to abort WUs that don't initialize within x minutes and report them here as stefanob has done or is there valuable information in the 4.41MB upload that would be lost if the WU is not allowed to end on its own?


Snags
36) Message boards : RALPH@home bug list : minirosetta 1.90 (Message 4919)
Posted 8 Aug 2009 by Snagletooth
Post:
4lyz_I_55_A_ddg_predictions_lm_rel0.5_2_8_8709_run12_MUT.4lyz_I_55_A_.out_12067_1_0 was still initializing when I opened the graphics window after 32 minutes. I quit and restarted BOINC. When I checked BOINC again several hours later this WU had completed successfully and I'd picked up another WU:

5azu_V_95_A_ddg_predictions_lm_0.5_0.5_10_8709_run9_MUT.5azu_V_95_A_.out_12065_1 which failed in less than 5 seconds with:

Exit status 1 (0x1)
Setting up graphics native ...
ERROR:: Exit from: src/core/pack/task/ResfileReader.cc line: 987
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish

37) Message boards : RALPH@home bug list : minirosetta 1.90 (Message 4901)
Posted 1 Aug 2009 by Snagletooth
Post:
WU received after the detachment referenced above lr8_rama_map_all_ss_iter08_rlbn_1acf_IGNORE_THE_REST_NATIVE_12056_1_1

Both crunchers failed with:

process exited with code 1 (0x1, -255)

Native pose needed for OptionKeys::relax::constrain_relax_to_native_coords
ERROR:: Exit from: src/protocols/relax/ClassicRelax.cc line: 544
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish

Snags
38) Message boards : RALPH@home bug list : minirosetta 1.90 (Message 4900)
Posted 1 Aug 2009 by Snagletooth
Post:
I apologize for posting this here as it is not an app error but I was afraid anywhere else it would languish perhaps past the point where relevant log files are available.

I was the second to receive this workunit. The website lists the outcome as "client detached"at 10:56:25 UTC.

Here are the messages from my client. Times are UTC -4 and a few lines about other projects have been left out.
Sat Aug 1 05:56:19 2009|ralph@home|Sending scheduler request: To fetch work. Requesting 3298 seconds of work, reporting 0 completed tasks
Sat Aug 1 06:01:35 2009|ralph@home|Scheduler request failed: Timeout was reached
Sat Aug 1 06:02:35 2009|ralph@home|Sending scheduler request: To fetch work. Requesting 3299 seconds of work, reporting 0 completed tasks
Sat Aug 1 06:02:40 2009|ralph@home|Scheduler request succeeded: got 1 new tasks
Sat Aug 1 06:02:42 2009|ralph@home|Started download of 0.5_0.5_10.relaxed_input.run3_73109.5azu.pdb
Sat Aug 1 06:02:42 2009|ralph@home|Started download of 0.5_0.5_10.relaxed_input.run3_73109.5azu_V_60_G.mutfile
Sat Aug 1 06:02:44 2009|ralph@home|Finished download of 0.5_0.5_10.relaxed_input.run3_73109.5azu_V_60_G.mutfile
Sat Aug 1 06:02:44 2009|ralph@home|Started download of 0.5_0.5_10.relaxed_input.run3_73109.5azu_V_60_G.cst
Sat Aug 1 06:02:47 2009|ralph@home|Finished download of 0.5_0.5_10.relaxed_input.run3_73109.5azu.pdb
Sat Aug 1 06:02:47 2009|ralph@home|Finished download of 0.5_0.5_10.relaxed_input.run3_73109.5azu_V_60_G.cst
Sat Aug 1 06:02:47 2009|ralph@home|Started download of soft_rep_design_mod.wts
Sat Aug 1 06:02:49 2009|ralph@home|Finished download of soft_rep_design_mod.wts
Sat Aug 1 06:55:37 2009|ralph@home|Starting 5azu_V_60_G_ddg_predictions_test_local_min_0.5_0.5_10_relaxed_starting_4__12055_1_1
Sat Aug 1 06:55:39 2009|ralph@home|Starting task 5azu_V_60_G_ddg_predictions_test_local_min_0.5_0.5_10_relaxed_starting_4__12055_1_1 using minirosetta version 190
Sat Aug 1 06:55:40 2009|boincsimap|Started upload of 9080101.093552_1_0
Sat Aug 1 06:55:48 2009|ralph@home|Computation for task 5azu_V_60_G_ddg_predictions_test_local_min_0.5_0.5_10_relaxed_starting_4__12055_1_1 finished
Sat Aug 1 06:55:48 2009|ralph@home|Output file 5azu_V_60_G_ddg_predictions_test_local_min_0.5_0.5_10_relaxed_starting_4__12055_1_1_0 for task 5azu_V_60_G_ddg_predictions_test_local_min_0.5_0.5_10_relaxed_starting_4__12055_1_1 absent
Sat Aug 1 06:56:49 2009|ralph@home|Sending scheduler request: To fetch work. Requesting 3305 seconds of work, reporting 1 completed tasks
Sat Aug 1 06:56:54 2009|ralph@home|Scheduler request succeeded: got 0 new tasks

You can see there is no entry matching the time the server marked the client detached and it was obviously not detached as my client subsequently attempted to report this result and requested more work eventually receiving another WU.

I am attached to and regularly crunch WUs for nine projects. I have had a number of these spontaneous detachments on one other project. It appears that these detachments are related to hanging scheduler requests as described in this thread at SIMAP. I was hoping you could check your logs and confirm whether the same has occurred here.

Snags

Mac OS 10.5.7, BOINC 6.2.18

39) Message boards : RALPH@home bug list : minirosetta 1.90 (Message 4895)
Posted 30 Jul 2009 by Snagletooth
Post:
Same here: lr13_seq_score12_ss2.5_rlbd_1py9_IGNORE_THE_REST_DECOY_12034_3_1

ERROR: Unable to open weights. Neither ./dslf_weights.wts nor dslf_weights.wts nor minirosetta_database/scoring/weights/dslf_weights.wts exist
ERROR:: Exit from: src/core/scoring/ScoreFunctionFactory.cc line: 177
BOINC:: Error reading and gzipping output datafile: default.out
40) Message boards : RALPH@home bug list : Minirosetta 1.81 (Message 4872)
Posted 13 Jul 2009 by Snagletooth
Post:
frb_0_8_cl1_5_hb_t297__IGNORE_THE_REST_1FXWF_9_11640_1_1

ERROR: [ERROR] Unable to open constraints file: t297_.5.cst
ERROR:: Exit from: src/core/scoring/constraints/ConstraintIO.cc line: 331
BOINC:: Error reading and gzipping output datafile: default.out


Previous 20 · Next 20



©2024 University of Washington
http://www.bakerlab.org