Task 5461426

Name RF_SAVE_ALL_OUT_NOJRAN_IGNORE_THE_REST_validation_env_f_pred_185_16902_2_1
Workunit 4847714
Created 14 Jun 2024, 0:30:28 UTC
Sent 14 Jun 2024, 4:31:58 UTC
Report deadline 15 Jun 2024, 4:31:58 UTC
Received 14 Jun 2024, 7:22:03 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 35361
Run time 2 hours 9 min 33 sec
CPU time
Validate state Invalid
Credit 0.00
Device peak FLOPS 2.80 GFLOPS
Application version Generalized biomolecular modeling and design with RoseTTAFold All-Atom v0.02 (nvidia_alpha)
windows_x86_64
Peak working set size 6,600.14 MB
Peak swap size 11,353.25 MB
Peak disk usage 3.07 MB

Stderr output

<core_client_version>7.24.1</core_client_version>
<![CDATA[
<stderr_txt>
Traceback (most recent call last):
  File "B:\ProgramData\BOINC\projects\ralph.bakerlab.org\cv2\rf2aa\predict.py", line 708, in <module>
    pred.predict(out_name+f'_{n}', 
  File "B:\ProgramData\BOINC\projects\ralph.bakerlab.org\cv2\rf2aa\predict.py", line 551, in predict
    logit_s, logit_aa_s, logit_pae, logit_pde, p_bind, pred_crds, alpha, pred_allatom, pred_lddt_binned,                msa_prev, pair_prev, state_prev = self.model(
  File "B:\ProgramData\BOINC\projects\ralph.bakerlab.org\ev0\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "B:\ProgramData\BOINC\projects\ralph.bakerlab.org\cv2\rf2aa\RoseTTAFoldModel.py", line 358, in forward
    msa, pair, xyz, alpha_s, xyz_allatom, state, symmsub = self.simulator(
  File "B:\ProgramData\BOINC\projects\ralph.bakerlab.org\ev0\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "B:\ProgramData\BOINC\projects\ralph.bakerlab.org\cv2\rf2aa\Track_module.py", line 1106, in forward
    msa, pair, xyz, state, alpha, symmsub = self.main_block[i_m](msa, pair,
  File "B:\ProgramData\BOINC\projects\ralph.bakerlab.org\ev0\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "B:\ProgramData\BOINC\projects\ralph.bakerlab.org\cv2\rf2aa\Track_module.py", line 929, in forward
    xyz, state, alpha = self.str2str(
  File "B:\ProgramData\BOINC\projects\ralph.bakerlab.org\ev0\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "B:\ProgramData\BOINC\projects\ralph.bakerlab.org\ev0\lib\site-packages\torch\cuda\amp\autocast_mode.py", line 141, in decorate_autocast
    return func(*args, **kwargs)
  File "B:\ProgramData\BOINC\projects\ralph.bakerlab.org\cv2\rf2aa\Track_module.py", line 476, in forward
    neighbor = get_seqsep_protein_sm(idx, bond_feats, dist_matrix, rotation_mask)
  File "B:\ProgramData\BOINC\projects\ralph.bakerlab.org\cv2\rf2aa\util_module.py", line 106, in get_seqsep_protein_sm
    res_dist, atom_dist = get_res_atom_dist(idx, bond_feats, dist_matrix, sm_mask)
  File "B:\ProgramData\BOINC\projects\ralph.bakerlab.org\cv2\rf2aa\util_module.py", line 140, in get_res_atom_dist
    i_s, j_s = torch.where(bond_feats==6)
RuntimeError: CUDA error: the launch timed out and was terminated
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
02:19:41 (25548): called boinc_finish(0)

</stderr_txt>
<message>
upload failure: <file_xfer_error>
  <file_name>RF_SAVE_ALL_OUT_NOJRAN_IGNORE_THE_REST_validation_env_f_pred_185_16902_2_1_r578841716_0</file_name>
  <error_code>-240 (stat() failed)</error_code>
</file_xfer_error>
</message>
]]>




©2024 University of Washington
http://www.bakerlab.org