Task 5458468

Name RF_SAVE_ALL_OUT_NOJRAN_IGNORE_THE_REST_validation_env_f_pred_44_16902_6_1
Workunit 4846900
Created 13 Jun 2024, 4:21:34 UTC
Sent 13 Jun 2024, 5:31:50 UTC
Report deadline 14 Jun 2024, 5:31:50 UTC
Received 14 Jun 2024, 1:52:11 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 43305
Run time 18 hours 52 min 59 sec
CPU time 14 min 54 sec
Validate state Invalid
Credit 0.00
Device peak FLOPS 3.55 GFLOPS
Application version Generalized biomolecular modeling and design with RoseTTAFold All-Atom v0.02 (nvidia_alpha)
windows_x86_64
Peak working set size 4,887.94 MB
Peak swap size 10,783.29 MB
Peak disk usage 4.32 MB

Stderr output

<core_client_version>8.0.2</core_client_version>
<![CDATA[
<stderr_txt>
Traceback (most recent call last):
  File "D:\ProgramData\BOINC\projects\ralph.bakerlab.org\cv2\rf2aa\predict.py", line 708, in <module>
    pred.predict(out_name+f'_{n}', 
  File "D:\ProgramData\BOINC\projects\ralph.bakerlab.org\cv2\rf2aa\predict.py", line 551, in predict
    logit_s, logit_aa_s, logit_pae, logit_pde, p_bind, pred_crds, alpha, pred_allatom, pred_lddt_binned,                msa_prev, pair_prev, state_prev = self.model(
  File "D:\ProgramData\BOINC\projects\ralph.bakerlab.org\ev0\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "D:\ProgramData\BOINC\projects\ralph.bakerlab.org\cv2\rf2aa\RoseTTAFoldModel.py", line 358, in forward
    msa, pair, xyz, alpha_s, xyz_allatom, state, symmsub = self.simulator(
  File "D:\ProgramData\BOINC\projects\ralph.bakerlab.org\ev0\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "D:\ProgramData\BOINC\projects\ralph.bakerlab.org\cv2\rf2aa\Track_module.py", line 1106, in forward
    msa, pair, xyz, state, alpha, symmsub = self.main_block[i_m](msa, pair,
  File "D:\ProgramData\BOINC\projects\ralph.bakerlab.org\ev0\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "D:\ProgramData\BOINC\projects\ralph.bakerlab.org\cv2\rf2aa\Track_module.py", line 929, in forward
    xyz, state, alpha = self.str2str(
  File "D:\ProgramData\BOINC\projects\ralph.bakerlab.org\ev0\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "D:\ProgramData\BOINC\projects\ralph.bakerlab.org\ev0\lib\site-packages\torch\cuda\amp\autocast_mode.py", line 141, in decorate_autocast
    return func(*args, **kwargs)
  File "D:\ProgramData\BOINC\projects\ralph.bakerlab.org\cv2\rf2aa\Track_module.py", line 476, in forward
    neighbor = get_seqsep_protein_sm(idx, bond_feats, dist_matrix, rotation_mask)
  File "D:\ProgramData\BOINC\projects\ralph.bakerlab.org\cv2\rf2aa\util_module.py", line 106, in get_seqsep_protein_sm
    res_dist, atom_dist = get_res_atom_dist(idx, bond_feats, dist_matrix, sm_mask)
  File "D:\ProgramData\BOINC\projects\ralph.bakerlab.org\cv2\rf2aa\util_module.py", line 140, in get_res_atom_dist
    i_s, j_s = torch.where(bond_feats==6)
RuntimeError: CUDA error: the launch timed out and was terminated
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
03:49:19 (35336): called boinc_finish(0)

</stderr_txt>
<message>
upload failure: <file_xfer_error>
  <file_name>RF_SAVE_ALL_OUT_NOJRAN_IGNORE_THE_REST_validation_env_f_pred_44_16902_6_1_r50771128_0</file_name>
  <error_code>-240 (stat() failed)</error_code>
</file_xfer_error>
</message>
]]>




©2024 University of Washington
http://www.bakerlab.org