Name | RF_SAVE_ALL_OUT_NOJRAN_IGNORE_THE_REST_validation_env_f_pred_350_16902_3_0 |
Workunit | 4848701 |
Created | 13 Jun 2024, 5:41:15 UTC |
Sent | 13 Jun 2024, 8:57:52 UTC |
Report deadline | 14 Jun 2024, 8:57:52 UTC |
Received | 17 Jun 2024, 22:44:50 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 0 (0x00000000) |
Computer ID | 43234 |
Run time | 1 days 17 hours 21 min 48 sec |
CPU time | 52 sec |
Validate state | Invalid |
Credit | 0.00 |
Device peak FLOPS | 3.38 GFLOPS |
Application version | Generalized biomolecular modeling and design with RoseTTAFold All-Atom v0.02 (nvidia_alpha) windows_x86_64 |
Peak working set size | 7,222.87 MB |
Peak swap size | 12,688.20 MB |
Peak disk usage | 8.10 MB |
<core_client_version>7.24.1</core_client_version> <![CDATA[ <stderr_txt> Traceback (most recent call last): File "B:\ProgramData\BOINC\projects\ralph.bakerlab.org\cv2\rf2aa\predict.py", line 708, in <module> pred.predict(out_name+f'_{n}', File "B:\ProgramData\BOINC\projects\ralph.bakerlab.org\cv2\rf2aa\predict.py", line 551, in predict logit_s, logit_aa_s, logit_pae, logit_pde, p_bind, pred_crds, alpha, pred_allatom, pred_lddt_binned, msa_prev, pair_prev, state_prev = self.model( File "B:\ProgramData\BOINC\projects\ralph.bakerlab.org\ev0\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "B:\ProgramData\BOINC\projects\ralph.bakerlab.org\cv2\rf2aa\RoseTTAFoldModel.py", line 358, in forward msa, pair, xyz, alpha_s, xyz_allatom, state, symmsub = self.simulator( File "B:\ProgramData\BOINC\projects\ralph.bakerlab.org\ev0\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "B:\ProgramData\BOINC\projects\ralph.bakerlab.org\cv2\rf2aa\Track_module.py", line 1106, in forward msa, pair, xyz, state, alpha, symmsub = self.main_block[i_m](msa, pair, File "B:\ProgramData\BOINC\projects\ralph.bakerlab.org\ev0\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "B:\ProgramData\BOINC\projects\ralph.bakerlab.org\cv2\rf2aa\Track_module.py", line 929, in forward xyz, state, alpha = self.str2str( File "B:\ProgramData\BOINC\projects\ralph.bakerlab.org\ev0\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "B:\ProgramData\BOINC\projects\ralph.bakerlab.org\ev0\lib\site-packages\torch\cuda\amp\autocast_mode.py", line 141, in decorate_autocast return func(*args, **kwargs) File "B:\ProgramData\BOINC\projects\ralph.bakerlab.org\cv2\rf2aa\Track_module.py", line 525, in forward v = xyz - xyz[:,:,1:2,:] RuntimeError: CUDA error: unknown error CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. 23:21:16 (9516): Can't acquire lockfile (32) - waiting 35s 23:21:51 (9516): Can't acquire lockfile (32) - exiting 23:21:51 (9516): Error: The process cannot access the file because it is being used by another process. (0x20) 23:31:57 (2112): Can't acquire lockfile (32) - waiting 35s 23:32:32 (2112): Can't acquire lockfile (32) - exiting 23:32:32 (2112): Error: The process cannot access the file because it is being used by another process. (0x20) 23:42:43 (10172): Can't acquire lockfile (32) - waiting 35s 23:43:18 (10172): Can't acquire lockfile (32) - exiting 23:43:18 (10172): Error: The process cannot access the file because it is being used by another process. (0x20) 23:54:08 (12064): Can't acquire lockfile (32) - waiting 35s 23:54:43 (12064): Can't acquire lockfile (32) - exiting 23:54:43 (12064): Error: The process cannot access the file because it is being used by another process. (0x20) 00:04:53 (9084): Can't acquire lockfile (32) - waiting 35s 00:05:28 (9084): Can't acquire lockfile (32) - exiting 00:05:28 (9084): Error: The process cannot access the file because it is being used by another process. (0x20) 00:05:44 (18260): Can't acquire lockfile (32) - waiting 35s 00:06:19 (18260): Can't acquire lockfile (32) - exiting 00:06:19 (18260): Error: The process cannot access the file because it is being used by another process. (0x20) 00:16:22 (5252): Can't acquire lockfile (32) - waiting 35s 00:16:57 (5252): Can't acquire lockfile (32) - exiting 00:16:57 (5252): Error: The process cannot access the file because it is being used by another process. (0x20) 00:27:16 (9140): Can't acquire lockfile (32) - waiting 35s 00:27:51 (9140): Can't acquire lockfile (32) - exiting 00:27:51 (9140): Error: The process cannot access the file because it is being used by another process. (0x20) 00:37:52 (6428): Can't acquire lockfile (32) - waiting 35s 00:38:27 (6428): Can't acquire lockfile (32) - exiting 00:38:27 (6428): Error: The process cannot access the file because it is being used by another process. (0x20) 00:48:33 (19568): Can't acquire lockfile (32) - waiting 35s 00:49:08 (19568): Can't acquire lockfile (32) - exiting 00:49:08 (19568): Error: The process cannot access the file because it is being used by another process. (0x20) 00:59:42 (944): Can't acquire lockfile (32) - waiting 35s 01:00:17 (944): Can't acquire lockfile (32) - exiting 01:00:17 (944): Error: The process cannot access the file because it is being used by another process. (0x20) 01:10:25 (14192): Can't acquire lockfile (32) - waiting 35s 01:11:00 (14192): Can't acquire lockfile (32) - exiting 01:11:00 (14192): Error: The process cannot access the file because it is being used by another process. (0x20) 01:21:05 (5664): Can't acquire lockfile (32) - waiting 35s 01:21:40 (5664): Can't acquire lockfile (32) - exiting 01:21:40 (5664): Error: The process cannot access the file because it is being used by another process. (0x20) 01:31:55 (1252): Can't acquire lockfile (32) - waiting 35s 01:32:30 (1252): Can't acquire lockfile (32) - exiting 01:32:30 (1252): Error: The process cannot access the file because it is being used by another process. (0x20) 01:43:02 (9104): Can't acquire lockfile (32) - waiting 35s 01:43:37 (9104): Can't acquire lockfile (32) - exiting 01:43:37 (9104): Error: The process cannot access the file because it is being used by another process. (0x20) 01:54:22 (14256): Can't acquire lockfile (32) - waiting 35s 01:54:57 (14256): Can't acquire lockfile (32) - exiting 01:54:57 (14256): Error: The process cannot access the file because it is being used by another process. (0x20) 02:05:04 (11624): Can't acquire lockfile (32) - waiting 35s 02:05:39 (11624): Can't acquire lockfile (32) - exiting 02:05:39 (11624): Error: The process cannot access the file because it is being used by another process. (0x20) 02:16:12 (14072): Can't acquire lockfile (32) - waiting 35s 02:16:47 (14072): Can't acquire lockfile (32) - exiting 02:16:47 (14072): Error: The process cannot access the file because it is being used by another process. (0x20) 02:27:08 (8132): Can't acquire lockfile (32) - waiting 35s 02:27:43 (8132): Can't acquire lockfile (32) - exiting 02:27:43 (8132): Error: The process cannot access the file because it is being used by another process. (0x20) 02:27:49 (18236): Can't acquire lockfile (32) - waiting 35s 02:28:24 (18236): Can't acquire lockfile (32) - exiting 02:28:24 (18236): Error: The process cannot access the file because it is being used by another process. (0x20) 02:38:40 (15048): Can't acquire lockfile (32) - waiting 35s 02:39:15 (15048): Can't acquire lockfile (32) - exiting 02:39:15 (15048): Error: The process cannot access the file because it is being used by another process. (0x20) 02:49:22 (10132): Can't acquire lockfile (32) - waiting 35s 02:49:57 (10132): Can't acquire lockfile (32) - exiting 02:49:57 (10132): Error: The process cannot access the file because it is being used by another process. (0x20) 02:50:10 (21328): Can't acquire lockfile (32) - waiting 35s 02:50:45 (21328): Can't acquire lockfile (32) - exiting 02:50:45 (21328): Error: The process cannot access the file because it is being used by another process. (0x20) 03:01:03 (2340): Can't acquire lockfile (32) - waiting 35s 03:01:38 (2340): Can't acquire lockfile (32) - exiting 03:01:38 (2340): Error: The process cannot access the file because it is being used by another process. (0x20) 03:11:52 (8700): Can't acquire lockfile (32) - waiting 35s 03:12:27 (8700): Can't acquire lockfile (32) - exiting 03:12:27 (8700): Error: The process cannot access the file because it is being used by another process. (0x20) 03:22:30 (19520): Can't acquire lockfile (32) - waiting 35s 03:23:05 (19520): Can't acquire lockfile (32) - exiting 03:23:05 (19520): Error: The process cannot access the file because it is being used by another process. (0x20) 03:23:12 (18576): Can't acquire lockfile (32) - waiting 35s 03:23:47 (18576): Can't acquire lockfile (32) - exiting 03:23:47 (18576): Error: The process cannot access the file because it is being used by another process. (0x20) 03:33:51 (6272): Can't acquire lockfile (32) - waiting 35s 03:34:26 (6272): Can't acquire lockfile (32) - exiting 03:34:26 (6272): Error: The process cannot access the file because it is being used by another process. (0x20) 03:44:35 (14308): Can't acquire lockfile (32) - waiting 35s 03:45:10 (14308): Can't acquire lockfile (32) - exiting 03:45:10 (14308): Error: The process cannot access the file because it is being used by another process. (0x20) 03:55:28 (20372): Can't acquire lockfile (32) - waiting 35s 03:56:03 (20372): Can't acquire lockfile (32) - exiting 03:56:03 (20372): Error: The process cannot access the file because it is being used by another process. (0x20) 04:06:04 (8016): Can't acquire lockfile (32) - waiting 35s 04:06:39 (8016): Can't acquire lockfile (32) - exiting 04:06:39 (8016): Error: The process cannot access the file because it is being used by another process. (0x20) 04:17:00 (10752): Can't acquire lockfile (32) - waiting 35s 04:17:35 (10752): Can't acquire lockfile (32) - exiting 04:17:35 (10752): Error: The process cannot access the file because it is being used by another process. (0x20) 04:27:59 (7916): Can't acquire lockfile (32) - waiting 35s 04:28:34 (7916): Can't acquire lockfile (32) - exiting 04:28:34 (7916): Error: The process cannot access the file because it is being used by another process. (0x20) 04:38:52 (13132): Can't acquire lockfile (32) - waiting 35s 04:39:27 (13132): Can't acquire lockfile (32) - exiting 04:39:27 (13132): Error: The process cannot access the file because it is being used by another process. (0x20) 04:39:29 (17096): Can't acquire lockfile (32) - waiting 35s 04:40:04 (17096): Can't acquire lockfile (32) - exiting 04:40:04 (17096): Error: The process cannot access the file because it is being used by another process. (0x20) 04:50:27 (13580): Can't acquire lockfile (32) - waiting 35s 04:51:02 (13580): Can't acquire lockfile (32) - exiting 04:51:02 (13580): Error: The process cannot access the file because it is being used by another process. (0x20) 05:01:18 (5208): Can't acquire lockfile (32) - waiting 35s 05:01:53 (5208): Can't acquire lockfile (32) - exiting 05:01:53 (5208): Error: The process cannot access the file because it is being used by another process. (0x20) 05:11:57 (18184): Can't acquire lockfile (32) - waiting 35s 05:12:32 (18184): Can't acquire lockfile (32) - exiting 05:12:32 (18184): Error: The process cannot access the file because it is being used by another process. (0x20) 05:12:50 (4396): Can't acquire lockfile (32) - waiting 35s 05:13:25 (4396): Can't acquire lockfile (32) - exiting 05:13:25 (4396): Error: The process cannot access the file because it is being used by another process. (0x20) 05:23:31 (20264): Can't acquire lockfile (32) - waiting 35s 05:24:06 (20264): Can't acquire lockfile (32) - exiting 05:24:06 (20264): Error: The process cannot access the file because it is being used by another process. (0x20) 05:34:47 (14100): Can't acquire lockfile (32) - waiting 35s 05:35:22 (14100): Can't acquire lockfile (32) - exiting 05:35:22 (14100): Error: The process cannot access the file because it is being used by another process. (0x20) 05:45:52 (928): Can't acquire lockfile (32) - waiting 35s 05:46:27 (928): Can't acquire lockfile (32) - exiting 05:46:27 (928): Error: The process cannot access the file because it is being used by another process. (0x20) 05:56:39 (14216): Can't acquire lockfile (32) - waiting 35s 05:57:14 (14216): Can't acquire lockfile (32) - exiting 05:57:14 (14216): Error: The process cannot access the file because it is being used by another process. (0x20) 06:07:57 (19976): Can't acquire lockfile (32) - waiting 35s 06:08:32 (19976): Can't acquire lockfile (32) - exiting 06:08:32 (19976): Error: The process cannot access the file because it is being used by another process. (0x20) 06:19:12 (3436): Can't acquire lockfile (32) - waiting 35s 06:19:47 (3436): Can't acquire lockfile (32) - exiting 06:19:47 (3436): Error: The process cannot access the file because it is being used by another process. (0x20) 06:30:24 (21164): Can't acquire lockfile (32) - waiting 35s 06:30:59 (21164): Can't acquire lockfile (32) - exiting 06:30:59 (21164): Error: The process cannot access the file because it is being used by another process. (0x20) 06:41:07 (3240): Can't acquire lockfile (32) - waiting 35s 06:41:42 (3240): Can't acquire lockfile (32) - exiting 06:41:42 (3240): Error: The process cannot access the file because it is being used by another process. (0x20) 06:52:27 (12132): Can't acquire lockfile (32) - waiting 35s 06:53:02 (12132): Can't acquire lockfile (32) - exiting 06:53:02 (12132): Error: The process cannot access the file because it is being used by another process. (0x20) 07:03:06 (14432): Can't acquire lockfile (32) - waiting 35s 07:03:41 (14432): Can't acquire lockfile (32) - exiting 07:03:41 (14432): Error: The process cannot access the file because it is being used by another process. (0x20) 17:42:54 (7220): called boinc_finish(0) </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>RF_SAVE_ALL_OUT_NOJRAN_IGNORE_THE_REST_validation_env_f_pred_350_16902_3_0_r1266607577_0</file_name> <error_code>-240 (stat() failed)</error_code> </file_xfer_error> </message> ]]>
©2024 University of Washington
http://www.bakerlab.org