Message boards : RALPH@home bug list : RoseTTAFold All-Atom 0.03 (nvidia_alpha)
Previous · 1 · 2
Author | Message |
---|---|
rilian Send message Joined: 7 Sep 07 Posts: 35 Credit: 107,666 RAC: 725 |
I got 6 units of the latest batch and all errored out after 4 seconds because of an access code violation. Same fate for my wingmen. Windows 10, RTX 4080, 32 GB RAM. Anybody who managed to finish off a bunch of them care to share your system details? i completed all 20 tasks from last batch win 10 64; RTX 3060 i had folding@home in parallel -- I crunch for Ukraine |
Sabroe_SMC Send message Joined: 10 Sep 10 Posts: 6 Credit: 1,067,564 RAC: 77 |
Hello guysNot as foolish as you for running two at time. On my RTX4090 1 Task of the Wus from 3.7.2024 longs about 440 sec 2 Tasks longs about 650 sec. At 1 Task the GPU Utilisation was abot 50-60% with 2 tasks it was about 98% 1 Stk 440-450 sec = 100% 2 Stk 610-640 sec = 142,8% 3 Stk 904-910 sec = 149% 4 Stk 1199-1202 sec = 149,8% But now 2 tasks are 1,5 longer |
Grant (SSSF) Send message Joined: 13 Jun 24 Posts: 124 Credit: 193,939 RAC: 2,635 |
I got 6 units of the latest batch and all errored out after 4 seconds because of an access code violation. Same fate for my wingmen. Windows 10, RTX 4080, 32 GB RAM. Anybody who managed to finish off a bunch of them care to share your system details?I had the same thing happening with my systems initially. i7-8700K, 32GB RAM, Win10 Pro, 552.22 video driver, RTX 2060 & RTX 2060 Super on other system with same other specs. I reset the project, it re-downloaded all the files, and then they worked OK (no idea why downloading the same things all over again made any difference, but it did). Grant Darwin NT |
Grant (SSSF) Send message Joined: 13 Jun 24 Posts: 124 Credit: 193,939 RAC: 2,635 |
But now 2 tasks are 1,5 longerAs i mentnioned in my other post- while these Tasks take longer to process than the last batch, they are still taking less time than the initial batch that i was able to process. For my RTX 2060 First batch 27min 50 sec Last batch 16min 50 sec Current batch 25min 45 sec How many CPU cores/threads of your system are in use? Either try 1 Task at a time again, or limit the number of running CPU Tasks to free up a CPU core/thread, or reserve 3 cores/threads for your 2 GPU Ralph Tasks. The default for Ralph is to reserve only 0.997 cores/threads per GPU Task, but the actual usage is more like 1.3, with frequent bumps of 2.5 every 20 seconds or so (corresponding with the drop in GPU load & power draw). See if reserving 3 cores/threads helps significantly enough to make a difference. I tried reserving 2 cores/threads, but the improvement in Ralph processing time was so small as to not come close to offsetting the lost output of CPU work from that core/thread. For your high end GPU and running 2 Tasks at a time, it might be. Grant Darwin NT |
Sabroe_SMC Send message Joined: 10 Sep 10 Posts: 6 Credit: 1,067,564 RAC: 77 |
Momently i have NO CPU tasks running |
Drago75 Send message Joined: 29 Jul 22 Posts: 3 Credit: 70,604 RAC: 227 |
Ok, so now my three Windows hosts run these wus successfully. It was necessary to detach from Ralph on all of them and to rejoin the project after a reboot to make sure like it was mentioned on a previous post. Evidently the data from the project folder got corrupted. Now they all work flawlessly. Another thing: My RTX 3070ti was initially rejected because of not enough VRAM which is nonsense because it’s got 8 GB. The solution was to install the latest version of BOINC manager. |
rilian Send message Joined: 7 Sep 07 Posts: 35 Credit: 107,666 RAC: 725 |
something strange is going with the tasks my tasks page shows that i have 3 tasks in progress, but BOINC on the computer shows only one. Nothing in transfers i definitely have all tasks visible, not only active... will research logs carefully tomorrow. did anyone see such issue ? -- I crunch for Ukraine |
Grant (SSSF) Send message Joined: 13 Jun 24 Posts: 124 Credit: 193,939 RAC: 2,635 |
my tasks page shows that i have 3 tasks in progress, but BOINC on the computer shows only one. Nothing in transfersSounds like what's known as Ghost Tasks. Some sort of network issue during contacting the Scheduler & getting the work. The Scheduler thinks you got it, but you didn't. Looking at my Tasks i've had some weirdness as well. I had a Task error out with the reason being "Timed out - no response", that was after only 8 minutes after getting it... Grant Darwin NT |
rilian Send message Joined: 7 Sep 07 Posts: 35 Credit: 107,666 RAC: 725 |
my tasks page shows that i have 3 tasks in progress, but BOINC on the computer shows only one. Nothing in transfersSounds like what's known as Ghost Tasks. Thank you I think you are right - i had some networking issues yesterday due to work on some other project.. Those 2 extra tasks still hang In Progress and will time out in few days -- I crunch for Ukraine |
rilian Send message Joined: 7 Sep 07 Posts: 35 Credit: 107,666 RAC: 725 |
I've got few tasks and they all fail after 4000 sec with C:ProgramDataBOINCprojectsralph.bakerlab.orgcv1rf2aautil.py:450: UserWarning: Using torch.cross without specifying the dim arg is deprecated. Please either pass the dim explicitly or simply use torch.linalg.cross. The default value of dim will change to agree with that of linalg.cross in a future release. (Triggered internally at C:cbpytorch_1000000000000workatensrcATennativeCross.cpp:66.) Z = torch.cross(Xn,Yn) -- I crunch for Ukraine |
Grant (SSSF) Send message Joined: 13 Jun 24 Posts: 124 Credit: 193,939 RAC: 2,635 |
I've got few tasks and they all fail after 4000 sec withThey didn't necessarily fail- that error message was occurring with Taks that completed and validated on the previous runs. It looks like they completed processing OK, but there was a problem with Validation. I'm getting the same issue. Either they didn't process correctly, or there is an issue with the Validators. Grant Darwin NT |
dcdc Send message Joined: 15 Aug 06 Posts: 27 Credit: 90,652 RAC: 0 |
I'm getting this error: 655 ralph@home 02/09/2024 23:21:46 Requesting new tasks for NVIDIA GPU 656 ralph@home 02/09/2024 23:21:47 Scheduler request completed: got 0 new tasks 657 ralph@home 02/09/2024 23:21:47 A minimum of 5120 MB (preferably 5120 MB) of video RAM is needed to process tasks using your computer's NVIDIA GPU 658 ralph@home 02/09/2024 23:21:47 Project requested delay of 31 seconds I've got a Quadro P2200 with exactly that (5120MB) according to GPU-Z. Anyone else getting that error? D |
rjs5 Send message Joined: 5 Jul 15 Posts: 22 Credit: 135,787 RAC: 2,494 |
I've got few tasks and they all fail after 4000 sec withThey didn't necessarily fail- that error message was occurring with Taks that completed and validated on the previous runs. Mine seem to be finishing with the WARNING message and then fail to VALIDATE. They are taking 7GB of 24GB of dedicated GPU memory. Stderr output <core_client_version>8.0.2</core_client_version> <![CDATA[ <stderr_txt> C:ProgramDataBOINCprojectsralph.bakerlab.orgcv1rf2aautil.py:450: UserWarning: Using torch.cross without specifying the dim arg is deprecated. Please either pass the dim explicitly or simply use torch.linalg.cross. The default value of dim will change to agree with that of linalg.cross in a future release. (Triggered internally at C:cbpytorch_1000000000000workatensrcATennativeCross.cpp:66.) Z = torch.cross(Xn,Yn) 16:09:10 (7564): called boinc_finish(0) </stderr_txt> ]]> |
rilian Send message Joined: 7 Sep 07 Posts: 35 Credit: 107,666 RAC: 725 |
|
Message boards :
RALPH@home bug list :
RoseTTAFold All-Atom 0.03 (nvidia_alpha)
©2024 University of Washington
http://www.bakerlab.org