Message boards : RALPH@home bug list : RoseTTAFold All-Atom
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 11 · Next
Author | Message |
---|---|
kotenok2000 Send message Joined: 26 Feb 21 Posts: 22 Credit: 1,893 RAC: 0 |
I am talking about intel GPU. |
Vester Send message Joined: 29 Apr 20 Posts: 17 Credit: 1,176 RAC: 0 |
I am talking about intel GPU. The Intel graphics are not running a task. Utilization of the GPU is about 3%. |
zombie67 [MM] Send message Joined: 8 Aug 06 Posts: 75 Credit: 2,396,363 RAC: 6,299 |
I had a full set of 16 tasks running just fine, apparently. But once they passed 25 hours, with 24 hours being the maximum, I knew they were broken and aborted them just now. |
Grant (SSSF) Send message Joined: 13 Jun 24 Posts: 126 Credit: 193,939 RAC: 2,635 |
It uses GPU, but boinc manager doesn't reflect that.This is a 100% CPU application. No GPU work is being done at all. |
Grant (SSSF) Send message Joined: 13 Jun 24 Posts: 126 Credit: 193,939 RAC: 2,635 |
I had a full set of 16 tasks running just fine, apparently. But once they passed 25 hours, with 24 hours being the maximum, I knew they were broken and aborted them just now.Mine have been going for 18 hours. I will give them until 24hrs, then abort. |
Grant (SSSF) Send message Joined: 13 Jun 24 Posts: 126 Credit: 193,939 RAC: 2,635 |
Was anyone able to return task successfully?Yes, a few people have. In most cases, the Runtime is less than an hour. (and they certainly make use of system resources eg Peak working set size 6,323.86 MB Peak swap size 10,856.63 MB) The CPU time counter is definitely broken, and i suspect checkpointing is done based on CPU time (not Run time), that's probably why the checkpointing isn't working. They need to fix the CPU time counter, as well as put a watchdog timer on the tasks as per the Rosetta 4.20 Tasks- but i suspect the watchdog timer also uses CPU time, not run time. So they really need to fix the CPU time issue. Along with the no end in sight processing time of these Tasks. |
Vester Send message Joined: 29 Apr 20 Posts: 17 Credit: 1,176 RAC: 0 |
My twelve tasks ran more than 1 day before failing with a message stating that I did not have enough paging files space although I have a fixed minimum and maximum of 84470 MB. I have limited the number of cores in BIOS to five, and I am running 5 tasks. Also, I had updated my Ralph preferences to 4 hours. Note: One can also limit the number of cores in Windows 11 by setting "number of processors" in Advanced Boot options (run msconfig). |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 913 Credit: 1,892,541 RAC: 294 |
It seems a multi-core app (or, better, windows/boinc consider it like a multi-core app). I killed my wus except 5. Task manager said that cpu is still 100% and all cores are running. But my cpu is 16-cores!! Now the wus remaining are at 99,950% after 23hrs and i think they will not finish in time |
Grant (SSSF) Send message Joined: 13 Jun 24 Posts: 126 Credit: 193,939 RAC: 2,635 |
It seems a multi-core app (or, better, windows/boinc consider it like a multi-core app).If it's a multiprocessing application, then it would explain the odd CPU utilisation showing up in Task Manager, and in which case they need to give us the option to set the number of cores available for the application for each Task running- ie 1, 2 4 etc. That way people won't end up with over committed systems (ie Runtime being multiple times longer thn CPU time, and missing deadlines because it's taking 20 hours to do 2 hours of work - they should pretty much be equal on a dedicated cruncher). |
Grant (SSSF) Send message Joined: 13 Jun 24 Posts: 126 Credit: 193,939 RAC: 2,635 |
These TTAFold tasks are very, very broken. I limited them to only 4 running, but due to the broken nature of suspend etc, the suspended ones still kept running in the background. And when the Rosetta Beta tasks started up, while their elapsed time ticked away, they received absolutely 0 CPU time. So i exited BOINC & threw away all 20 hours of processing time on a dozen TTAFold Tasks. When it restarted, those 4 TTAFold Tasks were using 100% of the CPU, still none at all available for the Rosetta Beta Tasks. So i then limited the TTAFold Tasks to only 1 running Task. That single Task is using 8 threads. There needs to be a way to limit the number of threads a single Task can use. And at the moment, the indications are that 1 Task using 8 threads performs no better than when 12 Tasks were trying to use 8 threads each, when there were only 12 threads available. I'll keep an eye on things to see if they don't slow down later, but the initial signs are that the extra threads are providing not even the slightest improvement in processing time- they're just being wasted. |
Grant (SSSF) Send message Joined: 13 Jun 24 Posts: 126 Credit: 193,939 RAC: 2,635 |
Note: One can also limit the number of cores in Windows 11 by setting "number of processors" in Advanced Boot options (run msconfig).I've opted to use max_concurrent to limit the number of cores/threads avalable to the TTAFold Tasks, leaving the others available for other processes. As i have found, they are pigs. 1 Task = 8 threads. |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 913 Credit: 1,892,541 RAC: 294 |
So i then limited the TTAFold Tasks to only 1 running Task. The app, probably, is NOT multi-threading. As i wrote, seems a "misunderstandig" between app and Boinc/Windows |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 913 Credit: 1,892,541 RAC: 294 |
As i have found, they are pigs. 1 Task = 8 threads. Pigs?? |
zioriga Send message Joined: 16 Feb 06 Posts: 8 Credit: 323,279 RAC: 1,175 |
I'running only 1 WUs and I have the GPU ( NVidia 3050) running at 99-100% (GPU_Z, no other WUs using GPU) in the WU Properties : cpu time 0 Elapsed time 21:20:15 Fraction done 99.334% In other words: is the WU running only on CPU or only on GPU ???? |
Grant (SSSF) Send message Joined: 13 Jun 24 Posts: 126 Credit: 193,939 RAC: 2,635 |
In other words: is the WU running only on CPU or only on GPU ????CPU only. |
Mr P Hucker Send message Joined: 3 Mar 23 Posts: 31 Credit: 9,510 RAC: 3 |
They use 10 threads each, but Boinc isn't told this, so it runs far too many. I only noticed this on my main machine because everything seemed so sluggish. It was trying to run 7 of 10 thread tasks on a 24 thread CXPU. Hence the GPU running asteroids slowed right down, and the interface was terribly sluggish. |
Grant (SSSF) Send message Joined: 13 Jun 24 Posts: 126 Credit: 193,939 RAC: 2,635 |
They'll take everything you give them, even if they don't need it.As i have found, they are pigs. 1 Task = 8 threads.Pigs?? Resource Hog definition: A process which consumes a large amount of system resources compared to its importance or function. |
Grant (SSSF) Send message Joined: 13 Jun 24 Posts: 126 Credit: 193,939 RAC: 2,635 |
They use 10 threads eachOn my system i have limited them to only one TTAFold Task at a time. It's using a maximum of 62% of my CPU time which works out at just under 8 threads. So it's effectively using 8. Yet the processing rate is the same as when 12 of them were fighting over 12 threads in total, so they really only need 1. A word to the developers- limit these TTAFold tasks to 1 thread per Task until such time as using more threads results in improved processing rates. Even then, the default should still remain 1; with the option in the Ralph@home preferences (and eventually the Rosetta@home preferences) for people to select a higher value for the TTAfold application if they choose to. |
Mr P Hucker Send message Joined: 3 Mar 23 Posts: 31 Credit: 9,510 RAC: 3 |
How do you know the rate is the same? We don't know how the task is counting progress. It could be timed like the standard Rosetta 4.2 tasks. Those take 8 hours on any speed of CPU.They use 10 threads eachOn my system i have limited them to only one TTAFold Task at a time. |
Grant (SSSF) Send message Joined: 13 Jun 24 Posts: 126 Credit: 193,939 RAC: 2,635 |
How do you know the rate is the same?Because as i said in my previous post- i had already processed 12 of those Tasks. The progress rate for the currently running single Tasks is the same as it was for those 12 other Tasks- it starts off fast, and continues to drop as the Task just keeps on going, well after the initial 4 hour estimate. True- If a Task was to ever complete, then hopefully we could then see if it actually did do any more work in that time, but at present i've got a single Task that is on the very same course as the previous ones with the same processing rate, the same slowing rate of fraction done & heading for missing the deadline because it's going to take over 24hours to process (if it ever does manage to process it). Given that others have run into memory issues after letting it run for 24hrs+, and I've now got 8 threads for the one Task i'd have expected to start running in to similar issues 8 times sooner, but that's yet t happen. So i think it's a pretty reasonable assumption that it's not doing 8 times as much work as it was, which is what it would have to do to make using that many threads worthwhile. |
Message boards :
RALPH@home bug list :
RoseTTAFold All-Atom
©2024 University of Washington
http://www.bakerlab.org