Posts by Fardringle

1) Message boards : RALPH@home bug list : RoseTTAFold All-Atom 0.02 (env) (Message 7758)
Posted 22 Jun 2024 by Fardringle
Post:
If you ever get CPU Taks on it again i'd check Tasks Manager when one is running.
On my 12 thread system, a single CPU Task used 8 threads. So on that 8 thread system, i'd expect CPU usage to be 100% just for the single Task. The fact it was lower (and very variable), and took such a long time indicates something else was there using CPU time as well.

That's exactly what I did before, and where I got the numbers that I reported here.

A single Ralph CPU task running on the 4 core/8 thread i7-4790. Absolutely nothing else running on the computer, and therefore nothing else using the CPU. Not in BOINC and not outside of BOINC other than the normal few percent now and then from Windows processes. That one task was varying between about 30% and 70% CPU usage while I was watching it. Of course I didn't watch it the entire 12+ hours so it might have been close to 100% at some point, but it definitely didn't stay anywhere close to 100% over the 30 minutes or so that I was actively watching it.
2) Message boards : RALPH@home bug list : RoseTTAFold All-Atom 0.02 (env) (Message 7756)
Posted 22 Jun 2024 by Fardringle
Post:
New app was released and there are new tasks

0.03 (nvidia_alpha)

They disappeared quickly, but as far as I can tell there were zero errors on my RTX 3060Ti and on my Quadro P5000 cards.
3) Message boards : RALPH@home bug list : RoseTTAFold All-Atom 0.02 (env) (Message 7755)
Posted 22 Jun 2024 by Fardringle
Post:
I used the <project_max_concurrent> line in the app_config.xml file to only allow the computer to run a single task at a time. And it did actually finally complete two tasks so far (running one at a time) after about 12.5 hours each. So it looks like they can actually finish successfully on CPU, it just takes a LOT longer than the 30 minute target goal in the preferences settings. And probably a huge amount longer if running multiple tasks at the same time.
Just on it's own, the TTAFold would have used 100% of your CPU, that fact that it was showing as 30-70% indicates there were other things (other BOINC projects?) trying to use the CPU at the same time, hence the extra long processing time. Otherwise your crunching time should have been around the 4 hour mark or there abouts.


Nope. Absolutely nothing else was running on that computer. It is a pretty old CPU (i7-4790) so longer run times are expected. But it was definitely a lot longer than the target time.
4) Message boards : RALPH@home bug list : RoseTTAFold All-Atom 0.02 (env) (Message 7733)
Posted 20 Jun 2024 by Fardringle
Post:
I used the <project_max_concurrent> line in the app_config.xml file to only allow the computer to run a single task at a time. And it did actually finally complete two tasks so far (running one at a time) after about 12.5 hours each. So it looks like they can actually finish successfully on CPU, it just takes a LOT longer than the 30 minute target goal in the preferences settings. And probably a huge amount longer if running multiple tasks at the same time.
5) Message boards : RALPH@home bug list : RoseTTAFold All-Atom 0.02 (env) (Message 7728)
Posted 20 Jun 2024 by Fardringle
Post:
I set up the app_config.xml file to only allow the computer to run a single Ralph task at a time and rebooted and all tasks reset to zero progress and it is now running the one task by itself and using around 3-4 of the 8 CPU threads. I'll watch it for a while to see if it can manage to actually complete a task with this configuration.


11 hours so far running a single task, and using between 30% and 70% of the CPU (using 3-6 of the 8 CPU threads), that single task says it is at 99.961% complete with 14 seconds left. But it was at 75% and 20 minutes left more than 9 hours ago, so it seems to be showing similar results as when several tasks were running at the same time on this computer...
6) Message boards : RALPH@home bug list : RoseTTAFold All-Atom (Message 7727)
Posted 19 Jun 2024 by Fardringle
Post:
GPU tasks are still failing immediately on my RTX 3060Ti. It's not a VRAM problem as the 3060Ti has 8GB.

It looks like it's actually a programming problem and there needs to be quotation marks around the file path, or whatever method is appropriate in Python to allow spaces in file paths, since is it failing when it tries to access the BOINC directory inside Program Files.

<core_client_version>8.0.2</core_client_version>
<![CDATA[
<message>
The access code is invalid.
 (0xc) - exit code 12 (0xc)</message>
<stderr_txt>
'C:Program' is not recognized as an internal or external command,
operable program or batch file.

</stderr_txt>
]]>
7) Message boards : RALPH@home bug list : RoseTTAFold All-Atom 0.02 (env) (Message 7721)
Posted 19 Jun 2024 by Fardringle
Post:
Target computation time is set at 1 hour.

The 4 running tasks are still "running" with zero estimated time left after 32 hours. I want to just kill the tasks, but also have a bit of morbid curiosity to see if they will actually finish...

i'd suggest to use a HWiNFO64 or MSI Afterburner to measure if it really is using GPU..

Btw why it runs 4 tasks ? It is only possible if your machine has 4 GPUS, otherwise it would require custom app config to make it run in parallel ...


This computer is not using a GPU. They are CPU tasks only. (I did get a few GPU tasks on another computer that has an RTX 3060ti, but they failed immediately, the same as other people have reported.)

I set up the app_config.xml file to only allow the computer to run a single Ralph task at a time and rebooted and all tasks reset to zero progress and it is now running the one task by itself and using around 3-4 of the 8 CPU threads. I'll watch it for a while to see if it can manage to actually complete a task with this configuration.
8) Message boards : RALPH@home bug list : RoseTTAFold All-Atom 0.02 (env) (Message 7719)
Posted 19 Jun 2024 by Fardringle
Post:
Target computation time is set at 1 hour.

The 4 running tasks are still "running" with zero estimated time left after 32 hours. I want to just kill the tasks, but also have a bit of morbid curiosity to see if they will actually finish...
9) Message boards : RALPH@home bug list : RoseTTAFold All-Atom 0.02 (env) (Message 7706)
Posted 19 Jun 2024 by Fardringle
Post:
One of my computers was able to download several of these tasks, and after about 3 hours of CPU time, the estimated remaining time was 15-20 minutes.

After about 15 hours, the estimated remaining time was finally down to 1 second.

At 17+ hours, the remaining time is zero (empty) but the tasks are still showing as running, and are using a significant amount of RAM and CPU power.
10) Message boards : RALPH@home bug list : RoseTTAFold All-Atom (Message 7514)
Posted 30 May 2024 by Fardringle
Post:
All tasks downloaded on a laptop with an i7-8650U CPU failed due to missing/corrupt data in the downloaded ZIP file. And only one task actually ran for a few seconds. The others all failed at zero seconds of run time.

<core_client_version>7.24.1</core_client_version>
<![CDATA[
<message>
The access code is invalid.
 (0xc) - exit code 12 (0xc)</message>
<stderr_txt>
[venv_a_pred_alpha_179.zip]
  End-of-central-directory signature not found.  Either this file is not
  a zipfile, or it constitutes one disk of a multi-part archive.  In the
  latter case the central directory and zipfile comment will be found on
  the last disk(s) of this archive.
unzip:  cannot find zipfile directory in venv_a_pred_alpha_179.zip,
        and cannot find venv_a_pred_alpha_179.zip.zip, period.

</stderr_txt>
]]>


https://ralph.bakerlab.org/results.php?hostid=49652&offset=0&show_names=0&state=0&appid=14


And all tasks on a much more robust Ryzen 9 3950X with an RTX 3060ti have failed in exactly the same way.

https://ralph.bakerlab.org/results.php?hostid=41408&offset=0&show_names=0&state=0&appid=14
11) Message boards : RALPH@home bug list : RoseTTAFold All-Atom (Message 7513)
Posted 30 May 2024 by Fardringle
Post:
All tasks downloaded on a laptop with an i7-8650U CPU failed due to missing/corrupt data in the downloaded ZIP file. And only one task actually ran for a few seconds. The others all failed at zero seconds of run time.

<core_client_version>7.24.1</core_client_version>
<![CDATA[
<message>
The access code is invalid.
 (0xc) - exit code 12 (0xc)</message>
<stderr_txt>
[venv_a_pred_alpha_179.zip]
  End-of-central-directory signature not found.  Either this file is not
  a zipfile, or it constitutes one disk of a multi-part archive.  In the
  latter case the central directory and zipfile comment will be found on
  the last disk(s) of this archive.
unzip:  cannot find zipfile directory in venv_a_pred_alpha_179.zip,
        and cannot find venv_a_pred_alpha_179.zip.zip, period.

</stderr_txt>
]]>


https://ralph.bakerlab.org/results.php?hostid=49652&offset=0&show_names=0&state=0&appid=14
12) Message boards : Number crunching : Tasks are not actually using the CPU (Message 7493)
Posted 24 Feb 2024 by Fardringle
Post:
Progress/estimates seem to move normally for a while according to the BOINC Manager, but then slows down to nothing once the task gets to around 99%. This appears to be at least partially due to the fact that the tasks don't ever actually use the CPU, no matter how long they are allowed to run. For example, the current batch of tasks running on one of my computers has been in progress for about 36 hours, with only 34 SECONDS of actual processing time.

I probably should have just aborted the tasks a long time ago, but I was curious to see if they would ever actually complete...
13) Message boards : Number crunching : Request: Increase of deadline for Virtualbox-using rosetta python app (Message 7476)
Posted 8 Jan 2024 by Fardringle
Post:
Adding to the complexity of completing these tasks on time, most of the tasks have finished in about 1-2 hours on my Ryzen 9 3900X, but I have several that have been running for 12+ hours already with no solid estimate of when they might be completed. So it could be very easy to get stuck on some of these long tasks and have the rest of the cached tasks expire before their deadline.
14) Message boards : Current tests : Test spotted (Message 7435)
Posted 12 Nov 2023 by Fardringle
Post:
Yes, let them try to run to completion, especially on an older computer that might take a while. I had a few take 4-6 hours on my pretty fast Ryzen 9 3900X, so a slow computer could easily take several times that long to finish.
15) Message boards : Current tests : Test spotted (Message 7433)
Posted 10 Nov 2023 by Fardringle
Post:
I received a total of 12 tasks from this test batch. Five completed successfully. Seven have failed due to the virtual machine becoming "stale" when the application thinks that it was stuck. In each of the failed tasks, it looks like the VM task was put into a paused state instead of shutting down, and then because it was in an unexpected state the application/task marked itself as invalid.

And it does not seem to be related to the actual run time. Some of the successful tasks completed in about 15-16 minutes. Others took several hours to finish. The failed tasks have a pretty wide variety of run times as well.

https://ralph.bakerlab.org/results.php?userid=717&offset=0&show_names=0&state=6&appid=
16) Message boards : RALPH@home bug list : Rosetta 4.23 (Message 7319)
Posted 24 Mar 2023 by Fardringle
Post:
All of the tasks that I received for this app say that they completed successfully, but had a 100% validation failure.

https://ralph.bakerlab.org/results.php?userid=717
17) Message boards : RALPH@home bug list : All fail (Message 7053)
Posted 6 Sep 2021 by Fardringle
Post:
Surprisingly, I did have two tasks in the current batch complete successfully, but the vast majority failed the same way as others have reported here.






©2024 University of Washington
http://www.bakerlab.org