Message boards : News : Rosetta version 4.20 released for testing
Author | Message |
---|---|
dekim Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 20 Jan 06 Posts: 250 Credit: 543,579 RAC: 0 |
This version includes a fallback to the original method of extracting into the slot directory for each job if extracting into the project directory fails. Please provide feedback in the discussion thread. |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 908 Credit: 1,892,541 RAC: 294 |
This version includes a fallback to the original method of extracting into the slot directory for each job if extracting into the project directory fails. Why this fallback? |
dekim Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 20 Jan 06 Posts: 250 Credit: 543,579 RAC: 0 |
I noticed a few hosts had issues extracting into the project directory so instead of just failing, this option allows them to continue on with the previous method of extracting into the run directory. I'm not sure what exactly caused it, one was a permissions issue and another was missing files, but rare. |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 908 Credit: 1,892,541 RAC: 294 |
Please, if you are not intereste in 4.17 and 4.18 wus, abort these by server and release more 4.20 |
sgaboinc Send message Joined: 8 Jul 14 Posts: 20 Credit: 4,159 RAC: 0 |
testing on linux https://ralph.bakerlab.org/result.php?resultid=5072130 https://ralph.bakerlab.org/result.php?resultid=5072139 boinc-client pulled 10 wu at a go, it used to be less and normally 1 task per core 8 wu it seemed there is a risk those who set a large task cache may download a large number of tasks. i'm changing the task cache as 0 forwards |
sgaboinc Send message Joined: 8 Jul 14 Posts: 20 Credit: 4,159 RAC: 0 |
but literally how do the fallback happen? would that means a new selection option in preferences? it may help to amass an 'faq' for this new 'feature'. my thoughts are that in addition, users can examine log files or online logs for errors of the failed jobs and perhaps fix permissions problems in the project folder |
sgaboinc Send message Joined: 8 Jul 14 Posts: 20 Credit: 4,159 RAC: 0 |
i've got one 4.20 thread started on Pi4 Arm Aarch64 https://ralph.bakerlab.org/result.php?resultid=5078692 download went through ok and it is running. due to low disk space and i'm running 3 concurrent threads 2 of them rosetta threads and one is 4.20 from ralph i'd await the next fetch for more threads on ralph |
sgaboinc Send message Joined: 8 Jul 14 Posts: 20 Credit: 4,159 RAC: 0 |
got 2 additional 4.20 wu running on Pi4 https://ralph.bakerlab.org/result.php?resultid=5082099 https://ralph.bakerlab.org/result.php?resultid=5082059 database_357d5d93529_n_methyl.zip is downloaded only once when the initial task previously is received it is more space efficient as well, previously 3-4 tasks pretty much consumed 4GB space. now only 1.55 GB is used after 3 wu are downloaded and running |
Raistmer Send message Joined: 24 Apr 20 Posts: 8 Credit: 2,010 RAC: 0 |
but literally how do the fallback happen? would that means a new selection option in preferences? At least it would be good if app send notice when it does such fallback. To allow user to fix this issue. Failsafe feature is good, but when it lead to such inefficiency better if user have his chance to help. |
Raistmer Send message Joined: 24 Apr 20 Posts: 8 Credit: 2,010 RAC: 0 |
I noticed a few hosts had issues extracting into the project directory so instead of just failing, this option allows them to continue on with the previous method of extracting into the run directory. I'm not sure what exactly caused it, one was a permissions issue and another was missing files, but rare. "Missing files" - maybe user deliberately removed new directory? Regarding permission - how they could be set to allow BOINC to write files to project dir (all task and result data stored into project dirs actually with just soft links to slot) and app to be not able to do the same... project dir can't be readonly or BOINC will not work at all... Anyway 8 tasks completed OK for 4.20 on my test host. Can't say was fallback active or not (any traces at least in result stderr of it? ) Currrently host asking for new work but doesn't get any. EDIT: I think the one trace definiely should be: Peak disk usage 6.13 MB - with fallback this result field should be ~1GB... |
WezH Send message Joined: 24 Apr 20 Posts: 6 Credit: 181,771 RAC: 0 |
Anyway 8 tasks completed OK for 4.20 on my test host. Did check 100+ completed tasks, none of them had over 10MB Peak disk usage. In Rosetta v4.15 it is about 960MB. |
dekim Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 20 Jan 06 Posts: 250 Credit: 543,579 RAC: 0 |
Detaching from the project, cleaning out the project directory, and then reattaching the project would hopefully solve this rare issue. I don't think it is a general issue but this needs more investigation. |
WezH Send message Joined: 24 Apr 20 Posts: 6 Credit: 181,771 RAC: 0 |
No surprise, about one third of Tasks in progress are in Top10 hosts. Bad example for beta testing: https://ralph.bakerlab.org/results.php?hostid=45441 |
dekim Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 20 Jan 06 Posts: 250 Credit: 543,579 RAC: 0 |
Things are looking pretty good so far with this latest version. I'm going to look into the extraction issue to see if I can reproduce the behavior and come up with a fix, but if I don't make much progress today, then I'll go ahead and release this version on R@h. Thanks for all your feedback, and suggesting this change. |
dekim Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 20 Jan 06 Posts: 250 Credit: 543,579 RAC: 0 |
I'll send out more tasks. It would be nice to have more android and linux-arm devices in the test pool. I'm still seeing some "finish file present too long" errors. |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 908 Credit: 1,892,541 RAC: 294 |
4.20 seems good. Up to now, no errors |
dekim Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 20 Jan 06 Posts: 250 Credit: 543,579 RAC: 0 |
I updated most of the R@h apps to 4.20. The remaining platforms will get updated later today. |
dekim Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 20 Jan 06 Posts: 250 Credit: 543,579 RAC: 0 |
I was not able to reproduce the extraction issues so if anyone notices this issue on their host, please report back here and maybe we can work together to come up with a fix. You will know that your host is having the issue if the "Peak disk usage" is close to 1gig and if you see a file ending with ".is_bad" in the project directory. |
CIA Send message Joined: 5 Apr 20 Posts: 13 Credit: 111,953 RAC: 0 |
Just want to give a shoutout to the mods/admins/programmers involved with both Rosetta and Ralph. After following the issue that was brought up over on Rosetta about file sizes and watching the quick response from the dev team to research the proposed solution, and then act on it so quickly. Great job overall! Well done chaps! |
dekim Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 20 Jan 06 Posts: 250 Credit: 543,579 RAC: 0 |
Thanks! I'm very happy that these updates will prevent a ton of disk usage and cpu waste. |
Message boards :
News :
Rosetta version 4.20 released for testing
©2024 University of Washington
http://www.bakerlab.org