Rosetta version 4.20 released for testing

Message boards : News : Rosetta version 4.20 released for testing

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 250
Credit: 542,654
RAC: 0
Message 6775 - Posted: 1 May 2020, 3:32:50 UTC
Last modified: 1 May 2020, 3:33:15 UTC

This version includes a fallback to the original method of extracting into the slot directory for each job if extracting into the project directory fails.

Please provide feedback in the discussion thread.
ID: 6775 · Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 685
Credit: 1,321,040
RAC: 223
Message 6777 - Posted: 1 May 2020, 7:50:46 UTC - in response to Message 6775.  

This version includes a fallback to the original method of extracting into the slot directory for each job if extracting into the project directory fails.

Why this fallback?
ID: 6777 · Report as offensive    Reply Quote
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 250
Credit: 542,654
RAC: 0
Message 6778 - Posted: 1 May 2020, 7:57:07 UTC - in response to Message 6777.  

I noticed a few hosts had issues extracting into the project directory so instead of just failing, this option allows them to continue on with the previous method of extracting into the run directory. I'm not sure what exactly caused it, one was a permissions issue and another was missing files, but rare.
ID: 6778 · Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 685
Credit: 1,321,040
RAC: 223
Message 6779 - Posted: 1 May 2020, 8:02:32 UTC - in response to Message 6778.  

Please, if you are not intereste in 4.17 and 4.18 wus, abort these by server and release more 4.20
ID: 6779 · Report as offensive    Reply Quote
sgaboinc

Send message
Joined: 8 Jul 14
Posts: 20
Credit: 4,159
RAC: 0
Message 6782 - Posted: 1 May 2020, 9:28:20 UTC
Last modified: 1 May 2020, 9:39:39 UTC

testing on linux
https://ralph.bakerlab.org/result.php?resultid=5072130
https://ralph.bakerlab.org/result.php?resultid=5072139
boinc-client pulled 10 wu at a go, it used to be less and normally 1 task per core 8 wu
it seemed there is a risk those who set a large task cache may download a large number of tasks.
i'm changing the task cache as 0 forwards
ID: 6782 · Report as offensive    Reply Quote
sgaboinc

Send message
Joined: 8 Jul 14
Posts: 20
Credit: 4,159
RAC: 0
Message 6783 - Posted: 1 May 2020, 11:06:43 UTC
Last modified: 1 May 2020, 11:08:26 UTC

but literally how do the fallback happen? would that means a new selection option in preferences?
it may help to amass an 'faq' for this new 'feature'.
my thoughts are that in addition, users can examine log files or online logs for errors of the failed jobs and perhaps fix permissions problems in the project folder
ID: 6783 · Report as offensive    Reply Quote
sgaboinc

Send message
Joined: 8 Jul 14
Posts: 20
Credit: 4,159
RAC: 0
Message 6784 - Posted: 1 May 2020, 11:13:00 UTC

i've got one 4.20 thread started on Pi4 Arm Aarch64
https://ralph.bakerlab.org/result.php?resultid=5078692
download went through ok and it is running.
due to low disk space and i'm running 3 concurrent threads 2 of them rosetta threads and one is 4.20 from ralph
i'd await the next fetch for more threads on ralph
ID: 6784 · Report as offensive    Reply Quote
sgaboinc

Send message
Joined: 8 Jul 14
Posts: 20
Credit: 4,159
RAC: 0
Message 6786 - Posted: 1 May 2020, 13:35:41 UTC
Last modified: 1 May 2020, 13:36:55 UTC

got 2 additional 4.20 wu running on Pi4
https://ralph.bakerlab.org/result.php?resultid=5082099
https://ralph.bakerlab.org/result.php?resultid=5082059

database_357d5d93529_n_methyl.zip is downloaded only once when the initial task previously is received
it is more space efficient as well, previously 3-4 tasks pretty much consumed 4GB space.
now only 1.55 GB is used after 3 wu are downloaded and running
ID: 6786 · Report as offensive    Reply Quote
Raistmer

Send message
Joined: 24 Apr 20
Posts: 8
Credit: 2,010
RAC: 0
Message 6788 - Posted: 1 May 2020, 13:41:58 UTC - in response to Message 6783.  

but literally how do the fallback happen? would that means a new selection option in preferences?
it may help to amass an 'faq' for this new 'feature'.
my thoughts are that in addition, users can examine log files or online logs for errors of the failed jobs and perhaps fix permissions problems in the project folder


At least it would be good if app send notice when it does such fallback. To allow user to fix this issue.
Failsafe feature is good, but when it lead to such inefficiency better if user have his chance to help.
ID: 6788 · Report as offensive    Reply Quote
Raistmer

Send message
Joined: 24 Apr 20
Posts: 8
Credit: 2,010
RAC: 0
Message 6789 - Posted: 1 May 2020, 13:47:47 UTC - in response to Message 6778.  
Last modified: 1 May 2020, 13:53:26 UTC

I noticed a few hosts had issues extracting into the project directory so instead of just failing, this option allows them to continue on with the previous method of extracting into the run directory. I'm not sure what exactly caused it, one was a permissions issue and another was missing files, but rare.


"Missing files" - maybe user deliberately removed new directory?
Regarding permission - how they could be set to allow BOINC to write files to project dir (all task and result data stored into project dirs actually with just soft links to slot) and app to be not able to do the same... project dir can't be readonly or BOINC will not work at all...

Anyway 8 tasks completed OK for 4.20 on my test host.
Can't say was fallback active or not (any traces at least in result stderr of it? )
Currrently host asking for new work but doesn't get any.

EDIT: I think the one trace definiely should be:
Peak disk usage 6.13 MB - with fallback this result field should be ~1GB...
ID: 6789 · Report as offensive    Reply Quote
WezH

Send message
Joined: 24 Apr 20
Posts: 4
Credit: 67,498
RAC: 0
Message 6790 - Posted: 1 May 2020, 16:23:21 UTC - in response to Message 6789.  

Anyway 8 tasks completed OK for 4.20 on my test host.
Can't say was fallback active or not (any traces at least in result stderr of it? )
Currrently host asking for new work but doesn't get any.

EDIT: I think the one trace definiely should be:
Peak disk usage 6.13 MB - with fallback this result field should be ~1GB...


Did check 100+ completed tasks, none of them had over 10MB Peak disk usage.

In Rosetta v4.15 it is about 960MB.
ID: 6790 · Report as offensive    Reply Quote
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 250
Credit: 542,654
RAC: 0
Message 6791 - Posted: 1 May 2020, 16:56:28 UTC

Detaching from the project, cleaning out the project directory, and then reattaching the project would hopefully solve this rare issue. I don't think it is a general issue but this needs more investigation.
ID: 6791 · Report as offensive    Reply Quote
WezH

Send message
Joined: 24 Apr 20
Posts: 4
Credit: 67,498
RAC: 0
Message 6792 - Posted: 1 May 2020, 16:58:06 UTC - in response to Message 6789.  


Currrently host asking for new work but doesn't get any.


No surprise, about one third of Tasks in progress are in Top10 hosts.

Bad example for beta testing: https://ralph.bakerlab.org/results.php?hostid=45441
ID: 6792 · Report as offensive    Reply Quote
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 250
Credit: 542,654
RAC: 0
Message 6793 - Posted: 1 May 2020, 17:03:28 UTC

Things are looking pretty good so far with this latest version. I'm going to look into the extraction issue to see if I can reproduce the behavior and come up with a fix, but if I don't make much progress today, then I'll go ahead and release this version on R@h. Thanks for all your feedback, and suggesting this change.
ID: 6793 · Report as offensive    Reply Quote
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 250
Credit: 542,654
RAC: 0
Message 6794 - Posted: 1 May 2020, 17:04:04 UTC - in response to Message 6792.  
Last modified: 1 May 2020, 17:21:51 UTC

I'll send out more tasks. It would be nice to have more android and linux-arm devices in the test pool.

I'm still seeing some "finish file present too long" errors.
ID: 6794 · Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 685
Credit: 1,321,040
RAC: 223
Message 6795 - Posted: 1 May 2020, 21:00:33 UTC

4.20 seems good.
Up to now, no errors
ID: 6795 · Report as offensive    Reply Quote
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 250
Credit: 542,654
RAC: 0
Message 6796 - Posted: 1 May 2020, 21:04:46 UTC

I updated most of the R@h apps to 4.20. The remaining platforms will get updated later today.
ID: 6796 · Report as offensive    Reply Quote
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 250
Credit: 542,654
RAC: 0
Message 6797 - Posted: 1 May 2020, 21:07:19 UTC

I was not able to reproduce the extraction issues so if anyone notices this issue on their host, please report back here and maybe we can work together to come up with a fix. You will know that your host is having the issue if the "Peak disk usage" is close to 1gig and if you see a file ending with ".is_bad" in the project directory.
ID: 6797 · Report as offensive    Reply Quote
CIA

Send message
Joined: 5 Apr 20
Posts: 13
Credit: 84,696
RAC: 1
Message 6798 - Posted: 1 May 2020, 21:26:07 UTC - in response to Message 6797.  

Just want to give a shoutout to the mods/admins/programmers involved with both Rosetta and Ralph. After following the issue that was brought up over on Rosetta about file sizes and watching the quick response from the dev team to research the proposed solution, and then act on it so quickly. Great job overall!

Well done chaps!
ID: 6798 · Report as offensive    Reply Quote
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 250
Credit: 542,654
RAC: 0
Message 6799 - Posted: 1 May 2020, 21:40:03 UTC - in response to Message 6798.  

Thanks! I'm very happy that these updates will prevent a ton of disk usage and cpu waste.
ID: 6799 · Report as offensive    Reply Quote
1 · 2 · Next

Message boards : News : Rosetta version 4.20 released for testing



©2020 University of Washington
http://www.bakerlab.org