Rosetta mini beta and/or android 3.61-3.83

Message boards : RALPH@home bug list : Rosetta mini beta and/or android 3.61-3.83

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 12 · Next

AuthorMessage
Mad_Max

Send message
Joined: 15 Nov 12
Posts: 15
Credit: 404,700
RAC: 0
Message 5992 - Posted: 14 Jan 2016, 23:05:02 UTC - in response to Message 5991.  
Last modified: 14 Jan 2016, 23:35:21 UTC

It is a not about situation with few Rosetta WUs running at same time. Running 4 WUs or even 6-8 WUs in parallel on HDD is OK if PC have enough RAM. Because at running stage R@H use disk only a little and do not cause any problems.

At loading stage (after BOINC restart / PC reboot) it is much more disk load from R@H/RALPH, but HDD can still cope with atleast 4 WUs parallel startup without problems (it takes few minutes of high HDD load, but not cause any errors).

It is about initial starting/initialization of few new WUs in parallel - because at this point full rosetta database (which now HUGE ~4000 files, ~350 Mb) + WUs data is extracting from project folder to dataslotsx folders for each WU. This put really high stress and slowdown on HDDs.

Usual BOINC initialize just one WU at a time because begin new WUs only after one of previous finished and all WUs have different run times so start time naturally shifting. So no problems too.
But in some conditions like:
- a mere coincidence of initialization time of few WUs
- after long outage of project (boinc is "hungry" for jobs from this project and try to start an many WUs from it as it can immediately after download finished)
- after project reset by user
- at initial project addition to BOINC
Multiple WUs try to initialize at same time and problem arise.

So it is not often situation. And seen only on HDD, while SSD do not have any problems at all.
So hard limit CPU cores allowed to run R@H would be like be beheaded instead of shave :) Though of course the problems with shaving is also solve this way lol
ID: 5992 · Report as offensive    Reply Quote
Mad_Max

Send message
Joined: 15 Nov 12
Posts: 15
Credit: 404,700
RAC: 0
Message 5993 - Posted: 14 Jan 2016, 23:07:46 UTC - in response to Message 5991.  
Last modified: 14 Jan 2016, 23:35:42 UTC

P.S.
Problem with huge slowdowns at initialization on HDD is known long ago for R@H/RALPH. But over time, it is constantly growing because main minirosetta database grow too with almost each new release:
- when i start crunching for R@H it was "only" ~1500 files and 100 Mb
- 2 years ago it was ~2500 files and 150 Mb
- now it is ~4000 files and ~350 Mb for last minirosetta v.3.70

So perhaps there is no any new bugs in BOINC or R@H and we just finally reach HDD limits with last update and start triggering some sort of BOINC timeouts/watchdog.
ID: 5993 · Report as offensive    Reply Quote
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 250
Credit: 543,579
RAC: 0
Message 5994 - Posted: 15 Jan 2016, 0:36:56 UTC

We can add a random sleep before extracting the database but that time would be wasted of course. Maybe the sleep time could be a function of the number of cpus allowed to run jobs.
ID: 5994 · Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 840
Credit: 1,888,960
RAC: 0
Message 5995 - Posted: 15 Jan 2016, 9:32:47 UTC - in response to Message 5993.  

So perhaps there is no any new bugs in BOINC or R@H and we just finally reach HDD limits with last update and start triggering some sort of BOINC timeouts/watchdog.


The future is SSD... :-)
ID: 5995 · Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 840
Credit: 1,888,960
RAC: 0
Message 5996 - Posted: 15 Jan 2016, 9:34:55 UTC

This version seems to be quite stable.
Do you think to pass in "production" on Rosetta?
ID: 5996 · Report as offensive    Reply Quote
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 250
Credit: 543,579
RAC: 0
Message 5997 - Posted: 15 Jan 2016, 18:26:19 UTC

I discovered a bug in the graphics app caused by the recent changes which can cause it to hang. I need to fix this.
ID: 5997 · Report as offensive    Reply Quote
sgaboinc

Send message
Joined: 8 Jul 14
Posts: 20
Credit: 4,159
RAC: 0
Message 5998 - Posted: 15 Jan 2016, 23:15:38 UTC - in response to Message 5992.  
Last modified: 15 Jan 2016, 23:20:33 UTC


It is about initial starting/initialization of few new WUs in parallel - because at this point full rosetta database (which now HUGE ~4000 files, ~350 Mb) + WUs data is extracting from project folder to dataslotsx folders for each WU. This put really high stress and slowdown on HDDs.



u may like to try out running R@h/Ralph on Linux, i saw that Linux tend to use huge disk cache some 1GB quite common, i'd think that could account for part of the efficiency of running R@h/Ralph on Linux vs MS Win. i'd think that'd even beat SSD in speed. Just that u'd need sufficient RAM to buffer the number of parallel threads running concurrently :)
ID: 5998 · Report as offensive    Reply Quote
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 250
Credit: 543,579
RAC: 0
Message 5999 - Posted: 16 Jan 2016, 2:19:12 UTC

Plus the majority of developers/labs use linux and very few support/use windows. I added a short random sleep up to 10 seconds. Just posted 3.71 now.
ID: 5999 · Report as offensive    Reply Quote
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 250
Credit: 543,579
RAC: 0
Message 6000 - Posted: 16 Jan 2016, 2:32:41 UTC

3.71 is not much different than 3.70. It includes a fix for a recently introduced bug that prevents the graphics app from hanging in an infinite loop if the worker app dies quickly before the graphics app is done initializing.
ID: 6000 · Report as offensive    Reply Quote
Etienne Guyot

Send message
Joined: 28 Apr 06
Posts: 3
Credit: 64,332
RAC: 0
Message 6001 - Posted: 17 Jan 2016, 13:15:24 UTC - in response to Message 6000.  

Hello,
v3.71 freezes Boinc Manager when I try to display the graphics (it was not the case with the prevoius 3.68 beta with blue graphics)
I have to kill boinc manager and restart it (fortunatly, it didn't kill the other project tasks running in //)
Computer: i7, 4GHz, 16GB, Win7pro 64-bit, SSD (prog), HDD (data), integrated graphics (MB: Asus Z97-Pro)

ID: 6001 · Report as offensive    Reply Quote
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 250
Credit: 543,579
RAC: 0
Message 6002 - Posted: 18 Jan 2016, 5:38:10 UTC - in response to Message 6001.  

Did you wait at least a minute to a few? It may just be waiting for things to get initialized.
ID: 6002 · Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 840
Credit: 1,888,960
RAC: 0
Message 6003 - Posted: 18 Jan 2016, 8:29:55 UTC

Just a curiosity: what is the failure rate "acceptable" to pass on Rosetta? 5%?10%?
ID: 6003 · Report as offensive    Reply Quote
Etienne Guyot

Send message
Joined: 28 Apr 06
Posts: 3
Credit: 64,332
RAC: 0
Message 6004 - Posted: 18 Jan 2016, 18:54:33 UTC - in response to Message 6002.  

Of course, yes... At least the time to try several ways to kill the graphic app which stayed in memory... But without success.
Reading previous similar posts, I believe a kernel thread was not responding.
(I could not kill the process, even with the /force option.)
ID: 6004 · Report as offensive    Reply Quote
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 250
Credit: 543,579
RAC: 0
Message 6005 - Posted: 19 Jan 2016, 4:17:52 UTC

Does this happen every time you try the graphics app? I'm not sure what is causing this.

I wonder if anyone else is having this issue.
ID: 6005 · Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 840
Credit: 1,888,960
RAC: 0
Message 6006 - Posted: 19 Jan 2016, 7:57:31 UTC - in response to Message 6004.  

Reading previous similar posts, I believe a kernel thread was not responding.(I could not kill the process, even with the /force option.)


Try Process Explorer to see the problem
ID: 6006 · Report as offensive    Reply Quote
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 250
Credit: 543,579
RAC: 0
Message 6007 - Posted: 19 Jan 2016, 18:52:48 UTC

I'd like to push this version out to R@h soon but this issue is worrying me. Is this reproducible and are others seeing this same issue?
ID: 6007 · Report as offensive    Reply Quote
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 250
Credit: 543,579
RAC: 0
Message 6008 - Posted: 19 Jan 2016, 19:17:51 UTC

Aside from this, the current app looks pretty stable. Most of the errors are due to manual aborts.


RALPH@home: Pass percentage by platform

Application OS Total
Results Pass Rate Fail Rate Failed
Downloading Failed
Downloaded Failed
Computing Failed
Uploading Failed
Uploaded Aborted
371 Darwin x86 461 100.0000% 0.0000% 0.0000% 0.0000% 0.0000% 0.0000% 0.0000% 0.0000%
371 Linux 2956 94.3843% 5.4804% 0.0000% 0.0000% 0.2368% 0.0000% 0.0000% 5.2436%
371 Windows 6832 93.7207% 6.2500% 0.0439% 0.0000% 0.8636% 0.0000% 0.0000% 5.3425%

ID: 6008 · Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 840
Credit: 1,888,960
RAC: 0
Message 6009 - Posted: 20 Jan 2016, 6:42:40 UTC

"Rosetta Mini for Android is not available for you type of computer"

??
ID: 6009 · Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 840
Credit: 1,888,960
RAC: 0
Message 6010 - Posted: 20 Jan 2016, 7:58:03 UTC - in response to Message 6009.  

"Rosetta Mini for Android is not available for you type of computer"

??


Edit: SOLVED. Restart boinc client and receive new work

P.S.
All graphic ok also on new wus
ID: 6010 · Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 840
Credit: 1,888,960
RAC: 0
Message 6011 - Posted: 20 Jan 2016, 9:35:36 UTC - in response to Message 6008.  


Application OS Total
Results Pass Rate Fail Rate Failed
Downloading Failed
Downloaded Failed
Computing Failed
Uploading Failed
Uploaded Aborted
371 Darwin x86 461 100.0000% 0.0000% 0.0000% 0.0000% 0.0000% 0.0000% 0.0000% 0.0000%
371 Linux 2956 94.3843% 5.4804% 0.0000% 0.0000% 0.2368% 0.0000% 0.0000% 5.2436%
371 Windows 6832 93.7207% 6.2500% 0.0439% 0.0000% 0.8636% 0.0000% 0.0000% 5.3425%


Seems to be ok.
ID: 6011 · Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 12 · Next

Message boards : RALPH@home bug list : Rosetta mini beta and/or android 3.61-3.83



©2024 University of Washington
http://www.bakerlab.org