Message boards : RALPH@home bug list : Rosetta mini beta and/or android 3.61-3.83
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 12 · Next
Author | Message |
---|---|
Mad_Max Send message Joined: 15 Nov 12 Posts: 15 Credit: 404,700 RAC: 0 |
It is a not about situation with few Rosetta WUs running at same time. Running 4 WUs or even 6-8 WUs in parallel on HDD is OK if PC have enough RAM. Because at running stage R@H use disk only a little and do not cause any problems. At loading stage (after BOINC restart / PC reboot) it is much more disk load from R@H/RALPH, but HDD can still cope with atleast 4 WUs parallel startup without problems (it takes few minutes of high HDD load, but not cause any errors). It is about initial starting/initialization of few new WUs in parallel - because at this point full rosetta database (which now HUGE ~4000 files, ~350 Mb) + WUs data is extracting from project folder to dataslotsx folders for each WU. This put really high stress and slowdown on HDDs. Usual BOINC initialize just one WU at a time because begin new WUs only after one of previous finished and all WUs have different run times so start time naturally shifting. So no problems too. But in some conditions like: - a mere coincidence of initialization time of few WUs - after long outage of project (boinc is "hungry" for jobs from this project and try to start an many WUs from it as it can immediately after download finished) - after project reset by user - at initial project addition to BOINC Multiple WUs try to initialize at same time and problem arise. So it is not often situation. And seen only on HDD, while SSD do not have any problems at all. So hard limit CPU cores allowed to run R@H would be like be beheaded instead of shave :) Though of course the problems with shaving is also solve this way lol |
Mad_Max Send message Joined: 15 Nov 12 Posts: 15 Credit: 404,700 RAC: 0 |
P.S. Problem with huge slowdowns at initialization on HDD is known long ago for R@H/RALPH. But over time, it is constantly growing because main minirosetta database grow too with almost each new release: - when i start crunching for R@H it was "only" ~1500 files and 100 Mb - 2 years ago it was ~2500 files and 150 Mb - now it is ~4000 files and ~350 Mb for last minirosetta v.3.70 So perhaps there is no any new bugs in BOINC or R@H and we just finally reach HDD limits with last update and start triggering some sort of BOINC timeouts/watchdog. |
dekim Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 20 Jan 06 Posts: 250 Credit: 543,579 RAC: 0 |
We can add a random sleep before extracting the database but that time would be wasted of course. Maybe the sleep time could be a function of the number of cpus allowed to run jobs. |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 910 Credit: 1,892,541 RAC: 294 |
So perhaps there is no any new bugs in BOINC or R@H and we just finally reach HDD limits with last update and start triggering some sort of BOINC timeouts/watchdog. The future is SSD... :-) |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 910 Credit: 1,892,541 RAC: 294 |
This version seems to be quite stable. Do you think to pass in "production" on Rosetta? |
dekim Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 20 Jan 06 Posts: 250 Credit: 543,579 RAC: 0 |
I discovered a bug in the graphics app caused by the recent changes which can cause it to hang. I need to fix this. |
sgaboinc Send message Joined: 8 Jul 14 Posts: 20 Credit: 4,159 RAC: 0 |
u may like to try out running R@h/Ralph on Linux, i saw that Linux tend to use huge disk cache some 1GB quite common, i'd think that could account for part of the efficiency of running R@h/Ralph on Linux vs MS Win. i'd think that'd even beat SSD in speed. Just that u'd need sufficient RAM to buffer the number of parallel threads running concurrently :) |
dekim Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 20 Jan 06 Posts: 250 Credit: 543,579 RAC: 0 |
Plus the majority of developers/labs use linux and very few support/use windows. I added a short random sleep up to 10 seconds. Just posted 3.71 now. |
dekim Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 20 Jan 06 Posts: 250 Credit: 543,579 RAC: 0 |
3.71 is not much different than 3.70. It includes a fix for a recently introduced bug that prevents the graphics app from hanging in an infinite loop if the worker app dies quickly before the graphics app is done initializing. |
Etienne Guyot Send message Joined: 28 Apr 06 Posts: 3 Credit: 64,332 RAC: 0 |
Hello, v3.71 freezes Boinc Manager when I try to display the graphics (it was not the case with the prevoius 3.68 beta with blue graphics) I have to kill boinc manager and restart it (fortunatly, it didn't kill the other project tasks running in //) Computer: i7, 4GHz, 16GB, Win7pro 64-bit, SSD (prog), HDD (data), integrated graphics (MB: Asus Z97-Pro) |
dekim Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 20 Jan 06 Posts: 250 Credit: 543,579 RAC: 0 |
Did you wait at least a minute to a few? It may just be waiting for things to get initialized. |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 910 Credit: 1,892,541 RAC: 294 |
Just a curiosity: what is the failure rate "acceptable" to pass on Rosetta? 5%?10%? |
Etienne Guyot Send message Joined: 28 Apr 06 Posts: 3 Credit: 64,332 RAC: 0 |
Of course, yes... At least the time to try several ways to kill the graphic app which stayed in memory... But without success. Reading previous similar posts, I believe a kernel thread was not responding. (I could not kill the process, even with the /force option.) |
dekim Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 20 Jan 06 Posts: 250 Credit: 543,579 RAC: 0 |
Does this happen every time you try the graphics app? I'm not sure what is causing this. I wonder if anyone else is having this issue. |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 910 Credit: 1,892,541 RAC: 294 |
Reading previous similar posts, I believe a kernel thread was not responding.(I could not kill the process, even with the /force option.) Try Process Explorer to see the problem |
dekim Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 20 Jan 06 Posts: 250 Credit: 543,579 RAC: 0 |
I'd like to push this version out to R@h soon but this issue is worrying me. Is this reproducible and are others seeing this same issue? |
dekim Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 20 Jan 06 Posts: 250 Credit: 543,579 RAC: 0 |
Aside from this, the current app looks pretty stable. Most of the errors are due to manual aborts. RALPH@home: Pass percentage by platform Application OS Total Results Pass Rate Fail Rate Failed Downloading Failed Downloaded Failed Computing Failed Uploading Failed Uploaded Aborted 371 Darwin x86 461 100.0000% 0.0000% 0.0000% 0.0000% 0.0000% 0.0000% 0.0000% 0.0000% 371 Linux 2956 94.3843% 5.4804% 0.0000% 0.0000% 0.2368% 0.0000% 0.0000% 5.2436% 371 Windows 6832 93.7207% 6.2500% 0.0439% 0.0000% 0.8636% 0.0000% 0.0000% 5.3425% |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 910 Credit: 1,892,541 RAC: 294 |
"Rosetta Mini for Android is not available for you type of computer" ?? |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 910 Credit: 1,892,541 RAC: 294 |
"Rosetta Mini for Android is not available for you type of computer" Edit: SOLVED. Restart boinc client and receive new work P.S. All graphic ok also on new wus |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 910 Credit: 1,892,541 RAC: 294 |
Seems to be ok. |
Message boards :
RALPH@home bug list :
Rosetta mini beta and/or android 3.61-3.83
©2024 University of Washington
http://www.bakerlab.org