Rosetta mini beta and/or android 3.61-3.83

Message boards : RALPH@home bug list : Rosetta mini beta and/or android 3.61-3.83

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 11 · Next

AuthorMessage
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 221
Credit: 504,833
RAC: 12
Message 5967 - Posted: 6 Jan 2016, 21:22:01 UTC - in response to Message 5966.  

It looks like the quota was set for 20 on our server and I just updated it to 50. I'm not sure where the 6 is coming from. Thanks for helping with this.
ID: 5967 · Report as offensive    Reply Quote
Vikram K. Mulligan

Send message
Joined: 6 Jan 16
Posts: 2
Credit: 0
RAC: 0
Message 5968 - Posted: 6 Jan 2016, 22:29:11 UTC - in response to Message 5956.  

>Ok, new graphic is pretty but seems to be no "coherent".
>1) All blue, included "accepted energy" box. Not an optimal solution, imho.
>2) Blue only simulations boxes (searching, accepted, low energy) and return to >black for others boxes.

Could you provide a screenshot? Also, what's your hardware and OS?
ID: 5968 · Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 534
Credit: 782,800
RAC: 88
Message 5970 - Posted: 7 Jan 2016, 8:21:01 UTC - in response to Message 5967.  
Last modified: 7 Jan 2016, 8:42:32 UTC

Thanks for helping with this.


No problem, it's beta project so it's normal.

On my notebook i clear the project's folder and the scheduler download these files:
minirosetta_3.68_windows_x86_64
minirosetta_beta_3.64_windows_x86_64
minirosetta_database_a73b1f4
minirosetta_database_b7c7d78
minirosetta_graphics_3.64_windows_x86_64
minirosetta_graphics_3.68_windows_x86_64

+input files


P.S.
The problem remains :-(
ID: 5970 · Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 534
Credit: 782,800
RAC: 88
Message 5971 - Posted: 7 Jan 2016, 8:31:04 UTC - in response to Message 5968.  
Last modified: 7 Jan 2016, 8:31:25 UTC

Could you provide a screenshot?

Screen

Also, what's your hardware and OS?

3 pc with integrated Intel, integrated Nvidia and AMD 260x
ID: 5971 · Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 534
Credit: 782,800
RAC: 88
Message 5973 - Posted: 7 Jan 2016, 8:47:36 UTC - in response to Message 5966.  

I have to wait 12 hours to try new wus


Ok. On this pc, after the detach/re-attach, scheduler downloads only (different from notebook):
minirosetta_3.68_windows_x86_64
minirosetta_database_a73b1f4
minirosetta_graphics_3.68_windows_x86_64

+input files


Hope this help
ID: 5973 · Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 534
Credit: 782,800
RAC: 88
Message 5974 - Posted: 7 Jan 2016, 10:30:24 UTC

Same errors even after detach.
Upload of these wus is over 2mb.


CPU time 258.3438
ERROR: false
ERROR:: Exit from: ..\..\..\src\apps\public\boinc\minirosetta.cc line: 96
DummyMover::apply() should never have been called! (JobDistributor/Parser should have replaced DummyMover.)

ERROR: false
ERROR:: Exit from: ..\..\..\src\apps\public\boinc\minirosetta.cc line: 96
======================================================
DONE :: 99 starting structures 1201 cpu seconds
This process generated 99 decoys from 99 attempts
======================================================
BOINC :: WS_max 0

BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down cleanly ...
called boinc_finish
ID: 5974 · Report as offensive    Reply Quote
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 221
Credit: 504,833
RAC: 12
Message 5976 - Posted: 7 Jan 2016, 19:02:31 UTC - in response to Message 5974.  

Can you provide the slot directory data of a job to me?

https://boinc.berkeley.edu/trac/wiki/BoincFiles

It should be in your boinc data directory at boinc/slots/N where N is a number. If you are running multiple projects, you'll have to look for one that is running a Ralph@h job.


ID: 5976 · Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 534
Credit: 782,800
RAC: 88
Message 5977 - Posted: 7 Jan 2016, 19:19:52 UTC - in response to Message 5976.  

Can you provide the slot directory data of a job to me?


Slot 1
Slot1screen
ID: 5977 · Report as offensive    Reply Quote
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 221
Credit: 504,833
RAC: 12
Message 5978 - Posted: 7 Jan 2016, 19:55:25 UTC

I think we are close to finding the cause of this bug. I'll keep you all posted. Thanks [VENETO] boboviz.
ID: 5978 · Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 534
Credit: 782,800
RAC: 88
Message 5979 - Posted: 8 Jan 2016, 14:18:44 UTC - in response to Message 5978.  

I think we are close to finding the cause of this bug. I'll keep you all posted.


Meanwhile, i continue to crunch :-)
ID: 5979 · Report as offensive    Reply Quote
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 221
Credit: 504,833
RAC: 12
Message 5980 - Posted: 8 Jan 2016, 19:47:05 UTC

The bug was found and the fixed version is currently building..... I'll post the updated app tomorrow if all goes as planned. Thanks!
ID: 5980 · Report as offensive    Reply Quote
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 221
Credit: 504,833
RAC: 12
Message 5981 - Posted: 9 Jan 2016, 0:57:41 UTC

It's taking some time to build so I may have to post the update on Monday.
ID: 5981 · Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 534
Credit: 782,800
RAC: 88
Message 5982 - Posted: 10 Jan 2016, 17:02:11 UTC - in response to Message 5981.  

It's taking some time to build so I may have to post the update on Monday.


Have we to clean the project's folder?
ID: 5982 · Report as offensive    Reply Quote
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 221
Credit: 504,833
RAC: 12
Message 5983 - Posted: 11 Jan 2016, 22:38:06 UTC - in response to Message 5982.  

No need to clean up anything. I still need to build the 32bit linux app but the other platforms are done now so I hope to post the update later today after some quick tests etc.
ID: 5983 · Report as offensive    Reply Quote
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 221
Credit: 504,833
RAC: 12
Message 5984 - Posted: 12 Jan 2016, 19:48:43 UTC

I ran into a bug for the 32bit linux build and also someone in the lab requested a minor update so I have to update and rebuild the apps along with tracking the bug. I think I know what is causing the bug so it shouldn't take too long.
ID: 5984 · Report as offensive    Reply Quote
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 221
Credit: 504,833
RAC: 12
Message 5986 - Posted: 13 Jan 2016, 19:50:46 UTC

I just updated the minirosetta_beta app to version 3.70. This version has various bug fixes in addition to protocol updates. The major additions to this version are an improved score function and a new cyclic peptide modeling protocol. The graphics application was also updated to include new colors and a light source for spacefill rendering used for the new cyclic peptide modeling protocol. Spacefill rendering is only used as default for this protocol since the additional graphics load is minimal due to the small size of the proteins to be modeled.
ID: 5986 · Report as offensive    Reply Quote
Mad_Max

Send message
Joined: 15 Nov 12
Posts: 11
Credit: 297,065
RAC: 0
Message 5988 - Posted: 14 Jan 2016, 14:29:11 UTC - in response to Message 5967.  

It looks like the quota was set for 20 on our server and I just updated it to 50. I'm not sure where the 6 is coming from. Thanks for helping with this.


AFAIK quota in BOINC became dynamic long ago and if computer report task with errors quota is cut. And go back to default(set in server settings) if computer begin reporting successful tasks.
So it can be any from 1 per CPU core / day to default value. 6 probable is 1 WU/day on 6 core/thread CPU.
ID: 5988 · Report as offensive    Reply Quote
Mad_Max

Send message
Joined: 15 Nov 12
Posts: 11
Credit: 297,065
RAC: 0
Message 5989 - Posted: 14 Jan 2016, 15:04:45 UTC

With 3.70 release a got strange bug - it looks like BOINC goes to infinite loop while extracting new RALPH WUs - disk work very hard non stop (it is classic HDD on this PC, not SSD) but WU can not load.
After ~15 min of non stop disk work i open process explorer to look what happening. I notice this:
boinc.exe process constantly reading and writing something to/from disk.
minirosetta_beta process (there was 3 of them on 4x core CPU, 4th was from WCG and work fine) starts, running for some time with low CPU utilization and exit. Then start again, work for some time (like ~1 min) and exit. And so on.
BOINC Manager (GUI) was not responsive at this time - it work, but not updating any status and not respond to any commands like pause or abort WUs (looks like it lost connection to boinc.exe, or boinc.exe not responding).

So I kill all BOINC and rosetta processes via process explorer and restart BOINC.
Same thing happened again - BOINC stuck while try to start 3 new RALPS WUs in parallel and stress HDD hard.
This time i try another thing - instead of killing minirosetta_beta process i suspend (pause by OS) 2 of 3 processes. 1 still running and after some time begin work normal: utilize full CPU core, stop hammering HDD, BOINC Manager begin work normal too.
Later i resume 2nd minirosetta_beta - it start OK, and 3rd minirosetta_beta - all OK too.

I do full restart - all work fine after restart too. And i can not reproduce this bug anymore.

It looks for me like latest BOINC ( i use 7.6.9 x86) or RALPH have some sort of timeout for loading(starting) of new WUs. And it is set to relative low value (like ~1 min). And if few rosetta WUs try to start at same time it slow downs classic HDDs so hard (because of extracting a few thousands small files for each WU from archive) so run out of this timeout and get restarted in a loop.
ID: 5989 · Report as offensive    Reply Quote
Old man

Send message
Joined: 16 Jul 15
Posts: 1
Credit: 9,427
RAC: 0
Message 5990 - Posted: 14 Jan 2016, 19:11:25 UTC - in response to Message 5989.  

With 3.70 release a got strange bug - it looks like BOINC goes to infinite loop while extracting new RALPH WUs - disk work very hard non stop (it is classic HDD on this PC, not SSD) but WU can not load.
After ~15 min of non stop disk work i open process explorer to look what happening. I notice this:
boinc.exe process constantly reading and writing something to/from disk.
minirosetta_beta process (there was 3 of them on 4x core CPU, 4th was from WCG and work fine) starts, running for some time with low CPU utilization and exit. Then start again, work for some time (like ~1 min) and exit. And so on.
BOINC Manager (GUI) was not responsive at this time - it work, but not updating any status and not respond to any commands like pause or abort WUs (looks like it lost connection to boinc.exe, or boinc.exe not responding).

So I kill all BOINC and rosetta processes via process explorer and restart BOINC.
Same thing happened again - BOINC stuck while try to start 3 new RALPS WUs in parallel and stress HDD hard.
This time i try another thing - instead of killing minirosetta_beta process i suspend (pause by OS) 2 of 3 processes. 1 still running and after some time begin work normal: utilize full CPU core, stop hammering HDD, BOINC Manager begin work normal too.
Later i resume 2nd minirosetta_beta - it start OK, and 3rd minirosetta_beta - all OK too.

I do full restart - all work fine after restart too. And i can not reproduce this bug anymore.

It looks for me like latest BOINC ( i use 7.6.9 x86) or RALPH have some sort of timeout for loading(starting) of new WUs. And it is set to relative low value (like ~1 min). And if few rosetta WUs try to start at same time it slow downs classic HDDs so hard (because of extracting a few thousands small files for each WU from archive) so run out of this timeout and get restarted in a loop.


Hey. I started running ralph today and i havent seen any troubles with my hdd. I have rosetta mini beta 3.70 version of program.
ID: 5990 · Report as offensive    Reply Quote
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 221
Credit: 504,833
RAC: 12
Message 5991 - Posted: 14 Jan 2016, 19:41:36 UTC

That sounds like a bad situation. I am not aware of a way to increase the timeout setting(s). I would limit the number of cpus that can run Ralph and R@h jobs to 1 or 2 if you continue to see this issue.
ID: 5991 · Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 11 · Next

Message boards : RALPH@home bug list : Rosetta mini beta and/or android 3.61-3.83



©2018 University of Washington
http://www.bakerlab.org