Minirosetta Beta 3.06

Message boards : RALPH@home bug list : Minirosetta Beta 3.06

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 574
Credit: 993,059
RAC: 843
Message 5271 - Posted: 2 May 2011, 21:54:48 UTC

Why beta? Everything on Ralph is beta....
ID: 5271 · Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 13 Jan 09
Posts: 82
Credit: 292,716
RAC: 20
Message 5272 - Posted: 3 May 2011, 4:43:14 UTC - in response to Message 5271.  

Are you sure? I thought it was alpha instead.
ID: 5272 · Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 574
Credit: 993,059
RAC: 843
Message 5273 - Posted: 3 May 2011, 5:00:49 UTC - in response to Message 5272.  

Are you sure? I thought it was alpha instead.


:-)
ID: 5273 · Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 574
Credit: 993,059
RAC: 843
Message 5274 - Posted: 3 May 2011, 7:53:20 UTC - in response to Message 5272.  

Seriously, we now that ralph is "alpha/beta/not stable/etc" project.
Use "beta" in the name of 3.06 version stands for?
ID: 5274 · Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 574
Credit: 993,059
RAC: 843
Message 5275 - Posted: 3 May 2011, 17:20:07 UTC

Graphic crashs on 3.06 wus....
ID: 5275 · Report as offensive    Reply Quote
Profile [SG-FC] dingdong

Send message
Joined: 17 Mar 09
Posts: 17
Credit: 4,532,381
RAC: 1,832
Message 5276 - Posted: 3 May 2011, 17:44:17 UTC

Hi,
This WUs crashed Boincmanager (reboot was necessary):

T515_ba_rs_stg0_lrlxjcst_t000__casp9_SAVE_ALL_OUT_15177_88_1
T515_ba_rs_stg0_lrlxjcst_t000__casp9_SAVE_ALL_OUT_15177_94_1

<core_client_version>6.10.58</core_client_version>
<![CDATA[
<message>
Maximum disk usage exceeded
</message>
]]>

Invalid

Boincview ment "ressource limit exceeded",
client state ist "aborted by user".

my preferences "Disk and memory usage" are:
Use at most 500 GB disk space
Leave at least 1 GB disk space free
Use at most 99% of total disk space
Use at most 90% of page file (swap space)
Use at most 95% of memory when computer is in use
Use at most 100% of memory when computer is not in use

The target CPU run time is 4 hours, but they run more than 6h until crash.
ID: 5276 · Report as offensive    Reply Quote
Profile Saenger
Avatar

Send message
Joined: 28 Feb 06
Posts: 12
Credit: 57,474
RAC: 7
Message 5277 - Posted: 4 May 2011, 4:41:14 UTC
Last modified: 4 May 2011, 4:44:29 UTC

Woke up this morning to an idling computer, while it pretended to run 3 RALPH on parallel.
I suspended RALPH for 15 seconds, CPU-usage resumed.
I reactivated RALPH, the WUs started again one after the other.
One was reset in CPU-time from 8h to 0, the second one from 8:30 to 2:20, the last one is still at 4:40,.
The CPU-usage is all right again, let's see what will wait for me once I return from work ;)

Here's a picture from my system monitor:

Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki
ID: 5277 · Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 574
Credit: 993,059
RAC: 843
Message 5279 - Posted: 4 May 2011, 11:22:21 UTC

2028060

ERROR: Cannot open PDB file "1xngA.pdb"
ERROR:: Exit from: ..\..\..\src\core\import_pose\import_pose.cc line: 199
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish

Invalid
ID: 5279 · Report as offensive    Reply Quote
Profile Conan
Avatar

Send message
Joined: 16 Feb 06
Posts: 358
Credit: 1,357,923
RAC: 14
Message 5280 - Posted: 4 May 2011, 12:17:56 UTC - in response to Message 5279.  

2028060

ERROR: Cannot open PDB file "1xngA.pdb"
ERROR:: Exit from: ..\..\..\src\core\import_pose\import_pose.cc line: 199
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish

Invalid


I have the same error on 2 Windows work units

this one
and this one

Also had an "Unhandled Exception Record"
"Access Violation (0xc0000005) at address 0x00470309 read attempt to address 0x00000014
On WU 2028097

All errors were on my Windows machines. Linux are all OK.

Conan
ID: 5280 · Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 13 Jan 09
Posts: 82
Credit: 292,716
RAC: 20
Message 5281 - Posted: 4 May 2011, 16:37:43 UTC

Rosetta Mini Beta 3.06
T515_ba_rs_stg0_lrljcst_t000__casp9_SAVE_ALL_OUT_15177_82

Elapsed 08:53:16
Progress 8.771%
To completion 22:18:51
CPU time at last checkpoint 00:51:55
CPU time 00:52:37

Looks like a good example of a problem I've seen recently at Rosetta@Home - BOINC thinks it is running constantly, but it is actually using no CPU time at all now.

I packed most of the contents of that slot into a .zip file just before I aborted that workunit. Do I need to send it somewhere?
ID: 5281 · Report as offensive    Reply Quote
Profile Sysadm@Nbg

Send message
Joined: 9 Dec 09
Posts: 7
Credit: 210,188
RAC: 0
Message 5282 - Posted: 4 May 2011, 18:02:09 UTC - in response to Message 5280.  

2028060

ERROR: Cannot open PDB file "1xngA.pdb"
ERROR:: Exit from: ..\..\..\src\core\import_pose\import_pose.cc line: 199
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish

Invalid


I have the same error on 2 Windows work units

this one
and this one

...

All errors were on my Windows machines. Linux are all OK.

Conan


Got the same error on a Linux 64bit machine >>klick<<
ID: 5282 · Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 574
Credit: 993,059
RAC: 843
Message 5283 - Posted: 4 May 2011, 19:45:29 UTC

2028070

<core_client_version>6.10.60</core_client_version>
<![CDATA[
<message>
Maximum disk usage exceeded
</message>
]]>

On win7 32 bit
ID: 5283 · Report as offensive    Reply Quote
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 221
Credit: 524,571
RAC: 19
Message 5284 - Posted: 4 May 2011, 21:16:30 UTC - in response to Message 5271.  

Why beta? Everything on Ralph is beta....


It has been quite a long time since we've updated the application and we are worried about backwards compatibility so we created a minirosetta_beta application which is the updated app we are testing and the minirosetta application is the actual production application running on R@h.
ID: 5284 · Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 574
Credit: 993,059
RAC: 843
Message 5285 - Posted: 5 May 2011, 4:50:13 UTC - in response to Message 5284.  


It has been quite a long time since we've updated the application and we are worried about backwards compatibility so we created a minirosetta_beta application which is the updated app we are testing and the minirosetta application is the actual production application running on R@h.


I thought the old code had been abandoned.....
ID: 5285 · Report as offensive    Reply Quote
Profile feet1st

Send message
Joined: 7 Mar 06
Posts: 313
Credit: 113,747
RAC: 0
Message 5286 - Posted: 7 May 2011, 20:26:33 UTC - in response to Message 5275.  
Last modified: 7 May 2011, 20:29:19 UTC

Graphics just crashes on Windows. Window starts to open and then dies.

And shortly thereafter, BOINC seems to lose control of the process and it no longer gets CPU time, even though BOINC Manager says it is running.
ID: 5286 · Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 13 Jan 09
Posts: 82
Credit: 292,716
RAC: 20
Message 5292 - Posted: 10 May 2011, 3:26:50 UTC

T0617_casp9_symm_cm_SAVE_ALL_OUT_IGNORE_THE _REST_control_15317_68

Another workunit that stopped using any CPU time at all shortly after a checkpoint, WITHOUT boinc.exe recognizing this.

CPU time at last checkpoint 00:04:28
CPU time 00:04:30
Elapsed time 02:37:06

Still not clear if the Tthrottle extension I'm using to prevent the computer from overheating has anything to do with the problem.

Hope you at least got enough debugging output to pin down the problem more.
ID: 5292 · Report as offensive    Reply Quote
Ironworker16
Avatar

Send message
Joined: 17 Nov 09
Posts: 3
Credit: 41,840
RAC: 0
Message 5293 - Posted: 10 May 2011, 21:40:01 UTC

I have 8 work units running for 18 hours and when I just check the CPU time most were between 14 to 18 minutes and one went 53 minutes and all are running high priority.
ID: 5293 · Report as offensive    Reply Quote
skgiven

Send message
Joined: 15 Dec 07
Posts: 8
Credit: 210,259
RAC: 23
Message 5295 - Posted: 10 May 2011, 23:23:41 UTC - in response to Message 5293.  
Last modified: 10 May 2011, 23:48:12 UTC

Running 6 WU's on a SB (3.7GHz). Estimated run time is about 1h 15min for the task at 65%, the next two others think the task will take 2h and 4h, but I expect this may be change.

Glad to see a new app version in progress (3.06).

2035968 1791676 10 May 2011 19:56:12 UTC 10 May 2011 23:00:39 UTC Over Success Done 3,588.55 31.75 22.08
2035937 1791645 10 May 2011 19:51:54 UTC 10 May 2011 22:16:57 UTC Over Success Done 3,579.52 31.67 25.30
2035891 1791599 10 May 2011 19:47:45 UTC 10 May 2011 22:25:13 UTC Over Success Done 4,109.05 36.35 29.56
2035844 1791552 10 May 2011 19:43:33 UTC 10 May 2011 22:00:24 UTC Over Success Done 3,610.23 31.94 21.31
2035829 1791537 10 May 2011 19:39:13 UTC 10 May 2011 20:48:45 UTC Over Success Done 3,559.34 31.49 24.60
ID: 5295 · Report as offensive    Reply Quote
Ironworker16
Avatar

Send message
Joined: 17 Nov 09
Posts: 3
Credit: 41,840
RAC: 0
Message 5296 - Posted: 10 May 2011, 23:53:16 UTC - in response to Message 5293.  

I closed and restarted boinc and the times were reset to the cpu time and they have all started to made progress again, 1 is complete.
ID: 5296 · Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 574
Credit: 993,059
RAC: 843
Message 5297 - Posted: 11 May 2011, 19:08:27 UTC

2036148

Outcome Client error
Client state Compute error
<message>
Funzione non corretta. (0x1) - exit code 1 (0x1)
</message>
Unpacking zip data: ../../projects/ralph.bakerlab.org/minirosetta_database_rev41800.zip
Unpacking WU data ...
Unpacking data: ../../projects/ralph.bakerlab.org/T0589_symm_cm_SAVE_ALL_OUT_IGNORE_THE_REST_control.zip
Setting database description ...
Setting up checkpointing ...
Setting up graphics native ...
BOINC:: Worker startup.
Starting watchdog...
Watchdog active.
Continuing computation from checkpoint: chk_S_3LC0A_0001_FastRelax__chk1_fa ... success!
Continuing computation from checkpoint: chk_S_3LC0A_0001_FastRelax__chk2_fa ... success!
Continuing computation from checkpoint: chk_S_3LC0A_0001_FastRelax__chk3_fa ... success!
Continuing computation from checkpoint: chk_S_3LC0A_0001_FastRelax__chk4_fa ... success!
Continuing computation from checkpoint: chk_S_3LC0A_0001_FastRelax__chk5_fa ... success!

</stderr_txt>
]]>

Validate state Invalid
ID: 5297 · Report as offensive    Reply Quote
1 · 2 · Next

Message boards : RALPH@home bug list : Minirosetta Beta 3.06



©2018 University of Washington
http://www.bakerlab.org