Bug reports for 5.65

Message boards : RALPH@home bug list : Bug reports for 5.65

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Rhiju
Volunteer moderator
Project developer
Project scientist

Send message
Joined: 14 Feb 06
Posts: 161
Credit: 3,725
RAC: 0
Message 3119 - Posted: 22 May 2007, 6:57:32 UTC

So far things have been pretty stable with 5.64; thanks to everyone for posting about crashes on ralph, its helped us fine-tune our workunits. This update just has a small addition to give us more control over the energy function assumed in RNA workunits.
ID: 3119 · Report as offensive    Reply Quote
k6

Send message
Joined: 16 May 07
Posts: 3
Credit: 3,025
RAC: 0
Message 3120 - Posted: 22 May 2007, 8:32:20 UTC
Last modified: 22 May 2007, 8:36:50 UTC

For this time I´ve computed 2 units using 5.65Beta, but both ends with compute error. Here it is:

521641
521561

Now, my computer is working on next units, i´ll edit this post and insert an additional links to failed WUs, if they occurs.

Sorry for bad english.
ID: 3120 · Report as offensive    Reply Quote
k6

Send message
Joined: 16 May 07
Posts: 3
Credit: 3,025
RAC: 0
Message 3121 - Posted: 22 May 2007, 9:46:34 UTC
Last modified: 22 May 2007, 10:17:21 UTC

Next bad WU:
521745

Good WUs:
521746
521770
ID: 3121 · Report as offensive    Reply Quote
Profile anders n

Send message
Joined: 16 Feb 06
Posts: 166
Credit: 131,419
RAC: 0
Message 3122 - Posted: 22 May 2007, 11:06:22 UTC

https://ralph.bakerlab.org/result.php?resultid=521786

- exit code -1073741819 (0xc0000005)

Anders n
ID: 3122 · Report as offensive    Reply Quote
Profile feet1st

Send message
Joined: 7 Mar 06
Posts: 313
Credit: 116,623
RAC: 0
Message 3123 - Posted: 22 May 2007, 14:36:07 UTC

Mine failed too after just 17:27.

Unrecoverable error for result 1urnA_BOINC_ABRELAX_BARCODE-1urnA-frags83__2061_1_1 ( - exit code -1073741819 (0xc0000005))

ID: 3123 · Report as offensive    Reply Quote
mdettweiler
Avatar

Send message
Joined: 4 Apr 07
Posts: 11
Credit: 1,010
RAC: 0
Message 3124 - Posted: 22 May 2007, 22:03:15 UTC

I got an error for this workunit. Here's what my BOINC client logged about the error:

5/22/2007 5:56:27 PM|ralph@home|Deferring communication for 1 min 0 sec
5/22/2007 5:56:27 PM|ralph@home|Reason: Unrecoverable error for result CNTRL_01RELAXNATIVE_SAVE_ALL_OUT_-1n0u_-_2064_9_0 ( - exit code -1073741819 (0xc0000005))
5/22/2007 5:56:28 PM|ralph@home|Computation for task CNTRL_01RELAXNATIVE_SAVE_ALL_OUT_-1n0u_-_2064_9_0 finished
5/22/2007 5:56:28 PM|ralph@home|Output file CNTRL_01RELAXNATIVE_SAVE_ALL_OUT_-1n0u_-_2064_9_0_0 for task CNTRL_01RELAXNATIVE_SAVE_ALL_OUT_-1n0u_-_2064_9_0 absent


The odd thing is, after it was done, my firewall told me that the Ralph application needed to access the internet. According to my firewall's logs, it sent back a couple of megabytes worth of information to the Ralph server after I clicked to allow internet access for the Ralph application. I've noticed that sometimes Ralph (and Rosetta, for that matter) workunits will oddly need to send back tons of data to the server if there is an error and the workunit has to stop. Is this because BOINC otherwise won't send back any data if the workunit errors out, and the Rosetta/Ralph admins want to see more error data than BOINC sends back?
ID: 3124 · Report as offensive    Reply Quote
Snagletooth

Send message
Joined: 4 May 07
Posts: 67
Credit: 134,427
RAC: 0
Message 3125 - Posted: 22 May 2007, 22:04:57 UTC

unrecoverable error

522925
ID: 3125 · Report as offensive    Reply Quote
Profile feet1st

Send message
Joined: 7 Mar 06
Posts: 313
Credit: 116,623
RAC: 0
Message 3126 - Posted: 22 May 2007, 22:43:20 UTC - in response to Message 3124.  



The odd thing is, after it was done, my firewall told me that the Ralph application needed to access the internet... Is this because BOINC otherwise won't send back any data if the workunit errors out, and the Rosetta/Ralph admins want to see more error data than BOINC sends back?


When a failure occurs, additional details about the failure are collected and reported directly to the project by the application rather then via BOINC Manager. I always end up with the firewall msg and it's been sitting there long enough I assume it times out and doesn't send the goods. So, when I remember, and see a new Ralph application, I always download it from here (if it hasn't come down already), then identify it to my firewall to allow internet access.
ID: 3126 · Report as offensive    Reply Quote
Profile feet1st

Send message
Joined: 7 Mar 06
Posts: 313
Credit: 116,623
RAC: 0
Message 3127 - Posted: 22 May 2007, 22:49:35 UTC

We have a pulse!


ID: 3127 · Report as offensive    Reply Quote
Profile EvoDude
Avatar

Send message
Joined: 18 Feb 06
Posts: 28
Credit: 639,833
RAC: 0
Message 3128 - Posted: 23 May 2007, 0:59:41 UTC - in response to Message 3124.  

I've had 7 'Computation Errors' in the last couple of days too. They report a client error and grant 0 credit.

The affected results ID's are:- 524262 524263 524220 524165 524163 524121 522749

Any chance someone could look into this problem and get back to us.

I got an error for this workunit. Here's what my BOINC client logged about the error:

5/22/2007 5:56:27 PM|ralph@home|Deferring communication for 1 min 0 sec
5/22/2007 5:56:27 PM|ralph@home|Reason: Unrecoverable error for result CNTRL_01RELAXNATIVE_SAVE_ALL_OUT_-1n0u_-_2064_9_0 ( - exit code -1073741819 (0xc0000005))
5/22/2007 5:56:28 PM|ralph@home|Computation for task CNTRL_01RELAXNATIVE_SAVE_ALL_OUT_-1n0u_-_2064_9_0 finished
5/22/2007 5:56:28 PM|ralph@home|Output file CNTRL_01RELAXNATIVE_SAVE_ALL_OUT_-1n0u_-_2064_9_0_0 for task CNTRL_01RELAXNATIVE_SAVE_ALL_OUT_-1n0u_-_2064_9_0 absent


The odd thing is, after it was done, my firewall told me that the Ralph application needed to access the internet. According to my firewall's logs, it sent back a couple of megabytes worth of information to the Ralph server after I clicked to allow internet access for the Ralph application. I've noticed that sometimes Ralph (and Rosetta, for that matter) workunits will oddly need to send back tons of data to the server if there is an error and the workunit has to stop. Is this because BOINC otherwise won't send back any data if the workunit errors out, and the Rosetta/Ralph admins want to see more error data than BOINC sends back?


ID: 3128 · Report as offensive    Reply Quote
Dr Who Fan
Avatar

Send message
Joined: 2 Sep 06
Posts: 76
Credit: 107,857
RAC: 0
Message 3129 - Posted: 23 May 2007, 1:07:59 UTC
Last modified: 23 May 2007, 1:08:36 UTC

Error:
https://ralph.bakerlab.org/result.php?resultid=522994

<core_client_version>5.8.16</core_client_version>
<![CDATA[
<message>
- exit code -1073741819 (0xc0000005)
</message>
<stderr_txt>
# cpu_run_time_pref: 7200
# random seed: 2664719


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x009B9479 read attempt to address 0x000C0010

Engaging BOINC Windows Runtime Debugger...



********************
ID: 3129 · Report as offensive    Reply Quote
Dr Who Fan
Avatar

Send message
Joined: 2 Sep 06
Posts: 76
Credit: 107,857
RAC: 0
Message 3130 - Posted: 23 May 2007, 1:09:57 UTC

Error:
https://ralph.bakerlab.org/result.php?resultid=523659

<core_client_version>5.8.16</core_client_version>
<![CDATA[
<message>
- exit code -1073741819 (0xc0000005)
</message>
<stderr_txt>
# cpu_run_time_pref: 7200
# random seed: 2662174


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x009B93DB read attempt to address 0x1133FE5C

Engaging BOINC Windows Runtime Debugger...



********************
ID: 3130 · Report as offensive    Reply Quote
Rhiju
Volunteer moderator
Project developer
Project scientist

Send message
Joined: 14 Feb 06
Posts: 161
Credit: 3,725
RAC: 0
Message 3131 - Posted: 23 May 2007, 1:24:31 UTC - in response to Message 3130.  

Hi everybody:

Looks like there are a lot of problems with this version, actually -- a very high error rate. I'll track it down! Thanks for posting.


Error:
https://ralph.bakerlab.org/result.php?resultid=523659

5.8.16

- exit code -1073741819 (0xc0000005)


# cpu_run_time_pref: 7200
# random seed: 2662174


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x009B93DB read attempt to address 0x1133FE5C

Engaging BOINC Windows Runtime Debugger...



********************


ID: 3131 · Report as offensive    Reply Quote
Admin

Send message
Joined: 20 Apr 07
Posts: 1
Credit: 218
RAC: 0
Message 3132 - Posted: 23 May 2007, 5:40:36 UTC

ID: 3132 · Report as offensive    Reply Quote
Deborah Goldsmith

Send message
Joined: 16 Feb 06
Posts: 3
Credit: 253,789
RAC: 0
Message 3133 - Posted: 23 May 2007, 6:13:35 UTC

Lots of crashes on Mac OS X Intel -- here's a representative:

Exception: EXC_BAD_ACCESS (0x0001)
Codes: KERN_INVALID_ADDRESS (0x0001) at 0x0895fe3c

Thread 0:
0 libSystem.B.dylib 0x90038297 mach_wait_until + 7
1 libSystem.B.dylib 0x90037f19 sleep + 121
2 ...beta_5.65_i686-apple-darwin 0x00ea6402 0x1000 + 15356930
3 ...beta_5.65_i686-apple-darwin 0x00e97baa 0x1000 + 15297450
4 ...beta_5.65_i686-apple-darwin 0x00e97c3a 0x1000 + 15297594
5 ...beta_5.65_i686-apple-darwin 0x00e97210 0x1000 + 15294992
6 ...beta_5.65_i686-apple-darwin 0x007685fb 0x1000 + 7763451
7 ...beta_5.65_i686-apple-darwin 0x0000260e 0x1000 + 5646
8 ...beta_5.65_i686-apple-darwin 0x00002535 0x1000 + 5429

Thread 1 Crashed:
0 ...beta_5.65_i686-apple-darwin 0x00abf8cb 0x1000 + 11266251
1 ...beta_5.65_i686-apple-darwin 0x004e9ad2 0x1000 + 5147346
2 ...beta_5.65_i686-apple-darwin 0x008874fd 0x1000 + 8938749
3 ...beta_5.65_i686-apple-darwin 0x0088a88c 0x1000 + 8951948
4 ...beta_5.65_i686-apple-darwin 0x00555758 0x1000 + 5588824
5 ...beta_5.65_i686-apple-darwin 0x00556c59 0x1000 + 5594201
6 ...beta_5.65_i686-apple-darwin 0x00bd587a 0x1000 + 12404858
7 ...beta_5.65_i686-apple-darwin 0x00bd8444 0x1000 + 12416068
8 ...beta_5.65_i686-apple-darwin 0x00084547 0x1000 + 537927
9 ...beta_5.65_i686-apple-darwin 0x006064d7 0x1000 + 6313175
10 ...beta_5.65_i686-apple-darwin 0x00768548 0x1000 + 7763272
11 ...beta_5.65_i686-apple-darwin 0x00e97a25 0x1000 + 15297061
12 libSystem.B.dylib 0x90024987 _pthread_body + 84

Thread 2:
0 libSystem.B.dylib 0x90038297 mach_wait_until + 7
1 libSystem.B.dylib 0x90037f19 sleep + 121
2 ...beta_5.65_i686-apple-darwin 0x00e9a1cc 0x1000 + 15307212
3 ...beta_5.65_i686-apple-darwin 0x00e8e606 0x1000 + 15259142
4 libSystem.B.dylib 0x90024987 _pthread_body + 84

Thread 3:
0 libSystem.B.dylib 0x90038297 mach_wait_until + 7
1 libSystem.B.dylib 0x90037f19 sleep + 121
2 ...beta_5.65_i686-apple-darwin 0x00dfb342 0x1000 + 14656322
3 libSystem.B.dylib 0x90024987 _pthread_body + 84

Thread 1 crashed with X86 Thread State (32-bit):
eax: 0x0895fe38 ebx: 0x00abf80e ecx: 0x00000004 edx: 0xb3fff190
edi: 0x01490c00 esi: 0x0c2848bc ebp: 0xb3ffe098 esp: 0xb3ffe040
ss: 0x0000001f efl: 0x00010203 eip: 0x00abf8cb cs: 0x00000017
ds: 0x0000001f es: 0x0000001f fs: 0x0000001f gs: 0x00000037

I have the full report if you want it.
ID: 3133 · Report as offensive    Reply Quote
Odysseus

Send message
Joined: 4 May 07
Posts: 23
Credit: 16,331
RAC: 0
Message 3134 - Posted: 23 May 2007, 7:06:11 UTC

My dual-G5 Mac (OS 10.4.9) had an error with exit status 193 (0xc1) after about five minutes of crunching on CNTRL_01ABRELAX_SAVE_ALL_OUT_-1cc8A-_filters_2065_17_2, having successfully completed two other v5.65 tasks. My G4/733 (OS 10.3.9) also has returned two v5.65 results without errors.
ID: 3134 · Report as offensive    Reply Quote
Profile Conan
Avatar

Send message
Joined: 16 Feb 06
Posts: 364
Credit: 1,368,421
RAC: 0
Message 3135 - Posted: 23 May 2007, 8:55:31 UTC

Work Units starting with TST1 are not Validating, after completing 6 hours of cruching they generate huge amouts of bug reports then say invalid and don't validate

https://ralph.bakerlabs.org/result.php?resultid=523246
https://ralph.bakerlabs.org/result.php?resultid=523511
https://ralph.bakerlabs.org/result.php?resultid=523862
https://ralph.bakerlabs.org/result.php?resultid=523863

Also https://ralph.bakerlabs.org/result.php?resultid=523529
gave the following

<core_client_version>5.8.16</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1)
</message>
<stderr_txt>
Graphics are disabled due to configuration...
# cpu_run_time_pref: 21600
# random seed: 2663137
ERROR:: Unable to determine sequence length from pdb file
ERROR:: Exit from: pose.cc line: 1929

Hope this helps.
ID: 3135 · Report as offensive    Reply Quote
Profile anders n

Send message
Joined: 16 Feb 06
Posts: 166
Credit: 131,419
RAC: 0
Message 3136 - Posted: 23 May 2007, 9:45:36 UTC

Intresting WU. I have a 4 H setting and it errord out after 1H 23min. Next chrucher has a 1 H setting and it came out ok after 62 min.

https://ralph.bakerlab.org/workunit.php?wuid=462868

Anders n
ID: 3136 · Report as offensive    Reply Quote
HTH

Send message
Joined: 6 Mar 06
Posts: 9
Credit: 10,226
RAC: 0
Message 3137 - Posted: 23 May 2007, 10:20:06 UTC

A compute error: 521610.
ID: 3137 · Report as offensive    Reply Quote
Billy

Send message
Joined: 29 Jan 07
Posts: 14
Credit: 7,865
RAC: 0
Message 3138 - Posted: 23 May 2007, 13:10:44 UTC
Last modified: 23 May 2007, 13:12:00 UTC

Result 522762

Intel Mac on OSX
ID: 3138 · Report as offensive    Reply Quote
1 · 2 · Next

Message boards : RALPH@home bug list : Bug reports for 5.65



©2024 University of Washington
http://www.bakerlab.org