Switching between projects with applications removed from memory

Message boards : Current tests : Switching between projects with applications removed from memory

To post messages, you must log in.

Previous · 1 · 2 · 3

AuthorMessage
[B^S] sTrey
Avatar

Send message
Joined: 15 Feb 06
Posts: 58
Credit: 15,430
RAC: 0
Message 909 - Posted: 19 Mar 2006, 0:57:20 UTC
Last modified: 19 Mar 2006, 0:59:19 UTC

FWIW Ralph had behaved fine both when swapped and not, but it didn't survive a pc restart forced by a windows lockup. It was not the active project at the time, chkdsk found nothing scrambled, and none of the other projects lost their work (even cpdn seasonal!) -- but the Ralph wu which was at hour 14 of 16, has restarted at zero. Bummer.
ID: 909 · Report as offensive    Reply Quote
Profile Fuzzy Hollynoodles
Avatar

Send message
Joined: 19 Feb 06
Posts: 37
Credit: 2,089
RAC: 0
Message 914 - Posted: 19 Mar 2006, 7:12:48 UTC
Last modified: 19 Mar 2006, 7:14:25 UTC

Rosetta crashed BIG time!

3/19/2006 8:07:15 AM|rosetta@home|Pausing result HOMSti_homDB019_1tif__352_1732_1 (removed from memory)
3/19/2006 8:07:15 AM|SETI@home Beta Test|Restarting result 01jl01ab.16610.114.798576.3.175_4 using setiathome_enhanced version 506
3/19/2006 8:07:16 AM||Rescheduling CPU: project op

...

3/19/2006 8:07:24 AM|rosetta@home|Unrecoverable error for result HOMSti_homDB019_1tif__352_1732_1 ( - exit code -164 (0xffffff5c))
3/19/2006 8:07:24 AM||Rescheduling CPU: process exited
3/19/2006 8:07:24 AM|rosetta@home|Computation for result HOMSti_homDB019_1tif__352_1732_1 finished

This WU: https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10786875
Result: https://boinc.bakerlab.org/rosetta/result.php?resultid=13549302

I see though that this WU has crashed for somebody else, so maybe a coincidence? Even I don't think it is.

Ralph WU runs fine. I've tried to force it to run by suspending the others, and then resuming them, so the Ralph WU are preempted, and no crashes (so far).


[color=navy][b]"I'm trying to maintain a shred of dignity in this world." - Me[/b][/color]

ID: 914 · Report as offensive    Reply Quote
Marky-UK

Send message
Joined: 16 Feb 06
Posts: 5
Credit: 1,530
RAC: 0
Message 917 - Posted: 19 Mar 2006, 12:02:40 UTC

Rosetta's just crashed for me too: https://ralph.bakerlab.org/workunit.php?wuid=17490

Unrecoverable error for result HB_BARCODE_30_1enh__352_83_0 ( - exit code -1073741811 (0xc000000d))
ID: 917 · Report as offensive    Reply Quote
Profile Fuzzy Hollynoodles
Avatar

Send message
Joined: 19 Feb 06
Posts: 37
Credit: 2,089
RAC: 0
Message 918 - Posted: 19 Mar 2006, 16:31:09 UTC

It happened again.

3/19/2006 5:36:43 PM|rosetta@home|Pausing result HB_BARCODE_30_1bq9A_351_14302_0 (removed from memory)
3/19/2006 5:36:44 PM|rosetta@home|Unrecoverable error for result HB_BARCODE_30_1bq9A_351_14302_0 ( - exit code -164 (0xffffff5c))
3/19/2006 5:36:44 PM||Rescheduling CPU: process exited
3/19/2006 5:36:44 PM|rosetta@home|Computation for result HB_BARCODE_30_1bq9A_351_14302_0 finished

Rosetta WU: https://boinc.bakerlab.org/rosetta/workunit.php?wuid=11537937
Result: https://boinc.bakerlab.org/rosetta/result.php?resultid=14251499

So this is it, I'm changing back to keeping WU's in memory while preempted untill you get this bug fixed. Else you devs should say that we can't crunch Rosetta and Ralph WU's on the same computer!



[color=navy][b]"I'm trying to maintain a shred of dignity in this world." - Me[/b][/color]

ID: 918 · Report as offensive    Reply Quote
Profile Contact
Avatar

Send message
Joined: 16 Feb 06
Posts: 19
Credit: 132,286
RAC: 0
Message 930 - Posted: 20 Mar 2006, 1:14:21 UTC

Looks good. No matter what i do, can't get ralph to fail under Win98 or XP while switching with app removed from memory.
ID: 930 · Report as offensive    Reply Quote
Mike Gelvin
Avatar

Send message
Joined: 17 Feb 06
Posts: 50
Credit: 55,397
RAC: 0
Message 938 - Posted: 22 Mar 2006, 0:28:26 UTC - in response to Message 918.  

It happened again.

3/19/2006 5:36:43 PM|rosetta@home|Pausing result HB_BARCODE_30_1bq9A_351_14302_0 (removed from memory)
3/19/2006 5:36:44 PM|rosetta@home|Unrecoverable error for result HB_BARCODE_30_1bq9A_351_14302_0 ( - exit code -164 (0xffffff5c))
3/19/2006 5:36:44 PM||Rescheduling CPU: process exited
3/19/2006 5:36:44 PM|rosetta@home|Computation for result HB_BARCODE_30_1bq9A_351_14302_0 finished

Rosetta WU: https://boinc.bakerlab.org/rosetta/workunit.php?wuid=11537937
Result: https://boinc.bakerlab.org/rosetta/result.php?resultid=14251499

So this is it, I'm changing back to keeping WU's in memory while preempted untill you get this bug fixed. Else you devs should say that we can't crunch Rosetta and Ralph WU's on the same computer!




You can crunch Ralph and Rosetta on the same computer. But if you don’t leave the apps in memory then Rosetta (from the Rosetta project NOT Ralph). does indeed have a good chance of failing. They have not yet updated Rosetta with the Ralph “leave in memory” fix.
When Rosetta gets fixed, I suspect you will see an app version greater than or equal to 4.93

ID: 938 · Report as offensive    Reply Quote
Big Whiskey
Avatar

Send message
Joined: 21 Mar 06
Posts: 3
Credit: 3,342
RAC: 0
Message 943 - Posted: 22 Mar 2006, 3:14:52 UTC

My crash was BIGGER!!!

When I openned BOINC manager it said it was running Rosetta and I tried to open Show Graphics. I caught a brief look of the SETI screen then close it self, I tried a few more times until it didn't open at all.

So I did A Restart without suspending BOINC. Wrong move apparently!
When BOINC started again I had lost several WU,one Seti,one Rosetta and two Seti Betas. And what was left BOINC would run for a minute then switch to next one.
I checked the message log file and found that five minutes before I restarted that Rosetta and Ralph start switching WU every second for some reason.

All the units that I lost have the same error code

2006-03-21 17:29:11 [ralph@home] Unrecoverable error for result HB_BARCODE_30_1fna__352_134_0 ( - exit code -1073741502 (0xc0000142))
2006-03-21 17:29:12 [rosetta@home] Unrecoverable error for result FA_RLXti_hom027_1tit__362_302_0 ( - exit code -1073741502 (0xc0000142))





ID: 943 · Report as offensive    Reply Quote
Profile Greg C. TNO

Send message
Joined: 26 Mar 06
Posts: 1
Credit: 51,485
RAC: 0
Message 1003 - Posted: 28 Mar 2006, 2:53:29 UTC

I just started running Ralph ON SEVERAL MACHINES, SO FAR NOT A SINGLE ONE has crashed, stalled or pulled the 1% trick. I have it set to remove applications from memory and have discontinued Rosetta, (no new work). The reliability of this version (4.92) seems great.

If you're curious, I'm running older equipment and Win2k Pro. Right now I have Ralph on a slew of PIII 600E's, and an AMD Atlon XP 1500. I could be more specific if it is relevant.

Regards.
ID: 1003 · Report as offensive    Reply Quote
david baker

Send message
Joined: 25 Mar 06
Posts: 3
Credit: 411
RAC: 0
Message 1004 - Posted: 28 Mar 2006, 4:16:41 UTC - in response to Message 1003.  

I just started running Ralph ON SEVERAL MACHINES, SO FAR NOT A SINGLE ONE has crashed, stalled or pulled the 1% trick. I have it set to remove applications from memory and have discontinued Rosetta, (no new work). The reliability of this version (4.92) seems great.

If you're curious, I'm running older equipment and Win2k Pro. Right now I have Ralph on a slew of PIII 600E's, and an AMD Atlon XP 1500. I could be more specific if it is relevant.

Regards.


that is great! what are other people finding?

ID: 1004 · Report as offensive    Reply Quote
Rayflic

Send message
Joined: 16 Feb 06
Posts: 2
Credit: 2,886
RAC: 0
Message 1039 - Posted: 7 Apr 2006, 22:00:10 UTC

A couple of problems (today)

4/6/2006 10:46:16 AM|rosetta@home|Unrecoverable error for result HBLR_1.0_1hz6_420_2926_0 ( - exit code -1073741811 (0xc000000d))


4/7/2006 12:55:47 PM|ralph@home|Unrecoverable error for result BARCODE_30_1ten__NATIVE_374_23_0 ( - exit code -1073741811 (0xc000000d))

ID: 1039 · Report as offensive    Reply Quote
Profile [AF>France>Est>Lorraine]Le Zam
Avatar

Send message
Joined: 2 Mar 06
Posts: 9
Credit: 3,278
RAC: 0
Message 1083 - Posted: 12 Apr 2006, 8:42:03 UTC

Hello i have this error today 1%
12/04/2006 10:52:10|ralph@home|Unrecoverable error for result HBLR_1.0_1b72_378_92_1 ( - exit code -1073741819 (0xc0000005))
Go ahead.
Thanks and have a good fun with Ralph !!!
ID: 1083 · Report as offensive    Reply Quote
Profile [AF>France>Est>Lorraine]Le Zam
Avatar

Send message
Joined: 2 Mar 06
Posts: 9
Credit: 3,278
RAC: 0
Message 1100 - Posted: 12 Apr 2006, 19:11:54 UTC

Another couple of bad Works-units !!!

12/04/2006 16:41:38|ralph@home|Unrecoverable error for result HBLR_1.0_1ogw_377_17_2 ( - exit code -1073741819 (0xc0000005))

12/04/2006 17:25:40|ralph@home|Unrecoverable error for result HBLR_1.0_1r69_378_63_2 ( - exit code -1073741819 (0xc0000005))

Bye

ID: 1100 · Report as offensive    Reply Quote
Profile [AF>France>Est>Lorraine]Le Zam
Avatar

Send message
Joined: 2 Mar 06
Posts: 9
Credit: 3,278
RAC: 0
Message 1127 - Posted: 13 Apr 2006, 16:04:15 UTC

13/04/2006 13:31:58|ralph@home|Starting result FACONTACTS_NOFILTERS_1c9oA_381_9_0 using rosetta_beta version 499
13/04/2006 13:53:31|ralph@home|Unrecoverable error for result FACONTACTS_NOFILTERS_1c9oA_381_9_0 ( - exit code -1073741819 (0xc0000005))
13/04/2006 13:53:31||request_reschedule_cpus: process exited
13/04/2006 13:53:31|ralph@home|Computation for result FACONTACTS_NOFILTERS_1c9oA_381_9_0 finished

13/04/2006 18:17:40|ralph@home|Unrecoverable error for result FACONTACTS_NOFILTERS_1cc8A_381_9_0 (aborted via GUI RPC)
I have stopped this Wu : 2H48 for 1.31%

ID: 1127 · Report as offensive    Reply Quote
Previous · 1 · 2 · 3

Message boards : Current tests : Switching between projects with applications removed from memory



©2024 University of Washington
http://www.bakerlab.org