Report \"failure when switching projects without keeping applications in memory\" bugs here

Message boards : RALPH@home bug list : Report \"failure when switching projects without keeping applications in memory\" bugs here

To post messages, you must log in.

Previous · 1 · 2 · 3

AuthorMessage
Psycodad

Send message
Joined: 16 Feb 06
Posts: 14
Credit: 2,157
RAC: 0
Message 630 - Posted: 25 Feb 2006, 14:51:34 UTC

Actually i did not pay attention to this and was not aware about the fact, that leaving in memory set to no, could cost a lot of cylces

How can i test or see , if this preferences cost me a lot cycles?

Anyway the first time of switching projects with the RalphVersion 4.88 went well
i will keep an Eye on it ^^


Edit: Sorry for my bad English
ID: 630 · Report as offensive    Reply Quote
Psycodad

Send message
Joined: 16 Feb 06
Posts: 14
Credit: 2,157
RAC: 0
Message 632 - Posted: 25 Feb 2006, 15:28:16 UTC

Thank you for this information!

ID: 632 · Report as offensive    Reply Quote
Psycodad

Send message
Joined: 16 Feb 06
Posts: 14
Credit: 2,157
RAC: 0
Message 652 - Posted: 25 Feb 2006, 20:50:23 UTC
Last modified: 25 Feb 2006, 20:50:40 UTC

It happened again!
This WU crashed after switching to CPDN. It was for the second time :(



25.02.2006 21:25:23|climateprediction.net|Restarting result sulphur_in3i_100869742_0 using sulphur_cycle version 422
25.02.2006 21:25:23|ralph@home|Pausing result BARCODE_30_1aiu__221_19_0 (removed from memory)
25.02.2006 21:25:28|climateprediction.net|Sending scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi
25.02.2006 21:25:33|climateprediction.net|Scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi succeeded
25.02.2006 21:25:36|ralph@home|Unrecoverable error for result BARCODE_30_1aiu__221_19_0 ( - exit code -1073741819 (0xc0000005))
25.02.2006 21:25:36||request_reschedule_cpus: process exited
25.02.2006 21:25:36|ralph@home|Computation for result BARCODE_30_1aiu__221_19_0 finished




Result

WU


Should I set the preferences to "Leave in memory"-yes for the next time and look what will happen then?
ID: 652 · Report as offensive    Reply Quote
Mike Gelvin
Avatar

Send message
Joined: 17 Feb 06
Posts: 50
Credit: 55,397
RAC: 0
Message 699 - Posted: 27 Feb 2006, 8:48:08 UTC

Many Client errors on this computer due to swap outs.

https://ralph.bakerlab.org/show_host_detail.php?hostid=611

ID: 699 · Report as offensive    Reply Quote
Profile sslickerson

Send message
Joined: 15 Feb 06
Posts: 17
Credit: 4,006
RAC: 0
Message 709 - Posted: 27 Feb 2006, 23:34:34 UTC
Last modified: 27 Feb 2006, 23:43:18 UTC

Swap out failure:

3/27/2006 4:05:54 PM|ralph@home|Computation for result BARCODE_30_1cc8A_215_1_1 finished
3/27/2006 4:05:54 PM|ralph@home|Output file BARCODE_30_1cc8A_215_1_1_0 for result BARCODE_30_1cc8A_215_1_1 exceeds size limit.
3/27/2006 4:05:54 PM|ralph@home|File size: 148638304.000000 bytes. Limit: 25000000.000000 bytes
3/27/2006 4:05:54 PM||Allowing work fetch again.
3/27/2006 4:05:54 PM||Resuming round-robin CPU scheduling.
3/27/2006 4:05:55 PM|ralph@home|Unrecoverable error for result BARCODE_30_1cc8A_215_1_1 (<file_xfer_error> <file_name>BARCODE_30_1cc8A_215_1_1_0</file_name> <error_code>-131</error_code> <error_message></error_message></file_xfer_error>)
3/27/2006 4:05:56 PM|ralph@home|Started upload of BARCODE_30_1cc8A_215_1_1_1

Result

Workunit

Other Recent failures




ID: 709 · Report as offensive    Reply Quote
Profile sslickerson

Send message
Joined: 15 Feb 06
Posts: 17
Credit: 4,006
RAC: 0
Message 724 - Posted: 28 Feb 2006, 6:54:02 UTC

Had this failure after I pulled Ralph out of memory--My fault, sorry :(

2/27/2006 11:53:05 PM|ralph@home|Unrecoverable error for result HOMSdc_homDB024_1dcj__229_3_0 (Incorrect function. (0x1) - exit code 1 (0x1))
2/27/2006 11:53:05 PM||request_reschedule_cpus: process exited
2/27/2006 11:53:05 PM|ralph@home|Computation for result HOMSdc_homDB024_1dcj__229_3_0 finished
2/27/2006 11:54:08 PM|ralph@home|Sending scheduler request to https://ralph.bakerlab.org/ralph_cgi/cgi
2/27/2006 11:54:08 PM|ralph@home|Reason: To fetch work
2/27/2006 11:54:08 PM|ralph@home|Requesting 8640 seconds of new work, and reporting 1 results
2/27/2006 11:54:13 PM|ralph@home|Scheduler request to https://ralph.bakerlab.org/ralph_cgi/cgi succeeded
2/27/2006 11:54:13 PM|ralph@home|Message from server: No work sent
2/27/2006 11:54:13 PM|ralph@home|Message from server: (reached daily quota of 1 results)
2/27/2006 11:54:13 PM|ralph@home|No work from project

Result

WUID

Host




ID: 724 · Report as offensive    Reply Quote
Mike Gelvin
Avatar

Send message
Joined: 17 Feb 06
Posts: 50
Credit: 55,397
RAC: 0
Message 776 - Posted: 1 Mar 2006, 20:59:32 UTC

4 successful runs and one client error (Incorrect function. (0x1) - exit code 1 (0x1)) since reseting to leave app in memory.
XP Pro.
Not sure when the switch to 4.90 happened during this run.

ID: 776 · Report as offensive    Reply Quote
Psycodad

Send message
Joined: 16 Feb 06
Posts: 14
Credit: 2,157
RAC: 0
Message 777 - Posted: 1 Mar 2006, 22:41:46 UTC - in response to Message 679.  

It happened again!
This WU crashed after switching to CPDN. It was for the second time :(



25.02.2006 21:25:23|climateprediction.net|Restarting result sulphur_in3i_100869742_0 using sulphur_cycle version 422
25.02.2006 21:25:23|ralph@home|Pausing result BARCODE_30_1aiu__221_19_0 (removed from memory)
25.02.2006 21:25:28|climateprediction.net|Sending scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi
25.02.2006 21:25:33|climateprediction.net|Scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi succeeded
25.02.2006 21:25:36|ralph@home|Unrecoverable error for result BARCODE_30_1aiu__221_19_0 ( - exit code -1073741819 (0xc0000005))
25.02.2006 21:25:36||request_reschedule_cpus: process exited
25.02.2006 21:25:36|ralph@home|Computation for result BARCODE_30_1aiu__221_19_0 finished




Result

WU


Should I set the preferences to "Leave in memory"-yes for the next time and look what will happen then?

Yes I would try that and see if it fixes the problem. Let us know what happens



All right ^^
Test with Preferences set to " Leave in memory" - yes, finished without any problems.
For the next WU, i will change it back to "no"

ID: 777 · Report as offensive    Reply Quote
Mike Gelvin
Avatar

Send message
Joined: 17 Feb 06
Posts: 50
Credit: 55,397
RAC: 0
Message 814 - Posted: 5 Mar 2006, 7:10:05 UTC

It is safe to say this problem was not fixed with 4.91.

I ran 3 units with remove from memory, all failed:

3/4/2006 1:08:11 AM|ralph@home|Pausing result BARCODE_30_1opd__236_10_0 (removed from memory)
3/4/2006 1:08:11 AM|SETI@home|Starting result 04ap01ab.1337.20720.97162.1.69_2 using setiathome version 418
3/4/2006 1:08:13 AM|ralph@home|Unrecoverable error for result BARCODE_30_1opd__236_10_0 ( - exit code -1073741819 (0xc0000005))
3/4/2006 1:08:13 AM||request_reschedule_cpus: process exited
3/4/2006 1:08:13 AM|ralph@home|Computation for result BARCODE_30_1opd__236_10_0 finished

https://ralph.bakerlab.org/result.php?resultid=13917

3/4/2006 2:59:48 AM|ralph@home|Scheduler request to https://ralph.bakerlab.org/ralph_cgi/cgi succeeded
3/4/2006 4:22:24 AM|ralph@home|Pausing result BARCODE_30_1fna__236_10_0 (removed from memory)
3/4/2006 4:22:24 AM|SETI@home|Starting result 21mr01aa.28899.15153.17324.1.120_0 using setiathome version 418
3/4/2006 4:22:26 AM|ralph@home|Unrecoverable error for result BARCODE_30_1fna__236_10_0 ( - exit code -1073741819 (0xc0000005))
3/4/2006 4:22:26 AM||request_reschedule_cpus: process exited
3/4/2006 4:22:26 AM|ralph@home|Computation for result BARCODE_30_1fna__236_10_0 finished

https://ralph.bakerlab.org/result.php?resultid=13918


3/4/2006 12:45:18 PM||request_reschedule_cpus: files downloaded
3/4/2006 12:45:18 PM|ralph@home|Pausing result BARCODE_30_1bm8__236_10_0 (removed from memory)
3/4/2006 12:45:18 PM|SETI@home|Starting result 15fe03aa.3675.3361.892344.1.108_1 using setiathome version 418
3/4/2006 12:45:20 PM|SETI@home|Finished download of 07au01ab.18308.26112.359658.1.146
3/4/2006 12:45:20 PM|SETI@home|Throughput 93799 bytes/sec
3/4/2006 12:45:20 PM|SETI@home|Finished download of 07au01ab.18308.26112.359658.1.141
3/4/2006 12:45:20 PM|SETI@home|Throughput 201484 bytes/sec
3/4/2006 12:45:20 PM|SETI@home|Started download of 07au01ab.18308.26112.359658.1.142
3/4/2006 12:45:21 PM|ralph@home|Unrecoverable error for result BARCODE_30_1bm8__236_10_0 ( - exit code -1073741819 (0xc0000005))
3/4/2006 12:45:21 PM||request_reschedule_cpus: process exited
3/4/2006 12:45:21 PM||request_reschedule_cpus: files downloaded
3/4/2006 12:45:21 PM||request_reschedule_cpus: files downloaded
3/4/2006 12:45:21 PM|ralph@home|Computation for result BARCODE_30_1bm8__236_10_0 finished

https://ralph.bakerlab.org/result.php?resultid=13919

I will now change over to leave in memory and report back.


ID: 814 · Report as offensive    Reply Quote
Psycodad

Send message
Joined: 16 Feb 06
Posts: 14
Credit: 2,157
RAC: 0
Message 822 - Posted: 6 Mar 2006, 19:50:12 UTC

I have to aggree with Mike

This Wu, crashed after 5 hours of crunching.
Before this failure it switched several times without any problems

06.03.2006 20:48:25|ralph@home|Pausing result BARCODE_30_1a19A_225_3_1 (removed from memory)
06.03.2006 20:48:25|Einstein@Home|Restarting result r1_0197.0__162_S4R2a_0 using albert version 437
06.03.2006 20:49:14|ralph@home|Unrecoverable error for result BARCODE_30_1a19A_225_3_1 ( - exit code -1073741819 (0xc0000005))
06.03.2006 20:49:14||request_reschedule_cpus: process exited
06.03.2006 20:49:14|ralph@home|Computation for result BARCODE_30_1a19A_225_3_1 finished


WU
Result
ID: 822 · Report as offensive    Reply Quote
Mike Gelvin
Avatar

Send message
Joined: 17 Feb 06
Posts: 50
Credit: 55,397
RAC: 0
Message 823 - Posted: 6 Mar 2006, 20:02:22 UTC

I only had one more work unit when I switched over to leave app in memory. It completed successfully.

https://ralph.bakerlab.org/result.php?resultid=13901

ID: 823 · Report as offensive    Reply Quote
pisi78

Send message
Joined: 16 Feb 06
Posts: 7
Credit: 2,020
RAC: 0
Message 828 - Posted: 7 Mar 2006, 12:37:41 UTC

https://ralph.bakerlab.org/workunit.php?wuid=11287


naturally with application not keeped in memory and running seti and einstein :)


ID: 828 · Report as offensive    Reply Quote
MatthewBChambers

Send message
Joined: 13 Mar 06
Posts: 4
Credit: 5,367
RAC: 0
Message 961 - Posted: 23 Mar 2006, 5:07:10 UTC

Hi, I am not sure if this is to post here or not--it is an error from rosetta but happened when I have ralph going.


3/22/2006 6:47:26 PM|rosetta@home|Starting result FA_RLXwh_hom015_1who__362_15_0 using rosetta version 482
3/22/2006 6:47:27 PM||request_reschedule_cpus: process exited
3/22/2006 7:27:20 PM|rosetta@home|Sending scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi
3/22/2006 7:27:20 PM|rosetta@home|Reason: To report results
3/22/2006 7:27:20 PM|rosetta@home|Reporting 1 results
3/22/2006 7:27:25 PM|rosetta@home|Scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi succeeded
3/22/2006 7:47:28 PM|climateprediction.net|Restarting result sulphur_igi8_000861200_0 using sulphur_cycle version 422
3/22/2006 7:47:28 PM|rosetta@home|Pausing result FA_RLXwh_hom015_1who__362_15_0 (removed from memory)
3/22/2006 7:47:30 PM|rosetta@home|Unrecoverable error for result FA_RLXwh_hom015_1who__362_15_0 ( - exit code -164 (0xffffff5c))
3/22/2006 7:47:30 PM||request_reschedule_cpus: process exited
3/22/2006 7:47:30 PM|rosetta@home|Computation for result FA_RLXwh_hom015_1who__362_15_0 finished



Here are the BOINC 'startup' messages if it helps

3/21/2006 11:44:05 AM||Starting BOINC client version 5.2.13 for windows_intelx86
3/21/2006 11:44:05 AM||libcurl/7.14.0 OpenSSL/0.9.8 zlib/1.2.3
3/21/2006 11:44:05 AM||Data directory: C:Program FilesBOINC
3/21/2006 11:44:06 AM||Processor: 1 GenuineIntel Intel(R) Pentium(R) 4 CPU 1.60GHz
3/21/2006 11:44:06 AM||Memory: 511.23 MB physical, 1.22 GB virtual
3/21/2006 11:44:06 AM||Disk: 37.30 GB total, 8.68 GB free
3/21/2006 11:44:06 AM|rosetta@home|Computer ID: 57038; location: home; project prefs: default
3/21/2006 11:44:06 AM|boincsimap|Computer ID: 6371; location: home; project prefs: default
3/21/2006 11:44:06 AM|climateprediction.net|Computer ID: 265183; location: ; project prefs: default
3/21/2006 11:44:06 AM|Einstein@Home|Computer ID: 450344; location: home; project prefs: default
3/21/2006 11:44:06 AM|LHC@home|Computer ID: 77254; location: ; project prefs: default
3/21/2006 11:44:06 AM|Predictor @ Home|Computer ID: 169507; location: home; project prefs: default
3/21/2006 11:44:06 AM|ralph@home|Computer ID: 1791; location: ; project prefs: default
3/21/2006 11:44:06 AM|SETI@home|Computer ID: 1585570; location: home; project prefs: default
3/21/2006 11:44:06 AM|SZTAKI Desktop Grid|Computer ID: 11497; location: home; project prefs: default
3/21/2006 11:44:06 AM|World Community Grid|Computer ID: 22915; location: ; project prefs: default
3/21/2006 11:44:06 AM||General prefs: from ralph@home (last modified 2006-03-18 11:57:57)
3/21/2006 11:44:06 AM||General prefs: using your defaults
3/21/2006 11:44:07 AM||Remote control not allowed; using loopback address

ID: 961 · Report as offensive    Reply Quote
Mike Gelvin
Avatar

Send message
Joined: 17 Feb 06
Posts: 50
Credit: 55,397
RAC: 0
Message 1001 - Posted: 27 Mar 2006, 20:01:21 UTC

Eight work units in a row have completed without a hitch. All were 4.94 running 8 hours of CPU time with swap outs (out of memory) every 2 hours.

Version 4.95 is in the Queue.

Win 2000 SP4 Intel Pent 4 2.40GHz

Looks like this one is put to bed. Thanks!

ID: 1001 · Report as offensive    Reply Quote
Previous · 1 · 2 · 3

Message boards : RALPH@home bug list : Report \"failure when switching projects without keeping applications in memory\" bugs here



©2024 University of Washington
http://www.bakerlab.org