Message boards : RALPH@home bug list : 4.87 - result exceeds size limit
Author | Message |
---|---|
doc :) Send message Joined: 16 Feb 06 Posts: 46 Credit: 4,437 RAC: 0 |
just finished my first 4.87 wu on my main pc: 24/02/2006 04:00:50|ralph@home|Computation for result BARCODE_30_1bm8__219_7_0 finished 24/02/2006 04:00:50|ralph@home|Output file BARCODE_30_1bm8__219_7_0_0 for result BARCODE_30_1bm8__219_7_0 exceeds size limit. 24/02/2006 04:00:50|ralph@home|File size: 39159730.000000 bytes. Limit: 25000000.000000 bytes 24/02/2006 04:00:51|ralph@home|Unrecoverable error for result BARCODE_30_1bm8__219_7_0 (<file_xfer_error> <file_name>BARCODE_30_1bm8__219_7_0_0</file_name> <error_code>-131</error_code> <error_message></error_message></file_xfer_error>) 24/02/2006 04:00:52|ralph@home|Started upload of BARCODE_30_1bm8__219_7_0_1 24/02/2006 04:00:58|ralph@home|Finished upload of BARCODE_30_1bm8__219_7_0_1 24/02/2006 04:00:58|ralph@home|Throughput 11314 bytes/sec i had the graphics open, but was busy doing something else and cant answer if it reached the end or not, it was pretty close to the expected runtime last time i looked at it though. my other pc finished one 4.87 wu without problems, it took only barely over 1 hour instead of the target time of 2 (could be because of some long running model, remote machine, didnt look at graphics). WU - result |
doc :) Send message Joined: 16 Feb 06 Posts: 46 Credit: 4,437 RAC: 0 |
cant edit my first post anymore, so heres a new one :) looking at the wu page now it shows cancelled, and so do all other wus i got here on both pcs, all are of the same type/batch. got them all suspended right now, should i still crunch them while they all show cancelled under error on their pages? |
dekim Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 20 Jan 06 Posts: 250 Credit: 543,579 RAC: 0 |
If you see that they are cancelled, please abort them. |
doc :) Send message Joined: 16 Feb 06 Posts: 46 Credit: 4,437 RAC: 0 |
aborted all that showed cancelled on their pages, thx for your answer. |
STE\/E Send message Joined: 16 Feb 06 Posts: 27 Credit: 2,226,442 RAC: 783 |
I've had a lot that showed canceled also, 70% - 80% failure rate so far. Something else I've noticed with the v4.87 is that they don't like to Upload either. Even if they finish successfully they will sit on my Pc's for hours doing nothing, it's like they are not even trying to Upload. If I try to Manually Upload them I get the same results, no message shows up that they will Retry or anything ... ??? |
STE\/E Send message Joined: 16 Feb 06 Posts: 27 Credit: 2,226,442 RAC: 783 |
The WU's that I said were not Uploading have all finally Uploaded. It only took them close to 8 hr's to it though. This was a problem on all my PC's, the other Projects that I am running had no problem Uploading their WU's right after finishing a WU. Maybe it's just something on the Server side ... ??? I ended up Aborting 48 WU's that showed Canceled already in the WU ID Pages & had 42 WU's that Erred out after Completing Successfully. From my Results Pages it looks like only about 25 give or take a few out of the 115 of the v4.87 WU's that I got actually Completed & Reported successfully ... |
KB7RZF Send message Joined: 16 Feb 06 Posts: 7 Credit: 1,426 RAC: 0 |
I had one that didn't show cancelled, but got the same message: 2/24/2006 5:48:34 AM|ralph@home|Output file BARCODE_30_1c9oA_219_24_0_0 for result BARCODE_30_1c9oA_219_24_0 exceeds size limit. 2/24/2006 5:48:34 AM|ralph@home|File size: 47170282.000000 bytes. Limit: 25000000.000000 bytes 2/24/2006 5:48:35 AM|ralph@home|Unrecoverable error for result BARCODE_30_1c9oA_219_24_0 (<file_xfer_error> <file_name>BARCODE_30_1c9oA_219_24_0_0</file_name> <error_code>-131</error_code> <error_message></error_message></file_xfer_error>) But mine uploaded and reported. And I have the Target CPU run time set to 2 hours also. Dunno if it helps. :-) Jeremy [edit] Ok, I lied, found out it was cancelled. :-) Aborted the other 2 also that showed cancelled now. So nevermind. |
Aglarond Send message Joined: 16 Feb 06 Posts: 11 Credit: 1,094 RAC: 0 |
Hi, I've got the same message with result ID = 8090: 24. 2. 2006 13:43:17|ralph@home|Restarting result BARCODE_30_1fna__219_21_0 using rosetta_beta version 487 24. 2. 2006 14:16:53||Rescheduling CPU: application exited 24. 2. 2006 14:16:53|ralph@home|Computation for result BARCODE_30_1fna__219_21_0 finished 24. 2. 2006 14:16:53|ralph@home|Output file BARCODE_30_1fna__219_21_0_0 for result BARCODE_30_1fna__219_21_0 exceeds size limit. 24. 2. 2006 14:16:53|ralph@home|File size: 163498460.000000 bytes. Limit: 25000000.000000 bytes 24. 2. 2006 14:16:54|ralph@home|Unrecoverable error for result BARCODE_30_1fna__219_21_0 (<file_xfer_error> <file_name>BARCODE_30_1fna__219_21_0_0</file_name> <error_code>-131</error_code> <error_message></error_message></file_xfer_error>) 24. 2. 2006 14:16:55|ralph@home|Started upload of BARCODE_30_1fna__219_21_0_1 24. 2. 2006 14:17:03|ralph@home|Finished upload of BARCODE_30_1fna__219_21_0_1 24. 2. 2006 14:17:03|ralph@home|Throughput 34639 bytes/sec But my WU is not cancelled, but shows Outcome = Client error Client state = Computing I hope it helps.. |
napolj2 Send message Joined: 20 Feb 06 Posts: 3 Credit: 293,823 RAC: 0 |
I got the same error message: 2006-02-24 7:56:11 AM|ralph@home|Computation for result BARCODE_30_1acf__219_10_0 finished 2006-02-24 7:56:11 AM|ralph@home|Output file BARCODE_30_1acf__219_10_0_0 for result BARCODE_30_1acf__219_10_0 exceeds size limit. 2006-02-24 7:56:11 AM|ralph@home|File size: 59286690.000000 bytes. Limit: 25000000.000000 bytes 2006-02-24 7:56:11 AM|ralph@home|Starting result BARCODE_30_1elwA_215_49_1 using rosetta_beta version 487 2006-02-24 7:56:12 AM|ralph@home|Unrecoverable error for result BARCODE_30_1acf__219_10_0 (<file_xfer_error> <file_name>BARCODE_30_1acf__219_10_0_0</file_name> <error_code>-131</error_code> <error_message></error_message></file_xfer_error>) 2006-02-24 7:56:13 AM|ralph@home|Started upload of BARCODE_30_1acf__219_10_0_1 2006-02-24 7:56:18 AM|ralph@home|Finished upload of BARCODE_30_1acf__219_10_0_1 Result Workunit I had BOINC running as a service so I didn't see any of the graphics. Since I'm new to RALPH@Home, I had a question. The FAQ says that "If you have a work unit that fails, it is important to report any imformation you have about the failure in the appropreate RALPH@home forum." So, I am supposed to make a post to the forum (presumably in the 'bug list' section) for every single failure? Isn't all the information automatically reported by BOINC to the RALPH@Home server? What additional info am I supposed to report? |
Divide Overflow Send message Joined: 15 Feb 06 Posts: 12 Credit: 128,027 RAC: 0 |
I'm getting the same error on a few of my results as well: https://ralph.bakerlab.org/result.php?resultid=7947 https://ralph.bakerlab.org/result.php?resultid=7921 https://ralph.bakerlab.org/result.php?resultid=7903 2/24/2006 12:50:07 PM|ralph@home|Computation for result BARCODE_30_1elwA_219_15_0 finished 2/24/2006 12:50:07 PM|ralph@home|Output file BARCODE_30_1elwA_219_15_0_0 for result BARCODE_30_1elwA_219_15_0 exceeds size limit. 2/24/2006 12:50:07 PM|ralph@home|File size: 64445797.000000 bytes. Limit: 25000000.000000 bytes 2/24/2006 12:50:08 PM|ralph@home|Starting result BARCODE_30_1enh__219_15_0 using rosetta_beta version 487 2/24/2006 12:50:09 PM|ralph@home|Unrecoverable error for result BARCODE_30_1elwA_219_15_0 (<file_xfer_error> <file_name>BARCODE_30_1elwA_219_15_0_0</file_name> <error_code>-131</error_code> <error_message></error_message></file_xfer_error>) |
doc :) Send message Joined: 16 Feb 06 Posts: 46 Credit: 4,437 RAC: 0 |
had one more of these, this time the wu was from a older batch that was reissued twice before i got it (errored out with a older beta and different errors than the size limit thing before though) WU - result 24/02/2006 14:42:43|ralph@home|Computation for result BARCODE_30_1c8cA_215_43_2 finished 24/02/2006 14:42:43|ralph@home|Output file BARCODE_30_1c8cA_215_43_2_0 for result BARCODE_30_1c8cA_215_43_2 exceeds size limit. 24/02/2006 14:42:43|ralph@home|File size: 45546511.000000 bytes. Limit: 25000000.000000 bytes 24/02/2006 14:42:45|ralph@home|Unrecoverable error for result BARCODE_30_1c8cA_215_43_2 (<file_xfer_error> <file_name>BARCODE_30_1c8cA_215_43_2_0</file_name> <error_code>-131</error_code> <error_message></error_message></file_xfer_error>) 24/02/2006 14:42:46|ralph@home|Started upload of BARCODE_30_1c8cA_215_43_2_1 24/02/2006 14:42:54|ralph@home|Finished upload of BARCODE_30_1c8cA_215_43_2_1 24/02/2006 14:42:54|ralph@home|Throughput 7915 bytes/sec on reporting the result i got a further error: 24/02/2006 23:12:26|ralph@home|Couldn't delete file projects/ralph.bakerlab.org/BARCODE_30_1c8cA_215_43_2_0 i still got the file here, its more than 40mb big. |
[B^S] sTrey Send message Joined: 15 Feb 06 Posts: 58 Credit: 15,430 RAC: 0 |
Old news I guess, but here are two more; wu/resultids are 7138/7794 and 7136/7792. These show as invalid [contributing to my currently used-up quota :( ] These finished apparently normally, only getting the error on the upload. |
Astro Send message Joined: 16 Feb 06 Posts: 141 Credit: 32,977 RAC: 0 |
Oops, I had one of these too, should have posted it here, but I posted it in the "unclassified" bug thread. |
Brian B Send message Joined: 17 Feb 06 Posts: 9 Credit: 2,632 RAC: 0 |
Here's mine: 02/25/2006 11:04:26 AM|ralph@home|Computation for result BARCODE_30_4ubpA_219_2_0 finished 02/25/2006 11:04:26 AM|ralph@home|Output file BARCODE_30_4ubpA_219_2_0_0 for result BARCODE_30_4ubpA_219_2_0 exceeds size limit. 02/25/2006 11:04:26 AM|ralph@home|File size: 30720726.000000 bytes. Limit: 25000000.000000 bytes 02/25/2006 11:04:27 AM|ralph@home|Unrecoverable error for result BARCODE_30_4ubpA_219_2_0 (<file_xfer_error> <file_name>BARCODE_30_4ubpA_219_2_0_0</file_name> <error_code>-131</error_code> <error_message></error_message></file_xfer_error>) 02/25/2006 11:04:28 AM|ralph@home|Started upload of BARCODE_30_4ubpA_219_2_0_1 02/25/2006 11:04:35 AM|ralph@home|Finished upload of BARCODE_30_4ubpA_219_2_0_1 02/25/2006 11:04:35 AM|ralph@home|Throughput 12687 bytes/sec Running v4.87 Result ID 7516 Workunit ID 6860 Server state = Over Outcome = Client error Client state = Computing Target CPU = 2 hours Hope this helps. Good Luck! |
Divide Overflow Send message Joined: 15 Feb 06 Posts: 12 Credit: 128,027 RAC: 0 |
Almost all of my 4.87 WU's are producing this error now. Should we abort 4.87 work now that 4.89 (Windows) application is out? |
doc :) Send message Joined: 16 Feb 06 Posts: 46 Credit: 4,437 RAC: 0 |
well, bad news i think, the result exceeds size limit stuff is still there with 4.89, at least my last wu ended with that again. WU - result 25/02/2006 19:19:01|ralph@home|Computation for result BARCODE_30_1ubi__215_41_2 finished 25/02/2006 19:19:01|ralph@home|Output file BARCODE_30_1ubi__215_41_2_0 for result BARCODE_30_1ubi__215_41_2 exceeds size limit. 25/02/2006 19:19:01|ralph@home|File size: 27694002.000000 bytes. Limit: 25000000.000000 bytes 25/02/2006 19:19:02|ralph@home|Unrecoverable error for result BARCODE_30_1ubi__215_41_2 (<file_xfer_error> <file_name>BARCODE_30_1ubi__215_41_2_0</file_name> <error_code>-131</error_code> <error_message></error_message></file_xfer_error>) |
Colin Porter Send message Joined: 16 Feb 06 Posts: 3 Credit: 24 RAC: 0 |
Unfortunately I've had it happen as well with 4.89. At least the problem with "Leave applications in memory while preempted" Seems to be sorted, (edit)Iv'e had three WU's succeed today running 4.89 so this may just be a problem with this particular WU. My Results (end edit) 25/02/2006 20:31:11|ralph@home|Pausing result BARCODE_30_1pgx__215_36_1 (removed from memory) 25/02/2006 20:31:13||request_reschedule_cpus: process exited 25/02/2006 20:31:26||Resuming computation and network activity 25/02/2006 20:31:26||request_reschedule_cpus: Resuming activities 25/02/2006 20:31:26|ralph@home|Restarting result BARCODE_30_1pgx__215_36_1 using rosetta_beta version 489 25/02/2006 21:31:27|SETI@home|Restarting result 09fe01aa.135.10161.129826.1.89_3 using setiathome version 418 25/02/2006 21:31:27|ralph@home|Pausing result BARCODE_30_1pgx__215_36_1 (removed from memory) 25/02/2006 21:31:28|ralph@home|Unrecoverable error for result BARCODE_30_1pgx__215_36_1 ( - exit code -164 (0xffffff5c)) 25/02/2006 21:31:28||request_reschedule_cpus: process exited 25/02/2006 21:31:28|ralph@home|Computation for result BARCODE_30_1pgx__215_36_1 finished 25/02/2006 21:31:28|ralph@home|Output file BARCODE_30_1pgx__215_36_1_0 for result BARCODE_30_1pgx__215_36_1 exceeds size limit. 25/02/2006 21:31:28|ralph@home|File size: 35555962.000000 bytes. Limit: 25000000.000000 bytes |
hugothehermit Send message Joined: 17 Feb 06 Posts: 17 Credit: 2,170 RAC: 0 |
Add my name to the list :) 26/02/2006 8:09:26 AM|ralph@home|Aborting result BARCODE_30_1bm8__221_10_0: exceeded disk limit: 200247907.000000 > 200000000.000000 26/02/2006 8:09:26 AM|ralph@home|Unrecoverable error for result BARCODE_30_1bm8__221_10_0 (Maximum disk usage exceeded) ver 4.88 Edit: to clean up |
hugothehermit Send message Joined: 17 Feb 06 Posts: 17 Credit: 2,170 RAC: 0 |
I got another, I had a look at the slots n stdout.txt and found it to be 80,562KB and full of WARNING:: fullatom scorefxn terms requested error messages. Maybe your making the stdout file(s) too big and causing the BOINC disk space to be used up? |
[B^S] sTrey Send message Joined: 15 Feb 06 Posts: 58 Credit: 15,430 RAC: 0 |
Yes still happening with 4.89, the output file exceeds limit message is in the client output file, the -131 error in the result file. Most recent for me is result 11865 cpu pref set to 2 hours, quits very close to that time. |
Message boards :
RALPH@home bug list :
4.87 - result exceeds size limit
©2024 University of Washington
http://www.bakerlab.org