Message boards : RALPH@home bug list : RALPH Version News! - Version 4.97 (Win/Lin/Mac) released!
Author | Message |
---|---|
Carlos_Pfitzner Send message Joined: 16 Feb 06 Posts: 182 Credit: 22,792 RAC: 0 |
|
Carlos_Pfitzner Send message Joined: 16 Feb 06 Posts: 182 Credit: 22,792 RAC: 0 |
First bug -:( Exit status -1073741819 (0xffffffffc0000005) - windows https://ralph.bakerlab.org/result.php?resultid=77811 details of how this error occured Date Host Project ID Message 4/7/2006 5:27:13 AM carlos.cp3 ralph@home 195 Throughput 7 bytes/sec 4/7/2006 5:27:13 AM carlos.cp3 ralph@home 196 Started download of 1hz6A.psipred_ss2.gz 4/7/2006 5:27:28 AM carlos.cp3 ralph@home 197 Finished download of 1hz6A.psipred_ss2.gz 4/7/2006 5:27:28 AM carlos.cp3 ralph@home 198 Throughput 47 bytes/sec 4/7/2006 5:27:28 AM carlos.cp3 ralph@home 199 Started download of aa1hz6A03_05.400_v1_3.gz 4/7/2006 5:36:17 AM carlos.cp3 ralph@home 200 Finished download of aa1hz6A03_05.400_v1_3.gz 4/7/2006 5:36:17 AM carlos.cp3 ralph@home 201 Throughput 2078 bytes/sec 4/7/2006 5:36:17 AM carlos.cp3 ralph@home 202 Started download of 1hz6.pdb.gz 4/7/2006 5:36:40 AM carlos.cp3 ralph@home 203 Finished download of 1hz6.pdb.gz 4/7/2006 5:36:40 AM carlos.cp3 ralph@home 204 Throughput 427 bytes/sec 4/7/2006 5:36:40 AM carlos.cp3 ralph@home 205 Started download of aa1hz6A09_05.400_v1_3.gz 4/7/2006 5:36:58 AM carlos.cp3 ralph@home 206 Finished download of aa1ogw_03_05.400_v1_3.gz 4/7/2006 5:36:58 AM carlos.cp3 ralph@home 207 Throughput 1028 bytes/sec 4/7/2006 5:36:58 AM carlos.cp3 ralph@home 208 Started download of 1hz6A.fasta 4/7/2006 5:36:59 AM carlos.cp3 --- 209 request_reschedule_cpus: files downloaded 4/7/2006 5:36:59 AM carlos.cp3 --- 210 request_reschedule_cpus: files downloaded 4/7/2006 5:36:59 AM carlos.cp3 World Community Grid 211 Pausing result faah0372_d205n016_x1hpv_02_0 (removed from memory) 4/7/2006 5:36:59 AM carlos.cp3 ralph@home 212 Starting result TOP_SAMPLE_output_flavor_1ogw_373_482_0 using rosetta_beta version 497 4/7/2006 5:37:00 AM carlos.cp3 --- 213 request_reschedule_cpus: process exited 4/7/2006 5:37:08 AM carlos.cp3 ralph@home 214 Finished download of 1hz6A.fasta 4/7/2006 5:37:08 AM carlos.cp3 ralph@home 215 Throughput 7 bytes/sec 4/7/2006 5:37:08 AM carlos.cp3 ralph@home 216 Started download of 1hz6_3-27_lowenergy.cst 4/7/2006 5:39:59 AM carlos.cp3 ralph@home 217 Finished download of 1hz6_3-27_lowenergy.cst 4/7/2006 5:39:59 AM carlos.cp3 ralph@home 218 Throughput 158 bytes/sec 4/7/2006 5:52:41 AM carlos.cp3 ralph@home 219 Finished download of aa1hz6A09_05.400_v1_3.gz 4/7/2006 5:52:41 AM carlos.cp3 ralph@home 220 Throughput 2949 bytes/sec 4/7/2006 5:52:42 AM carlos.cp3 --- 221 request_reschedule_cpus: files downloaded 4/7/2006 5:52:42 AM carlos.cp3 --- 222 request_reschedule_cpus: files downloaded 4/7/2006 5:52:42 AM carlos.cp3 --- 223 request_reschedule_cpus: files downloaded 4/7/2006 5:52:42 AM carlos.cp3 World Community Grid 224 Restarting result faah0372_d205n016_x1hpv_02_0 using faah version 509 4/7/2006 5:52:42 AM carlos.cp3 ralph@home 225 Pausing result TOP_SAMPLE_output_flavor_1ogw_373_482_0 (removed from memory) 4/7/2006 6:36:11 AM carlos.cp3 --- 226 request_reschedule_cpus: project op 4/7/2006 6:36:12 AM carlos.cp3 World Community Grid 227 Pausing result faah0372_d205n016_x1hpv_02_0 (removed from memory) 4/7/2006 6:36:12 AM carlos.cp3 ralph@home 228 Resuming result TOP_SAMPLE_output_flavor_1ogw_373_482_0 using rosetta_beta version 497 4/7/2006 6:36:13 AM carlos.cp3 --- 229 request_reschedule_cpus: process exited 4/7/2006 7:14:17 AM carlos.cp3 ralph@home 230 Unrecoverable error for result TOP_SAMPLE_output_flavor_1ogw_373_482_0 ( - exit code -1073741819 (0xc0000005)) 4/7/2006 7:14:17 AM carlos.cp3 --- 231 request_reschedule_cpus: process exited 4/7/2006 7:14:17 AM carlos.cp3 ralph@home 232 Computation for result TOP_SAMPLE_output_flavor_1ogw_373_482_0 finished 4/7/2006 7:14:17 AM carlos.cp3 rosetta@home 233 Restarting result FA_RLXwi_hom007_1wit__362_426_1 using rosetta version 483 4/7/2006 7:15:21 AM carlos.cp3 ralph@home 234 Sending scheduler request to https://ralph.bakerlab.org/ralph_cgi/cgi 4/7/2006 7:15:21 AM carlos.cp3 ralph@home 235 Reason: To report results 4/7/2006 7:15:21 AM carlos.cp3 ralph@home 236 Reporting 1 results Click signature for global team stats |
Carlos_Pfitzner Send message Joined: 16 Feb 06 Posts: 182 Credit: 22,792 RAC: 0 |
Exit status 2 (0x2) Linux https://ralph.bakerlab.org/result.php?resultid=78679 |
Carlos_Pfitzner Send message Joined: 16 Feb 06 Posts: 182 Credit: 22,792 RAC: 0 |
Exit status 131 (0x83) Linux SIGSEGV: segmentation violationStack trace (14 frames): https://ralph.bakerlab.org/result.php?resultid=78732 Click signature for global team stats |
Nuadormrac Send message Joined: 22 Feb 06 Posts: 68 Credit: 11,362 RAC: 0 |
Don't know if this is worth mentioning or not. But thus far 4.97 has been working fine on my winXP SP2 A64 box... All WUs I've started crunching have been returning successfully. If I do run into a problem, I'll be sure to report it, but all seems well for on my comp for now. BTW, this is a bit of an aside from previous versions where 1 of the WUs I had gotten out of 3 or so, tended to exit with a problem. I've gone through 6 or so now on this version... |
genes Send message Joined: 16 Feb 06 Posts: 45 Credit: 43,706 RAC: 20 |
Had my very first one fail -- 4/7/2006 9:46:24 PM|ralph@home|Unrecoverable error for result HBLR_1.0_2tif_375_54_0 ( - exit code -1073741819 (0xc0000005)) Result: https://ralph.bakerlab.org/result.php?resultid=79821 |
Nikolay A. Saharov Send message Joined: 17 Feb 06 Posts: 6 Credit: 25,102 RAC: 0 |
I also have one err result 79474. 08.04.2006 06:24|ralph@home|Unrecoverable error for result HBLR_1.0_1di2_375_11_0 ( - exit code -1073741819 (0xc0000005)) 08.04.2006 06:24|ralph@home|Computation for task HBLR_1.0_1di2_375_11_0 finished OS: WinXP SP2 Pro PC: PIV-2.6GHz HT 2CPUs, RAM 3Gb, ATI Radeon 9600 Series Video with 128Mb BOINC 5.3.31, with 17 projects |
Nuadormrac Send message Joined: 22 Feb 06 Posts: 68 Credit: 11,362 RAC: 0 |
Perhaps I spoke a bit too soon, but then again, perhaps not. Percentage wise, had far more successes then before, but one did just fail... https://ralph.bakerlab.org/result.php?resultid=80038 stderr out |
Nuadormrac Send message Joined: 22 Feb 06 Posts: 68 Credit: 11,362 RAC: 0 |
|
Nikolay A. Saharov Send message Joined: 17 Feb 06 Posts: 6 Credit: 25,102 RAC: 0 |
This WU 75610 has 2 results with some errors.
|
Nuadormrac Send message Joined: 22 Feb 06 Posts: 68 Credit: 11,362 RAC: 0 |
OK, a pattern seems to have emerged. The WUs that are having problems (I had another 7 go bad) are all of type HBLR_1.0_*, the other parts of the WUs having changed. I'm gathering this is a new WU type. The succession of successes, were of type barcode*, and were WU types I had seen in the past, so I guess that WU type had been nailed down already. 4 went late morning, so I rebooted. Never had that problem on Rossetta itself, but had seen on the boards that some mentioned it fixed things, so thought after 4 failures I'd try it here anyhow... I then got 2 successful completions with 2 failures, tried rebooting again just to see what would happen. Shutting down BOINC, rebooting, and then restarting comp then caused this one to fail... (Definitely a problem associated with removing from memory in my case.) https://ralph.bakerlab.org/result.php?resultid=81520 However, one other person also got an access violation on that unit, so not sure what was going on around the time of failure in their case. Definitely a repeated failure... The other failed WUs (and these do not have the same removal from memory problem associated with them at least in my case) https://ralph.bakerlab.org/workunit.php?wuid=75104 This one also has a second failure beyond the one on my comp --------------------- https://ralph.bakerlab.org/workunit.php?wuid=75045 3 failure results -------------------- https://ralph.bakerlab.org/workunit.php?wuid=75894 2 failures but one success --------------------- https://ralph.bakerlab.org/workunit.php?wuid=75780 --------------------- https://ralph.bakerlab.org/workunit.php?wuid=75778 all 3 results show failure --------------------- https://ralph.bakerlab.org/workunit.php?wuid=75777 --------------------- All failures are for the same error that both I and others have been reporting. The access violation... |
Carlos_Pfitzner Send message Joined: 16 Feb 06 Posts: 182 Credit: 22,792 RAC: 0 |
|
Nuadormrac Send message Joined: 22 Feb 06 Posts: 68 Credit: 11,362 RAC: 0 |
On checking Rosetta's forum, version 4.98 is really a rollback to version 4.83, not a newer version then 4.97. Apperently 4.97 was released onto Rosseta and caused many a problem/discussion there. This prompted a rollback, which was given a latter version number so as to convince BOINC to download the older version I gather... https://boinc.bakerlab.org/rosetta/forum_thread.php?id=1106 I just got this message from David Kim who is currently addressing this problem. Some of the users error reports and what not can be seen in that thread. I imagine they still need us to test 4.97, and that a newer version to test will be comming out soon. Which brings up a bit of an aside. What happens when we hit 4.99 and possibly need another version number. Do we go to 5.00 or? I only ask, because usually the number before the decimal place, with many software packages is a major version number, commonly associated with major changes/feature updates. Think of BOINC 4.x vs. 5.x for instance... Not that it entirely matters in the grand scheme of things if we do need to move to 5.xx on the science app number... |
doc :) Send message Joined: 16 Feb 06 Posts: 46 Credit: 4,437 RAC: 0 |
i am also seeing a lot of crashes with 4.97 and HBLR_* WUs, both my pcs are affected, while nobody is doing anything directly on it, only couple of browserwindows and stuff like that open. one is a athlonXP other a old duron, both winxp, first sp2 second sp1. couple examples: wu - wu - wu - wu - wu - wu - wu - wu i still got some of those HBLR_* WUs here, is it worth crunching them? |
Carlos_Pfitzner Send message Joined: 16 Feb 06 Posts: 182 Credit: 22,792 RAC: 0 |
Which brings up a bit of an aside. What happens when we hit 4.99 and possibly need another version number. Do we go to 5.00 or? I only ask, because usually the number before the decimal place, with many software packages is a major version number, commonly associated with major changes/feature updates. Think of BOINC 4.x vs. 5.x for instance... Not that it entirely matters in the grand scheme of things if we do need to move to 5.xx on the science app number... I believe they can start using 4.97.01 4.97.02 and on and on ... More 99 runs of ralph@home garanteed , before we hit 4.98.00 ! Click signature for global team stats |
Carlos_Pfitzner Send message Joined: 16 Feb 06 Posts: 182 Credit: 22,792 RAC: 0 |
Version 4.97 in Rosetta was version 4.95 in Ralph, NOT version 4.97 as you might suspect Unfortunately I have not anymore 4.95 Ralph on my HD to do a diff with 4.97 Rosetta and probe otherwise. In the absence of evidences, remains the fact that 4.95 worked very well on my PC ... a lot of WUs I have crunched with 4.95 without a single error While the same errors I get with Ralph 4.97 , *I get identical errors with Rosetta 4.97 U feelings may vary ... however u can check the WUs I crunched with each version here https://ralph.bakerlab.org/results.php?hostid=1294 and here https://boinc.bakerlab.org/rosetta/results.php?hostid=170243 so, u can verify that the errors of 4.97 Ralph are the same of 4.97 Rosetta look at the ones that errors out Dump of the Graphics thread: *this was introduced with Ralph 4.97 -> not before *and that Dump of the Graphics thread: appears on Rosetta 4.97 erred out too So, Indubitably 4.97 Ralph was put as 4.97 Rosetta , and not any other version may be u have access to 4.95 WUs on first URL too ... (a lot w/o any errors)! Click signature for global team stats |
Carlos_Pfitzner Send message Joined: 16 Feb 06 Posts: 182 Credit: 22,792 RAC: 0 |
Indeed 4.97 Rosetta was 4.97 Ralph *click below -:! https://ralph.bakerlab.org/result.php?resultid=77755 https://boinc.bakerlab.org/rosetta/result.php?resultid=16646578 So, it is not possibly to cover the sun with a sieve it would be better to have said the truth Definitely, at least, there was a mistake, in the hour of updating. I Hope that now, that the mistake was proved, would be nice updating Rosetta windows again, with Ralph 4.95 ... may be as Rosetta 4.98.01 at least I will be able to crunch for Rosetta again Thanks Click signature for global team stats |
Nuadormrac Send message Joined: 22 Feb 06 Posts: 68 Credit: 11,362 RAC: 0 |
As for the Macs they have been running all of the Ralph and Rosetta versions with out problems. The current situation is mostly a windows problem. But it is also affecting systems that are marginal on memory as the protien being tested in the new work units is the largest I have seen to date. Systems with 256Mb of memory are just going to have trouble with the new workunits. Even system with 512Mb where some of the memory is in use for other purposes may have trouble.[/color][/b] Umm, it isn't exactly a lack of memory situation I was running into... I do have a GB of RAM in my system, not 256 or 512 MB, and what's more my computer can quite happily crunch a CPDN combined ocean WU (this is in fact the same WU they distribute in the BBC climate model project, but they're distributing them on CPDN). What's more, these combined ocean models take a lot longer then a sulpher cycle to run, and are also more resource intensive then sulpher cycles according to the people at the CPDN project. Seasonal attribution, which recommends no less then 1 GB of RAM runs quite happily on my PC also... Unless these models are larger/more intensive then CPDN work units, I think something else might have been going on, at least here. It also isn't that these projects were running, because earlier I had done backups on each, bringing them both to a save point (have them both scheduled in BOINC), and then suspended them. Shutdown BOINC, backed up the folders, and did not resume then. The projects hadn't loaded back up when I was running RALPH, and only now resumed CPDN with the coupled oceans model, while it's finishing up some LHC units... |
FluffyChicken Send message Joined: 17 Feb 06 Posts: 54 Credit: 710 RAC: 0 |
Carlos, you can get the older versions and newer of course from https://ralph.bakerlab.org/download/ |
Carlos_Pfitzner Send message Joined: 16 Feb 06 Posts: 182 Credit: 22,792 RAC: 0 |
|
Message boards :
RALPH@home bug list :
RALPH Version News! - Version 4.97 (Win/Lin/Mac) released!
©2024 University of Washington
http://www.bakerlab.org