21)
Message boards :
RALPH@home bug list :
Bug reports for 5.71
(Message 3264)
Posted 3 Jul 2007 by Rhiju Post: Thanks for reporting errors! |
22)
Message boards :
RALPH@home bug list :
Bug reports for 5.69-5.70
(Message 3235)
Posted 26 Jun 2007 by Rhiju Post: Great, that's what I was hoping for actually. We're testing a mode of ralph in which we can run an old app version and a new app version at the same time. This is to allow some of our workunits to continue with a stable version to enable consistent results for publication, while other workunits can take advantage of later bug fixes and features. That workunit has a checkpointing issue with 5.68, but works well with 5.69 (now 5.70); I wanted to make sure I could send out one batch of jobs for the old app and one for the newer app! This Wu failed after trying to restart from last checkpoint. |
23)
Message boards :
RALPH@home bug list :
Bug reports for 5.69-5.70
(Message 3230)
Posted 25 Jun 2007 by Rhiju Post: Hi: Thanks, I figured out the problem with this WU! Result ID 566072 |
24)
Message boards :
RALPH@home bug list :
Bug reports for 5.69-5.70
(Message 3219)
Posted 24 Jun 2007 by Rhiju Post: Thanks for continuing to post problems... |
25)
Message boards :
RALPH@home bug list :
Bug reports for 5.66-5.68
(Message 3215)
Posted 24 Jun 2007 by Rhiju Post: Hi: We're looking at these now.. Result ID 563617 |
26)
Message boards :
RALPH@home bug list :
Bug reports for 5.66-5.68
(Message 3195)
Posted 12 Jun 2007 by Rhiju Post: Hi: Yea, about half the workunits failed on all the platforms. I'm looking into this now... I'm 8.5hrs in to this symm fold dock relax task and still have not completed the second model. Seems significantly higher then the 1hr/model mentioned previously. |
27)
Message boards :
RALPH@home bug list :
Bug reports for 5.66-5.68
(Message 3185)
Posted 30 May 2007 by Rhiju Post: Hi: Yea I wish I'd seen all these crashes before sending out the same job to Rosetta@home. The first jobs that came back seemed OK -- now I realize that its because all the adversely affected computers were taking suuuper long and then crashed. We didn't expect those workunits to have such big memory footprints, so we'll have to spend a bit of time debugging. After a fair run of successes my G5 Mac got a computation error (no crash AFAICT) with exit code 1 (0x1), running v5.68 on gp04__BOINC_SYMM_FOLD_AND_DOCK_RELAX_SUBSYSTEM-gp04_-delC126__2078_10 after a little over six hours of crunching. The system had been running with the screensaver blacked out and the display sleeping for at least twelve hours. The output ends withERROR:: Exit from: hbonds.cc line: 636 |
28)
Message boards :
RALPH@home bug list :
Bug reports for 5.66-5.68
(Message 3158)
Posted 25 May 2007 by Rhiju Post: Thanks for the post. I doubt that it is the graphics start and stop, but it might be. Please do post again if you find your mac crashing when you play with graphics. Those are tough bugs to fix, because a lot of the graphics stuff is out of our direct control. The good news (well, maybe bad to start with) is that the BOINC infrastructure will be moving to a new way of doing graphics that is apparently more robust, I think by the end of the summer. So after we iron out the kinks, that might help the graphics-related errors... Incidentally, those workunits do take a long time (we have implemented checkpointing so that work should be saved freuqently in case of crashese), and Mac G4's are pretty slow for running Rosetta, unfortunately. My Mac G4/733 crashed after more than ten hours of crunching 1gidA_BOINC_MG_SASAPAIR_ALLRES_RNA_ABINITIO_SAVE_ALL_OUT_BARCODE_RNA_CONTACT_RNA_LONG_RANGE_CONTACT_RNA_SASA-1gidA-_2068_172; last time I looked it was showing only about ten minutes to go but hadn’t decremented that time for quite a while. (The percent done was over 98% and continuing to increment.) Exit status 1 (0x1), with the all-too-familiar “SIGBUS: bus error†message in the output file. Once again, the crash occurred either while the display was blacked out (having displayed the screensaver for a minute) or when I interrupted it. BTW this system has always been set to work while in use and to keep apps in memory, so I don’t understand why starting and stopping the graphics should be a problem—if that’s indeed the case. |
29)
Message boards :
RALPH@home bug list :
Bug reports for 5.66-5.68
(Message 3157)
Posted 25 May 2007 by Rhiju Post: Hi feet1st -- yea, its because Rosetta changes its fold while the graphics thread finishes its drawing. We considered at one point freezing Rosetta until each graphics frame finishes, but were worried about the performance cost! So these large molecules may continue to get rendered in freaky ways! My sidechains still fall off on 5.67, as they did with 5.65 |
30)
Message boards :
RALPH@home bug list :
Bug reports for 5.66-5.68
(Message 3146)
Posted 24 May 2007 by Rhiju Post: Ralph 5.66 fixed a problem where the graphics thread was crashing when sidechains were shown. Ralph 5.67 fixes an issue in the output of symmetric proteins. Thanks in advance for your posts! The posts for 5.65 helped a lot. |
31)
Message boards :
RALPH@home bug list :
Bug reports for 5.65
(Message 3131)
Posted 23 May 2007 by Rhiju Post: Hi everybody: Looks like there are a lot of problems with this version, actually -- a very high error rate. I'll track it down! Thanks for posting. Error: |
32)
Message boards :
RALPH@home bug list :
Bug reports for 5.65
(Message 3119)
Posted 22 May 2007 by Rhiju Post: So far things have been pretty stable with 5.64; thanks to everyone for posting about crashes on ralph, its helped us fine-tune our workunits. This update just has a small addition to give us more control over the energy function assumed in RNA workunits. |
33)
Message boards :
RALPH@home bug list :
Bug reports for 5.63
(Message 3066)
Posted 5 May 2007 by Rhiju Post: Hi Odyssesus: Thanks much for the detailed post! The good news is that we've fixed the bug you reported. We're glad to hear you can see the graphics, apologize a little for that "time to complete" weirdness (it happens on Macs, unforunately) and the occasional lack of work -- welcome to ralph! Soon you'll be able to run on rosetta@home (we'll probably update the application on Sunday), and there will be plenty of work for your computer to do. Hello, everyone! I seem to have been recruited into my first alpha project (the first that identifies itself as such, at least). I regret that my first posting here has to be negative. Well, mixed … |
34)
Message boards :
RALPH@home bug list :
Bug reports for 5.63
(Message 3057)
Posted 4 May 2007 by Rhiju Post: Thanks to all for posting -- I think we found the source of the error, and we're sending out some more test jobs. Same for me |
35)
Message boards :
RALPH@home bug list :
Bug reports for 5.63
(Message 3048)
Posted 4 May 2007 by Rhiju Post: Please especially report any problems you might notice with checkpointing, or running on powerpc macs. |
36)
Message boards :
RALPH@home bug list :
Bug reports for 5.56-5.59
(Message 2986)
Posted 2 Apr 2007 by Rhiju Post: Thanks, Feet1st, that's a great explanation. We indeed try to keep the avg time per model at less than one hour; actually our ralph runs help us calibrate this! Maion, I believe your time remaining is working just the way Rhiju intended for it to. Once the remaining time estimate gets <10min. then time starts moving slower. This is avoid exceeding 100%. So, basically, once you get below a 10 minute estimated time remaining, the estimate is not on track anymore. Basically, the client is unsure exactly when it will finish, but in each case, the 15 and 17 minutes estimates were not far from right. |
37)
Message boards :
RALPH@home bug list :
Bug reports for 5.56-5.59
(Message 2980)
Posted 2 Apr 2007 by Rhiju Post: Updates in 5.59 I think this is the last update. Everything ran pretty smoothly in 5.58. This just has some small updates in the science, to get back some useful scores for each decoy and a small set of fixes for the symmetric FOLD_AND_DOCK workunits. |
38)
Message boards :
RALPH@home bug list :
Bug reports for 5.56-5.59
(Message 2976)
Posted 1 Apr 2007 by Rhiju Post: Anders n, I think the behavior you observe is partly due to an additional "correction" that the BOINC API applies when estimating time to completion -- it should never really be over 4 hours, right? We really don't have any control over that extra "correction". But we do have control over percent complete, and that shouldn't go to zero upon resuming ralph! So I'm still worried. On my mac intel machine, I just tried to suspend a ralph WU, and ran einstein@home for a few minutes; then suspended the einstein@home workunit, and resumed the ralph WU. Everything was fine (pct complete never dropped to zero)... when you try this, does pct complete drop to zero? [edit] Another question: you posted that 5.57 was fine; are you seeing an issue only with 5.58? If so, this is totally puzzling, since the small change I made to the Mac app shouldn't affect behacior of pct complete. OK, just talked to David K about this. Right now we keep track of time crunched based on a call to the BOINC API ... i.e. the BOINC manager keeps track of how much time was spent on each workunit. If you preempt after an hour and resume later, the BOINC manager will tell Rosetta about the hour already spent. |
39)
Message boards :
RALPH@home bug list :
Bug reports for 5.56-5.59
(Message 2972)
Posted 31 Mar 2007 by Rhiju Post: OK, just talked to David K about this. Right now we keep track of time crunched based on a call to the BOINC API ... i.e. the BOINC manager keeps track of how much time was spent on each workunit. If you preempt after an hour and resume later, the BOINC manager will tell Rosetta about the hour already spent. But if you shut BOINC down and restart that could cause a problem in a lot of estimates... we can try to make the Rosetta app more self-sufficient, keeping track of cpu time spent so far, but that might be a can of worms. Worth the time? I think its a better use of our time to figure out what's going wrong with Mac's preempt/resume so that most users will not need to shut down BOINC and restart like Anders n has been doing! And we'll spend time getting in those checkpoints... ...our solution to this problem is to be careful -- we do *not* plan to send workunits to Rosetta@home that take more than an hour per decoy! |
40)
Message boards :
RALPH@home bug list :
Bug reports for 5.56-5.59
(Message 2971)
Posted 31 Mar 2007 by Rhiju Post: Yes, I did send out some massively long workunits -- just testing out the system! Hmm, I hadn't carefully thought about what would happen if two models were completed on the first pass. Let me see if I can figure out a fix... ...our solution to this problem is to be careful -- we do *not* plan to send workunits to Rosetta@home that take more than an hour per decoy! |
©2025 University of Washington
http://www.bakerlab.org