Posts by Chu

1) Message boards : RALPH@home bug list : minirosetta 1.58 (Message 4756)
Posted 24 Mar 2009 by Chu
Post:
Thanks for your reporting. Some input and output files were not compressed properly for the WUs ending with "BOINC_MPZN_with_zinc_loop_modeling" and therefore caused pre-matured failures/exits. Sorry about it.

More problems on Mac O S X 10.4.11

WU's 1376869,1376870,1376871 failed: see below

ERROR: Conformation: fold_tree nres should match conformation nres. conformation nres: 137 fold_tree nres: 156589050
ERROR:: Exit from: src/core/conformation/Conformation.cc line: 224
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish

</stderr_txt>

2) Message boards : RALPH@home bug list : Bug Reports for Minirosetta v1.40 (Message 4337)
Posted 5 Nov 2008 by Chu
Post:
Just to second Sarel's post, we have also located the graphic problem when there is non-protein ligand displayed and implemented a fix to that. So please let us know if you still observe such problems. Thanks.
3) Message boards : RALPH@home bug list : Bug Reports for Minirosetta v1.38 (Message 4288)
Posted 22 Oct 2008 by Chu
Post:
That is the Zinc metal ion "ZN2+", modeled together with proteins which naturally bind them for structure stability and/or functional catalysis. I will post a more detailed description to explain the background of this prject and the goal we are hoping to achieve.
Chu! Long time no "see"! Welcome back.

What's with the little red ball?

Make this thread sticky. Maybe unsticky some others.

4) Message boards : RALPH@home bug list : Bug Reports for Minirosetta v1.38 (Message 4286)
Posted 22 Oct 2008 by Chu
Post:
Hi everyone,

Long time no see!

Please post issues/bugs relating to minirosetta version 1.38 here. In this version, we added two important applications to minirosetta, docking and protein folding with explicit zinc metal ion. Please pay extra attention to some new features of the graphics such as "color by chain" and "spacefill display of metal ligand" and let us know if you observe any problem or have suggestions or feedbacks. Thanks.

Here are two screen shots, one for protein folding with zinc metal (red ball) and one for docking of two protein structures (color by protein chain) to get their complex form.

5) Message boards : RALPH@home bug list : bug report for rosetta 5.47 & 5.48 (Message 2807)
Posted 27 Feb 2007 by Chu
Post:
Ralph has been updated to 5.47. In this update, we mainly include a special docking protocol which handles symmetric systems. Also we include another fix to the problem of "false watchdog terminiation" in many docking workunits. Please continue to report bugs you caught on your computers here. Thank you.

Ralph has been updated to 5.48. In this update, we mainly fix a problem in symmetric docking protocol seen in the previous update.
6) Message boards : RALPH@home bug list : Bug Report for 5.46 (Message 2796)
Posted 12 Feb 2007 by Chu
Post:
Don't worry. This batch of docking WUs has a higher memory requirement because it attempts to handle relatively larger proteins. When other WUs are put in the queue which have normal memory requirement( I believe default is 256MB for this project), your computer will keep crunching.
Is this memory usage normal?

12/02/2007 02:41|ralph@home|Message from server: Your computer has 313.84MB of memory, and a job requires 476.84MB
12/02/2007 02:41|ralph@home|Message from server: No work sent
12/02/2007 02:41|ralph@home|Message from server: (there was work but your computer doesn't have enough memory)
12/02/2007 02:41|ralph@home|No work from project

This seems to be new, because this host has already crunched some WUs.

cu,
Micha

7) Message boards : RALPH@home bug list : Bug Report for 5.46 (Message 2793)
Posted 12 Feb 2007 by Chu
Post:
Ralph has been updated to 5.46. In this update, we mainly fixed the problem of "false watchdog terminiation" for many docking workunits. For more information on this bug, please see here . Please continue to report bugs you caught on your computers here. Thank you.
8) Message boards : RALPH@home bug list : Bug Reports for 5.45 (Message 2784)
Posted 8 Feb 2007 by Chu
Post:
in early stage of some simulations, we carried out low-resolution search and thus sidechains will not be shown. Usually in the first box, there will either "search backbone"( no sidechains) or "search_full_atom" (with sidechains).
I just got one of these WUs:
1who__BOINC_ABINITIO_CONTROL2__1749_26_0 using rosetta_beta version 545
the graphic doesn't show the sidechains.

9) Message boards : RALPH@home bug list : Bug Reports for 5.45 (Message 2783)
Posted 8 Feb 2007 by Chu
Post:
sounds like some problem interfacing with BOINC manager. Those WUs themselves are fine and several of them you killed actually showed that they were stuck at score 0 which means this did not happen in the middle of a simulation. Could you please next time close the BOINC manager and re-open it to see if any of these WUs will be finished and reported? If that does not help, then go ahead to kill them. In addition, it seems to be specific to your linux hosts, but not Windows, right?

€
> Just had 4 Work Units fail, all at 1 hour processing time, I am expecting the other 2 to fail as well.
All the work units got 'stuck' and the Watchdog says it ended the run, but this is not the case.
All 4 work units on the Boinc Manager said that they were still running with NO CPU usage but still using up to 308 MB of RAM for each WU. All 4 got to 1 hour (my preferences are for 6 hours) and then said they were 100% complete but the WU did not release the CPU to go to another task.

http//ralph.bakerlab.org/result.php?resultid=420621
http//ralph.bakerlab.org/result.php?resultid=420709
http//ralph.bakerlab.org/result.php?resultid=420761
http//ralph.bakerlab.org/result.php?resultid=420767

Thanks

10) Message boards : RALPH@home bug list : WUs hang with Fedora (Core6) (Message 2759)
Posted 2 Feb 2007 by Chu
Post:
Not sure why this happens. I suggest you to post it again on Rosetta@Home forum as there are more people there and it is more likely for you to get some feedbacks. This could also be a general problem if other users report similar behavior.
I have 2 computers running Fedora core 6 that seem to get hung up on certain WUs. It's happening with both ralph 5.45 & rosetta 5.45.

The first one is a Pent D 2.8 with 1Gb RAM and the 2nd is an old laptop with a Pent 4M 1.8 with 1Gb RAM.

They both seem to hang on workunits. The WU will get around 1/3 complete and then the CPU in boincview will drop to 0. I've left them running overnight to see if they become unstuck but they never do.

WUs:
http://ralph.bakerlab.org/workunit.php?wuid=365447
http://boinc.bakerlab.org/rosetta/workunit.php?wuid=53704537

Anybody have any ideas?

-J

11) Message boards : RALPH@home bug list : Bug Reports for 5.45 (Message 2748)
Posted 30 Jan 2007 by Chu
Post:
Now it is updated on Rosetta@Home and you will get plenty of WUs to crunch. Just be aware that there is still some minor problem unsolved for mac platforms. See here
It isn't possible to test this update on my Mac as there is no work units. I did get 2 work units on one day, but they ran and I didn't notice them, so I couldn't turn on the graphics.

12) Message boards : RALPH@home bug list : Bug Reports for 5.45 (Message 2743)
Posted 29 Jan 2007 by Chu
Post:
If there is any apology to make, that should be from us. Thank you for your time and effort helping us.

I see, you were having problem of pre-empting a rosetta job and swapping it in and out with other BOINC applications. This is consistent with my previous speculation that your problem is probably not grahic-related. Honestly speaking, I don't know exactly either about what has gone wrong, but it could be somehow related to the BOINC api we were using for the rosetta 5.43 (though it did not explain why the problem did not happen universally on all other cilents' machines). The current 5.45 being tested on Ralph has been built with the newest version of BOINC API and that might help solve your problem. The plan is to put it on Rosetta@Home either later today or tomorrow. So please give it a try when it is upgraded and see if things improve on your side. Again, thank you for your generous contribution to our project.

The error message you got is certainly one of the symtoms related to graphics, but definitely not limited to that. May I ask if you have experienced any stability issue with your machine in general?


Hi Chu. Apologies for the long post.

No, I've never had any stability issue with my machine for any applications I run on it, with the sole exception that it doesn't like running the BOINC manager at the same time as I'm ripping DVDs. Other than that, it's rock solid. It's fairly well overclocked -I'm running a Core2Duo E6700 at 3.46 GHz, and my PC6400-rated RAM is actually running as PC8200 - but it's tested completely stable and several months of running both cores at 100% capacity 24/7 has never generated a single error for any BOINC application WU except Rosetta. Rosetta, though, became very touchy about running. It would inevitably fail a WU that was pre-empted and swapped out to allow something else to run. I had to leave it runing all the time on one core.

We certainly do not want to lose users because of application stability and that is why we are trying to work on improving it. Maybe you can check whether this is improved in 5.45 and if the failure rate goes down significantly, you may considering attaching back to Rosetta@Home.


I was quite puzzled and a bit disturbed at how the failure rate on Rosetta got more and more pronounced over time without any change to my machine's configuration or any other evidence of instability. I kept going for as long as possible because I liked crunching Rosetta and I'd accumulated a very respectable number of WUs. But the failure rate was becoming alarming, and on the 15th-16th January this year some 75-80% of all WUs aborted prematurely. That's when I regretfully had to call a halt. I joined RALPH to see whether the newer versions were more stable with an eye to going back to Rosetta when they're implemented. It's hard to tell, since the fairly irregular availability of work means I don't have a large WU base to draw conclusions from, but both 5.45 and 5.44 before it seem more stable than 5.43 on my machine; for one thing, they can both be swapped in and out to allow other BOINC applications to run without causing problems.

Out of curiosity, since the beta versions seemed more stable, I allowed my BOINC manager to download some new Rosetta workunits under 5.43 on Jan 27th. Sure enough, the first three it tried to run all failed with access violations, here, here and here. The fourth WU succeeded. By that stage, though, I'd had enough again and shut it down.

I have no idea why this is happening, and the 10% failure rate you mention would have been, if anything, an overestimate of the situation during the first few months I was crunching. The problems really seem to stem from the introduction of 5.43; which is puzzling since I don't use the graphics. I'll certainly try Rosetta again when 5.43 is upgraded, but I'd be a lot happier if I knew what was going wrong.



13) Message boards : RALPH@home bug list : Bug Reports for 5.45 (Message 2741)
Posted 29 Jan 2007 by Chu
Post:
Thanks Anders n, that might be due to a bad trajectory.
1 more stck Wu on my MAC.

http://ralph.bakerlab.org/result.php?resultid=406892

I will set the target time 4 H to se if it problem dissapears.

Anders n

14) Message boards : RALPH@home bug list : Bug Reports for 5.45 (Message 2740)
Posted 29 Jan 2007 by Chu
Post:
Hi Viromancy, I am a little surprised to hear that even with graphics disabled, you only got 75% failure rate for Rosetta@Home and from our current statistics, that number on average stays below 10% for windows platform. The error message you got is certainly one of the symtoms related to graphics, but definitely not limited to that. May I ask if you have experienced any stability issue with your machine in general? We certainly do not want to lose users because of application stability and that is why we are trying to work on improving it. Maybe you can check whether this is improved in 5.45 and if the failure rate goes down significantly, you may considering attaching back to Rosetta@Home.

BTW, the last three failure mentioned below in your post were caused by some problems in Rosetta science code and that is exactly the purpose running the alpha test to catch it.
Failed WU here.

Same type of error that forced me to stop crunching Rosetta altogether after decreasing stability for ver 5.43 resulted in around 75% of WUs aborting prematurely. Never had this problem at all with any WUs from other BOINC applications I run (World Community Grid/Malaria Control) and very rare with Rosetta before version 5.43. Had one instance of the same with version 5.44 here. Also, along with others, saw three odd, unrelated WU failures with ver 5.44 just before 5.45 was introduced here, here and here. I know these latter aren't ver 5.45, but for sake of completelness I thought it was worth mentioning.

I don't use graphics, at all. All these errors, and almost all of the constant errors being thrown up by Rosetta ver 5.43, occurred while the application was running in the background and the machine was otherwise idle.

15) Message boards : RALPH@home bug list : Bug Reports for 5.45 (Message 2733)
Posted 28 Jan 2007 by Chu
Post:
The current fix should not have any impact on the performance as compared to before.

We can define a high memory requirement in our job submission script to instruct only sending out the batch to cilents with larger memory. For most of the rosetta jobs, the default vaule should be fine, but with Rosetta design coming along, it will probably require more memory than usual.


Ya my TFlops comment was optimisitically looking forward to the new code rolling out to Rosetta and less users there having problems or confusion, or leaving due to failures.

I think just do what you're doing, keep small amounts of work coming at various times of day (think dial-up, each day after work). But I just wanted to point out that this test has enough special circumstances around it that it needs more time then most you've done before here on Ralph.

Speaking of TFlops, were you able to devise thread safety without too much of a performance impact? I've always been curious how many conformations would be showing if the graphic actually showed each and every one of them.

I picked up two DOC WUs last night on the PC that I was trying (and having problems with) previously, running 24hr time pref. so they're 6.5hrs in without any graphics enabled. Then I'll be using my PC most of today and have suspended Rosetta and enabled the ss for tonight.

...2 DOC WUs, one using 204MB the other using 177MB. So, I'll ask again, is there a simple way we can tell that a given WU was designed for high memory systems?

16) Message boards : RALPH@home bug list : Bug Reports for 5.45 (Message 2721)
Posted 28 Jan 2007 by Chu
Post:
Great, one positive data point, thanks for the report. If possible, try to leave the graphic window open even if you do not stay in front your computer all the time.
I just successfully completed one WU.

Opened the graphics window and played around rotating the protein.

Using BOINC 5.8.6a. P4 2.8, 512 of RAM, XP Pro.

17) Message boards : RALPH@home bug list : Bug Reports for 5.45 (Message 2720)
Posted 28 Jan 2007 by Chu
Post:
Good point. How should we proceed? Right after the update this afternoon, we sent out about 600 WUs and now half of them are already done. However, my guess is that most of them were crunched without graphics at all as people may not know the update in time to enable their graphics. We do need to send several batches for testing, and I just want to spread the words a little bit more before doing so. There are two ways by which people can help testing:

1. keep the screensaver disabled but manually enable graphics within boinc $manager by pushing "show graphics" button ( as reported by KSMarksPsych above). This way Rosetta@home does not have to be suspended, but more users' attention are required.

2. suspend Rosetta@Home first and enable boinc screensaver. My only concern is that TFflops for Rosetta may drop temporarily and Ralph may not have enough WUs to feed all the testing hosts, thus a lot of time will be wasted.

I personally prefer the first option, but if anybody has a better solution, please let us know. Meanwhile, we will send out graphics testing WUs periodically so that it can provide enough coverage before drawing the conclusion.

Yippie!! Project TFlops here we come!

Do you plan to do several batches of Ralph testing? People need time to suspend Rosetta so they can enable the screensaver to test the Ralph tasks, and then time to catch some tasks available on the server etc. etc.

1,000 tasks, twice a day for a few days?

Keep in mind, most users now do not use the screensaver. And most Ralph users also run Rosetta, so we're going to have to do a little jockeying around to do some good tests.

18) Message boards : RALPH@home bug list : Bug Reports for 5.45 (Message 2715)
Posted 27 Jan 2007 by Chu
Post:
Ralph has been updated to 5.45. In this update, we include a fix to the long known graphic problem and we would like to send it out for a test here RALPH first. In our beta test on our local windows and mac hosts, different rosetta jobs, which used to crash within 5 to 10 minutes with graphics on, are running in a much more stable manner. Given the desriable test results, we turned back the sidechain drawing and mouse-rotation features. Please give it a try either by turning on graphics in boinc manager or by enabling boinc screensaver. If you spot any problem, please report to us here ( more detailed description on errors are prefered ). Thanks.

For Mac users, even with the fix we still see that sometimes the graphic frame is suddently frozen due to an entrapment in the graphic thread (somewhere in glut library). When this happens, the graphic window can be closed without any problem but just can not be re-opened. The effect is limited to the graphic thread only and the worker thread still run properly (you can see increased progress) and return valid results when it finishes (Before the fix, it used to crash both the graphic thread and worker thread, and trigger a segmentation violation or bus error). If you see similar behavior for Ralph jobs, please keep the WU cruching and see if the WU will indeed produce results properly in the end. Thanks.

For windows users, we did not see any problem so far in our local tests and would like to see how it goes with Ralph.
19) Message boards : RALPH@home bug list : No Computers to Merge (Message 2714)
Posted 27 Jan 2007 by Chu
Post:
Sorry for the late reply. We have a link on the front page linking directly to the bug report list by which everyone uses to check the status of ralph testing jobs and thus have not come here often to see your message. I will try to pass this messge to the reponsible party and see what we can do. Thanks.

> Had another Boinc hiccup that somehow created a new host for no reason (created a new host on 3 other projects as well) and it is using the new host not the old one.
I went into the computer and selected "Merge Hosts" but the system tells me there are no hosts to merge. Of the 4 projects I got the same computer duplicated 2 of them (QMC and Einstein) allowed me to merge the computers back into one machine. Docking and Ralph have not allowed me to merge them.

Host 3204 is my host
Host 5474 is the new host that got created.
They both have the same specs but I can't merge them.

I would like the credit to all be on the one machine, Boinc shows the computer to be AuthenticAMD Dual Core AMD Opteron(tm) Processor 2.

Thanks and hope to hear from you soon.


This is my post from the 21/11/06, I would very much like to hear from the project staff.


I have resolved the problem I mentioned in this post on the 7th October as I was able to delete the host seeing as it had no credit (AMD 4800+ host).

However I am still unable to merge my AMD Opteron 275 (shown in Boinc as AuthenticAMD Dual Core AMD Opteron(tm) Processor 2).
I have not heard from the project staff and was wondering if you'd had a chance to see why I am unable to merge these two hosts (3204 and 5474).
They both have credit so I can't just delete one.

Hoping to hear from you soon.


Can you help me Ralph@home project team? I would like these two hosts merged but it does not work from my account.
Still hoping to hear from you soon.

20) Message boards : RALPH@home bug list : Bug Reports for 5.44 (Message 2693)
Posted 21 Jan 2007 by Chu
Post:
Did you get errors like this for all the recent WUs so far?
With last WU on Linux box I'm getting this message right after WU starts:

ralph@home 21.1.07 11:09:15 Task 1c9oA_BOINC_ABRELAX_NEWRELAXFLAGS_frags83__1640_3_0 exited with zero status but no 'finished' file
ralph@home 21.1.07 14:50:47 If this happens repeatedly you may need to reset the project.

Resseting project doesn't help.
Any idea?



Next 20



©2024 University of Washington
http://www.bakerlab.org