21)
Message boards :
Feedback :
User of the day not changed since new BOINC release
(Message 1576)
Posted 11 May 2006 by Moderator9 Post: ...I'm pretty sure I got the tone right. You have nothing to apologize for. It was my fault. I was in a hurry and gave a shorter answer than I would normally. A new version of the Rosetta application had just been released when I found your message and I was hurrying to get a lot of things done, but did not want you to think you were being ignored.. |
22)
Message boards :
Feedback :
User of the day not changed since new BOINC release
(Message 1573)
Posted 11 May 2006 by Moderator9 Post: ...Nice. I must confess I was not expecting this tired old answer. Or attitude! It is clear that you have misunderstood the content or tone of my post. May I direct your attention to this thread and this page of RALPH information. RALPH is NOT a production environment. As such many things that are considered aesthetics are not priority issues for repair or change. The User of the day function is just such an item. When you are attached to RALPH here will not be a steady stream of work. There will be times that your system may crash. You will not always get credits for any processing that is done. Team scores may be inaccurate, statistics feeds may not work or may be turned off at any time, the entire data base credits and all may be purged from time to time. The application and the work units run on RALPH may impact other projects you may be running. None of these things should (or do) occur on Rosetta, as Rosetta is a production environment. Had the User of the day function been damaged on Rosetta, it would be fixed immediately, because that is where the project focuses its support for such functions. I am sorry if this offends you but ALPHA testing is not for everyone. The User of the day problem has been reported to the project administrators (more than once). I am sorry if providing you with an honest answer to your question does not satisfy your needs, but I will not provide any user with false hope for a change, only to have to explain later why nothing happened to fix their problem. |
23)
Message boards :
Feedback :
User of the day not changed since new BOINC release
(Message 1566)
Posted 10 May 2006 by Moderator9 Post: We will also notice how long it takes to get fixed. Actually it has already been a month. But it is not likely to get fixed any time soon as it has nothing to do with the purpose of the RALPH ALPHA testing. The focus here is on testing new Work Unit types, applications versions, and approaches to the science. None of the competitive aspects of the BOINC world are important for RALPH and no energy is spent in addressing problems in those areas. I can assure you that the system administrator currently is working on more important issues. Should he become particularly bored at some point he may try to fix the User of the Day display, but it will be some time before that happens. It is my understanding the UOTD function is not turned off but that something is wrong with the implementation of the function in the version of the server software running on RALPH. So it is not as simple as flipping a switch. In any case it has no impact on the functional testing conducted on RALPH. |
24)
Message boards :
RALPH@home bug list :
Bug reports for Ralph 5.11 and 5.12
(Message 1542)
Posted 8 May 2006 by Moderator9 Post: I've suggested that if a given application version proves valid, then go ahead and USE the results for the science, and then they could keep a small but steady stream of WUs on Ralph and allow debts to be worked down and make things run more normally... You can set the share for RALPH to a lower value. This works especially well if you are connect all the time (Not through a modem). The system will seek work according to your connection frequency settings, but it will not run up a high debt when there is no work because of the lower share value. When there is work it will have a high enough debt to always get at least a few work units. |
25)
Message boards :
RALPH@home bug list :
Bug reports for Ralph 5.09 and 5.10
(Message 1520)
Posted 6 May 2006 by Moderator9 Post: ... and an error box asking if I should send the report to MS. WU 90436 "MS"!! Well there is your problem right there. Send it to the attention of "Bill". ;>) |
26)
Message boards :
RALPH@home bug list :
Bug reports for Ralph 5.09 and 5.10
(Message 1518)
Posted 6 May 2006 by Moderator9 Post: notice the graphics thread on rosetta production v5.07 Those are MAXIMUM settings. If it does not need the whole amount it will not use it all. |
27)
Message boards :
RALPH@home bug list :
Bug reports for Ralph 5.09 and 5.10
(Message 1510)
Posted 6 May 2006 by Moderator9 Post: Rhiju, There is no doubt about it. The Display of the "JUMPING/STRAND Break Protocol" is a real cat pleaser. The thing looks like a well baited fish hook. But there does seem to be a problem with the new text display repeating for at least the first three models. With each repeat the overall graphic seems to stretch vertically distorting the display slightly each time the lines are repeated. It is as if a file or variable that should be overwritten is not being cleared correctly. But you have solved the problem of the graphics not fitting in the boxes. That part looks real good. One thing you might consider for the graphic is to keep the count of steps for the first model and display that as the project step count for subsequent models. Also if the RMSD and ENERGY of the target is known, that would be good to include so people can see if they are getting close as the models are running. Obviously this would only for known structures. So far the only errors I have seen for Mac or Windows on any of these machines has been on the HOMO Work Units, which seem to just be recycles from other systems. |
28)
Message boards :
Cafe RALPH :
SOMEone PLEASE vote SOMEONE for user of the day!!
(Message 1507)
Posted 6 May 2006 by Moderator9 Post: As you may be aware RALPH is a test project. The purpose of the User of the Day on RALPH is to test the tolerance of the user community for consistency in this area of the homepage.... I have brought this to the attention of the folks who can change it. But there are higher priorities right now. So if I understand you correctly the User of the Month idea is not popular? |
29)
Message boards :
RALPH@home bug list :
Bug reports for Ralph 5.09 and 5.10
(Message 1506)
Posted 6 May 2006 by Moderator9 Post: not sure if this is a bug, more a cosmetic thing probably. The large blank area is the result of the resizing to fit the proteins inside the right box. That blank area looks bigger on 16:9 aspect ratio than it does on normal monitors. |
30)
Message boards :
RALPH@home bug list :
Bug reports for Ralph 5.08
(Message 1491)
Posted 5 May 2006 by Moderator9 Post: v5.08 has generated mixed results on my Linux box. Although several WU’s completed successfully, I’ve also had several result in computational errors: These results came from a Work Unit type that had a problem. See this post suggesting they be aborted. |
31)
Message boards :
RALPH@home bug list :
Bug reports for Ralph 5.05 and higher
(Message 1490)
Posted 5 May 2006 by Moderator9 Post: So the short of this is, if the workunit is simply running uninterrupted, it could run forever, or until it hits the Max time setting. This is the risk of running a single project setup. If you don't see movement in the graphic, try suspending the Work unit and letting the system run a different one for 5 min. Then restart the first Work unit again for 5 min. Repeat this process 4 -5 times and it should abort the workunit if it was stuck. If it is not stuck it should let it keep running. Either that or we have a watchdog bug. Well it is really two separate functions that are fallbacks to one another. If the watchdog never has the opportunity to work (i.e. the work unit is never stopped and started for the check to occur) then the Work Unit will hit a wall for maximum time to process. The Max time function is independent of the watchdog and works on a different set of criteria and variables. he Max time is hard coded by the project before the Work unit is sent out. Right now that max time on Rosetta is 24 hours. I think it is the asme for Ralph but Rhiju would have to verify that, because it could be different for each set of Work Units. In any case you are correct. If you system was in EDF mode, the watchdog would not likely have kicked in. Perhaps that is a good reason to revisit how the checking is done. |
32)
Message boards :
RALPH@home bug list :
Bug reports for Ralph 5.05 and higher
(Message 1482)
Posted 5 May 2006 by Moderator9 Post: Version 5.09 has been released. If you have errors in Version 5.09 please report them in the 5.09 Bug thread. |
33)
Message boards :
RALPH@home bug list :
Bug reports for Ralph 5.05 and higher
(Message 1478)
Posted 5 May 2006 by Moderator9 Post: [This computer is headless. Remote access only. Hence no screensaver. If it is a BIG protein you may have to wait for some time to see the steps advance, but you may be able to detect the slightest motion in the searching window image. If you see either the steps counting up or the movement in the searching window, it is still processing. On some of the large Work Units, it is possible for them to run very long times past your time setting. I would note however that yours is running way too long over the time setting. I have had a few lately that went 14 hours with a time setting of 2 hours. The point being this. Unless the Workunit is either swapped out for project switching, or boinc is turned on and off four times the watchdog will never wake up and abort the work unit. Failing that the work unit will be aborted when it hits a limit preset by the project which SHOULD be 24 hours of CPU time. My understanding is that it is designed to look at the Work unit each time it starts to process and determine of progress has been made since the last time it started up. This presuposes that the process was stopped for some reason. It does not just sit there checking the work unit all the time. If it never stops processing the workunit it will not check it. With luck Rhiju will chime in here and correct me if I am wrong about this, but I am going on the last explanation I had for all this. Now let me add a caution here. If you restart BOINC before the workunit reaches a percent complete of greater than 2%, the Work unit WILL START OVER FROM THE BEGINNING AND THE CPU TIME WILL RESET TO ZERO! So if you are going to play with starting and stopping. You should have keep in memory set to yes, and then suspend the Work unit or start another project long enough for another process to run for a while. The watch dog is supposed to do 4 of these checks which show no progress before it will abort the workunit. That is part of how they worked out the "four times your time setting" concept for manual aborts. So the short of this is, if the workunit is simply running uninterrupted, it could run forever, or until it hits the Max time setting. This is the risk of running a single project setup. If you don't see movement in the graphic, try suspending the Work unit and letting the system run a different one for 5 min. Then restart the first Work unit again for 5 min. Repeat this process 4 -5 times and it should abort the workunit if it was stuck. If it is not stuck it should let it keep running. Either that or we have a watchdog bug. |
34)
Message boards :
Feedback :
User of the day not changed since new BOINC release
(Message 1465)
Posted 3 May 2006 by Moderator9 Post: Seems ever since they took the Ralph servers down and installed the new BOINC server version, the user of the day has not changed. This has already been reported as a result of posts in a separate thread |
35)
Message boards :
Cafe RALPH :
SOMEone PLEASE vote SOMEONE for user of the day!!
(Message 1462)
Posted 3 May 2006 by Moderator9 Post: I was under the impression that UOTD was picked at random each day, As you may be aware RALPH is a test project. The purpose of the User of the Day on RALPH is to test the tolerance of the user community for consistancy in this area of the homepage.... Ok, Ok, I'll tell someone about it. But don't be surprised if it takes a while for the thing to change. There are not many profiles to choose from. |
36)
Message boards :
Number crunching :
Checkpointing, more credits? Or more models?
(Message 1460)
Posted 2 May 2006 by Moderator9 Post:
And well, not surprisingly that is precisely what has happened. If you look at the graphs on BOINCStats for Teraflops, and you have been watching the homepage of Rosetta, you can see the effect. You have to ignore Friday because there is a spike caused by failed credit awards on Friday, but the project is showing about 27TF and there is a general trend upward. It rises and falls a little but still the trend is up. The important thing is that only a week ago the project was stalled at about 24TF. That 3 TF gain is all about fixing the errors, and reductions in time lost from checkpointing issues. By my estimates there is about another 1TF that will come from additional error fixing. There could be another 2-4 TF still being lost due to long checkpointing. There is also about 2-3TF available if the Mac version of the application is fixed and optimized using Altivec coding. So there is still about 5 TF that could be squeezed out of the existing attach base of the project. This is all without adding a single system. Now to be fair there have been systems joining and returning every day so some part of the improvements comes from that as well. |
37)
Message boards :
Number crunching :
Checkpointing, more credits? Or more models?
(Message 1459)
Posted 2 May 2006 by Moderator9 Post:
[And well, not surprisingly that is precisely what has happened. If you look at the graphs on BOINCStats for Teraflops, and you have been watching the homepage of Rosetta, you can see the effect. You have to ignore Friday because there is a spike caused by failed credit awards on Friday, but the project is showing about 27TF and there is a general trend upward. It rises and falls a little but still the trend is up. The important thing is that only a week ago the project was stalled at about 24TF. That 3 TF gain is all about fixing the errors, and reductions in time lost from checkpointing issues. By my estimates there is about another 1TF that will come from additional error fixing. There could be another 2-4 TF still being lost due to long checkpointing. There is also about 2-3TF available if the Mac version of the application is fixed and optimized using Altivec coding. So there is still about 5 TF that could be squeezed out of the existing attach base of the project. This is all without adding a single system. Now to be fair there have been systems joining and returning every day so some part of the improvements comes from that as well. |
38)
Message boards :
Number crunching :
Checkpointing, more credits? Or more models?
(Message 1456)
Posted 2 May 2006 by Moderator9 Post: At one point it was mentioned that we were seeing 3x productivity on clients with the new checkpointing. I haven't tracked things closely enough... when I lose work due to preemption, does the time spent reset back to the checkpoint? And the credits is based on time spent, right? IF the Work Unit is removed from memory, it will always roll back to the last checkpoint. When it starts on my systems this will usually result in lost time as well. The clock does not keep rolling forward if the percent resets. This is why it is still a good idea to set keep in memory to yes. All the project loose somme time because of this loss. CPDN and Rosetta are two of the more lossy in this regard, but all projects loose some time this way. |
39)
Message boards :
Feedback :
need extra target cpu time options
(Message 1453)
Posted 2 May 2006 by Moderator9 Post: the home page says: Please set your ralph settings to match your rosetta@home settings so that we can truly simulate the rosetta@home environment. You are correct. I will bring this to the attention of the project team. |
40)
Message boards :
RALPH@home bug list :
Bug reports for Ralph 5.08
(Message 1446)
Posted 1 May 2006 by Moderator9 Post: Hello again, i have this problem : Rhiju commented on a similar error in This Post , but I am not certain yours is a Watchdog stop. The file errors are the same, and I know Rhiju had a post about that somewhere too, but I can't find it right now. He will pick this thread up again tomorrow if he does not catch it tonight sometime. My recollection is that this will self correct. |
©2023 University of Washington
http://www.bakerlab.org