New crediting system

Message boards : Current tests : New crediting system

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 13 · Next

AuthorMessage
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 250
Credit: 543,579
RAC: 0
Message 1985 - Posted: 12 Aug 2006, 0:43:47 UTC - in response to Message 1983.  

does anyone object to rolling this out to Rosetta@home?


I just crunched a few WU's for beta and I do object -

I have several WU that did 5 decoys, a couple WU that did 6 decoys, most WU has 3 or fewer. WU's crunched in about the same time frame.

If we were to look at cross project equalization of credits - Rosetta would be sitting at or very near the bottom. Granting the fewest credit per CPU hour out there. This could in fact hurt the project as many crunchers will seek out the project that grants higher credits per hour. Or is at least more consistant on granting credits per CPU hour.

I have some old PIII that sometimes will crunch for 2-3 hours with only returning a single decoy - this credit system would make these machines nearly worthless IMO for returning any benifit in running them.



2 credits/model is being used right now on Ralph FOR TESTING ONLY. For Rosetta@home, the value will be more true to the actual cpu time used per model since the value will be determined from the Ralph test runs for each work unit.

ID: 1985 · Report as offensive    Reply Quote
Profile Conan
Avatar

Send message
Joined: 16 Feb 06
Posts: 364
Credit: 1,368,421
RAC: 0
Message 1986 - Posted: 12 Aug 2006, 1:02:01 UTC

Would extending the "Time to Completion" from my current 4 hours to 6 or 8 hours, generate more decoys and hence more credit? If so we can extend the time to do the WU models. Does Ralph (and Rosetta) require results as quickly as possible and therefore 4 hours is better for the project?
Two credits for a couple of hours work is not very exciting if only 1 decoy is generated.
By the way I am unable to check if your new credit system actually does credit anything as I am unable to send results back to Ralph at the moment, getting "No Schedulers Responded" for over a day now, it uploads the finished WU but not the WU results.
ID: 1986 · Report as offensive    Reply Quote
Rollo

Send message
Joined: 13 Apr 06
Posts: 4
Credit: 610
RAC: 0
Message 1987 - Posted: 12 Aug 2006, 1:29:30 UTC

How about declaring a 2 week test period, when you deploy the new credit system to rosetta? After that period the new credits will by erased if there are serious flaws. So everybody can have a look and start complain :).
I would announce that all comments on the new system made within the first week will not be read and deleted. So everybody is forced to take a break of one week before he can complain. Hopefully that helps to get more comments of quality.
ID: 1987 · Report as offensive    Reply Quote
Hoelder1in

Send message
Joined: 17 Feb 06
Posts: 11
Credit: 46,359
RAC: 0
Message 1988 - Posted: 12 Aug 2006, 1:32:36 UTC - in response to Message 1980.  
Last modified: 12 Aug 2006, 1:44:50 UTC

we can apply a correction factor to account for the over/under claiming hosts or as someone on the boards suggested, remove the top and bottom X percent.
Assuming that you use the median instead of the mean you are already removing the top and bottom 50%, so removing any additional percentages from the distribution will not have any effect on the median. ;-)
Perhaps you can 'sell' the new credit system as WU/structure-specific 'Ralph quorum' applied to Rosetta (reminds me of the discussion on 'pseudo-redundancy' we had back in the old days ;-).

ID: 1988 · Report as offensive    Reply Quote
Profile Trog Dog
Avatar

Send message
Joined: 8 Aug 06
Posts: 38
Credit: 41,996
RAC: 0
Message 1989 - Posted: 12 Aug 2006, 1:36:48 UTC - in response to Message 1984.  

Everyone keep in mind that the current standard boinc crediting system will still be used.

Also, minor modifications to the credit/model values will not make that much of a difference in the long run. The important thing to know is that given any credit/model value, users will be on a level playing field. I think we can all agree that this is the major drive/motivation for coming up with a new method. Making sure it closely matches the BOIINC credit values is not as important since we will still use the old system along with the new.


Why keep the old system in place?
ID: 1989 · Report as offensive    Reply Quote
Profile feet1st

Send message
Joined: 7 Mar 06
Posts: 313
Credit: 116,623
RAC: 0
Message 1990 - Posted: 12 Aug 2006, 2:43:35 UTC - in response to Message 1986.  

Would extending the "Time to Completion" from my current 4 hours to 6 or 8 hours, generate more decoys and hence more credit?

YES.

Two credits for a couple of hours work is not very exciting if only 1 decoy is generated.

You're missing the point. If you are crunching a WU which takes that long to do a single model... then the credit awarded per model of that WU will be larger. And the overall idea is that regardless of which WU you happen to draw from the server, your credits per hour of crunching should be very consistent.

Many of you seem to be caught up on this 2 credit per model EXAMPLE. If they could have granted "X" rather then "2", I think it would have been more clear.

Let me outline how the workflow would be done. I think that might be where people are getting confused.

1) Calibration:
Run a mixture of WUs of various sizes on Ralph and come up with a benchmark. Take the time to crunch one model for each of 100 proteins on a given machine and compare machines and processor speeds and how many hours of crunching it took to do 1 model for each of the 100 proteins. Now, do some statistics work and try to normalize that in to FLOPs. Because BOINC credits are FLOPS based. Now divide that number of credits back across the 100 different proteins in proportion to the relative amount of time that protein's model took to crunch.

Example, say we crunch this 100 model benchmark in 24 hours on a PC that is capable of earning 100 credits per day under the old system. If we concur that the 100 credits is in line with the number of FLOPS that machine can do in a day, then we'd take that 100 credits and divide it amongst each protein. One of the proteins took 1 hour to complete, those are worth a little more then 4 credits each model. 4 other proteins took 15 minutes each. Those would each be worth 1 credit per model.

This 100 model benchmark would be a frame of reference going forward. Only done the one time.

2) Release a new WU to study, on Ralph:
OK, so now we have a new protein to study, or a new approach at studying them (which we hope will take less compute time to resolve!) and we send out some WUs on Ralph. We look at the time various types of hosts report back, and we size up this new WU against the WUs in the benchmark we previously established. We basically need to define how difficult it is to crunch a model with this new protein or using this new approach.

Average out the results from Ralph, calculate credit per model for this WU. This would be done every time a new type of WU is going to be sent out.

3) Release new WU to study, on Rosetta:
"We've found this new protein to be most challenging... it's models are awarded 5.31 credits each... but they take about 82 minutes to complete". And so eveyone on Rosetta knows exactly how much credit per model a given WU is worth... and this is fair, because you get credits for how many models you complete. i.e. you get paid (in credits) for each piece of production you return.

An apples to apples comparison:
Some people can pick apples faster then others. But the value to the orchard owner is measured in bushels, not hours. Pickers are paid PER BUSHEL. Now, hand everyone a different sized basket... wait for the complaints "but Johnny's basket is smaller then mine... of COURSE he's going to be able to fill it more times!" But if we pay based on BUSHELS, not baskets, then it's still fair to everyone; right? Fare to the farmer, fare to the pickers. And efficient pickers are paid more per day then inefficient pickers.

Well, during the calibration stage, "bushels" are defined, and the value of filling one with picked apples is determined by the farmer. During the Ralph release of a new "basket", we determine how it compares to a "bushel", is it larger? Smaller? or the same? Then during Rosetta release, we're all handed this new sized basket and asked to pick apples to the best of our unique abilities, and told this is what filling THIS basket is worth in credits. And visually, everyone can SEE that this new basket is smaller then a bushel, and so the credits for filling it are less... or visa versa. And, in the end, whether you REPORT your picking speed as 75 bushels an hour, or not, you are credited for the number of filled baskets under your trees at the end of the day.
ID: 1990 · Report as offensive    Reply Quote
Profile feet1st

Send message
Joined: 7 Mar 06
Posts: 313
Credit: 116,623
RAC: 0
Message 1991 - Posted: 12 Aug 2006, 2:54:51 UTC - in response to Message 1989.  

Why keep the old system in place?

In part so everyone can see they're still being credited fairly. Once folks get used to the new system and it is calibrated to BOINC flops and credit values, I presume the "claimed credits" information would eventually be phased out.

The problem people seem to be having is that on Ralph, the arbitrary 2 credits per model number, and the comparision of the credits claim under the old system and the credits awarded under the new is making it too easy to see a disparity and raise question about the new approach. That 2 credits per model number wasn't a "calibrated" value, and hence it wasn't adequate for the difficulty of the protein that was released. The "TEST" was of the mechanics of tallying and presenting the user data, and running with some predetermined credit value per model established.

I haven't heard any problems with the mechanics of the new system, and I believe that's why David Kim posed the question about proceeding to Rosetta when he did. Because his purpose for running on Ralph has proven the system works... the only thing I think people are still confused by is how many credits should be awarded. And the main reason we're confused by it is simply because we've not seen the full cycle completed. So, to some extent, by releasing on Rosetta, he'd be making it more clear to us what the new system is. Because then we could see the 5.4 credits per model award or whatever the calibrated number should be. And we'll be able to see that in the end, the number of credits for 4 hours of crunching is pretty much what we had under the old system (for the "average" WINDOWS user... Sounds like Linux folks will finally see a well-deserved boost... I don't want to get in to it here. But I believe for MOST people, the credit claimed will be inline with credit awarded).

ID: 1991 · Report as offensive    Reply Quote
Profile Astro

Send message
Joined: 16 Feb 06
Posts: 141
Credit: 32,977
RAC: 0
Message 1994 - Posted: 12 Aug 2006, 3:37:42 UTC

So, from Dekims description, we aren't seeing what it will be like at Rosetta. It satifies part of my curiosity to hear that different wus will be given credit at a different rate. This would/might make the adjustments to the "all over the map" issue and possibly even things out. I say if you feel confident, then "Let her roll".
I was under the impression it was going to be 2cr/model and that's what he wanted to release to Rosetta. I was not understanding this part of the process. When I read "start with issuing 2 cr/model in ralph, then make adjustments". I thought they meant change it to something other than 2, but apply it to all wus, without regard to crunch times or model/hour.
ID: 1994 · Report as offensive    Reply Quote
Profile Conan
Avatar

Send message
Joined: 16 Feb 06
Posts: 364
Credit: 1,368,421
RAC: 0
Message 1996 - Posted: 12 Aug 2006, 3:58:31 UTC

@feet1st, thanks for your response. In relation to 2 credits for a couple of hours work, I was not "missing the point" as you stated. I quite understand what they are trying and I know it is a test to see if it works, which I will go along with. I was refering to a previous post by Kevint, who has an older P111 machine which he believes will only get about 1 credit an hour if the 2 credit per decoy is used. if by extending the WU time then that may solve his problem by generating more decoys and hence more credit.
I am still having trouble uploading completed WU's to Ralph at the moment so I have yet to have any of the new credit awarded to me yet, to see how it compares.
ID: 1996 · Report as offensive    Reply Quote
Brian B

Send message
Joined: 17 Feb 06
Posts: 9
Credit: 2,632
RAC: 0
Message 1997 - Posted: 12 Aug 2006, 4:13:15 UTC - in response to Message 1996.  

Conan's not the only one have problems uploading. See this and this. Hope this gets the message out that there's something wrong with the servers even though the status pages states they're ok...
ID: 1997 · Report as offensive    Reply Quote
kevint

Send message
Joined: 24 Feb 06
Posts: 8
Credit: 1,568,696
RAC: 0
Message 1998 - Posted: 12 Aug 2006, 6:26:19 UTC - in response to Message 1996.  
Last modified: 12 Aug 2006, 6:26:48 UTC

which he believes will only get about 1 credit an hour if the 2 credit per decoy is used. if by extending the WU time then that may solve his problem by generating more decoys and hence more credit.
I am still having trouble uploading completed WU's to Ralph at the moment so I have yet to have any of the new credit awarded to me yet, to see how it compares.


I understand that this is only a test and that the production WU's may be different. My response was based upon a comment made about anyone having problems releasing this to the production side. As is, yes I do have issues with it, if the credit per decoy remains the same in production.
However, even upping the WU time would not solve this issue, if the machine can do a decoy every 1 hour, that is what it can do, increasing the WU crunch time would only grant more credits per WU, not more credit per hour.

However - to oblige I will test and crunch some 6 or 8 hour WU's on the older machines to see if there is a difference. I would like a 24 hour session as is, 2 hour WU's first for a baseline, then I will change it to a 8 hour WU and let that run for 24 hours to verify the credit per hour.

ID: 1998 · Report as offensive    Reply Quote
suguruhirahara

Send message
Joined: 5 Mar 06
Posts: 40
Credit: 11,320
RAC: 0
Message 2000 - Posted: 12 Aug 2006, 7:33:20 UTC

Hello,

I noticed that two sorts of value, "Recent average work credit" and "Total work credit", have been added to the list of my computers. Could anyone explain what they are?

Thanks,
ID: 2000 · Report as offensive    Reply Quote
Profile [B^S] thierry@home
Avatar

Send message
Joined: 15 Feb 06
Posts: 20
Credit: 17,624
RAC: 0
Message 2001 - Posted: 12 Aug 2006, 8:07:28 UTC

https://ralph.bakerlab.org/forum_thread.php?id=233#1973
ID: 2001 · Report as offensive    Reply Quote
suguruhirahara

Send message
Joined: 5 Mar 06
Posts: 40
Credit: 11,320
RAC: 0
Message 2002 - Posted: 12 Aug 2006, 8:09:46 UTC

ty
ID: 2002 · Report as offensive    Reply Quote
tralala

Send message
Joined: 12 Apr 06
Posts: 52
Credit: 15,257
RAC: 0
Message 2003 - Posted: 12 Aug 2006, 8:24:04 UTC - in response to Message 1984.  

Everyone keep in mind that the current standard boinc crediting system will still be used.

Also, minor modifications to the credit/model values will not make that much of a difference in the long run. The important thing to know is that given any credit/model value, users will be on a level playing field. I think we can all agree that this is the major drive/motivation for coming up with a new method. Making sure it closely matches the BOIINC credit values is not as important since we will still use the old system along with the new.


I think it is a smart idea to introduce the new crediting system to Rosetta first as an alternative. In the long run however you need to decide on one system, which can only be the credit/model system, since it will be much fairer. The exact transition can be determined later and indeed the credit/model scheme allows quick and smooth adjusting in any case. Nevertheless I suggest introducing this on Rosetta only after it has been seen in effect here on RALPH with real data not the fixed test-ration 2credits/model. If this is too complicated then roll it out on Rosetta but I strongly support the idea of feet1st of putting together a documentation, which should be placed prominently (News-Entry and Sticky in message board).
ID: 2003 · Report as offensive    Reply Quote
Hoelder1in

Send message
Joined: 17 Feb 06
Posts: 11
Credit: 46,359
RAC: 0
Message 2005 - Posted: 12 Aug 2006, 9:25:28 UTC - in response to Message 1990.  
Last modified: 12 Aug 2006, 9:25:46 UTC

1) Calibration:

...

2) Release a new WU to study, on Ralph:

...

Feet1st, I don't think what you say under 1) and 2) is correct - I liked your original explanation (id 1954) much better. The beauty of the new system is that it does NOT require any calibration or server-side benchmark to be performed. Also, it is not necessary to compare each new WU to benchmark WUs as you seem to imply under 2). On Ralph, one simply determines the median (or some other kind of average) of the old style credit that the Ralph participants claim per crunched structure. No other calibration/benchmark/comparison is required ! A good way to get this concept across might be to think of this as a BOINC style quorum of the (classical BOINC) credit claimed by the Ralph participants per returned structure. Just an idea - of course you are the one who is good at explaining...

I absolutely love the 'apples to apples comparison'. ;-)
ID: 2005 · Report as offensive    Reply Quote
MikeMarsUK

Send message
Joined: 8 Aug 06
Posts: 5
Credit: 0
RAC: 0
Message 2007 - Posted: 12 Aug 2006, 9:41:45 UTC - in response to Message 2003.  
Last modified: 12 Aug 2006, 9:43:02 UTC

...
I think it is a smart idea to introduce the new crediting system to Rosetta first as an alternative. In the long run however you need to decide on one system, which can only be the credit/model system, since it will be much fairer. The exact transition can be determined later and indeed the credit/model scheme allows quick and smooth adjusting in any case.
...


I would mostly agree with this, but the problem with having two sets of statistics is that it doesn't stop the 'cheating' flame wars, and could even possibly make it worse. If it wasn't for that, I'd quite like to have the two to compare.

Is there any chance of calculating the historical 'work credit' based on average claimed credit per work unit? (i.e., for previous models crunched over the years).

If Rosetta publishes the user.xml stats etc, I think it should be the work credits stats which are published, but if this is to be done, it would be best to grant work credit for all decoys that the participant has processed. My guess is that Rosetta does have the data to do this (although perhaps not the time).

ID: 2007 · Report as offensive    Reply Quote
Spare_Cycles

Send message
Joined: 16 Feb 06
Posts: 17
Credit: 12,942
RAC: 0
Message 2010 - Posted: 12 Aug 2006, 14:09:04 UTC

I am eager to see the fair credit system on Rosetta, but first we need to make sure that the current server problems on RALPH are not somehow being caused by a bug in the new credit system.

It's not possible to have two official stat systems. The stats that are exported to all the third party stat sites are the official stats. To attract crunchers who want to compete with fair stats, these exported stats must be the fair stats.
ID: 2010 · Report as offensive    Reply Quote
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 250
Credit: 543,579
RAC: 0
Message 2016 - Posted: 12 Aug 2006, 18:19:24 UTC

Okay, I hope I fixed the recent bugs. Let me know if there are still issues with uploading and the team info. I had to update the cgi program with the new database code.

FOR TESTING: I also made the credit/model value for each work unit determined from the most recent average so the work credits should be more accurate for different work units (particularly as more results are returned) rather than the 2 credit/model value. This way, we can get an idea of how the credit granting will be for different sized work units. This is just for testing though. For R@h, we will use the average value from the Ralph runs so everyone will get the same credit/model for a given work unit rather than a value that may change a bit initially.
ID: 2016 · Report as offensive    Reply Quote
NJMHoffmann

Send message
Joined: 17 Feb 06
Posts: 8
Credit: 1,270
RAC: 0
Message 2017 - Posted: 12 Aug 2006, 18:56:23 UTC - in response to Message 2016.  

For R@h, we will use the average value from the Ralph runs so everyone will get the same credit/model for a given work unit rather than a value that may change a bit initially.
I propose not only using Ralph-values for Rosetta, but to include the values received so far from Rosetta (per WU-type ) too. You'll need less Ralph units of a new WU-type to start on Rosetta, because the credit per WU granted from Ralph-estimations is only a starting point, that will be calibrated further with the returned Rosetta-WUs.

Norbert
ID: 2017 · Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 13 · Next

Message boards : Current tests : New crediting system



©2024 University of Washington
http://www.bakerlab.org