Posts by Hoelder1in

1) Message boards : Current tests : New crediting system (Message 2130)
Posted 16 Aug 2006 by Hoelder1in
Post:
With what I've seen so far in one day with up to a 168% difference from lowest to highest credits for one single computer it's nowhere near ready to roll out on Rosetta.

You're moving to a cherry picking heaven at the moment I would guess.
Wouldn't be hard for some of the larger teams (or bord individual) to create a program grab the stats, see what the initial credits claimed are for that type and tell the team.
Fluffy, dcdc, tralala, I think you are mistaken: Cherry-picking is NOT possible !
The variability you are seeing in the credits is not between different WU types but because of the different completion times of the models _within_ the WUs. Even if, say, the first model of a WU takes a long time to complete this doesn't tell you anything about how long the following models will take. This is a completely random process. So terminating WUs that start with a 'slow' model won't help you either.

Fluffy, instead of 168% difference you could also say +/-45% difference with respect to the average. Example: average=10, 10-45%=5.5, 10+45%=14.5, 164% difference between lowest and highest. I think this is acceptable, considering that most values will be much closer to the average.
2) Message boards : Current tests : New crediting system (Message 2116)
Posted 16 Aug 2006 by Hoelder1in
Post:
I don't have a problem with a credit scheme that grants 10 credits/hour for one WU and 5 for another, as long as over time they average out to cross project parity. After all, everyone would be getting a similar mix and that would be fair. Now, If it was known that "Xyz" WU got you more credits/hour than "Abc" Wus, then I can guarantee a small population of the community WILL abort the "Abc" WUs until they get a full cache of "Xyz" WUs.
tralala & mmciastro, due to the random nature of the Rosetta algorithm each model takes a somewhat different time to complete, even for models of the same WU type. I think this is what causes the +/-30% or so variability in credits/hour that you are observing. Once you take averages of 10 crunched WUs or more you will get pretty stable average credits/hour values. In fact, looking at the numbers I have seen so far, Windows/standard client users should see about a 10% increase of their RAC or credits/hour, compared to the old system. Seems to be acceptable to me. Cherry-picking will not be possible because you will only know how many models you completed _after_ you crunched your WU.
3) Message boards : Current tests : New crediting system (Message 2045)
Posted 15 Aug 2006 by Hoelder1in
Post:
Currently, I am just using the quotient of the claimed_credit and model totals for each work unit batch which get saved in a project specific table that we created. I wanted to avoid having to query the result table which would be necessary to get the median. I'd also have to add a project specific column to the result table which would hold the model count and I'm trying to avoid modifying the BOINC tables. I could use a correction factor if the descrepancies are significant enough. Can you point me to the results that you are talking about?
OK, I understand, if you only have these two numbers available that's of course the only thing you can do. I simply looked at the granted to claimed credit ratios on some of the recent results pages that looked like they were using the standard client. Here are a few examples: 1, 2, 3 (first six).
So you already did apply a correction factor, didn't you ? :-) I just randomly picked 13 standard client results that were sent out after midnight (UTC) and calculated the granted to claimed credit ratios. I get a mean of 1.16 with standard dev. 0.35 and an error of the mean of 0.10. So within the sampling error of this small sample your correction factor seems to be dead on. Presumably, you calculated it from a larger sample, so it probably is even more accurate. I guess the correction factor would have to be reviewed occasionally to correct for changes in the composition of the Ralph participants (Linux/Windows, standard/optimized client).
4) Message boards : Current tests : New crediting system (Message 2043)
Posted 14 Aug 2006 by Hoelder1in
Post:
Currently, I am just using the quotient of the claimed_credit and model totals for each work unit batch which get saved in a project specific table that we created. I wanted to avoid having to query the result table which would be necessary to get the median. I'd also have to add a project specific column to the result table which would hold the model count and I'm trying to avoid modifying the BOINC tables. I could use a correction factor if the descrepancies are significant enough. Can you point me to the results that you are talking about?
OK, I understand, if you only have these two numbers available that's of course the only thing you can do. I simply looked at the granted to claimed credit ratios on some of the recent results pages that looked like they were using the standard client. Here are a few examples: 1, 2, 3 (first six).
5) Message boards : Current tests : New crediting system (Message 2038)
Posted 13 Aug 2006 by Hoelder1in
Post:
FOR TESTING: I also made the credit/model value for each work unit determined from the most recent average so the work credits should be more accurate for different work units (particularly as more results are returned) rather than the 2 credit/model value. This way, we can get an idea of how the credit granting will be for different sized work units. This is just for testing though. For R@h, we will use the average value from the Ralph runs so everyone will get the same credit/model for a given work unit rather than a value that may change a bit initially.
I clicked through a number of recent results pages and it seems that the current credit/model averages are significantly higher than what the old crediting system would have assigned for Windows/standard client computers. I wonder why this is the case. Are you perhaps using the mean instead of the median to do the credits/model averaging and are thus affected by outliers on the high side ? I really think you should use the median which will make your averages largely independent of any low and high outliers and automatically 'select' the (Windows/standard client) majority population as averages. Another option would be a weighted mean (weighting factors 1/value, mean = n/Sum_1..n(1/v_i)) which de-emphasizes the high values. Here is an example to demonstrate the effect ot the different averaging methods: For the 10 values 2 3 5 5 5 5 5 10 15 20 the mean is 7.5, the median is exactly 5 and the weighted average would be 4.88. Would be interesting to hear your thinking on that.
6) Message boards : Current tests : New crediting system (Message 2005)
Posted 12 Aug 2006 by Hoelder1in
Post:
1) Calibration:

...

2) Release a new WU to study, on Ralph:

...

Feet1st, I don't think what you say under 1) and 2) is correct - I liked your original explanation (id 1954) much better. The beauty of the new system is that it does NOT require any calibration or server-side benchmark to be performed. Also, it is not necessary to compare each new WU to benchmark WUs as you seem to imply under 2). On Ralph, one simply determines the median (or some other kind of average) of the old style credit that the Ralph participants claim per crunched structure. No other calibration/benchmark/comparison is required ! A good way to get this concept across might be to think of this as a BOINC style quorum of the (classical BOINC) credit claimed by the Ralph participants per returned structure. Just an idea - of course you are the one who is good at explaining...

I absolutely love the 'apples to apples comparison'. ;-)
7) Message boards : Current tests : New crediting system (Message 1988)
Posted 12 Aug 2006 by Hoelder1in
Post:
we can apply a correction factor to account for the over/under claiming hosts or as someone on the boards suggested, remove the top and bottom X percent.
Assuming that you use the median instead of the mean you are already removing the top and bottom 50%, so removing any additional percentages from the distribution will not have any effect on the median. ;-)
Perhaps you can 'sell' the new credit system as WU/structure-specific 'Ralph quorum' applied to Rosetta (reminds me of the discussion on 'pseudo-redundancy' we had back in the old days ;-).
8) Message boards : Current tests : New crediting system (Message 1980)
Posted 12 Aug 2006 by Hoelder1in
Post:
we can apply a correction factor to account for the over/under claiming hosts or as someone on the boards suggested, remove the top and bottom X percent.
Assuming that you use the median instead of the mean you are already removing the top and bottom 50%, so removing any additional percentages from the distribution will not have any effect on the median. ;-)
9) Message boards : Current tests : New crediting system (Message 1977)
Posted 11 Aug 2006 by Hoelder1in
Post:
The new crediting system is pretty simple.
It finally dawned upon me that people didn't understand David Kim's original explanation of the new credit system that I reposted over at Rosetta (perhaps that was a mistake). Thanks to Feet1st and David Kim for the 'extended' explanations ! :-)
does anyone object to rolling this out to Rosetta@home?
So what's the idea which of the two credit systems, the old or the new one, should be exported to the stats sites and be listed on the account pages (e.g. this one) ??
10) Message boards : Current tests : New crediting system (Message 1921)
Posted 8 Aug 2006 by Hoelder1in
Post:
I took it from a result from one of my computers as just a very rough value to test with. The version that will eventually run on Rosetta@home will have work unit specific credit per model values that are determined from test runs on Ralph. It will be a requirement for lab members to not only test new work units on Ralph but to also determine the average credit per model value from their test runs for production runs. The credits should remain somewhat consistent with other projects since the average values will be based on the standard boinc crediting scheme. If things look okay on Ralph, Rosetta@home will use the credit per model crediting method while Ralph will switch back to the standard method.

OK, great - in fact this seems to be very much along the lines I was thinking myself... -H. :-)
11) Message boards : Current tests : New crediting system (Message 1919)
Posted 8 Aug 2006 by Hoelder1in
Post:
...and adding to mcciastro's questions: Will the two credits per structure be for one particular WU type or accross to board, independently of how long the strucutures take to complete ? It would be nice to let us know what you actually have in mind. Thanks, -H.






©2024 University of Washington
http://www.bakerlab.org