Message boards : Current tests : New crediting system
Author | Message |
---|---|
dekim Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 20 Jan 06 Posts: 250 Credit: 543,579 RAC: 0 |
Please post comments and suggestions regarding the new crediting system here. |
zombie67 [MM] Send message Joined: 8 Aug 06 Posts: 75 Credit: 2,396,363 RAC: 6,299 |
Please post comments and suggestions regarding the new crediting system here. Are there any jobs to run? I keep getting "communication deferred", and no jobs downloaded. TIA Reno, NV Team: SETI.USA |
zombie67 [MM] Send message Joined: 8 Aug 06 Posts: 75 Credit: 2,396,363 RAC: 6,299 |
Please post comments and suggestions regarding the new crediting system here. Oh sure.... As soon as I posted this, the jobs downloaded. Nevermind. =;^) Reno, NV Team: SETI.USA |
Astro Send message Joined: 16 Feb 06 Posts: 141 Credit: 32,977 RAC: 0 |
Is there some basis used to come up with this initial estimate of 2 cobblestones/model, or was it a "swag" (scientific wild a** guess) LOL?? Are you now collecting data about the Avg decoys/WU per hour and comparing it to current values issued (with standard boinc client)? Have you tried to identify a "Golden" machine (avg machine) and base credit issuance quantities based on that? What weight does Rosetta place on credit parity across all projects? Is there something we could do? tony |
zombie67 [MM] Send message Joined: 8 Aug 06 Posts: 75 Credit: 2,396,363 RAC: 6,299 |
Please post comments and suggestions regarding the new crediting system here. Okay, *now* I am getting the message "No work from project". This both on my intel mac, and my amd x2 linux machine, if that makes any difference. Reno, NV Team: SETI.USA |
Hoelder1in Send message Joined: 17 Feb 06 Posts: 11 Credit: 46,359 RAC: 0 |
...and adding to mcciastro's questions: Will the two credits per structure be for one particular WU type or accross to board, independently of how long the strucutures take to complete ? It would be nice to let us know what you actually have in mind. Thanks, -H. |
dekim Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 20 Jan 06 Posts: 250 Credit: 543,579 RAC: 0 |
Is there some basis used to come up with this initial estimate of 2 cobblestones/model, or was it a "swag" (scientific wild a** guess) LOL?? I took it from a result from one of my computers as just a very rough value to test with. The version that will eventually run on Rosetta@home will have work unit specific credit per model values that are determined from test runs on Ralph. It will be a requirement for lab members to not only test new work units on Ralph but to also determine the average credit per model value from their test runs for production runs. The credits should remain somewhat consistent with other projects since the average values will be based on the standard boinc crediting scheme. If things look okay on Ralph, Rosetta@home will use the credit per model crediting method while Ralph will switch back to the standard method. |
Hoelder1in Send message Joined: 17 Feb 06 Posts: 11 Credit: 46,359 RAC: 0 |
I took it from a result from one of my computers as just a very rough value to test with. The version that will eventually run on Rosetta@home will have work unit specific credit per model values that are determined from test runs on Ralph. It will be a requirement for lab members to not only test new work units on Ralph but to also determine the average credit per model value from their test runs for production runs. The credits should remain somewhat consistent with other projects since the average values will be based on the standard boinc crediting scheme. If things look okay on Ralph, Rosetta@home will use the credit per model crediting method while Ralph will switch back to the standard method. OK, great - in fact this seems to be very much along the lines I was thinking myself... -H. :-) |
tralala Send message Joined: 12 Apr 06 Posts: 52 Credit: 15,257 RAC: 0 |
How do you want to determine credit/model from Ralph runs? As an average of claimed credit based on X runs? As a median? Keep in mind that the reported specs and benchmarks on RALPH can be as manipulated for some hosts as on Rosetta. Some use special BOINC versions which claim about 3x the credit the standard client claims (including myself). Possibly an average of claimed credit/model with 10 or more results is reliable enough to be used for Rosetta. If you take lesser results you have a higher impact of distorted credit reports on some WU and thus a WU which will give more credit and other which will give lesser credit (which is not good since people start cherrypicking "good" WU by aborting "bad" WU) |
Honza Send message Joined: 16 Feb 06 Posts: 9 Credit: 1,962 RAC: 0 |
tralala - the aim is to avoid and abandom always and ever ill-numbered benchmarks. (or at least I hope and pray). You can simply choose CPU type, divide it with CPU frequency - or golden machine, as tony suggested. I know it's not perfert, RAM speed takes place etc. But you should fit in acceptable +- 10%, not something like 500% with benchmarks. Just try to maintain inter-project parity, that will do... |
tralala Send message Joined: 12 Apr 06 Posts: 52 Credit: 15,257 RAC: 0 |
tralala - the aim is to avoid and abandom always and ever ill-numbered benchmarks. Well so far they revealed no details how they plan to establish the credit/model ratio. As I see it, there are two approaches. What you describe would require a bunch of trustable computers, preferably locally available, with different processors and OSs. 10-20 various machines would be probably sufficient for that (one golden machine will not work, since it will prefer one CPU-type over others and one OS over others). That is not a task for Ralph I'd say since computer on RALPH are not trustable. The second approach is different and just relies on the force of numbers where anything will balance out with more sampling. This would mean sending out WU on Ralph to "untrustable" hosts but with a lot of results, so that on average you will have also correct values. The bottom line is, it is a tricky task which requires some testing and tuning until it works flawlessly. I hope they will discuss it with us before they use it over at Rosetta to avoid fixing it while it is being used at Rosetta. |
tralala Send message Joined: 12 Apr 06 Posts: 52 Credit: 15,257 RAC: 0 |
Okay here is one proposal: Send each type of WU to at least 60 hosts. Discard the results with the lowest and highest 10% claimed credit and take the arithmetic average of the remaining hosts as fixed credit/model for production runs on Rosetta. |
suguruhirahara Send message Joined: 5 Mar 06 Posts: 40 Credit: 11,320 RAC: 0 |
Okay here is one proposal: But doesn't this way cut fair highest and lowest results at high rate? There may be some people who runs R@H on Pentium(I, II) or Kentsfield now. Especially a few users use genuine Kentsfield as testing already. If the way would be applied, how about distributing WUs to same/similar spec's machine? This way could help the server to distinguish original claimed results from ones modified. Using coefficients figured of machine's spec, etc. I fear, however, this might be issue of client frameworks. If thinking realistically, changing "quorum" from 1 to 3 is the easiest and more effective way applied overall. changing it to 5 isn't good:( |
Ethan Send message Joined: 11 Feb 06 Posts: 18 Credit: 25,579 RAC: 0 |
Why not just use Ralph to determine the average crunch time for each simulation for a given WU? I don't believe there's a way to manipulate the amount of time it takes to process, then that average time can be compared to a 'golden' ratio of credits/cpu hour for an average machine. The ratio would have to be revisited every couple months since computers will get faster over time, but this way, the credit system is completely bypassed (and its inherent problems). -E |
Tom Philippart Send message Joined: 24 Jun 06 Posts: 4 Credit: 883 RAC: 0 |
no work from project :( |
Astro Send message Joined: 16 Feb 06 Posts: 141 Credit: 32,977 RAC: 0 |
Why not just use Ralph to determine the average crunch time for each simulation for a given WU? I don't believe there's a way to manipulate the amount of time it takes to process, then that average time can be compared to a 'golden' ratio of credits/cpu hour for an average machine. The ratio would have to be revisited every couple months since computers will get faster over time, but this way, the credit system is completely bypassed (and its inherent problems). The time can be manipulated by truxes client. 5.3.12tx36 |
JKeck {pirate} Send message Joined: 16 Feb 06 Posts: 14 Credit: 153,095 RAC: 0 |
Okay here is one proposal: I don't think that idea would unfairly cut any honest machine. It may not be the best way though, since it would tend to give higher numbers than expected if there are many cheating hosts. Maybe it would work better to throw out 10% of the highest standard deviations. In that case all the results thrown out could be high if there is a concentration of cheaters, but come from both ends if all standard clients get the same task. Any attempt to change the quorum would not work since the number of structures is user configurable. BOINC WIKI BOINCing since 2002/12/8 |
Ethan Send message Joined: 11 Feb 06 Posts: 18 Credit: 25,579 RAC: 0 |
Is that a Boinc client? What if the time was kept within the Rosetta code (which is compiled and can't be manipulated as far as I know)? |
Astro Send message Joined: 16 Feb 06 Posts: 141 Credit: 32,977 RAC: 0 |
Ethan, welcome to ralph by the way. Trux 5.3.12tx36 is an optimized Boinc Core Client. The claimed credit formula currently is (whetstone+Dhrystone) * Cpu time (in seconds)/172800. Most optimized Boinc Core clients change the Benchmarks, but Truxs alters the time reported and benchmarks to get more credit. does this answer your question? tony here is a Result ID from a trux client: stderr out <core_client_version>5.3.12.tx36</core_client_version> <real_cpu_time>2503</real_cpu_time> <corrected_cpu_time>3930</corrected_cpu_time> <corrected_Mfpops>11126.2</corrected_Mfpops> see how it's "corrected" the time and benchmark? |
Ethan Send message Joined: 11 Feb 06 Posts: 18 Credit: 25,579 RAC: 0 |
Hi Tony, It partially does, yes. I'm more curious that since both the whetstone/drystone times can be faked, as well as apparently runtime in the Trux client. . . would it be better to put a cpu time capture in the Rosetta code itself. Since Rosetta@home code is compiled in the lab, results shouldn't be fake-able. I don't know if it's the way to go, but it should get rid of any appearance of people getting a higher score due to 3rd party boinc clients. Oh, and thanks for the hello, I've been around awhile - ID: 2 :) |
Message boards :
Current tests :
New crediting system
©2024 University of Washington
http://www.bakerlab.org