New crediting system

Author	Message
dekim Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 20 Jan 06 Posts: 250 Credit: 543,579 RAC: 0	Message 1914 - Posted: 7 Aug 2006, 23:03:05 UTC Please post comments and suggestions regarding the new crediting system here. ID: 1914 · Reply Quote

zombie67 [MM] Send message Joined: 8 Aug 06 Posts: 75 Credit: 2,396,363 RAC: 6,299	Message 1915 - Posted: 8 Aug 2006, 0:40:01 UTC - in response to Message 1914. Please post comments and suggestions regarding the new crediting system here. Are there any jobs to run? I keep getting "communication deferred", and no jobs downloaded. TIA Reno, NV Team: SETI.USA ID: 1915 · Reply Quote

zombie67 [MM] Send message Joined: 8 Aug 06 Posts: 75 Credit: 2,396,363 RAC: 6,299	Message 1916 - Posted: 8 Aug 2006, 1:07:16 UTC - in response to Message 1915. Please post comments and suggestions regarding the new crediting system here. Are there any jobs to run? I keep getting "communication deferred", and no jobs downloaded. TIA Oh sure.... As soon as I posted this, the jobs downloaded. Nevermind. =;^) Reno, NV Team: SETI.USA ID: 1916 · Reply Quote

Astro Send message Joined: 16 Feb 06 Posts: 141 Credit: 32,977 RAC: 0	Message 1917 - Posted: 8 Aug 2006, 2:57:44 UTC Last modified: 8 Aug 2006, 3:02:31 UTC Is there some basis used to come up with this initial estimate of 2 cobblestones/model, or was it a "swag" (scientific wild a** guess) LOL?? Are you now collecting data about the Avg decoys/WU per hour and comparing it to current values issued (with standard boinc client)? Have you tried to identify a "Golden" machine (avg machine) and base credit issuance quantities based on that? What weight does Rosetta place on credit parity across all projects? Is there something we could do? tony ID: 1917 · Reply Quote

zombie67 [MM] Send message Joined: 8 Aug 06 Posts: 75 Credit: 2,396,363 RAC: 6,299	Message 1918 - Posted: 8 Aug 2006, 3:12:13 UTC - in response to Message 1916. Please post comments and suggestions regarding the new crediting system here. Are there any jobs to run? I keep getting "communication deferred", and no jobs downloaded. TIA Oh sure.... As soon as I posted this, the jobs downloaded. Nevermind. =;^) Okay, now I am getting the message "No work from project". This both on my intel mac, and my amd x2 linux machine, if that makes any difference. Reno, NV Team: SETI.USA ID: 1918 · Reply Quote

Hoelder1in Send message Joined: 17 Feb 06 Posts: 11 Credit: 46,359 RAC: 0	Message 1919 - Posted: 8 Aug 2006, 4:29:07 UTC ...and adding to mcciastro's questions: Will the two credits per structure be for one particular WU type or accross to board, independently of how long the strucutures take to complete ? It would be nice to let us know what you actually have in mind. Thanks, -H. ID: 1919 · Reply Quote

dekim Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 20 Jan 06 Posts: 250 Credit: 543,579 RAC: 0	Message 1920 - Posted: 8 Aug 2006, 4:34:00 UTC - in response to Message 1917. Is there some basis used to come up with this initial estimate of 2 cobblestones/model, or was it a "swag" (scientific wild a** guess) LOL?? Are you now collecting data about the Avg decoys/WU per hour and comparing it to current values issued (with standard boinc client)? Have you tried to identify a "Golden" machine (avg machine) and base credit issuance quantities based on that? What weight does Rosetta place on credit parity across all projects? Is there something we could do? tony I took it from a result from one of my computers as just a very rough value to test with. The version that will eventually run on Rosetta@home will have work unit specific credit per model values that are determined from test runs on Ralph. It will be a requirement for lab members to not only test new work units on Ralph but to also determine the average credit per model value from their test runs for production runs. The credits should remain somewhat consistent with other projects since the average values will be based on the standard boinc crediting scheme. If things look okay on Ralph, Rosetta@home will use the credit per model crediting method while Ralph will switch back to the standard method. ID: 1920 · Reply Quote

Hoelder1in Send message Joined: 17 Feb 06 Posts: 11 Credit: 46,359 RAC: 0	Message 1921 - Posted: 8 Aug 2006, 4:59:25 UTC - in response to Message 1920. Last modified: 8 Aug 2006, 5:17:14 UTC I took it from a result from one of my computers as just a very rough value to test with. The version that will eventually run on Rosetta@home will have work unit specific credit per model values that are determined from test runs on Ralph. It will be a requirement for lab members to not only test new work units on Ralph but to also determine the average credit per model value from their test runs for production runs. The credits should remain somewhat consistent with other projects since the average values will be based on the standard boinc crediting scheme. If things look okay on Ralph, Rosetta@home will use the credit per model crediting method while Ralph will switch back to the standard method. OK, great - in fact this seems to be very much along the lines I was thinking myself... -H. :-) ID: 1921 · Reply Quote

tralala Send message Joined: 12 Apr 06 Posts: 52 Credit: 15,257 RAC: 0	Message 1922 - Posted: 8 Aug 2006, 7:38:50 UTC Last modified: 8 Aug 2006, 7:47:25 UTC How do you want to determine credit/model from Ralph runs? As an average of claimed credit based on X runs? As a median? Keep in mind that the reported specs and benchmarks on RALPH can be as manipulated for some hosts as on Rosetta. Some use special BOINC versions which claim about 3x the credit the standard client claims (including myself). Possibly an average of claimed credit/model with 10 or more results is reliable enough to be used for Rosetta. If you take lesser results you have a higher impact of distorted credit reports on some WU and thus a WU which will give more credit and other which will give lesser credit (which is not good since people start cherrypicking "good" WU by aborting "bad" WU) ID: 1922 · Reply Quote

Honza Send message Joined: 16 Feb 06 Posts: 9 Credit: 1,962 RAC: 0	Message 1923 - Posted: 8 Aug 2006, 9:20:21 UTC tralala - the aim is to avoid and abandom always and ever ill-numbered benchmarks. (or at least I hope and pray). You can simply choose CPU type, divide it with CPU frequency - or golden machine, as tony suggested. I know it's not perfert, RAM speed takes place etc. But you should fit in acceptable +- 10%, not something like 500% with benchmarks. Just try to maintain inter-project parity, that will do... ID: 1923 · Reply Quote

tralala Send message Joined: 12 Apr 06 Posts: 52 Credit: 15,257 RAC: 0	Message 1924 - Posted: 8 Aug 2006, 14:12:17 UTC - in response to Message 1923. tralala - the aim is to avoid and abandom always and ever ill-numbered benchmarks. (or at least I hope and pray). You can simply choose CPU type, divide it with CPU frequency - or golden machine, as tony suggested. I know it's not perfert, RAM speed takes place etc. But you should fit in acceptable +- 10%, not something like 500% with benchmarks. Just try to maintain inter-project parity, that will do... Well so far they revealed no details how they plan to establish the credit/model ratio. As I see it, there are two approaches. What you describe would require a bunch of trustable computers, preferably locally available, with different processors and OSs. 10-20 various machines would be probably sufficient for that (one golden machine will not work, since it will prefer one CPU-type over others and one OS over others). That is not a task for Ralph I'd say since computer on RALPH are not trustable. The second approach is different and just relies on the force of numbers where anything will balance out with more sampling. This would mean sending out WU on Ralph to "untrustable" hosts but with a lot of results, so that on average you will have also correct values. The bottom line is, it is a tricky task which requires some testing and tuning until it works flawlessly. I hope they will discuss it with us before they use it over at Rosetta to avoid fixing it while it is being used at Rosetta. ID: 1924 · Reply Quote

tralala Send message Joined: 12 Apr 06 Posts: 52 Credit: 15,257 RAC: 0	Message 1925 - Posted: 8 Aug 2006, 14:17:02 UTC Okay here is one proposal: Send each type of WU to at least 60 hosts. Discard the results with the lowest and highest 10% claimed credit and take the arithmetic average of the remaining hosts as fixed credit/model for production runs on Rosetta. ID: 1925 · Reply Quote

suguruhirahara Send message Joined: 5 Mar 06 Posts: 40 Credit: 11,320 RAC: 0	Message 1926 - Posted: 8 Aug 2006, 14:38:25 UTC - in response to Message 1925. Last modified: 8 Aug 2006, 14:39:11 UTC Okay here is one proposal: Send each type of WU to at least 60 hosts. Discard the results with the lowest and highest 10% claimed credit and take the arithmetic average of the remaining hosts as fixed credit/model for production runs on Rosetta. But doesn't this way cut fair highest and lowest results at high rate? There may be some people who runs R@H on Pentium(I, II) or Kentsfield now. Especially a few users use genuine Kentsfield as testing already. If the way would be applied, how about distributing WUs to same/similar spec's machine? This way could help the server to distinguish original claimed results from ones modified. Using coefficients figured of machine's spec, etc. I fear, however, this might be issue of client frameworks. If thinking realistically, changing "quorum" from 1 to 3 is the easiest and more effective way applied overall. changing it to 5 isn't good:( ID: 1926 · Reply Quote

Ethan Send message Joined: 11 Feb 06 Posts: 18 Credit: 25,579 RAC: 0	Message 1927 - Posted: 8 Aug 2006, 15:21:15 UTC - in response to Message 1926. Why not just use Ralph to determine the average crunch time for each simulation for a given WU? I don't believe there's a way to manipulate the amount of time it takes to process, then that average time can be compared to a 'golden' ratio of credits/cpu hour for an average machine. The ratio would have to be revisited every couple months since computers will get faster over time, but this way, the credit system is completely bypassed (and its inherent problems). -E ID: 1927 · Reply Quote

Tom Philippart Send message Joined: 24 Jun 06 Posts: 4 Credit: 883 RAC: 0	Message 1928 - Posted: 8 Aug 2006, 17:04:19 UTC no work from project :( ID: 1928 · Reply Quote

Astro Send message Joined: 16 Feb 06 Posts: 141 Credit: 32,977 RAC: 0	Message 1929 - Posted: 8 Aug 2006, 17:12:19 UTC - in response to Message 1927. Why not just use Ralph to determine the average crunch time for each simulation for a given WU? I don't believe there's a way to manipulate the amount of time it takes to process, then that average time can be compared to a 'golden' ratio of credits/cpu hour for an average machine. The ratio would have to be revisited every couple months since computers will get faster over time, but this way, the credit system is completely bypassed (and its inherent problems). -E The time can be manipulated by truxes client. 5.3.12tx36 ID: 1929 · Reply Quote

JKeck {pirate} Send message Joined: 16 Feb 06 Posts: 14 Credit: 153,095 RAC: 0	Message 1930 - Posted: 8 Aug 2006, 17:24:30 UTC - in response to Message 1926. Okay here is one proposal: Send each type of WU to at least 60 hosts. Discard the results with the lowest and highest 10% claimed credit and take the arithmetic average of the remaining hosts as fixed credit/model for production runs on Rosetta. But doesn't this way cut fair highest and lowest results at high rate? There may be some people who runs R@H on Pentium(I, II) or Kentsfield now. Especially a few users use genuine Kentsfield as testing already. If the way would be applied, how about distributing WUs to same/similar spec's machine? This way could help the server to distinguish original claimed results from ones modified. Using coefficients figured of machine's spec, etc. I fear, however, this might be issue of client frameworks. If thinking realistically, changing "quorum" from 1 to 3 is the easiest and more effective way applied overall. changing it to 5 isn't good:( I don't think that idea would unfairly cut any honest machine. It may not be the best way though, since it would tend to give higher numbers than expected if there are many cheating hosts. Maybe it would work better to throw out 10% of the highest standard deviations. In that case all the results thrown out could be high if there is a concentration of cheaters, but come from both ends if all standard clients get the same task. Any attempt to change the quorum would not work since the number of structures is user configurable. BOINC WIKI BOINCing since 2002/12/8 ID: 1930 · Reply Quote

Ethan Send message Joined: 11 Feb 06 Posts: 18 Credit: 25,579 RAC: 0	Message 1931 - Posted: 8 Aug 2006, 17:31:26 UTC - in response to Message 1929. Last modified: 8 Aug 2006, 17:31:36 UTC The time can be manipulated by truxes client. 5.3.12tx36 Is that a Boinc client? What if the time was kept within the Rosetta code (which is compiled and can't be manipulated as far as I know)? ID: 1931 · Reply Quote

Astro Send message Joined: 16 Feb 06 Posts: 141 Credit: 32,977 RAC: 0	Message 1932 - Posted: 8 Aug 2006, 17:42:03 UTC - in response to Message 1931. Last modified: 8 Aug 2006, 17:45:25 UTC The time can be manipulated by truxes client. 5.3.12tx36 Is that a Boinc client? What if the time was kept within the Rosetta code (which is compiled and can't be manipulated as far as I know)? Ethan, welcome to ralph by the way. Trux 5.3.12tx36 is an optimized Boinc Core Client. The claimed credit formula currently is (whetstone+Dhrystone) * Cpu time (in seconds)/172800. Most optimized Boinc Core clients change the Benchmarks, but Truxs alters the time reported and benchmarks to get more credit. does this answer your question? tony here is a Result ID from a trux client: stderr out <core_client_version>5.3.12.tx36</core_client_version> <real_cpu_time>2503</real_cpu_time> <corrected_cpu_time>3930</corrected_cpu_time> <corrected_Mfpops>11126.2</corrected_Mfpops> see how it's "corrected" the time and benchmark? ID: 1932 · Reply Quote

Ethan Send message Joined: 11 Feb 06 Posts: 18 Credit: 25,579 RAC: 0	Message 1933 - Posted: 8 Aug 2006, 17:49:08 UTC - in response to Message 1932. Ethan, welcome to ralph by the way. Hi Tony, It partially does, yes. I'm more curious that since both the whetstone/drystone times can be faked, as well as apparently runtime in the Trux client. . . would it be better to put a cpu time capture in the Rosetta code itself. Since Rosetta@home code is compiled in the lab, results shouldn't be fake-able. I don't know if it's the way to go, but it should get rid of any appearance of people getting a higher score due to 3rd party boinc clients. Oh, and thanks for the hello, I've been around awhile - ID: 2 :) ID: 1933 · Reply Quote