New crediting system

Message boards : Current tests : New crediting system

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 13 · Next

AuthorMessage
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 250
Credit: 543,579
RAC: 0
Message 1914 - Posted: 7 Aug 2006, 23:03:05 UTC

Please post comments and suggestions regarding the new crediting system here.
ID: 1914 · Report as offensive    Reply Quote
zombie67 [MM]
Avatar

Send message
Joined: 8 Aug 06
Posts: 75
Credit: 2,396,363
RAC: 6,299
Message 1915 - Posted: 8 Aug 2006, 0:40:01 UTC - in response to Message 1914.  

Please post comments and suggestions regarding the new crediting system here.


Are there any jobs to run? I keep getting "communication deferred", and no jobs downloaded.

TIA


Reno, NV
Team: SETI.USA
ID: 1915 · Report as offensive    Reply Quote
zombie67 [MM]
Avatar

Send message
Joined: 8 Aug 06
Posts: 75
Credit: 2,396,363
RAC: 6,299
Message 1916 - Posted: 8 Aug 2006, 1:07:16 UTC - in response to Message 1915.  

Please post comments and suggestions regarding the new crediting system here.


Are there any jobs to run? I keep getting "communication deferred", and no jobs downloaded.

TIA



Oh sure.... As soon as I posted this, the jobs downloaded.

Nevermind. =;^)


Reno, NV
Team: SETI.USA
ID: 1916 · Report as offensive    Reply Quote
Profile Astro

Send message
Joined: 16 Feb 06
Posts: 141
Credit: 32,977
RAC: 0
Message 1917 - Posted: 8 Aug 2006, 2:57:44 UTC
Last modified: 8 Aug 2006, 3:02:31 UTC

Is there some basis used to come up with this initial estimate of 2 cobblestones/model, or was it a "swag" (scientific wild a** guess) LOL??

Are you now collecting data about the Avg decoys/WU per hour and comparing it to current values issued (with standard boinc client)?

Have you tried to identify a "Golden" machine (avg machine) and base credit issuance quantities based on that?

What weight does Rosetta place on credit parity across all projects?

Is there something we could do?

tony
ID: 1917 · Report as offensive    Reply Quote
zombie67 [MM]
Avatar

Send message
Joined: 8 Aug 06
Posts: 75
Credit: 2,396,363
RAC: 6,299
Message 1918 - Posted: 8 Aug 2006, 3:12:13 UTC - in response to Message 1916.  

Please post comments and suggestions regarding the new crediting system here.


Are there any jobs to run? I keep getting "communication deferred", and no jobs downloaded.

TIA



Oh sure.... As soon as I posted this, the jobs downloaded.

Nevermind. =;^)



Okay, *now* I am getting the message "No work from project". This both on my intel mac, and my amd x2 linux machine, if that makes any difference.

Reno, NV
Team: SETI.USA
ID: 1918 · Report as offensive    Reply Quote
Hoelder1in

Send message
Joined: 17 Feb 06
Posts: 11
Credit: 46,359
RAC: 0
Message 1919 - Posted: 8 Aug 2006, 4:29:07 UTC

...and adding to mcciastro's questions: Will the two credits per structure be for one particular WU type or accross to board, independently of how long the strucutures take to complete ? It would be nice to let us know what you actually have in mind. Thanks, -H.
ID: 1919 · Report as offensive    Reply Quote
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 250
Credit: 543,579
RAC: 0
Message 1920 - Posted: 8 Aug 2006, 4:34:00 UTC - in response to Message 1917.  

Is there some basis used to come up with this initial estimate of 2 cobblestones/model, or was it a "swag" (scientific wild a** guess) LOL??

Are you now collecting data about the Avg decoys/WU per hour and comparing it to current values issued (with standard boinc client)?

Have you tried to identify a "Golden" machine (avg machine) and base credit issuance quantities based on that?

What weight does Rosetta place on credit parity across all projects?

Is there something we could do?

tony



I took it from a result from one of my computers as just a very rough value to test with. The version that will eventually run on Rosetta@home will have work unit specific credit per model values that are determined from test runs on Ralph. It will be a requirement for lab members to not only test new work units on Ralph but to also determine the average credit per model value from their test runs for production runs. The credits should remain somewhat consistent with other projects since the average values will be based on the standard boinc crediting scheme. If things look okay on Ralph, Rosetta@home will use the credit per model crediting method while Ralph will switch back to the standard method.


ID: 1920 · Report as offensive    Reply Quote
Hoelder1in

Send message
Joined: 17 Feb 06
Posts: 11
Credit: 46,359
RAC: 0
Message 1921 - Posted: 8 Aug 2006, 4:59:25 UTC - in response to Message 1920.  
Last modified: 8 Aug 2006, 5:17:14 UTC

I took it from a result from one of my computers as just a very rough value to test with. The version that will eventually run on Rosetta@home will have work unit specific credit per model values that are determined from test runs on Ralph. It will be a requirement for lab members to not only test new work units on Ralph but to also determine the average credit per model value from their test runs for production runs. The credits should remain somewhat consistent with other projects since the average values will be based on the standard boinc crediting scheme. If things look okay on Ralph, Rosetta@home will use the credit per model crediting method while Ralph will switch back to the standard method.

OK, great - in fact this seems to be very much along the lines I was thinking myself... -H. :-)
ID: 1921 · Report as offensive    Reply Quote
tralala

Send message
Joined: 12 Apr 06
Posts: 52
Credit: 15,257
RAC: 0
Message 1922 - Posted: 8 Aug 2006, 7:38:50 UTC
Last modified: 8 Aug 2006, 7:47:25 UTC

How do you want to determine credit/model from Ralph runs? As an average of claimed credit based on X runs? As a median? Keep in mind that the reported specs and benchmarks on RALPH can be as manipulated for some hosts as on Rosetta. Some use special BOINC versions which claim about 3x the credit the standard client claims (including myself).

Possibly an average of claimed credit/model with 10 or more results is reliable enough to be used for Rosetta. If you take lesser results you have a higher impact of distorted credit reports on some WU and thus a WU which will give more credit and other which will give lesser credit (which is not good since people start cherrypicking "good" WU by aborting "bad" WU)
ID: 1922 · Report as offensive    Reply Quote
Honza

Send message
Joined: 16 Feb 06
Posts: 9
Credit: 1,962
RAC: 0
Message 1923 - Posted: 8 Aug 2006, 9:20:21 UTC

tralala - the aim is to avoid and abandom always and ever ill-numbered benchmarks.
(or at least I hope and pray).

You can simply choose CPU type, divide it with CPU frequency - or golden machine, as tony suggested. I know it's not perfert, RAM speed takes place etc.
But you should fit in acceptable +- 10%, not something like 500% with benchmarks.

Just try to maintain inter-project parity, that will do...

ID: 1923 · Report as offensive    Reply Quote
tralala

Send message
Joined: 12 Apr 06
Posts: 52
Credit: 15,257
RAC: 0
Message 1924 - Posted: 8 Aug 2006, 14:12:17 UTC - in response to Message 1923.  

tralala - the aim is to avoid and abandom always and ever ill-numbered benchmarks.
(or at least I hope and pray).

You can simply choose CPU type, divide it with CPU frequency - or golden machine, as tony suggested. I know it's not perfert, RAM speed takes place etc.
But you should fit in acceptable +- 10%, not something like 500% with benchmarks.

Just try to maintain inter-project parity, that will do...


Well so far they revealed no details how they plan to establish the credit/model ratio. As I see it, there are two approaches. What you describe would require a bunch of trustable computers, preferably locally available, with different processors and OSs. 10-20 various machines would be probably sufficient for that (one golden machine will not work, since it will prefer one CPU-type over others and one OS over others). That is not a task for Ralph I'd say since computer on RALPH are not trustable.

The second approach is different and just relies on the force of numbers where anything will balance out with more sampling. This would mean sending out WU on Ralph to "untrustable" hosts but with a lot of results, so that on average you will have also correct values.

The bottom line is, it is a tricky task which requires some testing and tuning until it works flawlessly. I hope they will discuss it with us before they use it over at Rosetta to avoid fixing it while it is being used at Rosetta.


ID: 1924 · Report as offensive    Reply Quote
tralala

Send message
Joined: 12 Apr 06
Posts: 52
Credit: 15,257
RAC: 0
Message 1925 - Posted: 8 Aug 2006, 14:17:02 UTC

Okay here is one proposal:

Send each type of WU to at least 60 hosts. Discard the results with the lowest and highest 10% claimed credit and take the arithmetic average of the remaining hosts as fixed credit/model for production runs on Rosetta.
ID: 1925 · Report as offensive    Reply Quote
suguruhirahara

Send message
Joined: 5 Mar 06
Posts: 40
Credit: 11,320
RAC: 0
Message 1926 - Posted: 8 Aug 2006, 14:38:25 UTC - in response to Message 1925.  
Last modified: 8 Aug 2006, 14:39:11 UTC

Okay here is one proposal:

Send each type of WU to at least 60 hosts. Discard the results with the lowest and highest 10% claimed credit and take the arithmetic average of the remaining hosts as fixed credit/model for production runs on Rosetta.

But doesn't this way cut fair highest and lowest results at high rate? There may be some people who runs R@H on Pentium(I, II) or Kentsfield now. Especially a few users use genuine Kentsfield as testing already.

If the way would be applied, how about distributing WUs to same/similar spec's machine? This way could help the server to distinguish original claimed results from ones modified. Using coefficients figured of machine's spec, etc. I fear, however, this might be issue of client frameworks.

If thinking realistically, changing "quorum" from 1 to 3 is the easiest and more effective way applied overall. changing it to 5 isn't good:(
ID: 1926 · Report as offensive    Reply Quote
Ethan

Send message
Joined: 11 Feb 06
Posts: 18
Credit: 25,579
RAC: 0
Message 1927 - Posted: 8 Aug 2006, 15:21:15 UTC - in response to Message 1926.  

Why not just use Ralph to determine the average crunch time for each simulation for a given WU? I don't believe there's a way to manipulate the amount of time it takes to process, then that average time can be compared to a 'golden' ratio of credits/cpu hour for an average machine. The ratio would have to be revisited every couple months since computers will get faster over time, but this way, the credit system is completely bypassed (and its inherent problems).

-E
ID: 1927 · Report as offensive    Reply Quote
Tom Philippart

Send message
Joined: 24 Jun 06
Posts: 4
Credit: 883
RAC: 0
Message 1928 - Posted: 8 Aug 2006, 17:04:19 UTC

no work from project :(
ID: 1928 · Report as offensive    Reply Quote
Profile Astro

Send message
Joined: 16 Feb 06
Posts: 141
Credit: 32,977
RAC: 0
Message 1929 - Posted: 8 Aug 2006, 17:12:19 UTC - in response to Message 1927.  

Why not just use Ralph to determine the average crunch time for each simulation for a given WU? I don't believe there's a way to manipulate the amount of time it takes to process, then that average time can be compared to a 'golden' ratio of credits/cpu hour for an average machine. The ratio would have to be revisited every couple months since computers will get faster over time, but this way, the credit system is completely bypassed (and its inherent problems).

-E

The time can be manipulated by truxes client. 5.3.12tx36
ID: 1929 · Report as offensive    Reply Quote
Profile JKeck {pirate}
Avatar

Send message
Joined: 16 Feb 06
Posts: 14
Credit: 153,095
RAC: 0
Message 1930 - Posted: 8 Aug 2006, 17:24:30 UTC - in response to Message 1926.  

Okay here is one proposal:

Send each type of WU to at least 60 hosts. Discard the results with the lowest and highest 10% claimed credit and take the arithmetic average of the remaining hosts as fixed credit/model for production runs on Rosetta.

But doesn't this way cut fair highest and lowest results at high rate? There may be some people who runs R@H on Pentium(I, II) or Kentsfield now. Especially a few users use genuine Kentsfield as testing already.

If the way would be applied, how about distributing WUs to same/similar spec's machine? This way could help the server to distinguish original claimed results from ones modified. Using coefficients figured of machine's spec, etc. I fear, however, this might be issue of client frameworks.

If thinking realistically, changing "quorum" from 1 to 3 is the easiest and more effective way applied overall. changing it to 5 isn't good:(

I don't think that idea would unfairly cut any honest machine. It may not be the best way though, since it would tend to give higher numbers than expected if there are many cheating hosts. Maybe it would work better to throw out 10% of the highest standard deviations. In that case all the results thrown out could be high if there is a concentration of cheaters, but come from both ends if all standard clients get the same task.

Any attempt to change the quorum would not work since the number of structures is user configurable.
BOINC WIKI

BOINCing since 2002/12/8
ID: 1930 · Report as offensive    Reply Quote
Ethan

Send message
Joined: 11 Feb 06
Posts: 18
Credit: 25,579
RAC: 0
Message 1931 - Posted: 8 Aug 2006, 17:31:26 UTC - in response to Message 1929.  
Last modified: 8 Aug 2006, 17:31:36 UTC


The time can be manipulated by truxes client. 5.3.12tx36


Is that a Boinc client? What if the time was kept within the Rosetta code (which is compiled and can't be manipulated as far as I know)?

ID: 1931 · Report as offensive    Reply Quote
Profile Astro

Send message
Joined: 16 Feb 06
Posts: 141
Credit: 32,977
RAC: 0
Message 1932 - Posted: 8 Aug 2006, 17:42:03 UTC - in response to Message 1931.  
Last modified: 8 Aug 2006, 17:45:25 UTC


The time can be manipulated by truxes client. 5.3.12tx36


Is that a Boinc client? What if the time was kept within the Rosetta code (which is compiled and can't be manipulated as far as I know)?

Ethan, welcome to ralph by the way.

Trux 5.3.12tx36 is an optimized Boinc Core Client. The claimed credit formula currently is (whetstone+Dhrystone) * Cpu time (in seconds)/172800. Most optimized Boinc Core clients change the Benchmarks, but Truxs alters the time reported and benchmarks to get more credit.

does this answer your question?

tony

here is a Result ID from a trux client:
stderr out <core_client_version>5.3.12.tx36</core_client_version>
<real_cpu_time>2503</real_cpu_time>
<corrected_cpu_time>3930</corrected_cpu_time>
<corrected_Mfpops>11126.2</corrected_Mfpops>

see how it's "corrected" the time and benchmark?
ID: 1932 · Report as offensive    Reply Quote
Ethan

Send message
Joined: 11 Feb 06
Posts: 18
Credit: 25,579
RAC: 0
Message 1933 - Posted: 8 Aug 2006, 17:49:08 UTC - in response to Message 1932.  


Ethan, welcome to ralph by the way.


Hi Tony,
It partially does, yes. I'm more curious that since both the whetstone/drystone times can be faked, as well as apparently runtime in the Trux client. . . would it be better to put a cpu time capture in the Rosetta code itself. Since Rosetta@home code is compiled in the lab, results shouldn't be fake-able.

I don't know if it's the way to go, but it should get rid of any appearance of people getting a higher score due to 3rd party boinc clients.

Oh, and thanks for the hello, I've been around awhile - ID: 2 :)

ID: 1933 · Report as offensive    Reply Quote
1 · 2 · 3 · 4 . . . 13 · Next

Message boards : Current tests : New crediting system



©2024 University of Washington
http://www.bakerlab.org