Posts by Astro

21) Message boards : RALPH@home bug list : Internal server error? (Message 2008)
Posted 12 Aug 2006 by Profile Astro
Post:
since I can't traceroute (thank you alltel, grumble grumble) I've use an website to do it for me and a trace from their systems looks like this:

22) Message boards : RALPH@home bug list : Internal server error? (Message 2006)
Posted 12 Aug 2006 by Profile Astro
Post:
I suspect their server is overwhelmed. I see the following transaction taking place when I tried to upload a wu:

8/12/2006 4:57:25 AM|ralph@home|Started upload of file BENCH_ABRELAX_SAVE_ALL_OUT_1scjB_1123_33_0_0
8/12/2006 4:57:25 AM||[ID#234] info: About to connect() to ralph.bakerlab.org port 80
8/12/2006 4:57:25 AM||[ID#234] info: Trying 140.142.20.204...
8/12/2006 4:57:25 AM||[ID#234] info: Connected to ralph.bakerlab.org (140.142.20.204) port 80
8/12/2006 4:57:25 AM||[ID#234] Sent header to server: POST /rosetta_cgi/file_upload_handler HTTP/1.1
User-Agent: BOINC client (windows_intelx86 5.5.11)
Host: ralph.bakerlab.org
Accept: */*
Accept-Encoding: deflate, gzip
Content-Type: application/x-www-form-urlencoded
Content-Length: 297


8/12/2006 4:57:25 AM||[ID#234] Received header from server: HTTP/1.1 200 OK

8/12/2006 4:57:25 AM||[ID#234] Received header from server: Date: Sat, 12 Aug 2006 08:46:24 GMT

8/12/2006 4:57:25 AM||[ID#234] Received header from server: Server: Apache/2.0.54 (Fedora)

8/12/2006 4:57:25 AM||[ID#234] Received header from server: Content-Length: 93

8/12/2006 4:57:25 AM||[ID#234] Received header from server: Connection: close

8/12/2006 4:57:25 AM||[ID#234] Received header from server: Content-Type: text/plain; charset=UTF-8

8/12/2006 4:57:25 AM||[ID#234] info: Closing connection #0
8/12/2006 4:57:26 AM||[ID#235] info: About to connect() to ralph.bakerlab.org port 80
8/12/2006 4:57:26 AM||[ID#235] info: Trying 140.142.20.204...
8/12/2006 4:57:26 AM||[ID#235] info: Connected to ralph.bakerlab.org (140.142.20.204) port 80
8/12/2006 4:57:26 AM||[ID#235] Sent header to server: POST /rosetta_cgi/file_upload_handler HTTP/1.1
User-Agent: BOINC client (windows_intelx86 5.5.11)
Host: ralph.bakerlab.org
Accept: */*
Accept-Encoding: deflate, gzip
Content-Type: application/x-www-form-urlencoded
Content-Length: 89011


8/12/2006 4:57:30 AM||[ID#235] Received header from server: HTTP/1.1 200 OK

8/12/2006 4:57:30 AM||[ID#235] Received header from server: Date: Sat, 12 Aug 2006 08:46:25 GMT

8/12/2006 4:57:30 AM||[ID#235] Received header from server: Server: Apache/2.0.54 (Fedora)

8/12/2006 4:57:30 AM||[ID#235] Received header from server: Connection: close

8/12/2006 4:57:30 AM||[ID#235] Received header from server: Transfer-Encoding: chunked

8/12/2006 4:57:30 AM||[ID#235] Received header from server: Content-Type: text/plain; charset=UTF-8

8/12/2006 4:57:30 AM||[ID#235] info: Closing connection #0
8/12/2006 4:57:31 AM|ralph@home|Finished upload of file BENCH_ABRELAX_SAVE_ALL_OUT_1scjB_1123_33_0_0
8/12/2006 4:57:31 AM|ralph@home|Throughput 21678 bytes/sec

Notice how it failed once, but succeeded on the second effort? 106 is an error returned by the Ralph server (apache) when it fails for some reason to secure a connection session for you.

To ensure you can reach Ralph and know the problem isn't between you and ralph or a firewall issue, open a command prompt window. (Start, all programs, accessories, COMMAND PROMPT)type in "tracert 140.142.20.204" (without quotations) and hit enter. What you should see is it making a number of hops accross the web until it gets to Ralph. If it stops at the first hop outside your network and you have an adsl connection, then it's likely your ISP has ICMP pings turned off at the CO (central office, a box by the roadside that your copper wires go to). I can't show you a "good" version of what it looks like since my ISP blocks ICMP. But if it gets there than you know it's likely a ralph server that's dropping your connections.

Note: this is the addy of the upload server
23) Message boards : Current tests : New crediting system (Message 1994)
Posted 12 Aug 2006 by Profile Astro
Post:
So, from Dekims description, we aren't seeing what it will be like at Rosetta. It satifies part of my curiosity to hear that different wus will be given credit at a different rate. This would/might make the adjustments to the "all over the map" issue and possibly even things out. I say if you feel confident, then "Let her roll".
I was under the impression it was going to be 2cr/model and that's what he wanted to release to Rosetta. I was not understanding this part of the process. When I read "start with issuing 2 cr/model in ralph, then make adjustments". I thought they meant change it to something other than 2, but apply it to all wus, without regard to crunch times or model/hour.
24) Message boards : Current tests : New crediting system (Message 1972)
Posted 11 Aug 2006 by Profile Astro
Post:
It's my opinion that I don't have enough data to say one way or the other. We don't know what you know about what happened behind the scenes to determine the number. I'm apologizing upfront for the following pic, but I had no other way of linking the two together into one pic, as both are needed to complete the picture and to be able to refer between them. What I'm seeing is models/time all over the map. Perhaps it you upped the number of Ralph WUs distributed per day to more than the two/three I'm getting now, that I could form an opinion. Perhaps with enough data the quantity of low number and high numbers will offset themselves to mean that the AVG granted credit over a large time period will be closer to Cross Project equality. At one point all I had were 18 cr/hour wus, then I got one that only came in at 2/hr and that offset themselves to be nearly what I would have gotten with just a standard boinc client.

Below are how it shapes up vs. other projects. It also shows every new Ralph WU I have on record. Remember Einstien and Seti are still working on theirs. Which way that goes is unknown other than Seti plans to up the "load store adjustment" (fpops multiplier) from 3.35 to 3.51 in the Seti App 5.17 (now in beta). If you look at the few results/computer you'll see what I mean by "all over the place".

I just think I need more time.

tony


25) Message boards : Current tests : New crediting system (Message 1951)
Posted 11 Aug 2006 by Profile Astro
Post:
Is ralph planning on sending more work? My results so far have been all over the map, and don't have enough data to draw any conclusions one way or the other.

tony
26) Message boards : RALPH@home bug list : Host venues corrupted? (Message 1950)
Posted 10 Aug 2006 by Profile Astro
Post:
ditto, and my team has been removed from the "projects" tab, and replaced with a big ole empty spot. LOL
27) Message boards : Current tests : New crediting system (Message 1948)
Posted 9 Aug 2006 by Profile Astro
Post:
Here's a project comparison using what LITTLE ralph data I have. The other projects haven't been brought up to date yet and I have many more data points to add to them, so I wouldn't use this as anything more than a general idea of where you're at. I've set other projects to NNW until I can get a decent sample quantity. I highlighted the "ralph-new" values in red. Since Rosetta and OLD Ralph used the same credit formula, you can compare the new ralph to the old ralph and even to Rosetta on each puter. Note: all the numbers here are from stock boinc core clients.

28) Message boards : Current tests : New crediting system (Message 1932)
Posted 8 Aug 2006 by Profile Astro
Post:

The time can be manipulated by truxes client. 5.3.12tx36


Is that a Boinc client? What if the time was kept within the Rosetta code (which is compiled and can't be manipulated as far as I know)?

Ethan, welcome to ralph by the way.

Trux 5.3.12tx36 is an optimized Boinc Core Client. The claimed credit formula currently is (whetstone+Dhrystone) * Cpu time (in seconds)/172800. Most optimized Boinc Core clients change the Benchmarks, but Truxs alters the time reported and benchmarks to get more credit.

does this answer your question?

tony

here is a Result ID from a trux client:
stderr out <core_client_version>5.3.12.tx36</core_client_version>
<real_cpu_time>2503</real_cpu_time>
<corrected_cpu_time>3930</corrected_cpu_time>
<corrected_Mfpops>11126.2</corrected_Mfpops>

see how it's "corrected" the time and benchmark?
29) Message boards : Current tests : New crediting system (Message 1929)
Posted 8 Aug 2006 by Profile Astro
Post:
Why not just use Ralph to determine the average crunch time for each simulation for a given WU? I don't believe there's a way to manipulate the amount of time it takes to process, then that average time can be compared to a 'golden' ratio of credits/cpu hour for an average machine. The ratio would have to be revisited every couple months since computers will get faster over time, but this way, the credit system is completely bypassed (and its inherent problems).

-E

The time can be manipulated by truxes client. 5.3.12tx36
30) Message boards : Current tests : New crediting system (Message 1917)
Posted 8 Aug 2006 by Profile Astro
Post:
Is there some basis used to come up with this initial estimate of 2 cobblestones/model, or was it a "swag" (scientific wild a** guess) LOL??

Are you now collecting data about the Avg decoys/WU per hour and comparing it to current values issued (with standard boinc client)?

Have you tried to identify a "Golden" machine (avg machine) and base credit issuance quantities based on that?

What weight does Rosetta place on credit parity across all projects?

Is there something we could do?

tony
31) Message boards : RALPH@home bug list : Bug reports for Ralph 5.25 (Message 1886)
Posted 5 Jul 2006 by Profile Astro
Post:
I just got the third one of these. The three WUs were:

FRA_t329_CASP7_hom001_8_858_3
FRA_t329_CASP7_hom001_8_858_4
FRA_t329_CASP7_hom001_8_858_5

I think I see a pattern. lol

Each WU I did had previously failed for one other user, prior to them failing for me.

tony
32) Message boards : RALPH@home bug list : Bug reports for Ralph 5.25 (Message 1884)
Posted 5 Jul 2006 by Profile Astro
Post:
Woke up to a screensaver and a "runtime" error box on my AMD64 3700. It's the first time I've seen this one. it looked like this



So I hit printscreen and pasted it to Photoshop, before I could finish editing the photo it happened again as can be seen below.



Here's what my Boinc Manager looked like:



And the WUs were wuid=185158 and wuid=185157. I noticed one other user had an error with these same WUs before they were issued to me. Here's the Result ID's:


Result ID 209014
Name FRA_t329_CASP7_hom001_8_858_3_1
Workunit 185157
Created 5 Jul 2006 4:25:27 UTC
Sent 5 Jul 2006 4:25:38 UTC
Received 5 Jul 2006 11:00:31 UTC
Server state Over
Outcome Client error
Client state Computing
Exit status 3 (0x3)
Computer ID 2172
Report deadline 9 Jul 2006 4:25:38 UTC
CPU time 99.3125
stderr out <core_client_version>5.5.4</core_client_version>
<message>
The system cannot find the path specified. (0x3) - exit code 3 (0x3)
</message>
<stderr_txt>
# cpu_run_time_pref: 14400
# random seed: 2970327

</stderr_txt>


Validate state Invalid
Claimed credit 0.389839542825192
Granted credit 0
application version 5.25

and

Result ID 209015
Name FRA_t329_CASP7_hom001_8_858_4_1
Workunit 185158
Created 5 Jul 2006 4:25:27 UTC
Sent 5 Jul 2006 4:25:38 UTC
Received 5 Jul 2006 11:00:31 UTC
Server state Over
Outcome Client error
Client state Computing
Exit status 3 (0x3)
Computer ID 2172
Report deadline 9 Jul 2006 4:25:38 UTC
CPU time 72.921875
stderr out <core_client_version>5.5.4</core_client_version>
<message>
The system cannot find the path specified. (0x3) - exit code 3 (0x3)
</message>
<stderr_txt>
# random seed: 2970326

</stderr_txt>


Validate state Invalid
Claimed credit 0.286246247068151
Granted credit 0
application version 5.25
33) Message boards : RALPH@home bug list : Bug reports for Ralph 5.23 (Message 1840)
Posted 18 Jun 2006 by Profile Astro
Post:

[deleted]
34) Message boards : RALPH@home bug list : Bug reports for Ralph 5.23 (Message 1834)
Posted 17 Jun 2006 by Profile Astro
Post:
Rom had mentioned there might be a fix to the fatal windows errors in 5.23. When it was released, I set the box I usually got these errors with to NNW/NNT for all other projects and suspended them, so I'd run nothing but 5.23. I'm not ready to say "it's Fixed", but so far it sure looks good.
177824 158140 14 Jun 2006 19:37:15 UTC 15 Jun 2006 10:43:14 UTC Over Success Done 13,545.38 53.30 53.30
177438 157770 14 Jun 2006 15:16:32 UTC 15 Jun 2006 7:36:07 UTC Over Success Done 14,102.19 55.50 55.50
176725 156687 14 Jun 2006 7:37:56 UTC 15 Jun 2006 1:10:40 UTC Over Success Done 13,275.25 52.24 52.24
176356 156712 14 Jun 2006 3:52:59 UTC 14 Jun 2006 20:28:26 UTC Over Success Done 13,985.28 53.74 53.74
175612 151093 13 Jun 2006 20:01:42 UTC 14 Jun 2006 16:47:14 UTC Over Success Done 14,084.34 54.12 54.12
174950 155410 13 Jun 2006 13:25:22 UTC 14 Jun 2006 15:16:32 UTC Over Success Done 14,111.06 54.22 54.22
174529 155065 13 Jun 2006 9:17:55 UTC 14 Jun 2006 7:37:56 UTC Over Success Done 14,101.56 54.18 54.18
174341 154879 13 Jun 2006 6:09:29 UTC 14 Jun 2006 3:52:59 UTC Over Success Done 14,346.30 55.12 55.12
173772 154359 12 Jun 2006 22:33:06 UTC 13 Jun 2006 20:01:42 UTC Over Success Done 14,103.70 54.19 54.19
173541 154155 12 Jun 2006 19:11:15 UTC 13 Jun 2006 10:39:56 UTC Over Success Done 14,441.13 55.49 55.49
170677 146450 11 Jun 2006 22:52:42 UTC 13 Jun 2006 9:17:55 UTC Over Success Done 13,161.97 50.57 50.57
170660 146482 11 Jun 2006 22:52:42 UTC 13 Jun 2006 3:03:42 UTC Over Success Done 14,275.66 54.85 54.85
170659 146481 11 Jun 2006 22:52:42 UTC 12 Jun 2006 8:30:20 UTC Over Success Done 13,858.98 53.25 53.25

OK, OK I'm convinced. I haven't had any errors of any kind with 5.23. The following WU can be added to the list of successes for my error prone puter:

181942 162058 16 Jun 2006 18:44:02 UTC 17 Jun 2006 10:47:32 UTC Over Success Done 14,364.88 56.53 56.53
181682 161805 16 Jun 2006 14:44:34 UTC 17 Jun 2006 4:16:03 UTC Over Success Done 13,954.83 54.92 54.92
181002 161170 16 Jun 2006 8:41:29 UTC 17 Jun 2006 0:34:09 UTC Over Success Done 13,478.86 53.04 53.04
180751 160956 16 Jun 2006 4:39:34 UTC 16 Jun 2006 20:44:26 UTC Over Success Done 13,756.77 54.14 54.14
180432 151974 15 Jun 2006 23:29:54 UTC 16 Jun 2006 15:04:45 UTC Over Success Done 14,181.53 55.81 55.81
179254 159529 15 Jun 2006 11:09:29 UTC 16 Jun 2006 11:14:50 UTC Over Success Done 14,375.77 56.57 56.57
178840 159127 15 Jun 2006 7:36:07 UTC 16 Jun 2006 8:41:29 UTC Over Success Done 14,242.95 56.05 56.05
178558 158860 15 Jun 2006 4:15:02 UTC 15 Jun 2006 23:29:54 UTC Over Success Done 13,769.56 54.19 54.19
178127 158438 14 Jun 2006 23:29:32 UTC 15 Jun 2006 20:00:12 UTC Over Success Done 14,343.39 56.44 56.44

I've set the other projects back to "allow new work" and "resumed" them. THanks for fixing this
35) Message boards : RALPH@home bug list : Bug reports for Ralph 5.23 (Message 1833)
Posted 16 Jun 2006 by Profile Astro
Post:
BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down...


This is normal, it just means the wu completed successfully and didn't need the watchdog, so it shut it down.

If you're talking about wuid=160649, then it completed sucessfully and has been credited. See the "result ID" for that WU below

Result ID 180410
Name FRA_t301_hom001_1_LOOPRLX_IGNORE_THE_REST__hom001_1_1bwzA__100_701_23_0
Workunit 160649
Created 15 Jun 2006 22:54:54 UTC
Sent 15 Jun 2006 23:51:21 UTC
Received 16 Jun 2006 14:42:46 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 2910
Report deadline 19 Jun 2006 23:51:21 UTC
CPU time 5847.390625
stderr out <core_client_version>5.4.9</core_client_version>
<stderr_txt>
# random seed: 2998884
# cpu_run_time_pref: 3600
# DONE :: 1 starting structures built 0 (nstruct) times
# This process generated 1 decoys from 1 attempts
# 0 starting pdbs were skipped


BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down...

</stderr_txt>


Validate state Valid
Claimed credit 13.6983915054668
Granted credit 13.6983915054668
application version 5.23
if the protein is huge, your puter old, or your runtime is set low, then this is what you should be seeing with your future wus. It's helping.

tony
36) Message boards : RALPH@home bug list : Bug reports for Ralph 5.23 (Message 1831)
Posted 16 Jun 2006 by Profile Astro
Post:
is this WU a troubled wu? FRA_t301_hom001_1_LOOPRLX_IGNORE_THE_REST__hom001_1_1bwzA__100_701_23_0
I have left it running now for 1:34:10 and it is only at 1.623% and finish time is up to 3333:00:01 now and still climbing

the progress indicator isn't linear. What you'll see are jumps in percetage. All WUs start at 1% and slowly proceed higher until one model is done. Then it jumps to another percentage and the points to the right of the decimal slowly proceed again until the next model is done. I.E you might see this if you checked the status every 10 min: 1.000, 1.0001, 1.0002, 1.003, 12.000, 12.001, 12.002, 12.003, 24.000, 24.001 etc etc until the time runs out where it jumps to 100%, uploads and reports.

The number of models depends on protein size, and puter speeds (for the most part). Every WU will run atleast ONE model regardless of time (except where terminated by "watchdog timer").

does this help?

tony
37) Message boards : RALPH@home bug list : Bug reports for Ralph 5.23 (Message 1828)
Posted 15 Jun 2006 by Profile Astro
Post:
Rom had mentioned there might be a fix to the fatal windows errors in 5.23. When it was released, I set the box I usually got these errors with to NNW/NNT for all other projects and suspended them, so I'd run nothing but 5.23. I'm not ready to say "it's Fixed", but so far it sure looks good.
177824 158140 14 Jun 2006 19:37:15 UTC 15 Jun 2006 10:43:14 UTC Over Success Done 13,545.38 53.30 53.30
177438 157770 14 Jun 2006 15:16:32 UTC 15 Jun 2006 7:36:07 UTC Over Success Done 14,102.19 55.50 55.50
176725 156687 14 Jun 2006 7:37:56 UTC 15 Jun 2006 1:10:40 UTC Over Success Done 13,275.25 52.24 52.24
176356 156712 14 Jun 2006 3:52:59 UTC 14 Jun 2006 20:28:26 UTC Over Success Done 13,985.28 53.74 53.74
175612 151093 13 Jun 2006 20:01:42 UTC 14 Jun 2006 16:47:14 UTC Over Success Done 14,084.34 54.12 54.12
174950 155410 13 Jun 2006 13:25:22 UTC 14 Jun 2006 15:16:32 UTC Over Success Done 14,111.06 54.22 54.22
174529 155065 13 Jun 2006 9:17:55 UTC 14 Jun 2006 7:37:56 UTC Over Success Done 14,101.56 54.18 54.18
174341 154879 13 Jun 2006 6:09:29 UTC 14 Jun 2006 3:52:59 UTC Over Success Done 14,346.30 55.12 55.12
173772 154359 12 Jun 2006 22:33:06 UTC 13 Jun 2006 20:01:42 UTC Over Success Done 14,103.70 54.19 54.19
173541 154155 12 Jun 2006 19:11:15 UTC 13 Jun 2006 10:39:56 UTC Over Success Done 14,441.13 55.49 55.49
170677 146450 11 Jun 2006 22:52:42 UTC 13 Jun 2006 9:17:55 UTC Over Success Done 13,161.97 50.57 50.57
170660 146482 11 Jun 2006 22:52:42 UTC 13 Jun 2006 3:03:42 UTC Over Success Done 14,275.66 54.85 54.85
170659 146481 11 Jun 2006 22:52:42 UTC 12 Jun 2006 8:30:20 UTC Over Success Done 13,858.98 53.25 53.25
38) Message boards : RALPH@home bug list : Bug reports for Ralph 5.23 (Message 1825)
Posted 13 Jun 2006 by Profile Astro
Post:
I have a funny (interesting) one. One my laptop (which has been pretty much flawless at Ralph, as opposed to my AMD64 3700 sandiego which experiences the "fatal windows" error) I've seen something happen twice in 24 hours. I see either the Rosetta 5.22 screensaver or the Ralph 5.23 screensaver will show on my window when I return from some personal task. the graphic will NOT go away by moving a mouse or pressing a key. I had another window open but couldn't see it. The mouse would still work on the unseen graphic if I just clicked all over I could hear it interacting, but the Rosetta Graphic would not release my screen. I ended up pressing the power button on both occasions, only to see the HD activity light blink and hear the windows log off Wav, but the Rosetta graphic was still on the screen all the way to Shutdown when the screen when dead.

Since mine is the only report of this, it was on both Rosetta and Ralph, and hasn't happened with the laptop before, I will be doing some adware/malware/virus/others scans to see if the problem is on my end.

tony
39) Message boards : RALPH@home bug list : Bug reports for Ralph 5.21 (Message 1790)
Posted 6 Jun 2006 by Profile Astro
Post:
Tony,

What kind of graphics adapter do you have on that machine?

AMD64 3700 Sandiego processor, Asus A8N-E mobo, Asus EN6200TC256/TD/64M/A Pci express video card, 1 GB OCZ Gold RAM. This is my only machine giving this fatal windows error, I has Nvidia chipsets in both the mobo and Video card.

tony

display "plug and play monitor onn NVIDIA Geforce 6200 TurboCache(TM)

Says ASUS OSD provide you the access to dynamically adjust parameters in D3D or OpenGL games by hotkeys.

Graphics card info
GeForce 6200TurboCache
Video Bios Version, 5.44.02.11
IRQ 18
PCI Express X16
256 MB memory
ForceWare Version 71.24
TV Encoder Type: Nvidia integrated
40) Message boards : RALPH@home bug list : Bug reports for Ralph 5.21 (Message 1785)
Posted 6 Jun 2006 by Profile Astro
Post:
OOPS, spoke to soon. On my AMD64 3700 I got excited about a graphics fix, so I turned ON the screensaver.

wuid=133453

got another fatal windows error.


Result ID 151135
Name t307__CASP7_ABRELAX_SAVE_ALL_OUT_CONTACT_hom001__649_260_0
Workunit 133453
Created 5 Jun 2006 5:31:35 UTC
Sent 5 Jun 2006 7:04:09 UTC
Received 6 Jun 2006 20:16:34 UTC
Server state Over
Outcome Client error
Client state Computing
Exit status -1073741811 (0xffffffffc000000d)
Computer ID 2172
Report deadline 9 Jun 2006 7:04:09 UTC
CPU time 13127.953125
stderr out <core_client_version>5.4.9</core_client_version>
<message>
- exit code -1073741811 (0xc000000d)
</message>
<stderr_txt>
# random seed: 3033888
# cpu_run_time_pref: 14400

</stderr_txt>


Validate state Invalid
Claimed credit 51.7612201395287
Granted credit 0
application version 5.21


Previous 20 · Next 20



©2024 University of Washington
http://www.bakerlab.org