| Author | Message |
|
|
|
Ralph 5.66 fixed a problem where the graphics thread was crashing when sidechains were shown.
Ralph 5.67 fixes an issue in the output of symmetric proteins.
Thanks in advance for your posts! The posts for 5.65 helped a lot.
____________
|
|
|
|
|
|
My sidechains still fall off on 5.67, as they did with 5.65
Example screenshot
____________
|
|
|
|
|
|
My Mac G4/733 crashed after more than ten hours of crunching 1gidA_BOINC_MG_SASAPAIR_ALLRES_RNA_ABINITIO_SAVE_ALL_OUT_BARCODE_RNA_CONTACT_RNA_LONG_RANGE_CONTACT_RNA_SASA-1gidA-_2068_172; last time I looked it was showing only about ten minutes to go but hadn’t decremented that time for quite a while. (The percent done was over 98% and continuing to increment.) Exit status 1 (0x1), with the all-too-familiar “SIGBUS: bus error” message in the output file. Once again, the crash occurred either while the display was blacked out (having displayed the screensaver for a minute) or when I interrupted it. BTW this system has always been set to work while in use and to keep apps in memory, so I don’t understand why starting and stopping the graphics should be a problem—if that’s indeed the case.
|
|
|
|
|
|
Hi feet1st -- yea, its because Rosetta changes its fold while the graphics thread finishes its drawing. We considered at one point freezing Rosetta until each graphics frame finishes, but were worried about the performance cost! So these large molecules may continue to get rendered in freaky ways!
My sidechains still fall off on 5.67, as they did with 5.65
Example screenshot
____________
|
|
|
|
|
|
Thanks for the post. I doubt that it is the graphics start and stop, but it might be. Please do post again if you find your mac crashing when you play with graphics. Those are tough bugs to fix, because a lot of the graphics stuff is out of our direct control. The good news (well, maybe bad to start with) is that the BOINC infrastructure will be moving to a new way of doing graphics that is apparently more robust, I think by the end of the summer. So after we iron out the kinks, that might help the graphics-related errors...
Incidentally, those workunits do take a long time (we have implemented checkpointing so that work should be saved freuqently in case of crashese), and Mac G4\'s are pretty slow for running Rosetta, unfortunately.
My Mac G4/733 crashed after more than ten hours of crunching 1gidA_BOINC_MG_SASAPAIR_ALLRES_RNA_ABINITIO_SAVE_ALL_OUT_BARCODE_RNA_CONTACT_RNA_LONG_RANGE_CONTACT_RNA_SASA-1gidA-_2068_172; last time I looked it was showing only about ten minutes to go but hadn’t decremented that time for quite a while. (The percent done was over 98% and continuing to increment.) Exit status 1 (0x1), with the all-too-familiar “SIGBUS: bus error” message in the output file. Once again, the crash occurred either while the display was blacked out (having displayed the screensaver for a minute) or when I interrupted it. BTW this system has always been set to work while in use and to keep apps in memory, so I don’t understand why starting and stopping the graphics should be a problem—if that’s indeed the case.
____________
|
|
|
|
|
|
http://ralph.bakerlab.org/result.php?resultid=529432
____________
|
|
|
|
|
|
http://ralph.bakerlab.org/result.php?resultid=530490
errored at 0.71%
____________
|
|
|
|
|
|
http://ralph.bakerlab.org/result.php?resultid=530509
____________
|
|
|
|
|
Hi feet1st -- yea, its because Rosetta changes its fold while the graphics thread finishes its drawing. We considered at one point freezing Rosetta until each graphics frame finishes, but were worried about the performance cost! So these large molecules may continue to get rendered in freaky ways!
...and so it\'s not JUST the large ones? But they take longer to rendure and so I\'m more likely to spot it there?
So, I am seeing the backbone of the next contortion, and the sidechains from the last? ...or perhaps visa-versa.
I personally am a computer programmer. I understand the challenge and performance concern, and personally feel that the graphic is just a nice thing for curious participants to keep us interested and involved. ...but I fear that many confuse the graphic with the science. They see a graphic that \"doesn\'t work right\", and they start to question the integrity of the science being done as well.
I already know the science is top-notch. But others do not take that for granted. So, I hope your mind will continue crunching on this issue until you can find a happy compromise that will yield proper graphics, as well as efficient crunching.
I take it that as you resolved the thread-safety issues with the graphic thread, that you devised a means of sharing the same memory. Which is more efficient then the double buffer approach I had envisioned. But... have you tested at all how MUCH more efficient? It may just be a 1% cost to push stuff out to a new memory area for the graphic thread.
Would it be possible to have the graphic thread grab a semaphore once it has rendured a frame? And then if that semaphore is in use, a new frame of data get\'s pushed out to the isolated \"graphic-only\" memory area, and the semaphore is freed.
I\'m thinking that the graphic thread probably regulates itself so far as frames per second and etc. And such an approach would allow you to only push bytes around once per frame rendured. So you might be able to crunch 100 steps and only have the overhead for one frame of memory copy.
I note that my approach actually redures a \"stale\" frame, rather then the one actually in progress at this micro-second. Because it pushes out the current model at the end of the reduring of the last frame, rather then just before reduring the next frame; but, I don\'t think anyone would mind. The result would be similar to how the football game you see on your television has a satilite delay from the ACTUAL game being played 2,000 miles away.
____________
|
|
|
|
|
|
This is on an Intel Mac (Macbook Pro). 5.68 seems to use far more memory than earlier versions. The VM size is 1.6GB, and the working set is over 600MB. This is causing a big impact on the machine.
____________
|
|
|
|
|
|
A very weird one from my Mac G4/733: when I came in to work this morning I saw that a Ralph task appeared frozen at 13.586% done, although its status showed as Running. Opening the graphics window, I saw this:

Note that the window says it’s 54.35% complete, contradicting BOINC Manager (although the times agree exactly), and that it seems to have lost track of my account—and even its own version number: “rosetta@home v0”!
Looking in the Messages tab I found: Thu May 24 17:37:41 2007|ralph@home|Starting 1eyvA_BOINC_NOFILTERS_ABRELAX_SAVE_ALL_OUT_NEWRELAXFLAGS-1eyvA-frags83__2069_6_0
Thu May 24 17:37:42 2007|ralph@home|Starting task 1eyvA_BOINC_NOFILTERS_ABRELAX_SAVE_ALL_OUT_NEWRELAXFLAGS-1eyvA-frags83__2069_6_0 using rosetta_beta version 567
Thu May 24 17:44:37 2007|ralph@home|Sending scheduler request: Requested by user
Thu May 24 17:44:37 2007|ralph@home|Reporting 1 tasks
Thu May 24 17:44:42 2007|ralph@home|Scheduler RPC succeeded [server version 509]
Thu May 24 17:44:42 2007|ralph@home|Deferring communication for 4 min 2 sec
Thu May 24 17:44:42 2007|ralph@home|Reason: requested by project That last bit was from my Updating to get yesterday’s crash reported. Then there was nothing for the remaining fifteen hours or so, aside from a SETI@home download a few minutes after the above messages were logged—so apparently it had prevented BOINC from crunching all night. I suspended the task; other projects resumed OK. A little while later I tried resuming the task, and it still seemed stuck, so I quit and relaunched BOINC. The WU seemed to have disappeared without a trace: no log entries indicating an upload or a report. Just to top the strangeness off, the result doesn’t seem to be on the website; I can’t find it anywhere in my account, under that host or elsewhere.
|
|
|
|
|
|
5.9MB task downloads? THAT\'s new! People will want to be aware of that in the new release notes on Rosetta.
____________
|
|
|
|
|
|
3 failures
Work Unit http://ralph.bakerlab.org/result.php?resultid=529839 failed with Exit code 1, Error exit from: hbonds.cc line: 648
Work Unit http://ralph.bakerlab.org/result.php?resultid=531669 failed with Exit code 1, Error exit from: hbonds.cc line: 624
Work Unit http://ralph.bakerlab.org/result.php?resultid=531753 failed with Exit code 193, SIGSEGV, Segmentation Violation.
Hope this helps.
____________
 |
|
|
|
|
This is on an Intel Mac (Macbook Pro). 5.68 seems to use far more memory than earlier versions. The VM size is 1.6GB, and the working set is over 600MB. This is causing a big impact on the machine.
I have this running on a XP and it take atleast 1,2 GB of VM.
Anders n
EDIT
It took 5H 20 min to do 1 model on a P4 2,8
____________
|
|
|
|
|
|
This WU uses A LOT of memory. My laptop has only got 512 mb ram, so the wu uses above 95% of the pagefile (1,5 to 1,6 Gb). Now after running for 3hours 27 minutes, the wu pauses and the status-field in BOINC shows the message \"Waiting for memory\". BOINC then just switched to another wu.
What should I do?
____________
|
|
|
|
|
This WU uses A LOT of memory. My laptop has only got 512 mb ram, so the wu uses above 95% of the pagefile (1,5 to 1,6 Gb). Now after running for 3hours 27 minutes, the wu pauses and the status-field in BOINC shows the message \"Waiting for memory\". BOINC then just switched to another wu.
What should I do?
Hi Bjarke
I\'m not on the team \"just\" a tester like you :)
Is there a chance for you to increase the VM?
Maybe it would get the WU kicking again.
Anders n
____________
|
|
|
|
|
3 failures
Work Unit http://ralph.bakerlab.org/result.php?resultid=529839 failed with Exit code 1, Error exit from: hbonds.cc line: 648
Work Unit http://ralph.bakerlab.org/result.php?resultid=531669 failed with Exit code 1, Error exit from: hbonds.cc line: 624
Work Unit http://ralph.bakerlab.org/result.php?resultid=531753 failed with Exit code 193, SIGSEGV, Segmentation Violation.
Hope this helps.
Another 2 failed after only 3 minutes
http://ralph.bakerlab.org/result.php?resultid=532538
http://ralph.bakerlab.org/result.php?resultid=532562
both failed with this error
process exited with code 1 (0x1)
trouble finding jump_templates_RNA_basepairs_v2.dat
ERROR:: Exit from: read_paths.cc line: 360
____________
 |
|
|
|
|
This WU uses A LOT of memory. My laptop has only got 512 mb ram, so the wu uses above 95% of the pagefile (1,5 to 1,6 Gb). Now after running for 3hours 27 minutes, the wu pauses and the status-field in BOINC shows the message \"Waiting for memory\". BOINC then just switched to another wu.
What should I do?
Hi Bjarke
I\'m not on the team \"just\" a tester like you :)
Is there a chance for you to increase the VM?
Maybe it would get the WU kicking again.
Anders n
Thanks for the tip, though I\'ve already tried that without luck. Anyway it seems that my computer resumed working on that WU after i while, unfortunately it came out with a failure.
The result, for anyone interested: 531974
____________
|
|
|
|
|
|
Got a SIGSEV error on this wu.
____________
  |
|
|
|
|
|
Error on workunit 460745 and 463219:
<core_client_version>5.8.15</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1)
</message>
<stderr_txt>
Graphics are disabled due to configuration...
# cpu_run_time_pref: 14400
trouble finding jump_templates_RNA_basepairs_v2.dat
ERROR:: Exit from: read_paths.cc line: 360
</stderr_txt>
]]>
Same error in both cases. Both workunits failed for other users as well.
____________
|
|
|
|
|
3 failures
Work Unit http://ralph.bakerlab.org/result.php?resultid=529839 failed with Exit code 1, Error exit from: hbonds.cc line: 648
Work Unit http://ralph.bakerlab.org/result.php?resultid=531669 failed with Exit code 1, Error exit from: hbonds.cc line: 624
Work Unit http://ralph.bakerlab.org/result.php?resultid=531753 failed with Exit code 193, SIGSEGV, Segmentation Violation.
Hope this helps.
Another 2 failed after only 3 minutes
http://ralph.bakerlab.org/result.php?resultid=532538
http://ralph.bakerlab.org/result.php?resultid=532562
both failed with this error
process exited with code 1 (0x1)
trouble finding jump_templates_RNA_basepairs_v2.dat
ERROR:: Exit from: read_paths.cc line: 360
Another one with the same error as above
http://ralph.bakerlab.org/result.php?resultid=532003
____________
 |
|
|
|
|
|
These two results crashed, after I upgraded BOINC from 5.8.16 to 5.9.12:
http://ralph.bakerlab.org/result.php?resultid=530801
http://ralph.bakerlab.org/result.php?resultid=530709
Not sure if they there actually running while upgrading, but option \"Leave in memory while suspended\" is enabled...
Cheers, Shai
____________

My NEW BOINC-Site
Why people joined BOINC Synergy... |
|
|
|
|
|
Computation error: 533221.
____________
 |
|
|
|
|
|
This WU did a crash and burn when the computer ran out of VM.
Anders n
____________
|
|
|
|
|
|
After a fair run of successes my G5 Mac got a computation error (no crash AFAICT) with exit code 1 (0x1), running v5.68 on gp04__BOINC_SYMM_FOLD_AND_DOCK_RELAX_SUBSYSTEM-gp04_-delC126__2078_10 after a little over six hours of crunching. The system had been running with the screensaver blacked out and the display sleeping for at least twelve hours. The output ends with ERROR:: Exit from: hbonds.cc line: 636
|
|
|
|
|
|
Hi: Yea I wish I\'d seen all these crashes before sending out the same job to Rosetta@home. The first jobs that came back seemed OK -- now I realize that its because all the adversely affected computers were taking suuuper long and then crashed. We didn\'t expect those workunits to have such big memory footprints, so we\'ll have to spend a bit of time debugging.
After a fair run of successes my G5 Mac got a computation error (no crash AFAICT) with exit code 1 (0x1), running v5.68 on gp04__BOINC_SYMM_FOLD_AND_DOCK_RELAX_SUBSYSTEM-gp04_-delC126__2078_10 after a little over six hours of crunching. The system had been running with the screensaver blacked out and the display sleeping for at least twelve hours. The output ends with ERROR:: Exit from: hbonds.cc line: 636
____________
|
|
|
|
|
|
Work Unit http://ralph.bakerlab.org/result.php?resultid=534823 failed with Exit code 1, ERROR:: Exit from: barcode_classes.cc line: 576
Work Unit http://ralph.bakerlab.org/result.php?resultid=535524 failed with Exit code 1, ERROR:: Exit from: fragments.cc line: 691 |
|
|
|
|
|
Have got a WU running at present
1acf__TREEJUMP_ABRELAX_TJTOP3_SAVE_ALL_OUT_BARCODE__2095_19_0
rosetta_beta version 568
Windows XP,
Processor: AuthenticAMD Unknown CPU Type [x86 Family 6 Model 8 Stepping 1] [fpu tsc sse 3dnow mmx]
Memory: 751.49 MB physical, 3.43 GB virtual
Same problem as mentioned above. Permanently 10 mins (exactly) to run. currently 92% complete, Progress still incrementing but \'to complete\' static at 10 minutes.
BOINC running as a service so no graphics, but Windows had blanked screen as screensaver kicked in overnight.
Should I abort?
--
Rod Ellery
____________
|
|
|
|
|
Have got a WU running at present
1acf__TREEJUMP_ABRELAX_TJTOP3_SAVE_ALL_OUT_BARCODE__2095_19_0
rosetta_beta version 568
Windows XP,
Processor: AuthenticAMD Unknown CPU Type [x86 Family 6 Model 8 Stepping 1] [fpu tsc sse 3dnow mmx]
Memory: 751.49 MB physical, 3.43 GB virtual
Same problem as mentioned above. Permanently 10 mins (exactly) to run. currently 92% complete, Progress still incrementing but \'to complete\' static at 10 minutes.
BOINC running as a service so no graphics, but Windows had blanked screen as screensaver kicked in overnight.
Should I abort?
--
Rod Ellery
How long has it been running and what are your pref. run time?
____________
|
|
|
|
|
|
It was running at about 2 hrs with preferred at 1 hr. It seems to have completed sometime in the last 3/4 hr. Just checking result status.
Seems to have generated a successful result. ( Result 536713
Rod |
|
|
|
|
Same problem as mentioned above. Permanently 10 mins (exactly) to run. currently 92% complete, Progress still incrementing but \'to complete\' static at 10 minutes.
--
Rod Ellery
When a WU takes longer than your run pref. time it looks like what you describe.
As long as the % complete increase things should be ok :)
Anders n
ps
Some of the WU-s take up to 4 H to make 1 model on my computers.
____________
|
|
|
|
|
|
Segmentation Violation on this WU
This WU
____________
 |
|
|
|
|
|
I\'m 8.5hrs in to this symm fold dock relax task and still have not completed the second model. Seems significantly higher then the 1hr/model mentioned previously.
____________
|
|
|
|
|
|
Hi: Yea, about half the workunits failed on all the platforms. I\'m looking into this now...
I\'m 8.5hrs in to this symm fold dock relax task and still have not completed the second model. Seems significantly higher then the 1hr/model mentioned previously.
____________
|
|
|
|
|
|
A compute error: 547575.
____________
 |
|
|
|
|
|
This WU failed. I had a Blue Screen of Death at this point. Cannot determine if the WU caused the computer to crash or the computer crashing caused the WU to fail.
____________
 |
|
|
|
|
Hi: Yea, about half the workunits failed on all the platforms. I\'m looking into this now...
I\'m 8.5hrs in to this symm fold dock relax task and still have not completed the second model. Seems significantly higher then the 1hr/model mentioned previously.
Looks like mine completed normally, but took just shy of 20hrs to complete 4 models.
____________
|
|
|
|
|
|
Another compute error from my Mac G4/733: 2oo2A_BOINC_SYMM_FOLD_AND_DOCK_RELAX-2oo2A-dimer__2117_156, exit code 1 (0x1).
|
|
|
|
|
|
This WU failed after 4 seconds and apparently tried to access the internet (according to ZoneAlarm).
____________
 |
|
|
|
|
|
Yes, the failures do attempt to connect directly back to the project to report additional traces. And unfortunately with ZA, each version of Ralph AND Rosetta must be enabled. Please enable the current v5.68 by clicking the program control, and then the add button and selecting the .exe from your BOINC/projects folder
On Windows, the default path would be:
/Program Files/BOINC/Ralph.bakerlab.org/rosetta_beta_5.68_windows_intelx86
____________
|
|
|
|
|
Yes, the failures do attempt to connect directly back to the project to report additional traces. And unfortunately with ZA, each version of Ralph AND Rosetta must be enabled. Please enable the current v5.68 by clicking the program control, and then the add button and selecting the .exe from your BOINC/projects folder
On Windows, the default path would be:
/Program Files/BOINC/Ralph.bakerlab.org/rosetta_beta_5.68_windows_intelx86
zonealarm used to have a \'changes frequently\' option that you could give to a program so it always allowed internet access. Don\'t know if that was just ZA Pro or not though...
____________
|
|
|
|
|
zonealarm used to have a \'changes frequently\' option that you could give to a program so it always allowed internet access. Don\'t know if that was just ZA Pro or not though...
With each new application release, the Rosetta application\'s executable file name changes--the version number is part of the filename. Thus, there\'s no way a firewall would be able to keep track of the changes automatically. |
|
|
|
|
|
Any ideas how come these 2 Wu-s was ok on my computers and not the others?
http://ralph.bakerlab.org/workunit.php?wuid=484660
http://ralph.bakerlab.org/workunit.php?wuid=488628
Anders n
____________
|
|
|
|
|
|
hello,
2 WU the same day with Outcome : \"client error\" :(
http://ralph.bakerlab.org/result.php?resultid=563532
http://ralph.bakerlab.org/result.php?resultid=563505
application version 5.68
BOINC manager : 5.8.15
____________
|
|
|
|
|
|
This one failed after restarting computer.
http://ralph.bakerlab.org/result.php?resultid=562489
Anders n
____________
|
|
|
|
|
|
2 Compute error Work Units, completed but failed file transfer, lost 12 hours of processing and get no credit for it.
this WU completed 24 decoys
and this one completed 12 decoys
CPU time 21377.666099
stderr out
<core_client_version>5.8.16</core_client_version>
< |
|
|
|
|
|
Got 2 wu\'s, both got a compute error, one after almost an hour, the other one after appr. 37 minutes...
http://ralph.bakerlab.org/result.php?resultid=564016
http://ralph.bakerlab.org/result.php?resultid=564014
____________

My NEW BOINC-Site
Why people joined BOINC Synergy... |
|
|
|
|
|
12 failures today - see below.

____________
  |
|
|
|
|
|
Failed result 564114 from WU 499793 1IL4_BOINC_MFR_ABRELAX_2147_49 for unknown reason - all 3 machines exited with error -161.
Peter |
|
|
|
|
|
Result ID 564092
Name 1FAB_BOINC_MFR_ABRELAX_2143_14_1
Workunit 499439
Created 23 Jun 2007 17:50:22 UTC
Sent 23 Jun 2007 17:50:30 UTC
Received 23 Jun 2007 21:04:33 UTC
Server state Over
Outcome Client error
Client state Compute error
Exit status 0 (0x0)
Computer ID 8764
Report deadline 27 Jun 2007 17:50:30 UTC
CPU time 6160.23699
stderr out
<core_client_version>5.9.11</core_client_version>
<![CDATA[
<stderr_txt>
Graphics are disabled due to configuration...
# cpu_run_time_pref: 7200
# random seed: 2605429
======================================================
DONE :: 1 starting structures 6160.24 cpu seconds
This process generated 3 decoys from 3 attempts
======================================================
BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down...
</stderr_txt>
<message>
<file_xfer_error>
<file_name>1FAB_BOINC_MFR_ABRELAX_2143_14_1_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>
</message>
]]>
Validate state Invalid
Claimed credit 20.2124723374978
Granted credit 0
application version 5.68
____________
|
|
|
|
|
|
Result ID 563167
Name 1HPR_BOINC_MFR_ABRELAX_2146_25_0
Workunit 499552
Created 23 Jun 2007 1:49:29 UTC
Sent 23 Jun 2007 2:02:31 UTC
Received 23 Jun 2007 8:58:28 UTC
Server state Over
Outcome Client error
Client state Compute error
Exit status 0 (0x0)
Computer ID 7181
Report deadline 27 Jun 2007 2:02:31 UTC
CPU time 6826.246613
stderr out
<core_client_version>5.10.2</core_client_version>
<![CDATA[
<stderr_txt>
Graphics are disabled due to configuration...
# cpu_run_time_pref: 7200
# random seed: 2605268
======================================================
DONE :: 1 starting structures 6826.25 cpu seconds
This process generated 8 decoys from 8 attempts
======================================================
BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down...
</stderr_txt>
<message>
<file_xfer_error>
<file_name>1HPR_BOINC_MFR_ABRELAX_2146_25_0_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>
</message>
]]>
Validate state Invalid
Claimed credit 14.7995571907133
Granted credit 0
application version 5.68
____________
|
|
|
|
|
|
Result ID 563321
Name 1HPR_BOINC_MFR_ABRELAX_2146_39_0
Workunit 499692
Created 23 Jun 2007 2:29:49 UTC
Sent 23 Jun 2007 2:38:59 UTC
Received 23 Jun 2007 8:58:33 UTC
Server state Over
Outcome Client error
Client state Compute error
Exit status 0 (0x0)
Computer ID 8763
Report deadline 27 Jun 2007 2:38:59 UTC
CPU time 7121.640625
stderr out
<core_client_version>5.10.2</core_client_version>
<![CDATA[
<stderr_txt>
# cpu_run_time_pref: 7200
# random seed: 2605254
======================================================
DONE :: 1 starting structures 7120.73 cpu seconds
This process generated 8 decoys from 8 attempts
======================================================
BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down...
</stderr_txt>
<message>
<file_xfer_error>
<file_name>1HPR_BOINC_MFR_ABRELAX_2146_39_0_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>
</message>
]]>
Validate state Invalid
Claimed credit 29.4299883127254
Granted credit 0
application version 5.68
____________
|
|
|
|
|
|
Result ID 563617
Name 1FAB_BOINC_MFR_ABRELAX_2144_24_1
Workunit 499540
Created 23 Jun 2007 6:10:18 UTC
Sent 23 Jun 2007 6:10:25 UTC
Received 23 Jun 2007 17:42:55 UTC
Server state Over
Outcome Client error
Client state Compute error
Exit status 0 (0x0)
Computer ID 8763
Report deadline 27 Jun 2007 6:10:25 UTC
CPU time 6887.484375
stderr out
<core_client_version>5.10.2</core_client_version>
<![CDATA[
<stderr_txt>
# cpu_run_time_pref: 7200
# random seed: 2605369
======================================================
DONE :: 1 starting structures 6886.77 cpu seconds
This process generated 4 decoys from 4 attempts
======================================================
BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down...
</stderr_txt>
<message>
<file_xfer_error>
<file_name>1FAB_BOINC_MFR_ABRELAX_2144_24_1_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>
</message>
]]>
Validate state Invalid
Claimed credit 28.4623439083363
Granted credit 0
application version 5.68
____________
|
|
|
|
|
|
Hi:
We\'re looking at these now..
Result ID 563617
Name 1FAB_BOINC_MFR_ABRELAX_2144_24_1
Workunit 499540
Created 23 Jun 2007 6:10:18 UTC
Sent 23 Jun 2007 6:10:25 UTC
Received 23 Jun 2007 17:42:55 UTC
Server state Over
Outcome Client error
Client state Compute error
Exit status 0 (0x0)
Computer ID 8763
Report deadline 27 Jun 2007 6:10:25 UTC
CPU time 6887.484375
stderr out
<core_client_version>5.10.2</core_client_version>
<![CDATA[
<stderr_txt>
# cpu_run_time_pref: 7200
# random seed: 2605369
======================================================
DONE :: 1 starting structures 6886.77 cpu seconds
This process generated 4 decoys from 4 attempts
======================================================
BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down...
</stderr_txt>
<message>
<file_xfer_error>
<file_name>1FAB_BOINC_MFR_ABRELAX_2144_24_1_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>
</message>
]]>
Validate state Invalid
Claimed credit 28.4623439083363
Granted credit 0
application version 5.68
____________
|
|
|
|
|
Hi:
We\'re looking at these now..
I got some of those too. Here is what it looks like on the boinc console:
2007-06-23 17:05:39 [ralph@home] Computation for task 1FAB_BOINC_MFR_ABRELAX_2144_38_1 finished
2007-06-23 17:05:39 [ralph@home] Output file 1FAB_BOINC_MFR_ABRELAX_2144_38_1_0 for task 1FAB_BOINC_MFR_ABRELAX_2144_38_1 absent
2007-06-23 17:05:39 [rosetta@home] Resuming task BENCH_051207_ABRELAX_SAVE_ALL_OUT_-1a19A-_BARCODE_R55_filters_1804_1116_0 using rosetta version 568
2007-06-23 17:05:40 [ralph@home] Deferring communication for 2 hr 27 min 5 sec
2007-06-23 17:05:40 [ralph@home] Reason: Unrecoverable error for result 1FAB_BOINC_MFR_ABRELAX_2144_38_1 (<file_xfer_error>
<file_name>1FAB_BOINC_MFR_ABRELAX_2144_38_1_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>
)
|
|
|
|
|
2 Compute error Work Units, completed but failed file transfer, lost 12 hours of processing and get no credit for it.
this WU completed 24 decoys
and this one completed 12 decoys
CPU time 21377.666099
stderr out
<core_client_version>5.8.16</core_client_version>
< |
|
|
|
|
|
to report 2 WU\'s with following error:
<core_client_version>5.10.7</core_client_version>
<![CDATA[
<stderr_txt>
# cpu_run_time_pref: 3600
# random seed: 2605148
======================================================
DONE :: 1 starting structures 3151.71 cpu seconds
This process generated 1 decoys from 1 attempts
======================================================
BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down...
</stderr_txt>
<message>
<file_xfer_error>
<file_name>1IL4_BOINC_MFR_ABRELAX_2148_45_2_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>
</message>
]]>
Result ID - 564740, 563618
WorkUnit ID - 499754,499086
____________
I wish I can fly like a bird in the sky |
|
|
|
|
|
Result ID 564743
Name 1HPR_BOINC_MFR_ABRELAX_2146_19_2
Workunit 499492
Created 24 Jun 2007 7:43:55 UTC
Sent 24 Jun 2007 7:43:59 UTC
Received 24 Jun 2007 16:00:28 UTC
Server state Over
Outcome Client error
Client state Compute error
Exit status 0 (0x0)
Computer ID 8761
Report deadline 28 Jun 2007 7:43:59 UTC
CPU time 6552.40625
stderr out
<core_client_version>5.10.8</core_client_version>
<![CDATA[
<stderr_txt>
# cpu_run_time_pref: 7200
# random seed: 2605274
======================================================
DONE :: 1 starting structures 6552.2 cpu seconds
This process generated 7 decoys from 7 attempts
======================================================
BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down...
</stderr_txt>
<message>
<file_xfer_error>
<file_name>1HPR_BOINC_MFR_ABRELAX_2146_19_2_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>
</message>
]]>
Validate state Invalid
Claimed credit 26.7637185373818
Granted credit 0
application version 5.68
____________
|
|
|
|
|
|
Result ID 564148
Name SSH1_BOINC_MFR_ABRELAX_2149_50_1
Workunit 499805
Created 23 Jun 2007 20:32:06 UTC
Sent 23 Jun 2007 20:32:12 UTC
Received 23 Jun 2007 22:43:35 UTC
Server state Over
Outcome Client error
Client state Compute error
Exit status 0 (0x0)
Computer ID 7181
Report deadline 27 Jun 2007 20:32:12 UTC
CPU time 6779.167671
stderr out
<core_client_version>5.10.2</core_client_version>
<![CDATA[
<stderr_txt>
Graphics are disabled due to configuration...
# cpu_run_time_pref: 7200
# random seed: 2605093
======================================================
DONE :: 1 starting structures 6778.17 cpu seconds
This process generated 6 decoys from 6 attempts
======================================================
BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down...
</stderr_txt>
<message>
<file_xfer_error>
<file_name>SSH1_BOINC_MFR_ABRELAX_2149_50_1_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>
</message>
]]>
Validate state Invalid
Claimed credit 14.6974882890008
Granted credit 0
application version 5.68
____________
|
|
|
|
|
|
Result ID 564147
Name 1IL4_BOINC_MFR_ABRELAX_2147_24_2
Workunit 499543
Created 23 Jun 2007 20:32:06 UTC
Sent 23 Jun 2007 20:32:12 UTC
Received 24 Jun 2007 5:58:34 UTC
Server state Over
Outcome Client error
Client state Compute error
Exit status 0 (0x0)
Computer ID 7181
Report deadline 27 Jun 2007 20:32:12 UTC
CPU time 6906.935656
stderr out
<core_client_version>5.10.2</core_client_version>
<![CDATA[
<stderr_txt>
Graphics are disabled due to configuration...
# cpu_run_time_pref: 7200
# random seed: 2605219
======================================================
DONE :: 1 starting structures 6905.52 cpu seconds
This process generated 3 decoys from 3 attempts
======================================================
BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down...
</stderr_txt>
<message>
<file_xfer_error>
<file_name>1IL4_BOINC_MFR_ABRELAX_2147_24_2_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>
</message>
]]>
Validate state Invalid
Claimed credit 14.9744940446306
Granted credit 0
application version 5.68
____________
|
|
|
|
|
|
My G5 Mac got an error with “exit status 1 (0x1)” on CNTRL_01RELAXNATIVE_SAVE_ALL_OUT_-1mkyA-_2137_5, as soon as it started: ERROR:: Unable to determine sequence length from starting structure coordinate file
ERROR:: Exit from: input_pdb.cc line: 2967
After a few successful runs, 1tol_BOINC_MFR_ABRELAX_2151_40 failed with “exit status 0 (0x0)”; the output file says 13/13 decoys were generated (in 3.7 CPU-hours), but then: <file_xfer_error>
<file_name>1tol_BOINC_MFR_ABRELAX_2151_40_2_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>
|
|
|
|
|
|
Confirming message 3222, the \"absent output file error\" is among us once more.
Result.
Rosetta Beta 5.68
Boinc 5.10.7 (which like 5.10.4 has performed without faults with Ralph and Rosetta except for this task)
CPU limit preference: 4 h.; used CPU time: 4:08:58, 2 models
MacOS 10.3.9
From local message file:
2007-06-23 18:47:07 [ralph@home] Resuming task SSH1_BOINC_MFR_ABRELAX_2150_29_1 using rosetta_beta version 568
2007-06-23 19:17:02 [ralph@home] Computation for task SSH1_BOINC_MFR_ABRELAX_2150_29_1 finished
2007-06-23 19:17:02 [ralph@home] Output file SSH1_BOINC_MFR_ABRELAX_2150_29_1_0 for task SSH1_BOINC_MFR_ABRELAX_2150_29_1 absent
2007-06-23 19:17:03 [ralph@home] Deferring communication for 1 min 0 sec
2007-06-23 19:17:03 [ralph@home] Reason: Unrecoverable error for result SSH1_BOINC_MFR_ABRELAX_2150_29_1 (<file_xfer_error>
<file_name>SSH1_BOINC_MFR_ABRELAX_2150_29_1_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>
R. A. Mostol |
|
|