64bit app just added.

Message boards : RALPH@home bug list : 64bit app just added.

To post messages, you must log in.

AuthorMessage
FluffyChicken

Send message
Joined: 17 Feb 06
Posts: 54
Credit: 710
RAC: 0
Message 3040 - Posted: 2 May 2007, 19:42:07 UTC

I don't see a post about this, but how are you implementing it.

I know its a copy of the 32 bit but what identifier is it. The official (or soon to be) is x86_64 , I know some projects do not use that and will need to change with the newer server code.
ID: 3040 · Report as offensive    Reply Quote
FluffyChicken

Send message
Joined: 17 Feb 06
Posts: 54
Credit: 710
RAC: 0
Message 3041 - Posted: 2 May 2007, 19:44:31 UTC

ok I just checked the download page and it' x86_64 :)

both linux and windows I see.
ID: 3041 · Report as offensive    Reply Quote
PieBandit
Avatar

Send message
Joined: 3 May 07
Posts: 9
Credit: 7,592
RAC: 0
Message 3042 - Posted: 3 May 2007, 1:40:16 UTC

Has anyone witha 64 bit linux machine gotten any work units? I just get messages 'No Work From Project'

ID: 3042 · Report as offensive    Reply Quote
Profile JKeck {pirate}
Avatar

Send message
Joined: 16 Feb 06
Posts: 14
Credit: 153,095
RAC: 0
Message 3043 - Posted: 3 May 2007, 10:43:00 UTC

I had several RPCs that gave no message, and no work. The most recent RPC gave the message "no work from project" and has 2 hours on the defferal. This is winxp 64.
BOINC WIKI

BOINCing since 2002/12/8
ID: 3043 · Report as offensive    Reply Quote
Profile Trog Dog
Avatar

Send message
Joined: 8 Aug 06
Posts: 38
Credit: 41,996
RAC: 0
Message 3044 - Posted: 3 May 2007, 13:46:19 UTC - in response to Message 3042.  

Has anyone witha 64 bit linux machine gotten any work units? I just get messages 'No Work From Project'


Yep, each of my 64bit boxes grabbed a wu. Remember RALPH doesn't always have work.
ID: 3044 · Report as offensive    Reply Quote
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 250
Credit: 543,579
RAC: 0
Message 3046 - Posted: 3 May 2007, 20:39:04 UTC

So are there any issues with the 64bit apps or shall I go ahead and release them on Rosetta@home?
ID: 3046 · Report as offensive    Reply Quote
PieBandit
Avatar

Send message
Joined: 3 May 07
Posts: 9
Credit: 7,592
RAC: 0
Message 3047 - Posted: 3 May 2007, 21:57:54 UTC
Last modified: 3 May 2007, 22:02:07 UTC

I was able to download the app, but never got work units. I had wanted to test because right now I am attempting to run Rosetta with an anonymous client (the 32 bit app) and it's failing mid work unit. Time spent working on the unit does not increase, % complete does not increase, but state goes from waiting to running to waiting as other jobs take their share of my processor. Would it be possible for you to release more work units on the 64 bit app, if you suspect it is an issue that would effect the 64 bit clients?

You can look at my results here if they might provide some insight: https://boinc.bakerlab.org/rosetta/results.php?userid=165650

My setup: AMD64 X2, kubuntu 7.04, 2G ram
ID: 3047 · Report as offensive    Reply Quote
Profile JKeck {pirate}
Avatar

Send message
Joined: 16 Feb 06
Posts: 14
Credit: 153,095
RAC: 0
Message 3049 - Posted: 4 May 2007, 11:38:27 UTC - in response to Message 3046.  

So are there any issues with the 64bit apps or shall I go ahead and release them on Rosetta@home?

I have a few on win x86_64 that returned successfully and were granted credit. So it is probably ok.

One point though, if the app is simply the 32 bit app renamed then it may be a better solution to update the server instead of trying to build and release a seperate app. Server versions later than 4/30/7 should automatically send the 32 bit app to official 64 bit clients. This does not help the *nix boxes now though, there is still not an official 64bit test app for *nix yet. I would expect there to be by the time it is actually released as 5.10.x.
BOINC WIKI

BOINCing since 2002/12/8
ID: 3049 · Report as offensive    Reply Quote
Profile Stefan Ledwina

Send message
Joined: 16 Feb 06
Posts: 2
Credit: 155,221
RAC: 0
Message 3050 - Posted: 4 May 2007, 11:56:02 UTC
Last modified: 4 May 2007, 11:56:32 UTC

Got 4 WUs todayand all are running fine. One of them is ready crunched and is valid...

This is my 64bit Linux Host:
https://ralph.bakerlab.org/show_host_detail.php?hostid=7960

ID: 3050 · Report as offensive    Reply Quote
PieBandit
Avatar

Send message
Joined: 3 May 07
Posts: 9
Credit: 7,592
RAC: 0
Message 3051 - Posted: 4 May 2007, 12:25:41 UTC

I got 3, and one of them is complete apparently, successfully <https://ralph.bakerlab.org/result.php?resultid=502025>. So looking forward to 64 bit rosetta!
ID: 3051 · Report as offensive    Reply Quote
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 250
Credit: 543,579
RAC: 0
Message 3058 - Posted: 4 May 2007, 21:42:50 UTC

great, sounds like we can release them on R@h. We do realize the next boinc client update will be smart enough to get the 32 bit app if 64 doesn't exist but what we are currently doing is simple enough. We will update our server soon, though, for other scheduling benefits.
ID: 3058 · Report as offensive    Reply Quote
Profile JKeck {pirate}
Avatar

Send message
Joined: 16 Feb 06
Posts: 14
Credit: 153,095
RAC: 0
Message 3059 - Posted: 4 May 2007, 22:26:15 UTC

I may have spoken too soon. Host has gotten about half compute errors. One of them can be ignored since it occured during a core client upgrade, however they all show the same error messages.
BOINC WIKI

BOINCing since 2002/12/8
ID: 3059 · Report as offensive    Reply Quote
PieBandit
Avatar

Send message
Joined: 3 May 07
Posts: 9
Credit: 7,592
RAC: 0
Message 3061 - Posted: 5 May 2007, 1:37:57 UTC - in response to Message 3059.  

I may have spoken too soon. Host has gotten about half compute errors. One of them can be ignored since it occured during a core client upgrade, however they all show the same error messages.


If you look at them, it happened to all computers on those work units. Some were not on 64 bit.

My other two work units are still running, no info yet
ID: 3061 · Report as offensive    Reply Quote
Profile Trog Dog
Avatar

Send message
Joined: 8 Aug 06
Posts: 38
Credit: 41,996
RAC: 0
Message 3064 - Posted: 5 May 2007, 3:02:05 UTC

No probs here with the 64bit app - haven't checked graphics though.
ID: 3064 · Report as offensive    Reply Quote
Profile JKeck {pirate}
Avatar

Send message
Joined: 16 Feb 06
Posts: 14
Credit: 153,095
RAC: 0
Message 3069 - Posted: 5 May 2007, 9:15:42 UTC

@PieBandit
Thanks for noticing that I did not look at what the other hosts did.

@Trog Dog
I have run the graphics, no related problems observed.
BOINC WIKI

BOINCing since 2002/12/8
ID: 3069 · Report as offensive    Reply Quote
PieBandit
Avatar

Send message
Joined: 3 May 07
Posts: 9
Credit: 7,592
RAC: 0
Message 3071 - Posted: 5 May 2007, 17:30:21 UTC
Last modified: 5 May 2007, 17:31:27 UTC

my initial 3 work units completed, thought they have this in the result: https://ralph.bakerlab.org/result.php?resultid=502647
<core_client_version>5.4.11</core_client_version>
<stderr_txt>
Graphics are disabled due to configuration...
# cpu_run_time_pref: 3600
# random seed: 2680278
SIGSEGV: segmentation violation
Graphics are disabled due to configuration...
# cpu_run_time_pref: 3600
SIGABRT: abort called
Stack trace (14 frames):
[0x8cbe15f]
[0x8cb8fac]
[0xffffe500]
[0x8d29014]
[0x8d3dece]
[0x8d430f1]
[0x8d432c7]
[0x8d134f5]
[0x83d66c0]
[0x8d2951f]
[0x8cbac23]
[0x8cbc1f3]
[0x8cb522d]
[0x8d5567a]

Exiting...
Graphics are disabled due to configuration...
# cpu_run_time_pref: 3600
======================================================
DONE :: 1 starting structures 2883.67 cpu seconds
This process generated 4 decoys from 4 attempts
======================================================


BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down...

</stderr_txt>

should that really be valid?

I also recieved this error on a new WU:

https://ralph.bakerlab.org/result.php?resultid=506041
ID: 3071 · Report as offensive    Reply Quote
Thomas Leibold

Send message
Joined: 25 Feb 07
Posts: 27
Credit: 77,464
RAC: 0
Message 3076 - Posted: 6 May 2007, 1:01:56 UTC - in response to Message 3071.  


======================================================
DONE :: 1 starting structures 2883.67 cpu seconds
This process generated 4 decoys from 4 attempts
======================================================

should that really be valid?


It looks like your client produced 4 models (possibly before the error ?) and therefore should get credit for those.

The issue you mention in your previous post ('running' state, but cpu time not increasing) sounds like something I see when the preference "leave application in memory" is set to "no" and Boinc switches tasks between different projects or suspends a task because of other activity on the system.
ID: 3076 · Report as offensive    Reply Quote
PieBandit
Avatar

Send message
Joined: 3 May 07
Posts: 9
Credit: 7,592
RAC: 0
Message 3084 - Posted: 7 May 2007, 14:18:49 UTC

I've set the property to yes and will see if that fixes it. I've found a temporary workaround of manually suspending the task for several hours. It usually picks up again after I resume it.
ID: 3084 · Report as offensive    Reply Quote
PieBandit
Avatar

Send message
Joined: 3 May 07
Posts: 9
Credit: 7,592
RAC: 0
Message 3085 - Posted: 8 May 2007, 4:27:12 UTC

This doesn't seem to be the case. the process isn't even running (when rosetta claims to be running, top doesn't show it. it shows all other tasks). I'm very confused now
ID: 3085 · Report as offensive    Reply Quote
Thomas Leibold

Send message
Joined: 25 Feb 07
Posts: 27
Credit: 77,464
RAC: 0
Message 3118 - Posted: 20 May 2007, 19:51:10 UTC - in response to Message 3085.  

This doesn't seem to be the case. the process isn't even running (when rosetta claims to be running, top doesn't show it. it shows all other tasks). I'm very confused now


With the 'keep application in memory' option on this is rare, but still seems to happen occassionally. I recently had one workunit in that state with the 32-bit linux client. What appears to be happening is that during an abnormal shutdown the communication between the processes for the task (you may have seen that each rosetta task uses 4 processes of which one does all the computation and therefore consumes most cpu time) gets in a state where they wait for each other indefinitely. Suspending and resuming the task in that situation does not help (except that Boinc will run another task while this stuck one is suspended). The rosetta watchdog timer doesn't help either since it is one of the processes involved in the communications deadlock. Boinc itself is unaware of the problem since the processes still exist (the fact that they don't actually consume cpu cycles isn't something the boinc client monitors, nor could it do that without introducing dependencies on the varies project clients).

Stopping and restarting the entire boinc client with all project tasks will restart such a workunit at the last checkpoint before it encountered the problem. If the problem is repeatable the workunit will be processed until it once again reaches the trouble spot and will again stop consuming cpu time. In such a case the only remedy is to abort the workunit.
ID: 3118 · Report as offensive    Reply Quote

Message boards : RALPH@home bug list : 64bit app just added.



©2024 University of Washington
http://www.bakerlab.org