Rosetta mini beta and/or android 3.61-3.83

Message boards : RALPH@home bug list : Rosetta mini beta and/or android 3.61-3.83

To post messages, you must log in.

Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 12 · Next

AuthorMessage
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 910
Credit: 1,892,541
RAC: 294
Message 6013 - Posted: 20 Jan 2016, 11:34:50 UTC - in response to Message 6010.  

"Rosetta Mini for Android is not available for you type of computer"

??


Edit: SOLVED. Restart boinc client and receive new work


Re-edit: NOT Solved.
Continue to request Rosetta Mini for Android on my Windows 10 with Amd Fx
ID: 6013 · Report as offensive    Reply Quote
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 250
Credit: 543,579
RAC: 0
Message 6014 - Posted: 20 Jan 2016, 18:16:50 UTC - in response to Message 6013.  

That is odd. I also updated the android application. But the boinc client should take care of the app management of different platforms etc.
ID: 6014 · Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 910
Credit: 1,892,541
RAC: 294
Message 6015 - Posted: 20 Jan 2016, 19:06:28 UTC - in response to Message 6014.  

That is odd. I also updated the android application. But the boinc client should take care of the app management of different platforms etc.


Same problem on my intel notebook....
ID: 6015 · Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 910
Credit: 1,892,541
RAC: 294
Message 6016 - Posted: 20 Jan 2016, 19:07:53 UTC

Some errors:
3716117

Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Out Of Memory (C++ Exception) (0xe06d7363) at address 0x73162ED2

Engaging BOINC Windows Runtime Debugger...

ID: 6016 · Report as offensive    Reply Quote
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 250
Credit: 543,579
RAC: 0
Message 6017 - Posted: 20 Jan 2016, 19:54:24 UTC - in response to Message 6016.  

This error is due to the large memory requirement of the test job. I resubmitted the job with a memory bound which hopefully should avoid such errors.
ID: 6017 · Report as offensive    Reply Quote
Etienne Guyot

Send message
Joined: 28 Apr 06
Posts: 3
Credit: 64,332
RAC: 0
Message 6018 - Posted: 21 Jan 2016, 1:45:42 UTC - in response to Message 6006.  

Sorry, I was too busy to reply and my system was done today...

Restarted now and checked again: no more problem!

The only difference was a reboot after a classical W7-64 KB patches applied...

When I got the error, using PE, I found one thread in ntdll.dll that call a RtlSomething function that neither return (sorry, I didn't pick the full name at that moment). It was the only thread started or remaining after a kill attempt (the main app was killed, only the graphics part stayed in memory)

So, I will continue to check regularly if that behavior happen again and report it in case...

FYI, Graphic driver: Intel HD Graphics 4600, version 10.18.14.4264

Process Explorer when working as expected:


ID: 6018 · Report as offensive    Reply Quote
BlisteringSheep

Send message
Joined: 3 Nov 15
Posts: 4
Credit: 2,231,667
RAC: 8
Message 6019 - Posted: 21 Jan 2016, 15:27:25 UTC

Two Android phones:

Newer unit is a Motorola Droid RAZR MAXX XT912 with Android 4.1.2, host 36214
Seems to be working mostly okay with 3.71 tasks. 9 of 34 tasks report validation errors, where the task report shows a file_xfer_error, i.e., 3718881
upload failure: <file_xfer_error>
<file_name>simple_cycpep_predict_example_20280_97_1_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>

Older unit is a Motorola Droid X2 with Android 2.3.5, host 36252
All tasks error out with a signal 11, i.e., 3718646

Both devices are using older BOINC version 7.0.36 from NativeBOINC

boinccmd --host ed-droidmaxx --get_host_info
timezone: -18000
domain name: droidmaxx
IP addr: 68e4:ffbe::d93d:540
#CPUS: 2
CPU vendor: ARM
CPU model: ARMv7 Processor rev 3 (v7l) @1200MHz
CPU FP OPS: 944491591.072010
CPU int OPS: 2013178288.104389
CPU mem BW: 1000000000.000000
OS name: Android
OS version: 3.0.8-gbacb1cf
mem size: 881475584.000000
cache size: -1.000000
swap size: 0.000000
disk size: 3228372992.000000
disk free: 1205657600.000000

boinccmd --host ed-droidx2 --get_host_info
timezone: -18000
domain name: ed-droidx2
IP addr: 804:0:8822:87be:400:0:4:0
#CPUS: 2
CPU vendor: ARM
CPU model: ARMv7 Processor rev 0 (v7l) @1000MHz
CPU FP OPS: 719713184.931507
CPU int OPS: 1417059093.091529
CPU mem BW: 1000000000.000000
OS name: Android
OS version: 2.6.32.9-00008-gc406305
mem size: 423706624.000000
cache size: -1.000000
swap size: 0.000000
disk size: 2113744896.000000
disk free: 1497886720.000000
ID: 6019 · Report as offensive    Reply Quote
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 250
Credit: 543,579
RAC: 0
Message 6020 - Posted: 21 Jan 2016, 18:48:08 UTC

The android app requires Android version 4.1+. I'm not sure what is going on with your Motorola. Seems to run ok. Are there any network issues?
ID: 6020 · Report as offensive    Reply Quote
BlisteringSheep

Send message
Joined: 3 Nov 15
Posts: 4
Credit: 2,231,667
RAC: 8
Message 6021 - Posted: 21 Jan 2016, 22:14:54 UTC - in response to Message 6020.  

I will remove it from the older device then (I missed the 4.1+ version requirement).

For the XT912, I also don't understand the error. I've never had any network issues (I monitor it with a remote boinc-gui, and also ping it every 5 seconds). The device also has plenty of free storage.
ID: 6021 · Report as offensive    Reply Quote
Dr Who Fan
Avatar

Send message
Joined: 2 Sep 06
Posts: 76
Credit: 107,857
RAC: 0
Message 6022 - Posted: 27 Jan 2016, 17:10:22 UTC

Ver 3.71 FAILED for both of us who ran this task:

Task ID 3722270

Name gaurav_design13_SAVE_ALL_OUT_20284_27_1
======================================================
DONE ::     2 starting structures  28399.3 cpu seconds
This process generated     56 decoys from      56 attempts
======================================================
BOINC :: WS_max 2.1257e+008

BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down cleanly ...
called boinc_finish

</stderr_txt>
<message>
upload failure: <file_xfer_error>
  <file_name>gaurav_design13_SAVE_ALL_OUT_20284_27_1_0</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>

</message>
]]>
Validate state	Invalid
Claimed credit	93.8624553758775
Granted credit	0
application version	3.71


ID: 6022 · Report as offensive    Reply Quote
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 250
Credit: 543,579
RAC: 0
Message 6023 - Posted: 28 Jan 2016, 19:17:26 UTC

thanks, there is an error in the job submission and will get fixed. I alerted the researcher that submitted these tests about the issue and fix.
ID: 6023 · Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 910
Credit: 1,892,541
RAC: 294
Message 6033 - Posted: 4 Feb 2016, 14:14:34 UTC - in response to Message 5986.  

The graphics application was also updated to include new colors and a light source for spacefill rendering used for the new cyclic peptide modeling protocol. Spacefill rendering is only used as default for this protocol since the additional graphics load is minimal due to the small size of the proteins to be modeled.


Is this the new graph??
Screen

ID: 6033 · Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 910
Credit: 1,892,541
RAC: 294
Message 6034 - Posted: 4 Feb 2016, 17:43:57 UTC - in response to Message 6033.  


Is this the new graph??
Screen


And error of this wu

<message>
upload failure: <file_xfer_error>
<file_name>gaurav_rsmn_0161_65_daa2_SAVE_ALL_OUT_20295_39_1_0</file_name>
<error_code>-161 (not found)</error_code>
</file_xfer_error>

</message>
ID: 6034 · Report as offensive    Reply Quote
Trotador

Send message
Joined: 7 May 10
Posts: 33
Credit: 14,751,677
RAC: 0
Message 6035 - Posted: 4 Feb 2016, 19:40:30 UTC

The current Ralph WUs use huge amounts of RAM, I've seen up to 4 Gb per unit, is it on purpose? any new kind of simulation?

thanks for the info

ID: 6035 · Report as offensive    Reply Quote
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 250
Credit: 543,579
RAC: 0
Message 6036 - Posted: 4 Feb 2016, 20:12:24 UTC - in response to Message 6033.  
Last modified: 4 Feb 2016, 20:13:01 UTC

"Is this the new graph??"

Yes.
ID: 6036 · Report as offensive    Reply Quote
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 250
Credit: 543,579
RAC: 0
Message 6037 - Posted: 4 Feb 2016, 20:15:57 UTC - in response to Message 6035.  

The current Ralph WUs use huge amounts of RAM, I've seen up to 4 Gb per unit, is it on purpose? any new kind of simulation?

thanks for the info




Yes, I'm running a test of a new type of job that runs small perturbations of the protein backbone and then does a round of design. The design protocol can use a lot of memory. I realize that this will be problematic and will see if we can distribute these jobs to high memory machines. We may just not be able to run these on R@h.
ID: 6037 · Report as offensive    Reply Quote
Trotador

Send message
Joined: 7 May 10
Posts: 33
Credit: 14,751,677
RAC: 0
Message 6038 - Posted: 4 Feb 2016, 21:00:04 UTC - in response to Message 6037.  

The current Ralph WUs use huge amounts of RAM, I've seen up to 4 Gb per unit, is it on purpose? any new kind of simulation?

thanks for the info




Yes, I'm running a test of a new type of job that runs small perturbations of the protein backbone and then does a round of design. The design protocol can use a lot of memory. I realize that this will be problematic and will see if we can distribute these jobs to high memory machines. We may just not be able to run these on R@h.


R@H means Rosseta, doesn`t it? It is good for the investigation you could look for ways of distributing these units. The only effective way I could think of is limiting the quantity of units downloaded, by the user as in CEP project in WCG or by the project. Distributing them only to hosts with lot of memory could just not be enough if the hosts have also a lot of available threads (like mine).

A good thing I'm seeing with these units is that Boinc/Ralph seems to take into account the amount of available system memory and limits the menory used, even limiting the quantity of units in execution below the quantity of available threads, and the systems does not stall and hang as used to happen in these cases. Is it correct ?

ID: 6038 · Report as offensive    Reply Quote
Profile dekim
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 20 Jan 06
Posts: 250
Credit: 543,579
RAC: 0
Message 6039 - Posted: 5 Feb 2016, 0:59:20 UTC - in response to Message 6038.  

The current Ralph WUs use huge amounts of RAM, I've seen up to 4 Gb per unit, is it on purpose? any new kind of simulation?

thanks for the info




Yes, I'm running a test of a new type of job that runs small perturbations of the protein backbone and then does a round of design. The design protocol can use a lot of memory. I realize that this will be problematic and will see if we can distribute these jobs to high memory machines. We may just not be able to run these on R@h.


R@H means Rosseta, doesn`t it? It is good for the investigation you could look for ways of distributing these units. The only effective way I could think of is limiting the quantity of units downloaded, by the user as in CEP project in WCG or by the project. Distributing them only to hosts with lot of memory could just not be enough if the hosts have also a lot of available threads (like mine).

A good thing I'm seeing with these units is that Boinc/Ralph seems to take into account the amount of available system memory and limits the menory used, even limiting the quantity of units in execution below the quantity of available threads, and the systems does not stall and hang as used to happen in these cases. Is it correct ?



I'm not sure. There haven't been that many reports of stalling/hanging etc...

I think I was able to debug our scheduler so that it should not send out these high memory jobs to low memory clients. Hopefully it can now be a matter of setting the limits.

ID: 6039 · Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 910
Credit: 1,892,541
RAC: 294
Message 6040 - Posted: 5 Feb 2016, 6:40:43 UTC - in response to Message 6039.  

I'm not sure. There haven't been that many reports of stalling/hanging etc...
I think I was able to debug our scheduler so that it should not send out these high memory jobs to low memory clients. Hopefully it can now be a matter of setting the limits.


The message is clear:
05/02/2016 07:38:34 | ralph@home | Rosetta Mini needs 7629.39 MB RAM but only 2302.58 MB is available for use.
ID: 6040 · Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 910
Credit: 1,892,541
RAC: 294
Message 6041 - Posted: 5 Feb 2016, 8:57:46 UTC

Obviously all errors:

Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Out Of Memory (C++ Exception) (0xe06d7363) at address 0x73682ED2
ID: 6041 · Report as offensive    Reply Quote
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 12 · Next

Message boards : RALPH@home bug list : Rosetta mini beta and/or android 3.61-3.83



©2024 University of Washington
http://www.bakerlab.org