Message boards : RALPH@home bug list : Rosetta 4.12+
Previous · 1 · 2 · 3 · 4 · 5 · Next
Author | Message |
---|---|
yoerik Send message Joined: 28 Mar 20 Posts: 9 Credit: 2,536 RAC: 0 |
I hope they will try widely 4.15 version before release it on production in Rosetta@Home. Posts from the admin give me hope - but they'll need more volunteers here in order to ensure that, from what I understand. |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 910 Credit: 1,892,541 RAC: 294 |
Posts from the admin give me hope - but they'll need more volunteers here in order to ensure that, from what I understand. It's not a problem. If you release work, the volunteers will arrive |
robertmiles Send message Joined: 13 Jan 09 Posts: 103 Credit: 331,865 RAC: 0 |
I hope they will try widely 4.15 version before release it on production in Rosetta@Home. That's what Ralph@home is for - testing new versions before they are released on Rosetta@home. |
yoerik Send message Joined: 28 Mar 20 Posts: 9 Credit: 2,536 RAC: 0 |
Posts from the admin give me hope - but they'll need more volunteers here in order to ensure that, from what I understand. From the Admin's post earlier: 4.12 was tested on Ralph but not thoroughly enough. We wanted to get it out anyway so that we can start working on the scaffolds. Time is important. We've been trying our best to get this next app version pushed out. But want it thoroughly tested now since we are still able to get important COVID-19 work done on R@h with 4.12. I'm inferring that they wanted to test it further, but time restraints forced them to release it to the public build sooner - they didn't have enough volunteers to test them thoroughly enough here, without delaying their research. Hence - they have time now, so there's no urgent rush at the moment. But it implies that they do need to get 4.15 out in order to do the next stage of work, but 4.12+ on the public release can do important work for now. It's all inferred, but given that there's only 269 active users here, 502 active hosts, I sincerely doubt they have enough volunteers here. |
nastasache Send message Joined: 6 Apr 20 Posts: 2 Credit: 2,754 RAC: 0 |
Maybe they are not promoting enough the test stage. I heard about ralph almost by accident. |
Tom Rinehart Send message Joined: 31 Mar 20 Posts: 4 Credit: 0 RAC: 0 |
I went ahead and posted the OSX update on R@h. We plan to update the rest of the platforms in the next day or so. On Rosetta@home, the Mac 4.15 app is working well. I have had 3 end in a computation error at the end of processing. I've had trouble with Rosetta Mini 3.78 app they all fail immediately like the Rosetta 4.12 Mac app. It is giving errors like: <core_client_version>7.14.4</core_client_version> <![CDATA[ <message> process exited with code 255 (0xff, -1)</message> <stderr_txt> [2020- 4- 7 19:41:47:] :: BOINC:: Initializing ... ok. [2020- 4- 7 19:41:47:] :: BOINC :: boinc_init() BOINC:: Setting up shared resources ... ok. BOINC:: Setting up semaphores ... ok. BOINC:: Updating status ... ok. BOINC:: Registering timer callback... ok. BOINC:: Worker initialized successfully. command: minirosetta_3.78_x86_64-apple-darwin -abinitio::fastrelax 1 -ex2aro 1 -frag3 00001.200.3mers -in:file:native 00001.pdb -corrections::beta_nov16 -silent_gz 1 -frag9 00001.200.9mers -out:file:silent default.out -ex1 1 -abinitio::rsd_wt_loop 0.5 -relax::default_repeats 15 -abinitio::use_filters false -abinitio::increase_cycles 10 -abinitio::rsd_wt_helix 0.5 -abinitio::rg_reweight 0.5 -in:file:boinc_wu_zip CF_monomer_03_data.zip -out:file:silent default.out -silent_gz -mute all -nstruct 10000 -cpu_run_time 28800 -boinc:max_nstruct 600 -checkpoint_interval 120 -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -run::rng mt19937 -constant_seed -jran 2413101 Registering options.. Registered extra options. Initializing broker options ... Registered extra options. Initializing core... Initializing options.... ok Options::initialize() Options::adding_options() Options::initialize() Check specs. Options::initialize() End reached ERROR: Option matching -corrections:beta_nov16 not found in command line top-level context</stderr_txt> ]]> The errors I get on the Mac 4.15 app are mostly like this one: <core_client_version>7.14.4</core_client_version> <![CDATA[ <stderr_txt> command: rosetta_4.15_x86_64-apple-darwin -run:protocol jd2_scripting -parser:protocol predictor_v11_boinc--fuse--il1r_design_boinc_v1.xml @flags_il1r2 -in:file:silent 8er4nd4m_Mini_Protein_binds_IL1R_COVID-19_design5.silent -in:file:silent_struct_type binary -silent_gz -mute all -out:file:silent_struct_type binary -out:file:silent default.out -in:file:boinc_wu_zip 8er4nd4m_Mini_Protein_binds_IL1R_COVID-19_design5.zip @8er4nd4m_Mini_Protein_binds_IL1R_COVID-19_design5.flags -nstruct 10000 -cpu_run_time 28800 -watchdog -boinc:max_nstruct 600 -checkpoint_interval 120 -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -run::rng mt19937 -constant_seed -jran 3296593 Starting watchdog... Watchdog active. ====================================================== DONE :: 283 starting structures 29066.9 cpu seconds This process generated 283 decoys from 283 attempts ====================================================== BOINC :: WS_max 8.82074e+08 BOINC :: Watchdog shutting down... 06:57:38 (55517): called boinc_finish(0) </stderr_txt> <message> finish file present too long</message> ]]> It looks like I also got a few of these on the Mac 4.09 app. |
Plomos Send message Joined: 8 Jul 12 Posts: 4 Credit: 226 RAC: 0 |
So I had the same error again on two more units that I pulled only a few hours ago from the server <core_client_version>7.16.1</core_client_version> <![CDATA[ <stderr_txt> command: ../../projects/ralph.bakerlab.org/rosetta_4.15_i686-pc-linux-gnu -run:protocol jd2_scripting -parser:protocol predictor_v11_boinc--fuse--covid_spike_design_boinc_v1.xml @flags_Junior_HalfRoid_vs_COVID-19_test1 -in:file:silent 6np3ll6z_Junior_HalfRoid_vs_COVID-19_test1.silent -in:file:silent_struct_type binary -silent_gz -mute all -out:file:silent_struct_type binary -out:file:silent default.out -in:file:boinc_wu_zip 6np3ll6z_Junior_HalfRoid_vs_COVID-19_test1.zip @6np3ll6z_Junior_HalfRoid_vs_COVID-19_test1.flags -nstruct 10000 -cpu_run_time 3600 -watchdog -boinc:max_nstruct 600 -checkpoint_interval 120 -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -run::rng mt19937 -constant_seed -jran 3983671 Starting watchdog... Watchdog active. Starting watchdog... Watchdog active. Starting watchdog... Watchdog active. BOINC:: CPU time: 18299.9s, 14400s + 3600s[2020- 4- 8 0:53:20:] :: BOINC WARNING! cannot get file size for default.out.gz: could not open file. Output exists: default.out.gz Size: -1 InternalDecoyCount: 0 (GZ) ----- 0 ----- Stream information inconsistent. Writing W_0000001 ====================================================== DONE :: 1 starting structures 18299.9 cpu seconds This process generated 1 decoys from 1 attempts ====================================================== 00:53:20 (8724): called boinc_finish(0) </stderr_txt> ]]> It seems that this happens both here at Ralph and at main rosetta when the system sends me 32 bit tasks instead of 64bit ones. On rosetta the 64 bit tasks run as they should but the 32 bit 4.12 as well as 4.15 here that are 32 bit do not run right and only produce one decoy after hours of work. Hopefully this can be fixed |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 910 Credit: 1,892,541 RAC: 294 |
I'm inferring that they wanted to test it further, but time restraints forced them to release it to the public build sooner - they didn't have enough volunteers to test them thoroughly enough here, without delaying their research. I know, i've read the admin's post. I know, also, that with 4.15 version there are not only bugifix, but also some new science ("some new code related to COVID-19 interface design that we would like to push out to R@h soon."). So, it is important to test it. It's all inferred, but given that there's only 269 active users here, 502 active hosts, I sincerely doubt they have enough volunteers here. After months and months of no work and no news, volunteer has gone (try to see the registration date of first page of top users. A lot of new users. Old users got tired of waiting). But if you give work and news, people will arrive (see, for example, the forum and the wus of Rosetta). (Also support to Raspberry will give more platform to test). |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 910 Credit: 1,892,541 RAC: 294 |
Maybe they are not promoting enough the test stage. For sure! The link in Home Page of Rosetta@Home to this beta project is very recent. |
Mad_Max Send message Joined: 15 Nov 12 Posts: 15 Credit: 404,700 RAC: 0 |
I aborted some of the older test batches. I'm not sure why your client is getting confused and running the wrong app. It should be running the 64bit version on your 64bit computer. I was writing of getting 64bit "wrapper" app on 32 bit machines including old running under WinXP. Of course all such WUs fails as Win32 systems can not execute any 64 bit apps. Producing error "Application is not a valid Win32 app" right at start. I don't have any problems on 64bit windows systems currently. Latest problem was downloading failures of small files, but looks like it resolved now as i didn't saw such errors for about a week. If older systems not longer supported by project you should adjust server scheduler accordingly, so it should not send tasks to such machines and respond with error/warning, instead of sending work to such host doomed to 100% error rate and wasting internet bandwidth and excess server load. |
xii5ku Send message Joined: 8 Apr 20 Posts: 2 Credit: 23,307 RAC: 0 |
Linux i686 application version problem in v4.12 + v4.15 (100% reproducible on my Linux EMT64 hosts, problem not reproducible with Linux x86-64 application version) On April 7 at Rosetta@home, I reported that all "Rosetta v4.12 i686-pc-linux-gnu" tasks got stuck at 1 decoy and finished after target CPU time + 4 h watchdog overtime, whereas all "Rosetta v4.12 x86_64-pc-linux-gnu" ran normally on the same hosts. (Rosetta forum thread "Rosetta v4.12 i686-pc-linux-gnu" : fixed 20 h CPU time, fixed 20 credits) Last night I received a bunch of tasks from Ralph to 4 of the same set of computers. I had the default target CPU time configured at Ralph, which is 1 hour. I have 257 valid results, of 257 tasks received:
So there is slight progress from v4.12 to v4.15 on my hosts, but not a breakthrough yet. |
robertmiles Send message Joined: 13 Jan 09 Posts: 103 Credit: 331,865 RAC: 0 |
I just increased the resource share for Rosetta@Home on my computer. Too soon to see the results from that yet. My Ralph account shows no 4.15 tasks yet. Can you tell which if any of these possible causes does this? 1. They aren't testing 4.15 for Windows yet. 2. The tasks they show don't include any from the last few days, probably because the list was read from a rather obsolete copy of their database. 3. Their list of tasks for a user show them only for a day or so. For one decoy tasks, note that the first decoy is usually only for testing how well your computer runs the software. That means that its output is seldom useful for any other purpose, and it may might not even be sent back. I looked at TN-Grid. They are currently not accepting new users. They are thinking of starting some COVID-19 work, which would probably start a flood of new users if they don't keep limiting them. On another subject, can't the wrapper for 32-bit tasks be recompiled or rewritten so that it runs in 32-bits, at least under 32-bit operating systems? Or maybe a script that tries the 64-bit wrapper first, and if that fails quickly with certain errors, tries the 32-bit wrapper instead? Does this need extra testing to handle a 32-bit version of BOINC running under a 64-bit operating system? |
xii5ku Send message Joined: 8 Apr 20 Posts: 2 Credit: 23,307 RAC: 0 |
@robertmiles, I can't respond to your Ralph@home/ Rosetta@home related points, because I am new to Ralph and lack the insight. But a quick response to this unrelated item: robertmiles wrote: This is not correct. New users can join any time. They only need to create the account via the web site and need to enter the invitation code from the main page. AFAIK this is a measure to reduce spam, not to hinder new contributors to join. That said, it is true that their work generator always had and still has a limited pace. But my experience during the last few days was that my hosts remained saturated. robertmiles wrote: They are thinking of starting some COVID-19 work, which would probably start a flood of new users if they don't keep limiting them.They already started such work. They just don't communicate this widely to boinc contributors because of the limited pace of the work generator. /end-offtopic |
robertmiles Send message Joined: 13 Jan 09 Posts: 103 Credit: 331,865 RAC: 0 |
@robertmiles, I can't respond to your Ralph@home/ Rosetta@home related points, because I am new to Ralph and lack the insight. But a quick response to this unrelated item: [snip] I think I was able to create an account. I'll finally try to add the project in a few hours, after I upgrade BOINC to 7.16.5. Thank you. |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 910 Credit: 1,892,541 RAC: 294 |
160 valid, only 3 errors <message> |
Rainer Baumeister Send message Joined: 7 Apr 20 Posts: 2 Credit: 437,267 RAC: 14 |
Hello, sorry, my English is very poor. v4.15 I use a Ryzen3700x (default) with 32GB RAM: 30 tasks OK, 2 errors A Ryzen 1700 (default) with 32GB causes massive problems: 4 OK, 66 errors! Why? Both computers run VERY reliable in all other projects. But with Rosetta I have to use Win10. :-( With Mint the normal Rosetta is anyway with errors. https://ralph.bakerlab.org/show_user.php?userid=58871 Greeting Rainer Translated with www.DeepL.com/Translator (free version) |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 910 Credit: 1,892,541 RAC: 294 |
160 valid, only 3 errors Again 9 with this error.. (after few seconds) |
robertmiles Send message Joined: 13 Jan 09 Posts: 103 Credit: 331,865 RAC: 0 |
[snip] robertmiles wrote:This is not correct. New users can join any time. They only need to create the account via the web site and need to enter the invitation code from the main page. AFAIK this is a measure to reduce spam, not to hinder new contributors to join. That said, it is true that their work generator always had and still has a limited pace. But my experience during the last few days was that my hosts remained saturated. [snip] I created the account, and have started running tasks. They have finished creating all of the workunits for their planned COVID-19 work, and expect to have the rest of them downloaded soon. |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 910 Credit: 1,892,541 RAC: 294 |
Even with 4.17, after few seconds, i have these errors, like 4.15 (only two wus, however): <message> |
Ivaylo Bonev Send message Joined: 30 Mar 20 Posts: 3 Credit: 3,702 RAC: 0 |
Same on 4.18: https://ralph.bakerlab.org/result.php?resultid=5034587 <message> upload failure: <file_xfer_error> <file_name>Mini_Protein_binds_IL6R_COVID-19_test3_SAVE_ALL_OUT_IGNORE_THE_REST_0cj9pv7f_32_749_0_r1202734223_0</file_name> <error_code>-240 (stat() failed)</error_code> </file_xfer_error> </message> |
Message boards :
RALPH@home bug list :
Rosetta 4.12+
©2024 University of Washington
http://www.bakerlab.org