Message boards : RALPH@home bug list : Rosetta 4.12+
Previous · 1 · 2 · 3 · 4 · Next
Author | Message |
---|---|
Mad_Max Send message Joined: 15 Nov 12 Posts: 15 Credit: 404,700 RAC: 0 |
I aborted some of the older test batches. I'm not sure why your client is getting confused and running the wrong app. It should be running the 64bit version on your 64bit computer. I was writing of getting 64bit "wrapper" app on 32 bit machines including old running under WinXP. Of course all such WUs fails as Win32 systems can not execute any 64 bit apps. Producing error "Application is not a valid Win32 app" right at start. I don't have any problems on 64bit windows systems currently. Latest problem was downloading failures of small files, but looks like it resolved now as i didn't saw such errors for about a week. If older systems not longer supported by project you should adjust server scheduler accordingly, so it should not send tasks to such machines and respond with error/warning, instead of sending work to such host doomed to 100% error rate and wasting internet bandwidth and excess server load. |
xii5ku Send message Joined: 8 Apr 20 Posts: 2 Credit: 23,307 RAC: 0 |
Linux i686 application version problem in v4.12 + v4.15 (100% reproducible on my Linux EMT64 hosts, problem not reproducible with Linux x86-64 application version) On April 7 at Rosetta@home, I reported that all "Rosetta v4.12 i686-pc-linux-gnu" tasks got stuck at 1 decoy and finished after target CPU time + 4 h watchdog overtime, whereas all "Rosetta v4.12 x86_64-pc-linux-gnu" ran normally on the same hosts. (Rosetta forum thread "Rosetta v4.12 i686-pc-linux-gnu" : fixed 20 h CPU time, fixed 20 credits) Last night I received a bunch of tasks from Ralph to 4 of the same set of computers. I had the default target CPU time configured at Ralph, which is 1 hour. I have 257 valid results, of 257 tasks received:
So there is slight progress from v4.12 to v4.15 on my hosts, but not a breakthrough yet. |
robertmiles Send message Joined: 13 Jan 09 Posts: 103 Credit: 331,865 RAC: 0 |
I just increased the resource share for Rosetta@Home on my computer. Too soon to see the results from that yet. My Ralph account shows no 4.15 tasks yet. Can you tell which if any of these possible causes does this? 1. They aren't testing 4.15 for Windows yet. 2. The tasks they show don't include any from the last few days, probably because the list was read from a rather obsolete copy of their database. 3. Their list of tasks for a user show them only for a day or so. For one decoy tasks, note that the first decoy is usually only for testing how well your computer runs the software. That means that its output is seldom useful for any other purpose, and it may might not even be sent back. I looked at TN-Grid. They are currently not accepting new users. They are thinking of starting some COVID-19 work, which would probably start a flood of new users if they don't keep limiting them. On another subject, can't the wrapper for 32-bit tasks be recompiled or rewritten so that it runs in 32-bits, at least under 32-bit operating systems? Or maybe a script that tries the 64-bit wrapper first, and if that fails quickly with certain errors, tries the 32-bit wrapper instead? Does this need extra testing to handle a 32-bit version of BOINC running under a 64-bit operating system? |
xii5ku Send message Joined: 8 Apr 20 Posts: 2 Credit: 23,307 RAC: 0 |
@robertmiles, I can't respond to your Ralph@home/ Rosetta@home related points, because I am new to Ralph and lack the insight. But a quick response to this unrelated item: robertmiles wrote: This is not correct. New users can join any time. They only need to create the account via the web site and need to enter the invitation code from the main page. AFAIK this is a measure to reduce spam, not to hinder new contributors to join. That said, it is true that their work generator always had and still has a limited pace. But my experience during the last few days was that my hosts remained saturated. robertmiles wrote: They are thinking of starting some COVID-19 work, which would probably start a flood of new users if they don't keep limiting them.They already started such work. They just don't communicate this widely to boinc contributors because of the limited pace of the work generator. /end-offtopic |
robertmiles Send message Joined: 13 Jan 09 Posts: 103 Credit: 331,865 RAC: 0 |
@robertmiles, I can't respond to your Ralph@home/ Rosetta@home related points, because I am new to Ralph and lack the insight. But a quick response to this unrelated item: [snip] I think I was able to create an account. I'll finally try to add the project in a few hours, after I upgrade BOINC to 7.16.5. Thank you. |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 913 Credit: 1,892,541 RAC: 294 |
160 valid, only 3 errors <message> |
Rainer Baumeister Send message Joined: 7 Apr 20 Posts: 2 Credit: 437,267 RAC: 14 |
Hello, sorry, my English is very poor. v4.15 I use a Ryzen3700x (default) with 32GB RAM: 30 tasks OK, 2 errors A Ryzen 1700 (default) with 32GB causes massive problems: 4 OK, 66 errors! Why? Both computers run VERY reliable in all other projects. But with Rosetta I have to use Win10. :-( With Mint the normal Rosetta is anyway with errors. https://ralph.bakerlab.org/show_user.php?userid=58871 Greeting Rainer Translated with www.DeepL.com/Translator (free version) |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 913 Credit: 1,892,541 RAC: 294 |
160 valid, only 3 errors Again 9 with this error.. (after few seconds) |
robertmiles Send message Joined: 13 Jan 09 Posts: 103 Credit: 331,865 RAC: 0 |
[snip] robertmiles wrote:This is not correct. New users can join any time. They only need to create the account via the web site and need to enter the invitation code from the main page. AFAIK this is a measure to reduce spam, not to hinder new contributors to join. That said, it is true that their work generator always had and still has a limited pace. But my experience during the last few days was that my hosts remained saturated. [snip] I created the account, and have started running tasks. They have finished creating all of the workunits for their planned COVID-19 work, and expect to have the rest of them downloaded soon. |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 913 Credit: 1,892,541 RAC: 294 |
Even with 4.17, after few seconds, i have these errors, like 4.15 (only two wus, however): <message> |
Ivaylo Bonev Send message Joined: 30 Mar 20 Posts: 3 Credit: 3,702 RAC: 0 |
Same on 4.18: https://ralph.bakerlab.org/result.php?resultid=5034587 <message> upload failure: <file_xfer_error> <file_name>Mini_Protein_binds_IL6R_COVID-19_test3_SAVE_ALL_OUT_IGNORE_THE_REST_0cj9pv7f_32_749_0_r1202734223_0</file_name> <error_code>-240 (stat() failed)</error_code> </file_xfer_error> </message> |
WezH Send message Joined: 24 Apr 20 Posts: 6 Credit: 181,771 RAC: 0 |
Even with 4.17, after few seconds, i have these errors, like 4.15 (only two wus, however): Same here, 9 errors from 184 tasks |
Trotador Send message Joined: 7 May 10 Posts: 33 Credit: 14,751,677 RAC: 0 |
4.20 still failing this way https://ralph.bakerlab.org/workunit.php?wuid=4518120 <core_client_version>7.14.2</core_client_version> <![CDATA[ <stderr_txt> command: ../../projects/ralph.bakerlab.org/rosetta_4.20_x86_64-pc-linux-gnu -run:protocol jd2_scripting -parser:protocol predictor_v11_boinc--fuse--il1r_design_boinc_v1_mod.xml @flags_il6r2 -in:file:silent Mini_Protein_binds_IL6R_COVID-19_test3_SAVE_ALL_OUT_IGNORE_THE_REST_0cj9pv7f.silent -in:file:silent_struct_type binary -silent_gz -mute all -out:file:silent_struct_type binary -out:file:silent default.out -in:file:boinc_wu_zip Mini_Protein_binds_IL6R_COVID-19_test3_SAVE_ALL_OUT_IGNORE_THE_REST_0cj9pv7f.zip @Mini_Protein_binds_IL6R_COVID-19_test3_SAVE_ALL_OUT_IGNORE_THE_REST_0cj9pv7f.flags -nstruct 10000 -cpu_run_time 3600 -boinc:max_nstruct 5000 -checkpoint_interval 120 -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -boinc::cpu_run_timeout 36000 -run::rng mt19937 -constant_seed -jran 3976176 Using database: database_357d5d93529_n_methyl/minirosetta_database ====================================================== DONE :: 1 starting structures 1201 cpu seconds This process generated 1 decoys from 1 attempts ====================================================== BOINC :: WS_max 0 06:39:06 (90949): called boinc_finish(0) </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>Mini_Protein_binds_IL6R_COVID-19_test3_SAVE_ALL_OUT_IGNORE_THE_REST_0cj9pv7f_32_771_1_r543859209_0</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> </message> ]]> |
Conan Send message Joined: 16 Feb 06 Posts: 364 Credit: 1,368,421 RAC: 0 |
Tried running a couple of 4.20 work units, but they both failed after less than 1 1/2 minutes with the error 'Process got Signal 11" This also happened with my wingman he got the same error. Conan |
[VENETO] boboviz Send message Joined: 9 Apr 08 Posts: 913 Credit: 1,892,541 RAC: 294 |
All errors after few seconds 5154586 5154504 5154616 - Unhandled Exception Record - - Unhandled Exception Record - - Unhandled Exception Record - etc... |
PDW Send message Joined: 30 Aug 14 Posts: 6 Credit: 1,832,794 RAC: 0 |
All of the ones just released, like these: test_ff_sym_c3_21res_c.127.43_0001_I_21_3_hit_CYS_GLU_4_5_4_cell033_0001_SAVE_ALL_OUT_47_105_1 Access Violation, even just a single WU running on its own with plenty of memory. |
PDW Send message Joined: 30 Aug 14 Posts: 6 Credit: 1,832,794 RAC: 0 |
More of the same: test_ff_sym_c3_21res_c.127.43_0001_I_21_3_hit_CYS_GLU_4_5_4_cell033_0001_SAVE_ALL_OUT_47_357_1 All Access Violation again. |
PDW Send message Joined: 30 Aug 14 Posts: 6 Credit: 1,832,794 RAC: 0 |
Today's tasks are a fail with file upload error on Windows at least. |
CIA Send message Joined: 5 Apr 20 Posts: 13 Credit: 111,953 RAC: 0 |
I only got one recent Ralph WU but it didn't fare so well on my OSX machine, or on another Linux machine that also tried. https://ralph.bakerlab.org/workunit.php?wuid=4615810 |
Dr Who Fan Send message Joined: 2 Sep 06 Posts: 76 Credit: 107,857 RAC: 0 |
Failed for BOTH me and the wingman Task 5155194 Outcome Computation error Client state Compute error Exit status 0 (0x00000000) Stderr output <core_client_version>7.16.3</core_client_version> |
Message boards :
RALPH@home bug list :
Rosetta 4.12+
©2024 University of Washington
http://www.bakerlab.org