Minirosetta beta 3.43

Message boards : RALPH@home bug list : Minirosetta beta 3.43

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 840
Credit: 1,888,960
RAC: 0
Message 5591 - Posted: 10 Nov 2012, 9:31:14 UTC

New version is out.
But no wu to crunch!
ID: 5591 · Report as offensive    Reply Quote
skgiven

Send message
Joined: 15 Dec 07
Posts: 8
Credit: 221,922
RAC: 0
Message 5592 - Posted: 13 Nov 2012, 21:48:35 UTC - in response to Message 5591.  

These tasks all pop up in their own windows,



Close the window and I think it closes the task and starts another. LAIM on so they might restart from 16min into run...
They don't appear to checkpoint going from BM Properties.
Progress remains at zero.
Tasks go past their estimated run time (of 76min, now at 90min).

W7x64, 8GB, HD5850, second drive for Boinc.

ID: 5592 · Report as offensive    Reply Quote
John Lewis Highsmith

Send message
Joined: 1 Mar 06
Posts: 7
Credit: 49,423
RAC: 0
Message 5593 - Posted: 14 Nov 2012, 2:31:51 UTC
Last modified: 14 Nov 2012, 2:43:57 UTC

On each of my 6 cores there was the same problem. (Two of the w/u are labeled Rosetta Mini Beta 3.43 and 4 are Rosetta Mini 3.43.) None of the other projects could run. "System Idle" was at 99%. When I suspended R@h other projects started. After suspending all the projects except R@h it took over. There is no progress indicated after 11 minutes of running on 240 minutes of estimated run time. On resuming the other w/u the R@h did not yield the floor. My final step was to suspend R@h again, so the other projects could run.

After posting this I thought of one additional step - resuming the R@h. When I did the w/u from other projects continued running. I'll check what happens when one of them completes.
ID: 5593 · Report as offensive    Reply Quote
Polian

Send message
Joined: 17 Feb 06
Posts: 3
Credit: 13,404
RAC: 0
Message 5594 - Posted: 14 Nov 2012, 4:13:51 UTC

I'm experiencing the same as the two above users on my Windows box at home. On my linux box at home they appear to be running normally (so far).
ID: 5594 · Report as offensive    Reply Quote
Jacob Klein

Send message
Joined: 9 Oct 09
Posts: 3
Credit: 216,796
RAC: 0
Message 5596 - Posted: 14 Nov 2012, 4:19:36 UTC - in response to Message 5592.  
Last modified: 14 Nov 2012, 4:23:49 UTC

I am also having the problem where each task is opening up an additional popup window erroneously.

My computer is working on 2 tasks from this project using application "Rosetta Mini 3.43". When the tasks start, they also open up their own windows, with titles that say "minirosetta version 3.43 [workunit: .....]"

Additionally, it doesn't look like any processing is happening on these work units. CPU is idle.

Quite annoying.
I assume this a bug with the project's application?

I guess it's our job to test, but this test failed hard. It didn't even gracefully exit - It just hung the computer out to dry, unable to process other projects even.

Windows 8 Pro x64
BOINC 7.0.38
ID: 5596 · Report as offensive    Reply Quote
John Lewis Highsmith

Send message
Joined: 1 Mar 06
Posts: 7
Credit: 49,423
RAC: 0
Message 5597 - Posted: 14 Nov 2012, 4:23:48 UTC

The 6 "other" tasks were wcgrids. 2 finished approx. 30 minutes ago, and 2 R@h tasks picked up where they had left off. Now only 4 of the 6 cores have wcgrid tasks running on them, and nothing happening on the 2 that were taken over by the R@h. I am suspending the R@h until a solution is found.
ID: 5597 · Report as offensive    Reply Quote
[WHGT]Cyberman

Send message
Joined: 29 Sep 12
Posts: 2
Credit: 3,789
RAC: 0
Message 5598 - Posted: 14 Nov 2012, 7:30:45 UTC
Last modified: 14 Nov 2012, 7:34:26 UTC

Same here.
A separate dos window for each task, all from Ralph, Minirosetta 3.43 .

No actual work done as far as I can see.

It seems to refresh after a while, redrawing(recreating?) the windows.

Win 7, 64Bit - it just installed some updates and wants to restart - perhaps this is a reason?

[edit]Restarted my PC - same situation. Still, it only started after that last update was installed.
ID: 5598 · Report as offensive    Reply Quote
skgiven

Send message
Joined: 15 Dec 07
Posts: 8
Credit: 221,922
RAC: 0
Message 5599 - Posted: 14 Nov 2012, 7:34:31 UTC - in response to Message 5597.  
Last modified: 14 Nov 2012, 8:22:30 UTC

I let 4 Ralph tasks 'run' for 10h, but still no progress. It looks like the tasks are not actually running going by CPU usage in Task Manager; 25% CPU usage from 4 Ralph tasks, 2 Asteroids and MW WU's doesn't add up - When I resumed other projects and Suspended Ralph the CPU usage rose to 80%
Is this a bad wrapper or just the app not working correctly?
I would suggest you abort this run.
- BTW if you just close the pop-up window, the same task reopens and starts again...

I slowly downloaded Ralph onto Dotschux VM. The estimated time to completion of a non started WU says 2h51min.
After half an hour it looked like it was working normally; 35min into a run, last checkpoint at 34min, elapsed time 35min, estimated remaining 38min, fraction done 55%, Virtual Mem 651MB, Working set 572MB.
However when I suspended the task it immediately failed 'computer error', and went to 100%
A new task started and when I selected Suspend Project and Resume it didn't fail. That said it was only a min into the run. Project Suspend and Resume would need to be tested later into the run.
ID: 5599 · Report as offensive    Reply Quote
skgiven

Send message
Joined: 15 Dec 07
Posts: 8
Credit: 221,922
RAC: 0
Message 5600 - Posted: 14 Nov 2012, 11:01:24 UTC - in response to Message 5599.  

On the Dotschux VM, CPU usage was low and the VM became increasingly sluggish, barely responsive (click and wait 5min). I started getting 'Communicating with Boinc Client Please Wait' pop-up messages. At least one task reached 85% complete, but it's now unresponsive.
ID: 5600 · Report as offensive    Reply Quote
skgiven

Send message
Joined: 15 Dec 07
Posts: 8
Credit: 221,922
RAC: 0
Message 5601 - Posted: 14 Nov 2012, 12:39:10 UTC - in response to Message 5600.  
Last modified: 14 Nov 2012, 12:55:09 UTC

After restarting the VM I managed to report a completed WU:
2521489 14 Nov 2012 6:38:00 UTC 14 Nov 2012 11:30:14 UTC Over Success Done 3,606.31 34.71 34.71

This is the one that failed on suspend:
2853653 2521461 14 Nov 2012 6:33:50 UTC 14 Nov 2012 8:09:19 UTC Over Client error Compute error 2,215.91 21.33 ---

Process got signal 11

I don't see many scientist/researcher/developers posting in this forum:
214 days since Rocco Moretti made a post,
516 days since Profile dekim posted.
ID: 5601 · Report as offensive    Reply Quote
[WHGT]Cyberman

Send message
Joined: 29 Sep 12
Posts: 2
Credit: 3,789
RAC: 0
Message 5602 - Posted: 14 Nov 2012, 13:43:46 UTC

Two Rosetta WUs just started - same symptoms. I don't think it's an issue with Ralph, rather with windows.
ID: 5602 · Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 840
Credit: 1,888,960
RAC: 0
Message 5603 - Posted: 14 Nov 2012, 13:47:26 UTC - in response to Message 5601.  

I don't see many scientist/researcher/developers posting in this forum:
214 days since Rocco Moretti made a post,
516 days since Profile dekim posted.


It's a long-term problem
The bug list thread are opened by me and conan
But we aren't researcher...
ID: 5603 · Report as offensive    Reply Quote
LEONARI

Send message
Joined: 12 Mar 06
Posts: 5
Credit: 108,342
RAC: 0
Message 5604 - Posted: 14 Nov 2012, 14:02:52 UTC

I am seeing all of the above so nothing new to add other than to give some more clues to assist diagnosis.

1. I have tested this problem against BOINC manager versions 7.0.25/.31/.36 and .38; I see the same problem on all of those builds.

2. I saw the problem start when my machine began downloading forty-two RALPH tasks at 19:02 13/11/12 GMT (that is last night for those of you whom are USA citizens) - most of which were over 10 MB with some over 100 MB.

3. Between then and 07:30 this morning Rosetta/RALPH downloaded excessive data, over 3 Gigs worth, when the normal amount would only be 50 MB in 24 hours but the "Remaining" time for each of the forty-two tasks, before they started, was only 5 hours 38 minutes (evidence available), although they all subsequently changed to ten hours, 38 minutes (evidence available).


ID: 5604 · Report as offensive    Reply Quote
Polian

Send message
Joined: 17 Feb 06
Posts: 3
Credit: 13,404
RAC: 0
Message 5605 - Posted: 14 Nov 2012, 15:25:44 UTC

3.43 should not have been pushed out, seeing the same things with Windows clients over on production Rosetta now.
ID: 5605 · Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 840
Credit: 1,888,960
RAC: 0
Message 5606 - Posted: 14 Nov 2012, 15:45:54 UTC - in response to Message 5605.  

3.43 should not have been pushed out, seeing the same things with Windows clients over on production Rosetta now.


+1

Only a single batch of less than 3000 wu (with problems) and you pass this version to production? You are crazy....
ID: 5606 · Report as offensive    Reply Quote
LEONARI

Send message
Joined: 12 Mar 06
Posts: 5
Credit: 108,342
RAC: 0
Message 5608 - Posted: 14 Nov 2012, 18:13:32 UTC

Questions: I presume I should abort all of the Rosetta/RALPH 4.43 tasks, correct? If so, how long should I stop taking new Rosetta/RALPH work, until 4.44 appears or the build falls back to 4.42 ?

For what it is worth, for completeness and to assist any diagnosis the following are my machine's CPU benchmarks:

14/11/2012 17:15:15 | | No config file found - using defaults
14/11/2012 17:15:15 | | Starting BOINC client version 7.0.38 for windows_x86_64
14/11/2012 17:15:15 | | log flags: file_xfer, sched_ops, task
14/11/2012 17:15:15 | | Libraries: libcurl/7.25.0 OpenSSL/1.0.1 zlib/1.2.6
14/11/2012 17:15:15 | | Data directory: C:ProgramDataBOINC
14/11/2012 17:15:15 | | Running under account leonari
14/11/2012 17:15:15 | | Processor: 8 GenuineIntel Intel(R) Core(TM) i7-2630QM CPU @ 2.00GHz [Family 6 Model 42 Stepping 7]
14/11/2012 17:15:15 | | Processor: 256.00 KB cache
14/11/2012 17:15:15 | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 syscall nx lm vmx tm2 popcnt aes pbe
14/11/2012 17:15:15 | | OS: Microsoft Windows 7: Home Premium x64 Edition, Service Pack 1, (06.01.7601.00)
14/11/2012 17:15:15 | | Memory: 5.91 GB physical, 11.82 GB virtual
14/11/2012 17:15:15 | | Disk: 580.21 GB total, 450.14 GB free
14/11/2012 17:15:15 | | Local time is UTC +0 hours
14/11/2012 17:15:15 | | NVIDIA GPU 0: GeForce GT 540M (driver version 296.31, CUDA version 4.20, compute capability 2.1, 1024MB, 961MB available, 258 GFLOPS peak)
14/11/2012 17:15:15 | | OpenCL: NVIDIA GPU 0: GeForce GT 540M (driver version 296.31, device version OpenCL 1.1 CUDA, 1024MB, 961MB available)
14/11/2012 17:15:15 | | Version change (7.0.25 -> 7.0.38)
14/11/2012 17:15:15 | rosetta@home | URL https://boinc.bakerlab.org/rosetta/; Computer ID 1474581; resource share 20
14/11/2012 17:15:15 | ralph@home | URL https://ralph.bakerlab.org/; Computer ID 26114; resource share 5
14/11/2012 17:15:15 | SETI@home | URL http://setiathome.berkeley.edu/; Computer ID 6156595; resource share 75
14/11/2012 17:15:15 | SETI@home | General prefs: from SETI@home (last modified 16-Jul-2012 00:54:28)
14/11/2012 17:15:15 | SETI@home | Computer location: home
14/11/2012 17:15:15 | SETI@home | General prefs: no separate prefs for home; using your defaults
14/11/2012 17:15:15 | | Reading preferences override file
14/11/2012 17:15:15 | | Preferences:
14/11/2012 17:15:15 | | max memory usage when active: 5445.62MB
14/11/2012 17:15:15 | | max memory usage when idle: 5990.18MB
14/11/2012 17:15:15 | | max disk usage: 20.00GB
14/11/2012 17:15:15 | | (to change preferences, visit the web site of an attached project, or select Preferences in the Manager)
14/11/2012 17:15:15 | | Not using a proxy
14/11/2012 17:15:15 | | Running CPU benchmarks
14/11/2012 17:15:15 | | Suspending computation - CPU benchmarks in progress
14/11/2012 17:15:15 | | Suspending network activity - time of day
14/11/2012 17:15:46 | | Benchmark results:
14/11/2012 17:15:46 | | Number of CPUs: 8
14/11/2012 17:15:46 | | 2437 floating point MIPS (Whetstone) per CPU
14/11/2012 17:15:46 | | 7803 integer MIPS (Dhrystone) per CPU

ID: 5608 · Report as offensive    Reply Quote
Jacob Klein

Send message
Joined: 9 Oct 09
Posts: 3
Credit: 216,796
RAC: 0
Message 5609 - Posted: 14 Nov 2012, 18:37:57 UTC - in response to Message 5608.  
Last modified: 14 Nov 2012, 18:38:27 UTC

This is quite a mess!

For the machines I can easily change settings on... I've aborted any Ralph/Rosetta tasks, and have set the projects to No New Tasks.. until we get confirmation that it has been fixed.
For the machines that I cannot easily change settings on... I guess they're stuck and are unable to process any work? I even think they won't even be able to work on other non-Ralph non-Rosetta projects!

I hope this gets fixed quickly - VERY NASTY!
ID: 5609 · Report as offensive    Reply Quote
Polian

Send message
Joined: 17 Feb 06
Posts: 3
Credit: 13,404
RAC: 0
Message 5610 - Posted: 14 Nov 2012, 18:41:05 UTC - in response to Message 5608.  

Someone over on the Rosetta boards noticed that both the science and screensaver applications have the same size and CRC, lol. Sounds like the files just got confused when they were placing them for dissemination. See this post.
ID: 5610 · Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 9 Apr 08
Posts: 840
Credit: 1,888,960
RAC: 0
Message 5611 - Posted: 14 Nov 2012, 20:10:15 UTC

Now, i download 8 wus on my win7 x64
I kill all the wus
Please, stop it
ID: 5611 · Report as offensive    Reply Quote
AMDave

Send message
Joined: 15 Jul 06
Posts: 6
Credit: 100,163
RAC: 0
Message 5612 - Posted: 15 Nov 2012, 0:00:47 UTC

I had the same problems that are mentioned above, Win7_AMD64, BOINC 7.0.28.

But, the problem appears with the Windows client only.

On Linux the WUs all ran ok, linux_AMD64, BOINC 7.0.28.
ID: 5612 · Report as offensive    Reply Quote
1 · 2 · Next

Message boards : RALPH@home bug list : Minirosetta beta 3.43



©2024 University of Washington
http://www.bakerlab.org