Message boards : RALPH@home bug list : Report \"stuck at 1%\" bugs here
Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next
Author | Message |
---|---|
Rom Walton (BOINC) Volunteer moderator Project developer Send message Joined: 10 Mar 06 Posts: 21 Credit: 5,515 RAC: 0 |
Oppps, forgot to ask you to do one additional thing.... In Process Explorer there is an Options menu... Configure Symbols... Can you set the Dbghelp.dll path to: C:Program FilesBOINCDbgHelp.dll After that could you rerun the tests again? When things are working right you'll get something that looks like this: rosetta_beta_4.93_windows_intelx86.exe!pairenergy+0x126 rosetta_beta_4.93_windows_intelx86.exe!fullatom_energy+0x1979 rosetta_beta_4.93_windows_intelx86.exe!scorefxn+0xb4e TIA. ----- Rom |
Mike Gelvin Send message Joined: 17 Feb 06 Posts: 50 Credit: 55,397 RAC: 0 |
Rom, Data with Symbols: Pass 1 for CSwitchDelta aprox 90 StartAddress rosetta_beta_4.93_windows_intelx86.exe+0x1de550 Stack: ntoskrnl.exe!KiDispatchInterrupt+0x7b ntoskrnl.exe!PsSetLegoNotifyRoutine+0x83a rosetta_beta_4.93_windows_intelx86.exe+0x32f6b6 for CSwitchDelta 31 StartAddress rosetta_beta_4.93_windows_intelx86.exe+0x49fcf Stack: ntoskrnl.exe!KiUnexpectedInterrupt+0x183 win32k.sys+0x19c2 win32k.sys+0xb72 win32k.sys!EngGetCurrentCodePage+0x3654 ntoskrnl.exe!KiReleaseSpinLock+0xae4 !local_unwind2+0x5fe830bb ntoskrnl.exe!PsSetLegoNotifyRoutine+0x83a USER32.DLL!DispatchMessageW+0x40 rosetta_beta_4.93_windows_intelx86.exe+0x47b2fb rosetta_beta_4.93_windows_intelx86.exe+0x26c504 KERNEL32.dll!ProcessIdToSessionId+0x17d for CSwitchDelta 1 StartAddress WINMM.dlltimeSetEvent+0x2b0 Stack: ntoskrnl.exe!KiUnexpectedInterrupt+0x183 ntoskrnl.exe!ObSetSecurityDescriptorInfo+0x62c ntoskrnl.exe!KiReleaseSpinLock+0xae4 ntdll.dll!ZwWaitForMultipleObjects+0xb Pass 2 for CSwitchDelta aprox 90 StartAddress rosetta_beta_4.93_windows_intelx86.exe+0x1de550 Stack: ntoskrnl.exe!KiDispatchInterrupt+0x7b !local_unwind2+0x5fe830bb ntoskrnl.exe!PsSetLegoNotifyRoutine+0x83a rosetta_beta_4.93_windows_intelx86.exe+0x49aeda rosetta_beta_4.93_windows_intelx86.exe+0x256bb5 for CSwitchDelta 31 StartAddress rosetta_beta_4.93_windows_intelx86.exe+0x49fcf Stack: ntoskrnl.exe!KiUnexpectedInterrupt+0x183 win32k.sys+0x19c2 win32k.sys+0xb72 win32k.sys!EngGetCurrentCodePage+0x3654 ntoskrnl.exe!KiReleaseSpinLock+0xae4 !local_unwind2+0x5fe830bb ntoskrnl.exe!PsSetLegoNotifyRoutine+0x83a USER32.DLL!DispatchMessageW+0x40 rosetta_beta_4.93_windows_intelx86.exe+0x47b2fb rosetta_beta_4.93_windows_intelx86.exe+0x26c504 KERNEL32.dll!ProcessIdToSessionId+0x17d for CSwitchDelta 1 StartAddress WINMM.dlltimeSetEvent+0x2b0 Stack: ntoskrnl.exe!KiUnexpectedInterrupt+0x183 ntoskrnl.exe!ZwYieldExecution+0x35f ntoskrnl.exe!KiUnexpectedInterrupt+0x1ba ntdll.dll!ZwWaitForMultipleObjects+0xb Pass 3 for CSwitchDelta aprox 90 StartAddress rosetta_beta_4.93_windows_intelx86.exe+0x1de550 Stack: ntoskrnl.exe!KiDispatchInterrupt+0x7b !local_unwind2+0x5fe830bb ntoskrnl.exe!PsSetLegoNotifyRoutine+0x83a rosetta_beta_4.93_windows_intelx86.exe+0x256b92 for CSwitchDelta 31 StartAddress rosetta_beta_4.93_windows_intelx86.exe+0x49fcf Stack: ntoskrnl.exe!KiUnexpectedInterrupt+0x183 win32k.sys+0x19c2 win32k.sys+0xb72 win32k.sys!EngGetCurrentCodePage+0x3654 ntoskrnl.exe!KiReleaseSpinLock+0xae4 !local_unwind2+0x5fe830bb ntoskrnl.exe!PsSetLegoNotifyRoutine+0x83a USER32.DLL!DispatchMessageW+0x40 rosetta_beta_4.93_windows_intelx86.exe+0x47b2fb rosetta_beta_4.93_windows_intelx86.exe+0x26c504 KERNEL32.dll!ProcessIdToSessionId+0x17d for CSwitchDelta 1 StartAddress WINMM.dlltimeSetEvent+0x2b0 Stack: ntoskrnl.exe!KiUnexpectedInterrupt+0x183 ntoskrnl.exe!ZwYieldExecution+0x35f ntoskrnl.exe!KiUnexpectedInterrupt+0x1ba ntdll.dll!ZwWaitForMultipleObjects+0xb Good luck with this! Mike |
Rom Walton (BOINC) Volunteer moderator Project developer Send message Joined: 10 Mar 06 Posts: 21 Credit: 5,515 RAC: 0 |
Mike, Using Process Explorer again, can you look at the thread state for each thread? What is the base priority and dynamic priority for each thread in your list? It should be visible on the Threads tab on the process properties dialog box. TIA. ----- Rom |
Mike Gelvin Send message Joined: 17 Feb 06 Posts: 50 Credit: 55,397 RAC: 0 |
Mike, More Info: for CSwitchDelta aprox 90 StartAddress rosetta_beta_4.93_windows_intelx86.exe+0x1de550 ThreadID 2716 State Ready Kernal Time 0:00:01.131 not moving User Time 18:34:50.250 and climbing fast Base Priority 1 Dynamic Priority 1 for CSwitchDelta 31 StartAddress rosetta_beta_4.93_windows_intelx86.exe+0x49fcf ThreadID 2680 State Ready Kernal Time 0:00:00.828 not moving User Time 0:00:00.187 not moving Base Priority 4 Dynamic Priority 6 for CSwitchDelta 1 StartAddress WINMM.dlltimeSetEvent+0x2b0 ThreadID 2720 State Wait:UserRequest Kernal Time 0:00:00.000 not moving User Time 0:00:00.000 not moving Base Priority 15 Dynamic Priority 15 |
Rom Walton (BOINC) Volunteer moderator Project developer Send message Joined: 10 Mar 06 Posts: 21 Credit: 5,515 RAC: 0 |
Mike, Are you familiar with the Windows debugging tools? The reason I ask, is if I could get a dump of the process this might go quite a bit quicker. Would you be game for trying to get me a dump? |
BennyRop Send message Joined: 11 Mar 06 Posts: 14 Credit: 674 RAC: 0 |
Or temporarily opening two holes in your firewall/router so that the system could be taken over through RealVNC? (emailing Rom the ip#, RealVNC name and password) Granted, it's something I'd only do with someone I trusted. :) |
Mike Gelvin Send message Joined: 17 Feb 06 Posts: 50 Credit: 55,397 RAC: 0 |
Mike, This is why I was suggesting direct contact. I am familiar with VS tools for remote debugging, but I always have the source where I can attach to a remote process and set breakpoints and such. How to debug without source is something I'm not sure about. (Never had to, so never I figured it out). |
Mike Gelvin Send message Joined: 17 Feb 06 Posts: 50 Credit: 55,397 RAC: 0 |
Or temporarily opening two holes in your firewall/router so that the system could be taken over through RealVNC? (emailing Rom the ip#, RealVNC name and password) Granted, it's something I'd only do with someone I trusted. :) I'm sorry, direct access is not possible. I'm stretching the rules just running foreign code. |
Rom Walton (BOINC) Volunteer moderator Project developer Send message Joined: 10 Mar 06 Posts: 21 Credit: 5,515 RAC: 0 |
Mike, Sweet. Attach to the process with Visual Studio. Break on all threads From the debug menu select Save Dump As. Be sure to change the dump type to dump with heap. And give it some sort of name. With winzip compression the fire should shrink to 20MB or so. Do you have a web server I would be able to dl it from? Or should we try email? ----- Rom |
Mike Gelvin Send message Joined: 17 Feb 06 Posts: 50 Credit: 55,397 RAC: 0 |
Rom, Ok, the latest. Like I said, Im unfamiliar with debugging without source code. So.. I attached to the process and broke all threads. I looked for the Dump As. It wasn’t in the debug menu so I did some checking in Help and discovered a passage that essentially said he symbols had to be loaded to allow a dump. So I did a “Continue†and detached from the process to investigate how to load the symbols. After figuring that out, I looked at the run time for the Rosetta Beta process and discovered it had started over at 0 CPU time. Do you know if this represents a true restart? If so, I may no longer be stuck at 0. Anyway, I now have the dump file, its zipped and its size is under 13 meg, easy enough for me to email. 1) Is it possible this is of no more value cause I might no longer be stuck? 2) Should I allow it to keep running and see? ( I have it swapped out at the moment with 11 minutes of run time according to task manager) 3) Do you still want the file? 4) Where to? Mike |
Mike Gelvin Send message Joined: 17 Feb 06 Posts: 50 Credit: 55,397 RAC: 0 |
Looking at the stdout file, it appears that it indeed did restart due to a failed heartbeat. It is however using the exact same command line including seed. So I am going to let it run and see if its still stuck at 0. |
Rom Walton (BOINC) Volunteer moderator Project developer Send message Joined: 10 Mar 06 Posts: 21 Credit: 5,515 RAC: 0 |
Ah, okay... Well hopefully it'll do it again... Let me know how it goes... |
Mike Gelvin Send message Joined: 17 Feb 06 Posts: 50 Credit: 55,397 RAC: 0 |
Ah, okay... OK, I'm 10+ hours in and still stuck at 1%. I think it will stay stuck. If you concur I will gather the info. In the meantime, I am going to preempt it. |
Rom Walton (BOINC) Volunteer moderator Project developer Send message Joined: 10 Mar 06 Posts: 21 Credit: 5,515 RAC: 0 |
well go ahead and get a dump of it. I'm glad it at least repro'ed for you. ----- Rom |
Mike Gelvin Send message Joined: 17 Feb 06 Posts: 50 Credit: 55,397 RAC: 0 |
well go ahead and get a dump of it. I'm glad it at least repro'ed for you. Got it... where to? |
Rom Walton (BOINC) Volunteer moderator Project developer Send message Joined: 10 Mar 06 Posts: 21 Credit: 5,515 RAC: 0 |
Could you send it to this address: romw at romwnet.org It is currently setup with unrestricted sizes for sending and receiving email. ----- Rom |
Mike Gelvin Send message Joined: 17 Feb 06 Posts: 50 Credit: 55,397 RAC: 0 |
Could you send it to this address: I sent you an email with the following content... did you get it? "Looks like I’m having trouble getting the 12 meg out of the gate here. My main email ISP has a 5 meg limit, another has a 10 meg limit (both I have direct access to).. yet another ISP I have an account with is unlimited, but I have no direct connection with them and they don’t allow relaying… So It looks like I am going to have to carve the files up. Do you have a preferred method? I can create segmented Zips, or there is a shareware program I have used in the past called EZSplit. Or I could just write a short program to cut it up." Mike |
Rom Walton (BOINC) Volunteer moderator Project developer Send message Joined: 10 Mar 06 Posts: 21 Credit: 5,515 RAC: 0 |
Could you send it to this address: I didn't get it. Go ahead and create mini rars then, winrar can break up the dump file and reassemble it without to much grief. ----- Rom |
Mike Gelvin Send message Joined: 17 Feb 06 Posts: 50 Credit: 55,397 RAC: 0 |
Elvis has left the building. |
UBT - Timbo Send message Joined: 16 Feb 06 Posts: 3 Credit: 3,924 RAC: 0 |
Hi Rom, As per isntructions in the other thread, have aborted the following RALPH 4.93 WU's as they were stuck at 1%: 22/03/2006 14:53:43|ralph@home|Unrecoverable error for result HB_BARCODE_30_1a19A_352_138_0 (aborted via GUI RPC) 22/03/2006 14:53:48|ralph@home|Unrecoverable error for result HB_BARCODE_30_1a68__352_138_0 (aborted via GUI RPC) 22/03/2006 14:53:55|ralph@home|Unrecoverable error for result HB_BARCODE_30_1ctf__352_137_0 (aborted via GUI RPC) 22/03/2006 14:54:00|ralph@home|Unrecoverable error for result HB_BARCODE_30_1ctf__352_136_0 (aborted via GUI RPC) 22/03/2006 14:54:11|ralph@home|Unrecoverable error for result HB_BARCODE_30_4ubpA_352_135_0 (aborted via GUI RPC) Have 2 more that are progressing: 22/03/2006 14:54:23|ralph@home|Pausing result HB_BARCODE_30_5croA_352_136_0 (left in memory) 22/03/2006 14:56:02|ralph@home|Pausing result HB_BARCODE_30_1bk2__352_137_0 (left in memory) and now both are at around 37% at: Stage: "Ab initio". Model: 95 Step: 325,000+ - had to change the CPU resource to 2 days (from 4 days), as these 2 WU's are preventing me crunching for any other project - but happy to help with 48 hours of solid RALPH crunching if it helps figure out the problem. Now have some 4.94 WU's regards, Tim |
Message boards :
RALPH@home bug list :
Report \"stuck at 1%\" bugs here
©2024 University of Washington
http://www.bakerlab.org