1)
Message boards :
Number crunching :
no credit
(Message 6603)
Posted 25 May 2019 by TPCBF Post: I know this is a fairly old post, but it is good to see that I am not the only one who has been wasting days of runtime on R@H WUs for nothing. And I am glad to see that on the latest batch of 8WUs which arrived some time yesterday, the first four validated and "paid" quite nicely. Just the weird "bouncing" memory usage (I have seen as little as 100MB and as much as 1GB on four WUs running) is a bit disconcerting... Ralf |
2)
Message boards :
RALPH@home bug list :
Rosetta mini beta and/or android 3.61-3.83
(Message 6220)
Posted 19 Oct 2017 by TPCBF Post: One task (#4084226) stalled at 83.144%.Got about a dozen or so he last couple of days. They all would end up in a compute error or stall out at various percentages in the 70-90% range, blocking any other useful WU from running. I noticed that pretty much all WUs checkpoint after about 13 secs, then won't show another checkpoint for hours until they crap out. Also, I had to manually remove a dozen or so dead tasks in the task manager to get my machine responsive again... Have the last two running right now, which show a slightly different behaviour (at least while I can watch them). Started about 15min ago, they show 12% done, 12 min of CPU time vs 15 clock time and the check point increasing each time I check by about 10 secs, but still a fraction of the indicated CPU time used. I am using BOINC agent 7.8.3 on an 8GB/i3/Windows 8.1 host... |
3)
Message boards :
Current tests :
Is this thing on?
(Message 6088)
Posted 1 Nov 2016 by TPCBF Post: Looks like there are more WUs that just get stuck like this. A total of 6 WUs finished an reported, but the remaining 8 (or at least the 3 other ones that have been started overnight) are stuck right now. Will likely abort those shortly as not to block my laptop for other stuff to crunch... Ralf |
4)
Message boards :
Current tests :
Is this thing on?
(Message 6087)
Posted 1 Nov 2016 by TPCBF Post: Well, strange dates on all the posts, months if not years old... Anyway, a new batch of WUs made it to this machine this afternoon and while the first one finished without a hitch and reported, the second one crunching right now for +55 min shows 1.994% done, with the last check point at 5:30min, with 6min CPU time and an estimated remaining runtime of 45min which keeps decreasing without any obvious progress. Something's rotten in the state of Berkley... :? Ralf |
5)
Message boards :
RALPH@home bug list :
Rosetta mini beta and/or android 3.61-3.83
(Message 5912)
Posted 12 Oct 2015 by TPCBF Post: I did not get a chance to do that, at least this time around. In the past, I mentioned this a few times that I had the same issue, but with each batch of WUs, it's the same spiel. Out of a dozen or so beta WUs, taking up days of compute time, blocking other projects, only two WUs finished and got credited.Got a bunch of WUs today(Beta 3.63, on Windows 8.1/64) and while the first one finished fine, with the rest it seems the "same old same old" starts: I really don't know if any of the developers are actually paying attention, at least none of them seems to be posting in here... :-( Ralf |
6)
Message boards :
RALPH@home bug list :
Rosetta mini beta and/or android 3.61-3.83
(Message 5904)
Posted 10 Oct 2015 by TPCBF Post: Got a bunch of WUs today(Beta 3.63, on Windows 8.1/64) and while the first one finished fine, with the rest it seems the "same old same old" starts: They will run for a while, then CPU time will stop increasing, at some point the job still shows "running" but no ETA time (just "-----") until they will crap out with a "Computation error" after blocking anything else on the host for hours, and no credit given either. Again, these are Beta 3.63 WUs, on Windows 8.1/64, 8GB of RAM, BOINC agent v7.6.9... Ralf |
7)
Message boards :
Current tests :
New test batch-Anybody out there?????
(Message 5832)
Posted 14 Mar 2015 by TPCBF Post: Hi TPCBF,Well, didn't look like it, that's why I posted in the meantime in the Rosetta forum. WUs cb_mar11_dock_placestub_EEEH_1035_vegf_ProteinInterfaceDesign_20241_91_0_0 and cb_mar11_dock_placestub_EEEH_1038_vegf_ProteinInterfaceDesign_20241_91_0_0 finally finished earlier, after I had suspended them since last night and resumed once some other project WUs finished before their deadline WU cb_mar11_dock_placestub_EEEH_1037_vegf_ProteinInterfaceDesign_20241_91_0_0 finished just a couple of minutes ago (will report it in another) while cb_mar11_dock_placestub_EEEH_1036_vegf_ProteinInterfaceDesign_20241_91_0_0 now sits for a short while at 76.478% with no estimated time remaining... Ralf |
8)
Message boards :
Current tests :
New test batch-Anybody out there?????
(Message 5830)
Posted 13 Mar 2015 by TPCBF Post: Well, got earlier today again 4 WUs of what appears to be a new series of tests. Problem however that all 4 WUs only run to a certain percentage and then simply seem to be stuck (the lowest at about 12%, the highest got to 8x%),blocking any other work on that host. Is anyone of the Baker team actually keeping an eye out about what is happening and actually looking for feedback? Or is this just all one large waste of time, on both ends? |
9)
Message boards :
RALPH@home bug list :
minirosetta beta 3.50-3.52 apps
(Message 5813)
Posted 25 Feb 2015 by TPCBF Post: Received this error on Task 3335547Same here. Got about a dozen or so WUs and they crap out faster than you can shake a stick at... :-( Ralf |
10)
Message boards :
RALPH@home bug list :
minirosetta beta 3.50-3.52 apps
(Message 5769)
Posted 25 Jul 2014 by TPCBF Post: To roughly check works of checkpoints not necessarily to restart.The problem with the current checkpoint setting in the WUs is that the recent batch of WUs seem to reset itself a lot, always starting from scratch instead of being able to continue from the last checkpoint. That's the purpose of checkpoints. As it is currently, a lot of processing power get's wasted this way... Ralf |
11)
Message boards :
RALPH@home bug list :
minirosetta beta 3.50-3.52 apps
(Message 5759)
Posted 20 Jul 2014 by TPCBF Post: Same here, the 4 WUs I p/u on the 17th just keep restarting from 0% over and over again and each time, at least during the initial time, are trashing the hard drive like crazy... Is anyone from the project actually around to monitor any responses. Or is Mr.Baker & Cie only available when there's a chance to bask in the limelight? Ralf |
12)
Message boards :
RALPH@home bug list :
Rosetta Mini Beta 3.53
(Message 5750)
Posted 19 Jul 2014 by TPCBF Post: ok, the 4 WUs that I got have by now repeatedly restarted from scratch, even at though at one point showing almost 30% done. And I noticed that those WUs are trashing the hard drive like crazy... Ralf |
13)
Message boards :
RALPH@home bug list :
Rosetta Mini Beta 3.53
(Message 5746)
Posted 19 Jul 2014 by TPCBF Post: Strange behaviour of the last set of (4) WUs. They are running now for a bit more than 2h, showing around 24% done but only a few odd minutes of elapsed runtime, and only a few (23) seconds of CPU time and supposedly no checkpoint reached yet... :? Running on Windows 8.1/64, with 6GB of RAM and not much else going on on the machine right now... Anyone there and care to explain? Ralf |
14)
Message boards :
RALPH@home bug list :
Rosetta Mini Beta 3.53
(Message 5743)
Posted 16 Jul 2014 by TPCBF Post: Yeah, those WUs seem to have a fairly large (low) restart threshold. Got two jobs on my laptop, running for about an hour and when I restarted my laptop today, they both started at 0% again. Theoretically they should finish before I have to leave and hit the road again, hope that they don't start all over again when I get back. Ralf |
15)
Message boards :
RALPH@home bug list :
MiniRosetta Beta 3.41
(Message 5566)
Posted 25 Aug 2012 by TPCBF Post: And the crap continues, nothing but compute and validate errors... :( Ralf |
16)
Message boards :
RALPH@home bug list :
MiniRosetta Beta 3.41
(Message 5565)
Posted 24 Aug 2012 by TPCBF Post: Yup, had now the last 4 3.41 task bomb out with the same error, though with runtimes between 388 and 18100 secs... Hope we get some response from the techs this time around (oh well, one can dream) Ralf |
17)
Message boards :
Number crunching :
WARNING, Tasks Cancelled.
(Message 5560)
Posted 12 Jul 2012 by TPCBF Post: And I don't think this applies to the curren tissue at all. I checked a couple dozen WUs that validated just fine before this nonsense started and they all showed "cancelled" in the WU info...RALPH and Rosetta don't display the server version, but there has been discussion in the past on Rosetta about old server code. But looks like the admins just don't f'ing care either way... :-( |
18)
Message boards :
RALPH@home bug list :
MiniRosetta Beta 3.26
(Message 5557)
Posted 11 Jul 2012 by TPCBF Post: The silence of the project admins is deafening... :-( Ralf |
19)
Message boards :
RALPH@home bug list :
MiniRosetta Beta 3.26
(Message 5552)
Posted 10 Jul 2012 by TPCBF Post: Just checked and every Work Unit still in progress has been cancelled, so if I process them they will get Validate errors.Well, just checked as a second WU of the latest batch(es) just finished and it did the same thing: - processing along just fine, uploading, reporting - results in a "validate error" The "canceled" under "errors" in the WU info shows up for a few days back at least, with all but one error WU reporting and validating just fine, so I am not sure that is of any relevance in this case. Looks something is askew here and it would really be nice if one of the admins would respond, at least to let us know they are looking into this... My WU's with validate errors are 2465839 and 2464581 with 2461750 from the same batch send and returned validated just fine... Haven't aborted those WU's just yet, but suspended the project in the hope to hear from the projects admins about this first... :? Ralf |
20)
Message boards :
RALPH@home bug list :
MiniRosetta Beta 3.26
(Message 5546)
Posted 29 Jun 2012 by TPCBF Post: Since the latest batch started the other day, I get roughly one compute error for each dozen or so WUs that go through just fine... Ralf |
©2024 University of Washington
http://www.bakerlab.org