Page 2 of 3
Re: How many users do a work unit.
Posted: Wed May 19, 2010 12:48 am
by road-runner
Well the way I knew I had 2 of the same was looking at FAHMON is what I used awhile back and was sitting there looking at the same WU on 2 different machines and I didnt have a EUE. There used to be a thread about it around here somewhere of people getting duplicates. I have not really checked it in awhile, I got to where I just dont really care anymore, I know what to expect and when it gets to bad I just move to a different project for awhile. I run what is given to me and and hope it does some good, to much trouble to try and keep track of it anymore.
Re: How many users do a work unit.
Posted: Wed May 19, 2010 1:37 am
by patonb
Theres said to be a bug that if they both call for a unit, it could happen that both got the same unit.
Re: How many users do a work unit.
Posted: Wed May 19, 2010 7:32 am
by 7im
RR, you don't have to be the one to have an EUE to have gotten dupes.
Someone else got 3 almost instant EUEs (or not quite instant), and Stanford ticked off 3 more copies. You received 2 of them. See how that works?
Re: How many users do a work unit.
Posted: Wed May 19, 2010 8:21 am
by Rattledagger
7im wrote:RR, you don't have to be the one to have an EUE to have gotten dupes.
Someone else got 3 almost instant EUEs (or not quite instant), and Stanford ticked off 3 more copies. You received 2 of them. See how that works?
Hmm, if you starts with sending-out a single copy of a wu, you shouldn't suddenly have 3 copies in progress if you're not doing something like this:
a: Send-out copy #1 to user-A, it errors-out.
b: Generates 2 new copies, send copy #2 to user-B and copy #3 to user-C.
c1: Both errors-out, generates 4 new copies, send copy #4 & #5 to user-D (different clients), copy #6 to user-E and copy #7 to user-F.
c2: One errors-out, generates 2 new copies, send copy #4 & #5 to user-D (different clients).
Sending-out 2 copies for each error would be an option, but an unefficient one.
Re: How many users do a work unit.
Posted: Wed May 19, 2010 12:59 pm
by road-runner
7im wrote:RR, you don't have to be the one to have an EUE to have gotten dupes.
Someone else got 3 almost instant EUEs (or not quite instant), and Stanford ticked off 3 more copies. You received 2 of them. See how that works?
Why keep sending out the bad WUs that EUE, I have this exact thing happen EUE 3 times in a row the server sends it to one of my other machines only to do the same thing. I then report it in the WU thread only to see others reporting the same one. Seems after 3 times it would be automatically removed and checked? No? They still have to wait on several of us to report it?
Re: How many users do a work unit.
Posted: Wed May 19, 2010 1:31 pm
by Wrish
It's not only bad WU's that EUE. Many people have bad (temporary) configurations or unstable machines.
Since we have orders of magnitude more computing resources than Stanford does, they send several of us the WU in question so as to get "statistical" data on whether the WU can be completed by some other setup. All the EUE's get logged so they have data on whether the problem resides in configuration or the work unit.
Re: How many users do a work unit.
Posted: Wed May 19, 2010 2:47 pm
by bruce
Rattledagger wrote:7im wrote:RR, you don't have to be the one to have an EUE to have gotten dupes.
Someone else got 3 almost instant EUEs (or not quite instant), and Stanford ticked off 3 more copies. You received 2 of them. See how that works?
Hmm, if you starts with sending-out a single copy of a wu, you shouldn't suddenly have 3 copies in progress if you're not doing something like this:
a: Send-out copy #1 to user-A, it errors-out.
b: Generates 2 new copies, send copy #2 to user-B and copy #3 to user-C.
c1: Both errors-out, generates 4 new copies, send copy #4 & #5 to user-D (different clients), copy #6 to user-E and copy #7 to user-F.
c2: One errors-out, generates 2 new copies, send copy #4 & #5 to user-D (different clients).
Sending-out 2 copies for each error would be an option, but an unefficient one.
I don't think it works that way, but as I said earlier, it's very difficult to prove. I believe it's more like the scenario 7im mentioned where a single EUE is reassigned to user-A and a single extra copy is sent to user-B.
The other side of the problem is that a single user-A can have a series of rapid EUEs. (Those might be the same WU or might be a variety of WUs.) FAH attempts to avoid that single user-A from depleting the WUs on the servers by shutting down the offending machine for 24-hours. Unfortunately, the owner often becomes very indignant and probably restarts his machine, probably leading to the duplication of more and more WUs. (Of course that owner blames FAH, never believing that his machine might be causing a problem -- and at that point, nobody really knows which it might be.)
Re: How many users do a work unit.
Posted: Wed May 19, 2010 3:34 pm
by 7im
Just speculating, but those WU dumpers and cherry pickers may also contribute to this problem. Not only do they delay the one work unit they dump, but the work server may see this like an EUE, and reassign the dumped WU multiple times like an EUE'd WU. Not only do they screw the WU, they screw several of us with copies of the dumped WU. Cherry pickers blow!
Re: How many users do a work unit.
Posted: Wed May 19, 2010 5:00 pm
by Rattledagger
bruce wrote:I don't think it works that way, but as I said earlier, it's very difficult to prove. I believe it's more like the scenario 7im mentioned where a single EUE is reassigned to user-A and a single extra copy is sent to user-B.
Ah, it wasn't clear if he meant the same user generated all EUE's or not.
Still, even user-A continues erroring-out the same wu, there's no reason for the server to start generating more copies to be sent to user-C, user-D and so on, as long as user-B haven't also errored-out the same wu.
Also, would guess in majority of instances sending same wu to user-A is a waste, since the probability it will error-out again is likely high, since most errors is likely either due to buggy wu or unstable computer.
The other side of the problem is that a single user-A can have a series of rapid EUEs. (Those might be the same WU or might be a variety of WUs.) FAH attempts to avoid that single user-A from depleting the WUs on the servers by shutting down the offending machine for 24-hours. Unfortunately, the owner often becomes very indignant and probably restarts his machine, probably leading to the duplication of more and more WUs. (Of course that owner blames FAH, never believing that his machine might be causing a problem -- and at that point, nobody really knows which it might be.)
Well, if user has been re-assigned the same buggy wu multiple times in a row and this is the reason for 24-hour deferral, it's not surprising if user gets indignant...
If every wu erroring-out on the other hand is different, it's much less likely all of them is buggy.
Hmm, cherry picking?
This would be another reason it's a bad idea to send-out a copy to more than user-B, except if user-B also generates an error...
Re: How many users do a work unit.
Posted: Wed May 19, 2010 9:27 pm
by 7im
Bad Ideas? No reasons?
Let's think this through a bit before assuming too much and making an... er, well, you know. Stanford thinks through their decisions, and sets policy within the capabilties of their systems BEFORE acting. Again, without much imagination, I can speculate on a few good ideas and good reasons...
1. There is a damn good reason (though not specified) for sending out an extra copy to User A and B and C... and/or
2. The system isn't fine-grained enough to decide if an EUE is a single event so the default of multiple copies is better than no copies... and/or
3. The WU has been delayed so much already, they send an extra 2 copies out to more users seeking a very quick return. The odds of 1 of the 3 extras going to a fast system is higher than 1 of 1.
Again, I speculate on the actual process, but still...
#2 is a okay idea, considering the alternative
#3 sounds like a very good reason in my opinion
Re: How many users do a work unit.
Posted: Thu May 20, 2010 12:48 am
by road-runner
Aww, yea makes sense now, they need to ban the cherry pickers also...
Re: How many users do a work unit.
Posted: Thu May 20, 2010 7:42 pm
by bruce
road-runner wrote:Aww, yea makes sense now, they need to ban the cherry pickers also...
Great idea. How do you propose they do that?
(. . . other than removing bonuses when the success rate falls below 80% -- I'd say that's a big step in the right direction.)
Re: How many users do a work unit.
Posted: Thu May 20, 2010 9:46 pm
by Rattledagger
7im wrote:Bad Ideas? No reasons?
Let's think this through a bit before assuming too much and making an... er, well, you know. Stanford thinks through their decisions, and sets policy within the capabilties of their systems BEFORE acting. Again, without much imagination, I can speculate on a few good ideas and good reasons...
Yes, it's a valid point, so more accurate would be to say "there's no good reason, except to work-around limitations in current client or server".
One of these limitations goes under the "bad ideas"-part, the cherry-pickers.
If a user is picking cherries and has just aborted a wu, the server re-assigning this wu one minute later will very likely just lead to wu being aborted again, so for FAH the re-issue in this instance only gives higher server-load.
But, the problem is, server-side doesn't know if user has aborted a wu, or if he's just having a download-problem, and wu has gotten lost in transit. This is due to a client-limitation, the client doesn't report download-errors, wu-abortions, and "unknown" errors.
So, until the client-limitation is removed, FAH must treat all these cases the same, and either treat all as "lost" wu and re-issue this to same client upto a max number of times, or treat all cases as "errored-out" and give user another wu instead.
Re: How many users do a work unit.
Posted: Thu May 20, 2010 10:26 pm
by 7im
Rat, you're so closed minded and have such negative beliefs about the project, I'm not going to waste any more of my time explaining how that is a positive feature instead of your narrow and incorrect interpretation of the process. Because no amount of reasoning will change a belief. Yes, the earth is flat.
Re: How many users do a work unit.
Posted: Thu May 20, 2010 11:48 pm
by road-runner
bruce wrote:road-runner wrote:Aww, yea makes sense now, they need to ban the cherry pickers also...
Great idea. How do you propose they do that?
(. . . other than removing bonuses when the success rate falls below 80% -- I'd say that's a big step in the right direction.)
Cant they ban there IP address and username if they keep cherry picking?