How many users do a work unit.

Moderators: Site Moderators, FAHC Science Team

road-runner
Posts: 227
Joined: Sun Dec 02, 2007 4:01 am
Location: Willis, Texas

Re: How many users do a work unit.

Post by road-runner »

Well the way I knew I had 2 of the same was looking at FAHMON is what I used awhile back and was sitting there looking at the same WU on 2 different machines and I didnt have a EUE. There used to be a thread about it around here somewhere of people getting duplicates. I have not really checked it in awhile, I got to where I just dont really care anymore, I know what to expect and when it gets to bad I just move to a different project for awhile. I run what is given to me and and hope it does some good, to much trouble to try and keep track of it anymore.
Image
patonb
Posts: 348
Joined: Thu Oct 23, 2008 2:42 am
Hardware configuration: WooHoo= SR-2 -- L5639 @ ?? -- Evga 560ti FPB -- 12Gig Corsair XMS3 -- Corsair 1050hx -- Blackhawk Ultra

Foldie = @3.2Ghz -- Noctua NH-U12 -- BFG GTX 260-216 -- 6Gig OCZ Gold -- x58a-ud3r -- 6Gig OCZ Gold -- hx520

Re: How many users do a work unit.

Post by patonb »

Theres said to be a bug that if they both call for a unit, it could happen that both got the same unit.
WooHoo = L5639 @ 3.3Ghz Evga SR-2 6x2gb Corsair XMS3 CM 212+ Corsair 1050hx Blackhawk Ultra EVGA 560ti

Foldie = i7 950@ 4.0Ghz x58a-ud3r 216-216 @ 850/2000 3x2gb OCZ Gold NH-u12 Heatsink Corsair hx520 Antec 900
7im
Posts: 10179
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: How many users do a work unit.

Post by 7im »

RR, you don't have to be the one to have an EUE to have gotten dupes.

Someone else got 3 almost instant EUEs (or not quite instant), and Stanford ticked off 3 more copies. You received 2 of them. See how that works?
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Rattledagger
Posts: 128
Joined: Thu Dec 06, 2007 9:48 pm
Location: Norway

Re: How many users do a work unit.

Post by Rattledagger »

7im wrote:RR, you don't have to be the one to have an EUE to have gotten dupes.

Someone else got 3 almost instant EUEs (or not quite instant), and Stanford ticked off 3 more copies. You received 2 of them. See how that works?
Hmm, if you starts with sending-out a single copy of a wu, you shouldn't suddenly have 3 copies in progress if you're not doing something like this:

a: Send-out copy #1 to user-A, it errors-out.
b: Generates 2 new copies, send copy #2 to user-B and copy #3 to user-C.
c1: Both errors-out, generates 4 new copies, send copy #4 & #5 to user-D (different clients), copy #6 to user-E and copy #7 to user-F.
c2: One errors-out, generates 2 new copies, send copy #4 & #5 to user-D (different clients).

Sending-out 2 copies for each error would be an option, but an unefficient one.
road-runner
Posts: 227
Joined: Sun Dec 02, 2007 4:01 am
Location: Willis, Texas

Re: How many users do a work unit.

Post by road-runner »

7im wrote:RR, you don't have to be the one to have an EUE to have gotten dupes.

Someone else got 3 almost instant EUEs (or not quite instant), and Stanford ticked off 3 more copies. You received 2 of them. See how that works?
Why keep sending out the bad WUs that EUE, I have this exact thing happen EUE 3 times in a row the server sends it to one of my other machines only to do the same thing. I then report it in the WU thread only to see others reporting the same one. Seems after 3 times it would be automatically removed and checked? No? They still have to wait on several of us to report it?
Image
Wrish
Posts: 74
Joined: Thu Jan 28, 2010 5:09 am

Re: How many users do a work unit.

Post by Wrish »

It's not only bad WU's that EUE. Many people have bad (temporary) configurations or unstable machines.

Since we have orders of magnitude more computing resources than Stanford does, they send several of us the WU in question so as to get "statistical" data on whether the WU can be completed by some other setup. All the EUE's get logged so they have data on whether the problem resides in configuration or the work unit.
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: How many users do a work unit.

Post by bruce »

Rattledagger wrote:
7im wrote:RR, you don't have to be the one to have an EUE to have gotten dupes.

Someone else got 3 almost instant EUEs (or not quite instant), and Stanford ticked off 3 more copies. You received 2 of them. See how that works?
Hmm, if you starts with sending-out a single copy of a wu, you shouldn't suddenly have 3 copies in progress if you're not doing something like this:

a: Send-out copy #1 to user-A, it errors-out.
b: Generates 2 new copies, send copy #2 to user-B and copy #3 to user-C.
c1: Both errors-out, generates 4 new copies, send copy #4 & #5 to user-D (different clients), copy #6 to user-E and copy #7 to user-F.
c2: One errors-out, generates 2 new copies, send copy #4 & #5 to user-D (different clients).

Sending-out 2 copies for each error would be an option, but an unefficient one.
I don't think it works that way, but as I said earlier, it's very difficult to prove. I believe it's more like the scenario 7im mentioned where a single EUE is reassigned to user-A and a single extra copy is sent to user-B.

The other side of the problem is that a single user-A can have a series of rapid EUEs. (Those might be the same WU or might be a variety of WUs.) FAH attempts to avoid that single user-A from depleting the WUs on the servers by shutting down the offending machine for 24-hours. Unfortunately, the owner often becomes very indignant and probably restarts his machine, probably leading to the duplication of more and more WUs. (Of course that owner blames FAH, never believing that his machine might be causing a problem -- and at that point, nobody really knows which it might be.)
7im
Posts: 10179
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: How many users do a work unit.

Post by 7im »

Just speculating, but those WU dumpers and cherry pickers may also contribute to this problem. Not only do they delay the one work unit they dump, but the work server may see this like an EUE, and reassign the dumped WU multiple times like an EUE'd WU. Not only do they screw the WU, they screw several of us with copies of the dumped WU. Cherry pickers blow!
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Rattledagger
Posts: 128
Joined: Thu Dec 06, 2007 9:48 pm
Location: Norway

Re: How many users do a work unit.

Post by Rattledagger »

bruce wrote:I don't think it works that way, but as I said earlier, it's very difficult to prove. I believe it's more like the scenario 7im mentioned where a single EUE is reassigned to user-A and a single extra copy is sent to user-B.
Ah, it wasn't clear if he meant the same user generated all EUE's or not.

Still, even user-A continues erroring-out the same wu, there's no reason for the server to start generating more copies to be sent to user-C, user-D and so on, as long as user-B haven't also errored-out the same wu.

Also, would guess in majority of instances sending same wu to user-A is a waste, since the probability it will error-out again is likely high, since most errors is likely either due to buggy wu or unstable computer.
The other side of the problem is that a single user-A can have a series of rapid EUEs. (Those might be the same WU or might be a variety of WUs.) FAH attempts to avoid that single user-A from depleting the WUs on the servers by shutting down the offending machine for 24-hours. Unfortunately, the owner often becomes very indignant and probably restarts his machine, probably leading to the duplication of more and more WUs. (Of course that owner blames FAH, never believing that his machine might be causing a problem -- and at that point, nobody really knows which it might be.)
Well, if user has been re-assigned the same buggy wu multiple times in a row and this is the reason for 24-hour deferral, it's not surprising if user gets indignant...
If every wu erroring-out on the other hand is different, it's much less likely all of them is buggy.

Hmm, cherry picking?

This would be another reason it's a bad idea to send-out a copy to more than user-B, except if user-B also generates an error...
7im
Posts: 10179
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: How many users do a work unit.

Post by 7im »

Bad Ideas? No reasons? :roll: Let's think this through a bit before assuming too much and making an... er, well, you know. Stanford thinks through their decisions, and sets policy within the capabilties of their systems BEFORE acting. Again, without much imagination, I can speculate on a few good ideas and good reasons...

1. There is a damn good reason (though not specified) for sending out an extra copy to User A and B and C... and/or
2. The system isn't fine-grained enough to decide if an EUE is a single event so the default of multiple copies is better than no copies... and/or
3. The WU has been delayed so much already, they send an extra 2 copies out to more users seeking a very quick return. The odds of 1 of the 3 extras going to a fast system is higher than 1 of 1.

Again, I speculate on the actual process, but still...
#2 is a okay idea, considering the alternative
#3 sounds like a very good reason in my opinion
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
road-runner
Posts: 227
Joined: Sun Dec 02, 2007 4:01 am
Location: Willis, Texas

Re: How many users do a work unit.

Post by road-runner »

Aww, yea makes sense now, they need to ban the cherry pickers also...
Image
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: How many users do a work unit.

Post by bruce »

road-runner wrote:Aww, yea makes sense now, they need to ban the cherry pickers also...
Great idea. How do you propose they do that?

(. . . other than removing bonuses when the success rate falls below 80% -- I'd say that's a big step in the right direction.)
Rattledagger
Posts: 128
Joined: Thu Dec 06, 2007 9:48 pm
Location: Norway

Re: How many users do a work unit.

Post by Rattledagger »

7im wrote:Bad Ideas? No reasons? :roll: Let's think this through a bit before assuming too much and making an... er, well, you know. Stanford thinks through their decisions, and sets policy within the capabilties of their systems BEFORE acting. Again, without much imagination, I can speculate on a few good ideas and good reasons...
Yes, it's a valid point, so more accurate would be to say "there's no good reason, except to work-around limitations in current client or server". :oops:

One of these limitations goes under the "bad ideas"-part, the cherry-pickers.
If a user is picking cherries and has just aborted a wu, the server re-assigning this wu one minute later will very likely just lead to wu being aborted again, so for FAH the re-issue in this instance only gives higher server-load.
But, the problem is, server-side doesn't know if user has aborted a wu, or if he's just having a download-problem, and wu has gotten lost in transit. This is due to a client-limitation, the client doesn't report download-errors, wu-abortions, and "unknown" errors.

So, until the client-limitation is removed, FAH must treat all these cases the same, and either treat all as "lost" wu and re-issue this to same client upto a max number of times, or treat all cases as "errored-out" and give user another wu instead.
7im
Posts: 10179
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: How many users do a work unit.

Post by 7im »

Rat, you're so closed minded and have such negative beliefs about the project, I'm not going to waste any more of my time explaining how that is a positive feature instead of your narrow and incorrect interpretation of the process. Because no amount of reasoning will change a belief. Yes, the earth is flat. :roll:
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
road-runner
Posts: 227
Joined: Sun Dec 02, 2007 4:01 am
Location: Willis, Texas

Re: How many users do a work unit.

Post by road-runner »

bruce wrote:
road-runner wrote:Aww, yea makes sense now, they need to ban the cherry pickers also...
Great idea. How do you propose they do that?

(. . . other than removing bonuses when the success rate falls below 80% -- I'd say that's a big step in the right direction.)
Cant they ban there IP address and username if they keep cherry picking?
Image
Post Reply