PCI-e bandwidth/capacity limitations
Moderator: Site Moderators
Forum rules
Please read the forum rules before posting.
Please read the forum rules before posting.
Re: PCI-e bandwidth/capacity limitations
x4 slot slowing too. According to FAHBench results 2-10%.
Using real results of the calculations are not correct for compare. Depending on the cores and different workunits performance can float by +-10% or more. In addition, PPD affected by state of the network, servers load, and a lot of other factors.
Using real results of the calculations are not correct for compare. Depending on the cores and different workunits performance can float by +-10% or more. In addition, PPD affected by state of the network, servers load, and a lot of other factors.
Re: PCI-e bandwidth/capacity limitations
I decided to go for a pair of headless octa-core x99 systems with 4 1080s each, plus another system with one 1080 that would be for personal use. They should be able to handle whatever I throw at them. At some point, I'll see if I can get some flexible risers and test a board at its maximum capacity of 7 cards. If the performance drop is not significant enough, I'll max out the total combined capacity at 14 cards.
I should hopefully have my parts picked up at the end of the month, then flown back home. Assuming nothing breaks, the systems should be up and running early next month.
I should hopefully have my parts picked up at the end of the month, then flown back home. Assuming nothing breaks, the systems should be up and running early next month.
Re: PCI-e bandwidth/capacity limitations
According to my calculations, the cost of a new motherboard with four full-fledged PCIe x16 slots and powerful processor will exceed the cost of two budget motherboards with two PCIe x16 slots each and two cheap dual-core processors. Boards with two cards do not require a special risers and housing.
It is worth doing only if the 4 slots motherboard CPU is already there, or are available at half of market prices.
It is worth doing only if the 4 slots motherboard CPU is already there, or are available at half of market prices.
-
- Posts: 1164
- Joined: Wed Apr 01, 2009 9:22 pm
- Hardware configuration: Asus Z8NA D6C, 2 [email protected] Ghz, , 12gb Ram, GTX 980ti, AX650 PSU, win 10 (daily use)
Asus Z87 WS, Xeon E3-1230L v3, 8gb ram, KFA GTX 1080, EVGA 750ti , AX760 PSU, Mint 18.2 OS
Not currently folding
Asus Z9PE- D8 WS, 2 [email protected] Ghz, 16Gb 1.35v Ram, Ubuntu (Fold only)
Asus Z9PA, 2 Ivy 12 core, 16gb Ram, H folding appliance (fold only) - Location: Jersey, Channel islands
Re: PCI-e bandwidth/capacity limitations
True, but you also need 2 sets of ram, 2 power supplies, 2 cases and the space for them, that soon starts adding upyalexey wrote:According to my calculations, the cost of a new motherboard with four full-fledged PCIe x16 slots and powerful processor will exceed the cost of two budget motherboards with two PCIe x16 slots each and two cheap dual-core processors. Boards with two cards do not require a special risers and housing.
It is worth doing only if the 4 slots motherboard CPU is already there, or are available at half of market prices.
-
- Posts: 2040
- Joined: Sat Dec 01, 2012 3:43 pm
- Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441
Re: PCI-e bandwidth/capacity limitations
And 2 Operating Systems to manage independently. But still cheaper than all in one system.
Re: PCI-e bandwidth/capacity limitations
Two PSU 500-600 watt power is also cheaper than the one on the 1000-1200 watts. Two casess, if need any, less expensive than ordering a special frame for the graphics cards on raisers. And it cheaper than 4 liquid-cooled graphics card, plus an external heat sink.
We can talk about a slight increase in space and additional administrative tasks. But if to make images server for network boot, the administration will be even simpler.
We can talk about a slight increase in space and additional administrative tasks. But if to make images server for network boot, the administration will be even simpler.
-
- Posts: 1164
- Joined: Wed Apr 01, 2009 9:22 pm
- Hardware configuration: Asus Z8NA D6C, 2 [email protected] Ghz, , 12gb Ram, GTX 980ti, AX650 PSU, win 10 (daily use)
Asus Z87 WS, Xeon E3-1230L v3, 8gb ram, KFA GTX 1080, EVGA 750ti , AX760 PSU, Mint 18.2 OS
Not currently folding
Asus Z9PE- D8 WS, 2 [email protected] Ghz, 16Gb 1.35v Ram, Ubuntu (Fold only)
Asus Z9PA, 2 Ivy 12 core, 16gb Ram, H folding appliance (fold only) - Location: Jersey, Channel islands
Re: PCI-e bandwidth/capacity limitations
Linux is free.......foldy wrote:And 2 Operating Systems to manage independently. But still cheaper than all in one system.
-
- Posts: 110
- Joined: Mon Nov 09, 2015 3:52 pm
- Hardware configuration: MoBo◘Gigabye X99 UD4-CF F24s
CPU◘2680V4 🔥Rosetta/SIDock
RAM◘64GB Hynix 2400 CL15
HDD◘ST1000DM003 Sata3 NCQ
GFX◘Zotac X-Gaming RTX3070 🔥Folding
VALID◘5nan6w - Location: Russia
- Contact:
Re: PCI-e bandwidth/capacity limitations
I beg pardon for a late answer. I always keep 1/4 or more of my quad-core cpu for folding.Nathan_P wrote:Possible, did you leave a cpu core free to feed the gpu? My testing never went down as far as x1 but others did see a dip, not as severe as yours though.
Thanks for info. I currently use 373.06, Windows 8.1 and don't know if your post related to this OS. If it is, 373.06 should include speed advantages of 370.28 I think.b4441b29 wrote:In the 16x lane slot with Nvidia driver version 367.44 it averaged 659240 PPD one week and 629492 PPD the next.
In the 4x lane slot with Nvidia driver version 370.28 it averaged 662106 PPD over a week.
Maximum I saw on my Palit GTX 1070 @2050/8016 are 810 216,8 PPD for p9194. My CPU is A8 [email protected] & card in x16 2.0 slot
And my comrade used the same driver version & Win 10 x64 achieved 900 750,6 PPD for the same project while his GTX1070 was less overclocked. His CPU is i3 [email protected] & PCI-E 3.0 x16
510 290 819 pts earned in Folding@home project
Re: PCI-e bandwidth/capacity limitations
Okay, sorry for the delay. Video cards were held at customs, because it looked rather suspicious that someone tried to take 10 cards from the US to Dubai in a cabin bag. Anyway...
First system is up and running, and the second is currently in the process of having everything installed. I can finally contribute to my questions in this thread. Specs are as follows:
4x GTX 1080
1x Xeon E5-2609 v4
8GB 2133mhz
1600w PSU
Windows 10
Driver version 373.06
Am I aware the PSU is overkill? Yep. There were sales at the time I was buying the parts (made it by mere minutes, actually). Parts are all running stock for now. EVGA Precision monitor data shows that during folding, the temps hit 82 degrees on the hottest card at 100% fan speed, and 58 degrees on the coolest (which is obviously the one right at the end with unrestricted air flow), with just short of 3 million PPD. And this is just one of two identical systems.
Now for the important part: While folding, PCI-e bus usage varies, but has never exceeded 23%. The average appears to be below 20% when comparing all four cards, which means that a fifth 1080 could easily be added. Of course, I'm assuming that by 'bus usage', the software is referring to the maximum bandwidth available on the entire bus, rather than the maximum capacity of the x16 slot. If that were the case, it would imply a 1080 could fold on x4 3.0 without any performance hit. As awesome as that sounds, I think it's too good to be true.
I think that should answer the question. Anyone needed more info?
First system is up and running, and the second is currently in the process of having everything installed. I can finally contribute to my questions in this thread. Specs are as follows:
4x GTX 1080
1x Xeon E5-2609 v4
8GB 2133mhz
1600w PSU
Windows 10
Driver version 373.06
Am I aware the PSU is overkill? Yep. There were sales at the time I was buying the parts (made it by mere minutes, actually). Parts are all running stock for now. EVGA Precision monitor data shows that during folding, the temps hit 82 degrees on the hottest card at 100% fan speed, and 58 degrees on the coolest (which is obviously the one right at the end with unrestricted air flow), with just short of 3 million PPD. And this is just one of two identical systems.
Now for the important part: While folding, PCI-e bus usage varies, but has never exceeded 23%. The average appears to be below 20% when comparing all four cards, which means that a fifth 1080 could easily be added. Of course, I'm assuming that by 'bus usage', the software is referring to the maximum bandwidth available on the entire bus, rather than the maximum capacity of the x16 slot. If that were the case, it would imply a 1080 could fold on x4 3.0 without any performance hit. As awesome as that sounds, I think it's too good to be true.
I think that should answer the question. Anyone needed more info?
Re: PCI-e bandwidth/capacity limitations
Xeon E5-2609 v4 has only 40 PCIe lines. If there is no PCIe switch on this motherboard, the video cards have at most eight PCIe lines each. Bus load is typical for such number of lines.
-
- Posts: 51
- Joined: Thu Sep 13, 2012 3:23 pm
- Hardware configuration: Ryzen 5 5600x
G.Skill 2x16 GB 3200@3333
GTX 1070
Lancool II Mesh Perf.
Linux Mint 21.3
some drives, some cooler, some peripherials
Re: PCI-e bandwidth/capacity limitations
Hiigaran, thanks for sharing your data.hiigaran wrote: Now for the important part: While folding, PCI-e bus usage varies, but has never exceeded 23%. The average appears to be below 20% when comparing all four cards, which means that a fifth 1080 could easily be added. Of course, I'm assuming that by 'bus usage', the software is referring to the maximum bandwidth available on the entire bus, rather than the maximum capacity of the x16 slot.
I have about 40-45% bus load and 58-65 C temps (depending on WU) with single open stand GTX 1070@2050 [email protected] V in 16x 2.0 slot with X58 board, twice higher than your 8x 3.0, maybe due to some latencies of X58 northbridge, which is absent at X99 boards.
Temperatures for 2 cards in the middle of your setup looks rather high for prolonged endurance of the system, because VRMs may be hotter than chip.
I could recommend watercooling for multi-GPU setups, but nor me nor my friends have not any experience with watercooling, just single error and everything is burnt out.
From what I read last weeks - Alphacool Eiswolf GPX Pro 120 AiO looks pretty fine for GTX 1080, it gives about 51C under load with only ~37 dBA noise from fan and additional ~60 MHz of boost and cold under 50 C VRMs. But I don't have any data about how it would work for 2-3 years period.
Here it's review in russian: http://www.hardwareluxx.ru/index.php/ar ... ml?start=3, pics from top to down are: idle temperature, load temperature, VRMs load temperature, max boost MHz with 100% power limit, fans rpms, noise dBA @idle, noise dBA @load, max boost with 110% power limit.
I could also give recommendation for builders of many GPU systems, for minimzing power load and heat of motherboard's VRMs - looking for cards with minimal consumption from 75 watt per PCIe slot onboard budget and use some additional fan directed on heatsinks for cooling of motherboards VRMs.
As from tomshardware power draw measurements, at resolution 4k with MetroLastLight game, power consumptions from motherboard slots averaging like:
- for GTX 1070 FE ~75 watts with little spikes over it,
- for GTX 1080 FE ~45-50 watts,
- for custom [spoiler]MSI GamingX[/spoiler] GTX 1070 ~ 35-40 watts,
- for custom [spoiler]MSI GamingX[/spoiler] GTX 1080 ~ 45-50 watts,
- for reference 980 Ti - only 10-15 watts (but it have overall power efficiency of only ~0.55-0.60 as compared to 1.00 of Pascal cards)
I hope that other custom Pascal cards also have reduced consumption from PCIe slots as compared to GTX 1070 FE.
-
- Posts: 2040
- Joined: Sat Dec 01, 2012 3:43 pm
- Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441
Re: PCI-e bandwidth/capacity limitations
GTX 970 pcie x8 2.0 60% bus usage. I guess with pcie x4 2.0 it would loose 20% performance. Edit: Windows 7 64bit
Last edited by foldy on Wed Dec 28, 2016 4:32 pm, edited 1 time in total.
-
- Posts: 410
- Joined: Mon Nov 15, 2010 8:51 pm
- Hardware configuration: 8x GTX 1080
3x GTX 1080 Ti
3x GTX 1060
Various other bits and pieces - Location: South Coast, UK
Re: PCI-e bandwidth/capacity limitations
Is this windows? I've only seen those sort of bus usage values on my win10 work machines, where even a lowly GT 730 will show 20%. I think it is either been misreported or a driver 'feature' (bug?) that is similar to the CPU polling using a full core - I think the usage value is proportional to the CPU speed, not the GPU speed (something to experiment with?).foldy wrote:GTX 970 pcie x8 2.0 60% bus usage. I guess with pcie x4 2.0 it would loose 20% performance.
With linux, on a pairs of similar GTX 1080 cards, I see 1% usage on the 16x slot and 2-3% on the 4x slot. PPD often favours the lower (slower) slot to the tune of 4% (measured over 6000 WU) - probably due to lower temps and higher boosts. I can't see any detriment from running 4x at all. Indeed, experiments at 1x showed maybe 5-10% drop in ppd but probably within experimental error.
Thinking about it, I fail to see how a well designed core / driver / project needs that much bandwidth. PCIE 1x still gives a gigabyte per second. Projects rarely use 500 MB of graphics memory, of which every system I run has more of than actual system RAM so probably not paging.
In summary, I've no evidence that anything over PCIE 2 4x is needed for current GPUs when running linux and I probably wouldn't lose sleep if some were running on 1x.
-
- Posts: 2040
- Joined: Sat Dec 01, 2012 3:43 pm
- Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441
Re: PCI-e bandwidth/capacity limitations
Yes it is Windows 7. The bus speed in folding usecase is not about how many MB can be transferred per second in total but about latency, how long it takes to transfer the needed data to continue calculations. e.g. if GPU needs 10 MB data now and pcie 1x can transfer 1GB/s then the delay is 10ms while the GPU may be idle. If you have 10 times the bandwidth with pcie 16x then it takes only 1 ms while the GPU may be idle. I guess the nvidia opencl driver is not really optimized on Windows. On linux the problem was never seen.
Re: PCI-e bandwidth/capacity limitations
IMHO, that's probably true for several reasons.foldy wrote:I guess the nvidia opencl driver is not really optimized on Windows.
1) NVidia's prime goal is to sell CUDA. OpenCL will never compete with it at a fundamental level.
2) Windows (particularly Win10) is optimized for the "experience" (visual performance, not computing performance) -- and probably on Android phones, it particular, with the PC desktop a secondary consideration.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.