NAVI No Longer MCM

Hapatingjaky · Jun 17, 2018

https://www.pcgamesn.com/amd-navi-monolithic-gpu-design

Seems because of Multi-GPU ala Crossfire and SLI being killed off AMD plans to ditch the MCM design and remain a "Monolitic GPU"

“We are looking at the MCM type of approach,” says Wang, “but we’ve yet to conclude that this is something that can be used for traditional gaming graphics type of application.”

Game developers adoption of Multi-GPU support:

That infrastructure doesn’t exist with graphics cards outside of CrossFire and Nvidia’s SLI. And even that kind of multi-GPU support is dwindling to the point where it’s practically dead. Game developers don’t want to spend the necessary resources to code their games specifically to work with a multi-GPU array with a miniscule install base, and that would be the same with an MCM design.

Now if say a console had an MCM CPU/APU/GPU whatever you want to call it then we may see Multi-GPU being supported better.

bill dennison · Jun 17, 2018

It’s definitely something AMD’s engineering teams are investigating, but it still looks a long way from being workable for gaming GPUs, and definitely not in time for the AMD Navi release next year. “We are looking at the MCM type of approach,” says Wang, “but we’ve yet to conclude that this is something that can be used for traditional gaming graphics type of application.”

http://www.rage3d.com/board/showpost.php?p=1338074389&postcount=64

welcome to 3 days ago :bleh:

…..

navi was always iffy for MCM too soon after Raja

maybe the next one

0091/2 · Jun 17, 2018

What are the users here doing with their SLI setup? Are the newer AAA games supporting them?

pax · Jun 17, 2018

If they had used hbm in navi I think it couldve worked with 2 gpu on one card. Probably one of the issues is that gddr6 will be the ram of choice for speed and cost for gaming cards. And that makes a dual gpu card harder to make.

Hapatingjaky · Jun 18, 2018

pax said:
If they had used hbm in navi I think it couldve worked with 2 gpu on one card. Probably one of the issues is that gddr6 will be the ram of choice for speed and cost for gaming cards. And that makes a dual gpu card harder to make.

No, the way Crossfire and SLI work it wouldn't really matter if you used infinity fabric or not its still the same issue. As long as both AMD and Nvidia use an AFR method over SFR ( this is what 3DFX used back in the day ) then we are screwed. HBM has nothing to do with it. GDDR6 is fast enough if not faster and cheaper too produce then HBM.

This comes down to the fact that there is no developer support for Multi-GPU in todays games anymore. You have DX12 and Vulkan 1.2 ( 2018 release ) that support Multi-GPU natively, but developers have to enable and support the feature. Gone are the days of Crossfire/SLI Profiles from AMD or Nvidia.

Even if AMD or Nvidia were to release a Multi-Core GPU it will still come down too developer support. And that won't happen unless a console gets Multi-GPU treatment.

GTwannabe · Jun 18, 2018

Very few developers will waste time and money on optimizations for the tiny fraction of gamers that run multi-GPU. Makes more sense to tweak your code to run well on consoles. For mGPU to work, it needs to be transparent to the OS.

bill dennison · Jun 18, 2018

I thought of the two way to do MCM one was SFR two or more gpu's working on one hbm memory pool at the same time
and you need hbm as you need it all on the same chip package

windows might need driver balancing but the game would not
a little like ryzen

and nv is working on it also

http://research.nvidia.com/publication/2017-06_MCM-GPU:-Multi-Chip-Module-GPUs

bobvodka · Jun 18, 2018

bill dennison said:
windows might need driver balancing but the game would not
a little like ryzen
[/url]

So, firstly, games don't ignore CPU topology - they know what they are running on (and best guess forward compatibility) so that resources can be spread as they need - so already it's not that simple.

Secondly, trying to pretend that 2 GPUs are one is what DX11 and OpenGL did and that lead to driver profiles... so, I mean, if you want some kind of driver profile provided by MS in order to load balance your games correctly for each GPU then knock yourself out but that's the reality of what you are suggesting.

But then I keep pointing this out and apparently no one pays attention so... *shrug*

pax · Jun 18, 2018

Hapatingjaky said:
No, the way Crossfire and SLI work it wouldn't really matter if you used infinity fabric or not its still the same issue. As long as both AMD and Nvidia use an AFR method over SFR ( this is what 3DFX used back in the day ) then we are screwed. HBM has nothing to do with it. GDDR6 is fast enough if not faster and cheaper too produce then HBM.

This comes down to the fact that there is no developer support for Multi-GPU in todays games anymore. You have DX12 and Vulkan 1.2 ( 2018 release ) that support Multi-GPU natively, but developers have to enable and support the feature. Gone are the days of Crossfire/SLI Profiles from AMD or Nvidia.

Even if AMD or Nvidia were to release a Multi-Core GPU it will still come down too developer support. And that won't happen unless a console gets Multi-GPU treatment.

I meant as space needed to put 2 gpu with gddr6. Makes for a big pcb. As for dx12 and Infinity Fabric it was always deemed to be developer agnostic. Mind you Id always like a user side switch in case something doesnt work well.

bill dennison · Jun 18, 2018

bobvodka said:
So, firstly, games don't ignore CPU topology - they know what they are running on (and best guess forward compatibility) so that resources can be spread as they need - so already it's not that simple.

Secondly, trying to pretend that 2 GPUs are one is what DX11 and OpenGL did and that lead to driver profiles... so, I mean, if you want some kind of driver profile provided by MS in order to load balance your games correctly for each GPU then knock yourself out but that's the reality of what you are suggesting.

But then I keep pointing this out and apparently no one pays attention so... *shrug*

they we are s*it out of luck

because it is not getting much smaller than 7nm soon we are going to hit the wall on die shrinks
what is next 3nm where to after that and how many years will it take

so I sure hope they can get MCM working

shadow001 · Jun 18, 2018

bobvodka said:
So, firstly, games don't ignore CPU topology - they know what they are running on (and best guess forward compatibility) so that resources can be spread as they need - so already it's not that simple.

Secondly, trying to pretend that 2 GPUs are one is what DX11 and OpenGL did and that lead to driver profiles... so, I mean, if you want some kind of driver profile provided by MS in order to load balance your games correctly for each GPU then knock yourself out but that's the reality of what you are suggesting.

But then I keep pointing this out and apparently no one pays attention so... *shrug*

The latest version of DX12 has multi GPU support built in without resorting to driver profiles by either GPU vendors or MS itself providing them.....It's all up to developers to use it, but it is there.

It still makes sense to provide support to at least 2 way multi GPU, since the performance scaling increase is definitely there going from 1 to 2 GPU's in the vast majority of cases, with the best case scenarios going over 80% scaling which isn't peanuts by any standard, unless said user doesn't even bother with at least a 4K display and still sticks to 1080p or 1440p displays where one GPU is enough for what is these days considered a low resolution and in this latter case, one gets CPU limited quite often especially at 1080p ( almost all the time in that case ).

shadow001 · Jun 18, 2018

bill dennison said:
they we are s*it out of luck

because it is not getting much smaller than 7nm soon we are going to hit the wall on die shrinks
what is next 3nm where to after that and how many years will it take

so I sure hope they can get MCM working

There is nothing beyond 7nm basically while the chip is made of silicon, as the gate portion that allows current to flow thru it ( or not ), and that controls the behavior of said transistor is so thin at that point ( think 2 to 3 atoms wide ), that regardless if it's in it's closed position the current will pass right thru it even when it's not supposed to ( quantum tunneling ).....The transistor becomes too erratic in it's behavior and unreliable as it has a hard time producing the same results in the same conditions...… AKA, useless for a computing device of any kind where consistent and repeatable results are a requirement, not just a nice thing to have.

pax · Jun 18, 2018

They've been talking 5 nm and 3 nm tho.. mind you they tend to measure transistors differently from one node to the next even if its the same number such as intel nodes being usually smaller than others at the same number.

Some even saying intel 10 nm is in some ways smaller than GF or TSMC at 7 nm.

shadow001 · Jun 19, 2018

pax said:
They've been talking 5 nm and 3 nm tho.. mind you they tend to measure transistors differently from one node to the next even if its the same number such as intel nodes being usually smaller than others at the same number.

Some even saying intel 10 nm is in some ways smaller than GF or TSMC at 7 nm.

That's because it is since all the parts within the transistor are made at 10nm, even the most difficult part which is the gate switch i mentioned earlier, while TSMC and Global may be using 7nm on the emitter and collector side, but still go at it with 12nm at the gate switch itself which is the hardest part.

Add the above with the use of regular ultra violet light and not the Deep ultra violet ( DUV ) or even extreme ultra violet light ( EUV ), and it's not too surprising that for the first time in 30+ years, Intel doesn't have the fab process advantage it used to have ( 18 to 24 months ahead of everyone else ), giving them always that edge in pulling off a design while the die size and power use are still reasonable.

I can't wait another 6~7 weeks until 2nd generation thread ripper is out, and i get a 32 core / 64 thread one built at 12nm and still backwards compatible with existing boards too boot, and for much less than the 28nm monstrosity shown by intel and still cooled with chilled water and devouring near 1000 watts of power, while needing a custom 3647 pin socket usually reserved for servers..... :lol:

Then add 3rd gen Ryzen next year at 7nm, and 3rd gen Epyc and thread ripper also at 7nm, still keeping the pressure on intel..... :evil:

demo · Jun 19, 2018

shadow001 said:
The latest version of DX12 has multi GPU support built in without resorting to driver profiles by either GPU vendors or MS itself providing them.....It's all up to developers to use it, but it is there.

That's a blessing and a curse at the same time. On one hand, great! It's supported via API.

On the other hand, it's up to devs instead of IHV's to support it.. meaning IHV's no longer have incentive to invest time and money into MGPU. They have to rely on devs that primarily code for console to implement support.. MGPU is dead.

bobvodka · Jun 19, 2018

bill dennison said:
they we are s*it out of luck

so I sure hope they can get MCM working

MCM working isn't a problem; we've had gfx cards with two GPUs on before, and AMD have interconnect stuff working for CPUs so the basics are there.

Depends on what they come up with however can could make it complicated for games;
1) 1 'device' which is really two under the hood is 'welcome to driver profiles' land
2) 2 devices, with a dedicated bus connection to their own memory, would result in a faster SLI/CFX setup (no PCIe bus transfers) but relies on games being coded for it and has some of the old problems of resource ownership
2a) As above but with a NUMA style setup so GPU0 can ask GPU1 to fetch some data - still requires support, and 'remote' access would be slow but removes classic SLI/CFX data transfer options at the expense of 'remote' bandwidth on the other GPU.
3) 2 devices with a shared memory bus means the resource ownership and transfer problems go away, but now everyone is hitting the same bus so unless bandwidth goes up you risk slowing things down

(I could do a much longer post on the problems of work sync but I'll refrain for now at least)

All of that assumes they just glue multiple GPUs as we know them today on to a single interconnect.

Maybe the solution is to go sideways from where we are; break the GPU apart and redesign how everything fits together? but that comes with a series of massive unknowns and, more than likely, developer buy in when it comes to the newer APIs.

In short; simple MCM is possible, but comes with a metric ****ton of issues where SLI/CFX and shared memory bandwidth collide in an orgy of potential performance issues.

shadow001 · Jun 19, 2018

demo said:
That's a blessing and a curse at the same time. On one hand, great! It's supported via API.

On the other hand, it's up to devs instead of IHV's to support it.. meaning IHV's no longer have incentive to invest time and money into MGPU. They have to rely on devs that primarily code for console to implement support.. MGPU is dead.

It's looking more and more like it, but it can also backfire big time in that if the games are inherently designed to only require a single GPU to run them at 60+ Fps even at high quality settings and resolutions, then sales for those that were doing multi GPU up to that point disappear, and these are the customers that usually buy the higher end / higher profit margin cards, even if the total amount is peanuts compared to mainstream cards in the 200~300$ range.

bobvodka said:
MCM working isn't a problem; we've had gfx cards with two GPUs on before, and AMD have interconnect stuff working for CPUs so the basics are there.

Depends on what they come up with however can could make it complicated for games;
1) 1 'device' which is really two under the hood is 'welcome to driver profiles' land
2) 2 devices, with a dedicated bus connection to their own memory, would result in a faster SLI/CFX setup (no PCIe bus transfers) but relies on games being coded for it and has some of the old problems of resource ownership
2a) As above but with a NUMA style setup so GPU0 can ask GPU1 to fetch some data - still requires support, and 'remote' access would be slow but removes classic SLI/CFX data transfer options at the expense of 'remote' bandwidth on the other GPU.
3) 2 devices with a shared memory bus means the resource ownership and transfer problems go away, but now everyone is hitting the same bus so unless bandwidth goes up you risk slowing things down

(I could do a much longer post on the problems of work sync but I'll refrain for now at least)

All of that assumes they just glue multiple GPUs as we know them today on to a single interconnect.

Maybe the solution is to go sideways from where we are; break the GPU apart and redesign how everything fits together? but that comes with a series of massive unknowns and, more than likely, developer buy in when it comes to the newer APIs.

In short; simple MCM is possible, but comes with a metric ****ton of issues where SLI/CFX and shared memory bandwidth collide in an orgy of potential performance issues.

At least the bandwidth issue is easily solved with the use of HBM as the bus itself passes inside the GPU die itself and no longer in the video card PCB, so it can be made as wide as it needs to be.....Current Vega chips use an HBM memory bus 1024 bits wide for each memory chip which is undoable on a PCB in the traditional way, and their memory runs at a pretty low clock speed, so the next generation coming up next year passing the 1 TB/sec mark wouldn't be the least bit surprising and doesn't even require a new memory type either.....It's all in how wide the memory bus is.

Would be a sad joke that now that they've got the bandwidth issue licked once and for all even in a multi GPU on a single card design, they give up on it just the same.

bill dennison · Jun 19, 2018

I thought the whole point of HBM was to do MCM eventually

going back to gddr6 for navi is one reason MCM on it is out

but they need a card fast so Navi is going the old way big chip and gddr6

shadow001 · Jun 19, 2018

bill dennison said:
I thought the whole point of HBM was to do MCM eventually

going back to gddr6 for navi is one reason MCM on it is out

but they need a card fast so Navi is going the old way big chip and gddr6

At this point, I think it would be harder to get Navi working with GDDR6 since it means redesigning the PCB of the card in the same fashion as they used to be before HBM was available, essentially making them more complicated and harder to achieve a given capacity without putting a huge amount of memory chips surrounding the main GPU packaging, as well as power delivery to each memory module, which will use more than HBM does and the higher clocks of said memory, given that the bus can't be as wide as HBM, means adjusting the timings of the GDDR6 memory higher than they are with HBM since the latter running at a much lower clock speed to begin with.

With Navi potentially being the 4th GPU using HBM since AMD started with the 28nm R9 Fury GPU and it's near 600mm^ size, it would be beyond dumb of their part to throw all that R@D in the trash, having gained serious experience in how to integrate HBM in the same package as it is just like MCM just that the companion chips are for memory and not graphics processing, never mind the MCM approach being used in thread ripper with the 4 die 32 core / 64 thread ripper being released in a little over a month from now using infinity fabric and much to the grief of Intel and them still sticking to a monolithic die...…

Seems that socket 2066 for intel / X299 will max out with a 22 core / 48 thread part in a single die, still skylake architecture based so it's IPC is no better than Threadripper for the same clock speed single core, but the latter throws another 10 cores and 20 threads for the highest end version, so for multi tasking and / or multi threading scenarios, it's about to become very painful for intel for the next year or 2, especially if 3rd generation thread ripper simply adds more improvements to the same formula, this time built at a 7nm process rather than 12nm that's for 2nd gen thread ripper being released soon...( wallet is ready now.... :lol:

).

Not applying the same trick for GPU's by having something similar would be dumb on their part, where the single Navi GPU version is the successor to the Rx580, and 2~3~4 GPU versions within the same HBM style packaging covers all the other markets, from high end gaming desktops, to workstations and compute cards like the instinct series.....One chip covers it all rather than making ~3 distinct chips for low end, mid range and high end of the market, and the only trick is if this multi GPU approach is completely transparent to the O/S and developers.....If it is, it's the death of the big monolithic GPU as we know it and absolutely needing the latest cutting edge fab process to be even remotely feasible.

So that is what it comes down to....Can they pull off what they did on the GPU side, what they've done with high end CPU's? and going MCM there?....My guess is an easy yes.

shadow001 · Jun 19, 2018

http://www.guru3d.com/news-story/benchmarks-and-cpu-z-screenshots-amd-ryzen-threadripper-2990x-with-32-core-surface.html

Ouch.....What does intel have to counter this on high end desktops?....Nothing.

NAVI No Longer MCM

New member

Well-known member

Well-known member

Well-known member

New member

New member

Well-known member

New member

Well-known member

Well-known member

New member

New member

Well-known member

New member

space cadet

New member

New member

Well-known member

New member

New member