A follow up to my previous thread: https://www.reddit.com/r/forza/comments/56mn7n/help_me_figure_out_whats_really_behind_the_forza/
u/letmebehealthy was kind enough to record a few traces for me, and I was able to draw a few solid conclusions a few things that definitely are causing performance issues, and a few things that are definitely not.
The specs of the system tested: * CPU: i7 4700MQ (3.4ghz turbo) * RAM: 16GB 1600 MHz * HDD where installed: 1024GB @ 5400RPM w/ 13% fragmentation * GPU: 780m (~830 MHz gpu clock max), 4GB GDDR5 Mem.
A solid system, well within the minimum. These tests were run at 720p at the absolute minimum, and he still wasn't able to maintain 60.
- Exhibit 1: The entire trace: http://imgur.com/SCov0W6
- Exhibit 2: Roughly a single frame: http://imgur.com/RPn319l
Forza horizon 3 is the destroyer of CPUs. The CPU is nearly pegged the entire time, so even the slightest extra activity is enough to push it into dropping GPU usage and frames (as seen by the green areas.) The xbox one has an 8-core CPU with AMD Jaguar cores, which a haswell i7 will eat for breakfast, it's not even close. So why is the CPU usage so high? I can mostly tell you what isnt causing it.
-
1- It's not EFS or encryption of any time. The CPU usage associated with EFS is so extraordinarily neglible I can hardly even point it out on the graph. This theory just needs to be laid to rest already.
-
2 - It's not decompression....sort of. The first half of the trace is idle, the second half is in motion. You can see it start to hit the hard drive, and there is an associated increase in CPU usage and a few new threads are spawned. The CPU starts context switching like crazy. The GPU usage is dropping, so there's likely some stutters here - but CPU usage associated with the asset streaming is mild. The only reason it's causing issues is that the overall CPU usage is already so high. If the rest of the code wasnt so heavy, it wouldnt be an issue.
-
3 - It's not running the frame buffer out of DRAM. It's such a silly theory that its hard to explain why its false because the premise is so ridiculous. If it were true, on the timelines you would see large deadzones of activity while that slow transfer takes place. The easiest way to disprove it would be if someone were to limit their PCI-E slot to x8 or Gen2 - the route to DRAM from the GPU is PCI-E. DRAM is already much faster than Gen3 x16, if that's a bottleneck then cutting half the bandwidth to the slot will cut the bandwidth to the DRAM in half, and performance should experience a sustained drop compared to x16. I seriously doubt it will.
-
4- 16GB probably isn't enough to eliminate all stutters. Towards the end of the trace, windows kicks in memory compression in order to preserve physical memory, and that causes a CPU spike. While this is happening, performance drops hard. But after it's done, perfomance looks arguably even better than before. So it worked as intended - but this never would have happened in the first place if there was an excess of physical memory. The tracing program itself takes a good chunk of memory, but still FH3 is using a lot of memory. The solution here isnt to turn off memory compression or superfetch, it's to have more memory in the first place. I suspect those with 24-32GB of memory are having a lot less hard stutters. Memory is cheap - another 8GB is like $30 right now, like half the price of the game. It'll never hurt to have too much memory.
-
5 - FH3 is heavily multithreaded. That's normally good. Still, a few main threads dominate and single threaded performance is also a bottleneck. From a long view, it looks good. But this is still extremely heavy CPU usage, the kind you would expect to see running a game like black ops 3 at 200+fps, not a racer at <60fps. From the looks of it, you need an absolute top tier CPU to maintain 60fps.
So....I just want to preface this by saying that professional game devs are extraordinarily intelligent and some of the hardest working people on the planet, and they deserve your absolute respect. Whatever is going on here, it's not due to laziness or incompetence. But it really looks like a CPU optimization issue in this game, and considering how extreme the recommended CPU specs are, they're probably quite aware of it. So why is a game that runs a solid 30 on piddly little Jaguar cores on xbox struggling to hit 60 on a Haswell i7? It's not obvious and I'm don't really want to speculate too specifically about it, but its not something silly like running the frame buffer out of DRAM or due to any sort of encryption. x86 code is x86 code, and dx12 code is dx12 code, so it's not the lower level code itself that's the issue. My best guess is that there's something that they're doing either in hardware or through GPGPU on the xbox one that they need to do in software on PC. That maybe they got a little too clever in optimizing for the xbox architecture and that didn't translate well, and they didn't have the time or resources to do it right on PC. Since its sustained high usage and not an anomaly I suspect there isn't a quick fix, and as much as I hate to say it, it just seems like poorly optimized PC CPU code. I have absolutely no doubt in my mind that with unlimited time and money they could make a killer PC port, I have complete faith in the competence of devs at the AAA level. Most likely it just had to hit that ship date and they ran out of time. Whether we like it or not, PC isn't the #1 priority for Forza. But I hope they can fix it in time.
Submitted by Darius510 | #Specialdealer Special Offer Online Shopping Store 2016
No comments:
Post a Comment