Some games tend towards multithreading by breaking particular tasks off into their own threads (likely as not involving some kind of work pool as an auxillary thing), others have some more granular way of splitting up most of the work (often along
these or
these lines). This distinction is really not binary and a lot of stuff doesn't fit into it well, but for the sake of explanation I'm going to roll with it for a bit.
In the first case, popular intuition on why things perform as they do is mostly correct. If you've got as many threads as are expected, framerate is probably limited by just one or two of them. The trouble is that the number of expected threads is as likely as not to be six (due to the number of cores consoles leave free for devs). On a 4C8T CPU, this is almost certainly completely fine, because each logical core is still a whole lot faster than a physical 1.6 GHz Jaguar core. On a 4C4T CPU, there's still more than enough raw power to handle the work that needs to be done (for normal framerates), but with some threading architectures and/or load patterns, thread management might start to take a bite out of it. At 2C4T, you've no longer got a big advantage over consoles in raw power unless those two cores are clocked to the moon. At 2C2T, the thread contention can probably more than eat up any advantage in raw power you can get.
As an example of that first kind of threading architecture, see Shadow of Mordor (nothing against it, it's just what I tried to play lately and got data for). It runs three heavy threads full-time (clearly doing different kinds of work than each other) and three more heavy ones in camera movement (those three apparently the same). With little camera movement, the three full-time threads can share a G3258's two hardware threads well enough and everything is pretty great. When I moved the camera 90 degrees, the other three threads stole ~30% each (of 200% total) for a second or so, in which time maybe two or three frames would get rendered, input would get polled maybe two or three times, and input interpolation weirdness would likely as not have moved the camera another 90-180 degrees starting the process over. In other words, it was completely unplayable due to this thread contention. The minspec is an i5-750 (4C4T 2.66 GHz Nehalem) and I had my G3258 clocked to 4.1 for that, so I should have barely been below minspec on raw power, but the thread contention killed it.
The other threading pattern (GCD etc) makes it relatively easy to keep however many cores are handy 80+% busy on something or other, but if not done carefully this architecture can be hell on data access patterns. For instance, in the Naughty Dog presentation they say they're doing about a bajillion fiber switches per frame (I don't remember and am on a really cheesy internet connection at the moment so I don't want to search through the video to find out). Work that should be adjacent is probably getting scheduled across different cores in an interleaved way that's tough for caches to handle in the best of cases, and I'd be very impressed if hardware prefetch can see beyond a fiber switch. I'd bet games like that love a fast, unified, and low-latency L3 cache (or an L4), and I bet that explains a lot of Intel's continued dominance at gaming in titles that would otherwise scale very well to higher core counts.
Another thing that may be particularly important for this is what work is synchronous with framerate. If all the heavy logic is synchronized to framerate and it's a heavy game, it may want a whole lot of power across a lot of threads. If a lot of heavy logic is on a separate tick, framerate is more likely to be able to fly on the strength of one thread while fixed-rate logic that takes many console threads fits in 1.5-2 big cores. Other work may be entirely asynchronous.
It's nearly impossible to test objectively if games don't provide their own tools for it, but personally I'd like to see more focus on the performance of things other than framerate. Games with a slow tick, games that can't stream fast enough to avoid pop-in, networked logic that can't keep up on a slow connection, weirdness that results in high input lag in certain situations.... it's all annoying, CPU power is often the limitation, and often things like that can show up while framerate looks perfectly fine.