And there are secondary considerations as well. For example, the bigger the cache the larger a target it is for soft error-causing cosmic rays (process node reductions of course shrink the physical size, but make the bits easier to flip). The larger L2+ caches get around this by including a lot of error correction circuitry, but that incrases latency and power usage (yet another reason why those caches are slower than L1). From the intro of
a paper (PDF) from Carnegie Mellon
Rising soft-error rates are a major concern for modern microprocessor designers. The reduction in charge stored in memory cells, a result of continued technology scaling, leaves on-chip SRAMs (e.g., caches, TLBs, register files) highly susceptible to soft errors. Coding techniques, such as SECDED ECC (single-error correct, double-error detect), are widely utilized for protecting on-chip SRAMs. For L1 data caches, however, where low access latencies are critical, the additional delay to correct ECC errors prohibits inline correction on a read. In the event an error is detected on a read, recent designs such as the AMD Opteron throw a machine check exception asynchronously, potentially halting the machine to prevent silent data corruption.
Further compounding problems, recent work suggests that spatial multi-bit errors, where a single cosmic particle strike upsets multiple neighboring memory cells, are increasingly likely at future technology nodes. Bit interleaving, also called column multiplexing, is the conventional approach used to protect memory arrays from spatial multi-bit errors. In bit interleaving, bits belonging to multiple ECC check words are physically interleaved so that a spatial multi-bit error does not affect adjacent bits from a single check word. For SRAMs in a high-performance processor, however, our results indicate that interleaving beyond two-way is prohibitively expensive from a power perspective as a result of the additional precharging of bitlines from the interleaved data.
The bigger the L1 cache, the more they have to worry about soft errors (and what to do about them). This may not be as important a factor in keeping L1 caches small as some of the other things already mentioned, but when you're designing the critical path parts of a processor
everything has an effect that has to be considered.