L2 cache how much




















So why add continually larger caches in the first place? Because each additional memory pool pushes back the need to access main memory and can improve performance in specific cases. Each stair step represents a new level of cache. Larger caches are both slower and more expensive.

At six transistors per bit of SRAM 6T , cache is also expensive in terms of die size, and therefore dollar cost. The performance impact of adding a CPU cache is directly related to its efficiency or hit rate; repeated cache misses can have a catastrophic impact on CPU performance.

The following example is vastly simplified but should serve to illustrate the point. Imagine that a CPU has to load data from the L1 cache times in a row. The L1 cache has a 1ns access latency and a percent hit rate. It, therefore, takes our CPU nanoseconds to perform this operation. Haswell-E die shot click to zoom in. The repetitive structures in the middle of the chip are 20MB of shared L3 cache.

Now, assume the cache has a 99 percent hit rate, but the data the CPU actually needs for its th access is sitting in L2, with a cycle 10ns access latency. That means it takes the CPU 99 nanoseconds to perform the first 99 reads and 10 nanoseconds to perform the th. A 1 percent reduction in hit rate has just slowed the CPU down by 10 percent. If the data has been evicted from the cache and is sitting in main memory, with an access latency of ns, the performance difference between a 95 and 97 percent hit rate could nearly double the total time needed to execute the code.

A cache is contended when two different threads are writing and overwriting data in the same memory space. It hurts the performance of both threads — each core is forced to spend time writing its own preferred data into the L1, only for the other core promptly overwrite that information.

Later Ryzen CPUs do not share cache in this fashion and do not suffer from this problem. This graph shows how the hit rate of the Opteron an original Bulldozer processor dropped off when both cores were active, in at least some tests.

This allowed me to isolate as well as possible the effects of the L2 cache size on CPU performance. It is natural at this point to wonder just what the L2 cache is and what it does. The L1 cache is a small amount of memory, generally 32KB to KB, that runs at the same speed as the processor and stores instructions and data to feed to the CPU core. It sits directly on the CPU die itself. It is the task of the L1 cache to be able to send new instructions to the core as quickly as the core can take them, for otherwise, the core will sit idle and under utilized.

To assist the L1 cache, most modern processors have another, larger layer of cache: L2. The L2 cache will not only act as a buffer for incoming instructions and data, but will also store recent instructions and data and try to anticipate what will be done next. When successful, this anticipation can feed instructions to the CPU even faster and thereby increase its utilization.

Clearly, when the L2 cache works at its best, the CPU can be more effectively used. Traditionally, the L2 cache was located on the motherboard. At that time, the L2 cache was still much faster than the system memory, but it was hamstrung by the fact that it was so far from the CPU core and L1 cache. This was a great improvement for CPU performance, but as clock speeds increased, it was once again a bottleneck. More recently, the L2 cache was moved from the processor packaging to the CPU die itself.

While the cache size had to be reduced for the then large CPU cores, it reaped a large benefit: It ran at full-speed, rather than half-speed. This more than made up for the reduced L2 cache size for the processors of the day. At this point, the reader may wonder just why the cache is smaller as it gets closer to the CPU core, just where it is needed the most. In short, manufacturing memory on the CPU die itself is very expensive, and space is limited.

The idea is to keep the most frequently used instructions in L1, with L2 cache holding the next most likely needed bits of data, and L3 following suit. Cache design is a key strategy in the highly competitive microprocessor market, as it is directly responsible for improved CPU and system performance. When looking at new computers check out the amounts of L1, L2 and L3 cache. All else being equal, a system with more CPU cache will perform better, and synchronous cache is faster than asynchronous.

Please enter the following code:.



0コメント

  • 1000 / 1000