Maxwell was about doing more with less. By rethinking the SM design into the new SMM (Streaming Multiprocessor Maxwell), NVIDIA delivered nearly 2× the performance per watt compared to Kepler. The number of CUDA cores was reduced to 128 (vs 192 on Kepler's SMX). The shared memory was increased to 96 KB.
⚙️ Key Highlights
- 16 SMMs
- Quad warp schedulers, dual-issue capable
- 2 MB of L2 cache (big increase)
-
Fast Shared Memory Atomics: Unlike Fermi and Kepler, which implemented shared memory atomics with a lock/update/unlock pattern, Maxwell introduced built-in atomic operations for 32-bit integers, plus native 32-bit and 64-bit compare-and-swap (CAS) directly in shared memory
-
Dynamic Parallelism (first introduced with Kepler) was now fully available across the product line, enabling more flexible, GPU-driven workloads
👉 Maxwell powered some of the most iconic GPUs of its time (like the GTX 980).