Maxwell was about doing more with less. By rethinking the SM design into the new SMM (Streaming Multiprocessor Maxwell), NVIDIA delivered nearly 2× the performance per watt compared to Kepler. The number of CUDA cores was reduced to 128 (vs 192 on Kepler's SMX). The shared memory was increased to 96 KB.

⚙️ Key Highlights

16 SMMs
Quad warp schedulers, dual-issue capable
2 MB of L2 cache (big increase)
Fast Shared Memory Atomics: Unlike Fermi and Kepler, which implemented shared memory atomics with a lock/update/unlock pattern, Maxwell introduced built-in atomic operations for 32-bit integers, plus native 32-bit and 64-bit compare-and-swap (CAS) directly in shared memory
Dynamic Parallelism (first introduced with Kepler) was now fully available across the product line, enabling more flexible, GPU-driven workloads

👉 Maxwell powered some of the most iconic GPUs of its time (like the GTX 980).