Systems now handle new requests (compute-bound prefill) and ongoing generation (memory-bound decode) in the same forward pass. Addresses opposing computational characteristics that previously forced inefficient resource allocation. Engineering substance over algorithmic novelty.