This number determines everything else. A chatbot can wait two seconds. Self-driving systems need milliseconds. Your answer dictates edge versus cloud, model size limits, whether your cost structure even pencils out. Get this wrong and you're rebuilding the entire stack.