A neat research direction gaining traction: instead of processing requests sequentially, you can batch thinking, I/O, and multiple prompts together. The practical payoff: faster responses and better resource utilization.
HN discussion digs into the implications.