You overcomplicated. My actual target is easier. 10K RPS on let’s say four cores with SO_REUSEPORT. It’s just 2500 RPS per core, thus 400µs for each request. If friendly full-featured async library API fits into 100µs we have 300µs more for everything else. It’s not easy, but a bit more than maybe