I pushed Python to 20,000 requests sent/second. Here's the code and kernel tuning I used.