I am torn about writing this. AsyncIO in Python is always a mess, protobuf is another, gRPC is the worst of them all because of all that boilerplate code that does nothing but trouble.
The task I am facing is integrating a mesh API server based on our internal codebase.
non-gRPC options
I mean, gRPC is just h2+protobuf, how hard could it be? Even uWSGI had h2 from decades ago
Turns out the options are quite limited. h2 in uWSGI was major versions behind, SPDYv3 never took off. gRPC and h2 are related but different because the frames are marked and handled differently.
So there’s either hypercorn or fallback to gRPC. To avoid further mess I decided to stick with gRPC
infectious async/await
Now I face another…
I am torn about writing this. AsyncIO in Python is always a mess, protobuf is another, gRPC is the worst of them all because of all that boilerplate code that does nothing but trouble.
The task I am facing is integrating a mesh API server based on our internal codebase.
non-gRPC options
I mean, gRPC is just h2+protobuf, how hard could it be? Even uWSGI had h2 from decades ago
Turns out the options are quite limited. h2 in uWSGI was major versions behind, SPDYv3 never took off. gRPC and h2 are related but different because the frames are marked and handled differently.
So there’s either hypercorn or fallback to gRPC. To avoid further mess I decided to stick with gRPC
infectious async/await
Now I face another challenge: The existing business logic is written in async/await style (cue FastAPI fad)
I carefully studied the gRPC async hello world example
Everything ran great, except the notorious GIL, my gRPC server runs but only on one single CPU.
multiprocess
Old school solution to GIL: spawn many processes. Given a 1:1 map to worker CPU. Easy? There’s an official multiprocessing example
It worked... until it didn’t. The major selling point of h2 is connection multiplexing, one TCP connection to serve all concurrency. And our mesh client is so good at this, only one worker consumes 100% of one CPU and the rest simply idle. 🤣
SO_REUSEPORT
I also tried to implement a prefork worker on my own. Let’s get rid of master because political-correctness we have SO_REUSEPORT already.
Unfortunately it didn’t work at all, because of h2’s multiplexing nature. The kernel won’t schedule requests if there’s only one single connection.
ProcessPoolExecutor
I looked closely and found how gRPC inits:
grpc.server(futures.ThreadPoolExecutor(max_workers=10))
Maybe swap it with ProcessPoolExecutor() ?
Nope, server went dead with a timeout. Don’t have time to look into C/C++ details. Nope.
It seems gRPC only allows ThreadPoolExecutor().
Why does Google even allow it as a parameter then?
The apply_async() hallucination
Out of despair, next I asked ChatGPT. The advanced AI model said: just use multiprocessing in your invokes
Yeah why not. So how do I run async in multiprocessing?
ChatGPT hallucinated: use apply_async. I initially believed that shit only to find it means the func will return an AsyncResult object, not running some async/await code. btw, I found the .apply() is just a shortcut for .apply_async().get()
Putting it together
I got the mess to work eventually.
- Create a normal gRPC server with 
add_generic_rpc_handlersand stuff - Create a 
pool = ProcessPoolExecutor(...)before theunary_unary_rpc_method_handler, with aninitializerthat spawns a globalloop = asyncio.new_event_loop(). It had to be global becauseconcurrent.futuresonly allows it this way - Run 
loop.run_until_complete()insidepool.submit() 
lessons learned
If you aren’t a try-hard:
avoid async
avoid gRPC