Backpressure in asyncio without tears

2026-05-18 · ~6 min

The first concurrent scraper everyone writes spawns one task per URL and waits on asyncio.gather. It works on the ten-item test list and falls over on the real one: ten thousand coroutines, ten thousand open sockets, and a resident set that climbs until the OOM killer makes the decision for you.

The fix is backpressure — letting the slow part of the pipeline push back on the fast part. Two tools cover almost every case.

A semaphore to cap concurrency

If all you need is "no more than N in flight," a Semaphore is enough:

sem = asyncio.Semaphore(20)

async def fetch(session, url):
    async with sem:
        async with session.get(url) as resp:
            return await resp.text()

You can still create all the tasks up front, but only twenty are ever doing real work. Memory stays flat in the number of active requests, not the total.

A bounded queue when producers and consumers differ

When something is generating work faster than you can process it — reading a huge file, paginating an API — a bounded Queue makes the producer wait:

queue = asyncio.Queue(maxsize=100)

async def producer():
    async for item in source():
        await queue.put(item)   # blocks once 100 are buffered
    await queue.put(None)

async def consumer():
    while (item := await queue.get()) is not None:
        await handle(item)

The maxsize is the whole point. Drop it and you've reinvented the unbounded list that ate your RAM.

Rule of thumb: every place data enters your program faster than it leaves needs a bound. The bound is where you get to choose the failure mode.

Neither of these is clever. That's the appeal — they're the boring primitives that keep a long-running service from surprising you at 3am.