-
-
Notifications
You must be signed in to change notification settings - Fork 292
Description
Currently, if a user wants to perform multiple actions in parallel (like interacting with multiple tabs at the same time), they must manually create tasks and use asyncio.gather
. While functional, this exposes internal async logic and adds unnecessary verbosity.
We should introduce a new method in the Browser class: run_in_parallel(*coroutines)
, which would act as a typed and thread-safe wrapper around asyncio.gather
.
This method would:
- Accept any number of coroutine objects
- Internally run them using asyncio.gather
- Return their results in order
This is a small abstraction, but it drastically improves developer experience when writing concurrent scraping logic.
Requirements:
- Must be thread-safe (in case the user calls it from different threads)
- Should be fully typed (generic return types, inference for result list)
- Should propagate exceptions properly like asyncio.gather does
Usage example:
async def scrap_google(tab: Tab):
await tab.go_to("https://google.com")
return await tab.find(tag_name="input")
async def scrap_github(tab: Tab):
await tab.go_to("https://github.com")
return await tab.find(tag_name="h1")
results = await browser.run_in_parallel(
scrap_google(tab1),
scrap_github(tab2)
)
In case a large number of coroutines is passed (e.g. 20+), it may not be efficient or safe to run them all in parallel. It would be interesting to introduce a configurable limit for the maximum number of concurrent coroutines. This could be defined via the Options object (max_parallel_tasks, for example).
The run_in_parallel
method would then respect this limit and execute the coroutines in batches or using a bounded semaphore internally.
This behavior is open to discussion. The main idea is to give users more control over concurrent load without manually handling throttling logic.