Skip to content

Conversation

Mantisus
Copy link
Collaborator

Description

  • Add an example of running PlaywrightCrawler using local Chrome and Firefox profiles.

@Mantisus Mantisus self-assigned this Sep 24, 2025
Copy link
Collaborator

@Pijukatel Pijukatel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. I learned something new.

One question that came to my mind after reading the docs. Is it safer even for Firefox to copy the profile to temdir? I guess the profile can get changed during crawling? Is that desired side effect, or bad side effect?

@Mantisus
Copy link
Collaborator Author

Is it safer even for Firefox to copy the profile to temdir? I guess the profile can get changed during crawling? Is that desired side effect, or bad side effect?

It depends on the goal. If it should not affect the main profile, then yes, it is safer to use a copy.

But I can imagine a use case where all changes made during scanning must be synchronized with the profile.

@Mantisus
Copy link
Collaborator Author

This doc may also solve issue #1071. Since it shows how to use channel, use the installed Chrome.

@B4nan, what do you think?

Copy link
Collaborator

@janbuchar janbuchar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty cool, do you think you could make a JS counterpart to the example as well?

@Mantisus
Copy link
Collaborator Author

do you think you could make a JS counterpart to the example as well?

I think I'll try to do that. 🙂

@B4nan
Copy link
Member

B4nan commented Sep 26, 2025

This doc may also solve issue #1071. Since it shows how to use channel, use the installed Chrome.

In JS version we have useChrome flag under launchContext in browser crawlers, and a default mechanism for inference of the browser executable path:

const crawler = new PlaywrightCrawler({
    launchContext: {
        useChrome: true,
    },
    requestHandler() { ... },
});

https://github.com/apify/crawlee/blob/07cd2c43fcbf21c6917b5bf55cd1af64c4bdcb00/packages/browser-crawler/src/internals/browser-launcher.ts#L202-L204
https://github.com/apify/crawlee/blob/07cd2c43fcbf21c6917b5bf55cd1af64c4bdcb00/packages/browser-crawler/src/internals/browser-launcher.ts#L220

Not saying we need to port this 1:1, but IMO it's very common to use chrome for scraping, so having a bit more native support than just a guide on how to use it with Playwright feels right to me.

And more importantly, we have a ready made docker image with preinstalled chrome:

https://github.com/apify/apify-actor-docker/tree/master/node-playwright-chrome

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants