Skip to content

Conversation

protoss70
Copy link
Contributor

Added docs for the new n8n WCC single actor app I am developing. Most of the text is copy pasted from Make.com ai crawling documentation

@protoss70 protoss70 self-assigned this Aug 5, 2025
@protoss70 protoss70 added the t-integrations Issues with this label are in the ownership of the integrations team. label Aug 5, 2025
@apify-service-account
Copy link

Preview for this PR was built for commit 41701e0 and is ready at https://pr-1763.preview.docs.apify.com!

1 similar comment
@apify-service-account
Copy link

Preview for this PR was built for commit 41701e0 and is ready at https://pr-1763.preview.docs.apify.com!

@protoss70 protoss70 marked this pull request as ready for review August 11, 2025 15:39
@protoss70 protoss70 requested a review from TC-MO as a code owner August 11, 2025 15:39
@apify-service-account
Copy link

Preview for this PR was built for commit 090a2925 and is ready at https://pr-1763.preview.docs.apify.com!

@protoss70 protoss70 requested a review from drobnikj August 11, 2025 15:39
@apify-service-account
Copy link

Preview for this PR was built for commit 5bbe301f and is ready at https://pr-1763.preview.docs.apify.com!

@apify-service-account
Copy link

Preview for this PR was built for commit 7bdb83a7 and is ready at https://pr-1763.preview.docs.apify.com!

Copy link
Member

@drobnikj drobnikj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be nice to add an example workflow usage with AI agent.

Pre approved 👍

@@ -0,0 +1,165 @@
---
title: N8N - AI crawling Actor integration
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is N8N correct? I think they spell it n8n.

@apify-service-account
Copy link

Preview for this PR was built for commit 028d4bc1 and is ready at https://pr-1763.preview.docs.apify.com!


## Apify Scraper for AI Crawling

Apify Scraper for AI Crawling from [Apify](https://apify.com/) lets you extract text content from websites to feed AI models, LLM applications, vector databases, or Retrieval Augmented Generation (RAG) pipelines. It supports rich formatting using Markdown, cleans the HTML of irrelevant elements, downloads linked files, and integrates with AI ecosystems like LangChain, LlamaIndex, and other LLM frameworks.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be a link to specific Actor, not general link to apify.com?


![Apify token on n8n](images/token.png)

Once connected, you can build workflows to automate website extraction and integrate results into your AI applications.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This repeats sentence from intro section and seems redundant

Comment on lines 10 to 15
## Prerequisites

Before you begin, make sure you have:

- An [Apify account](https://console.apify.com/)
- An [n8n instance](https://docs.n8n.io/getting-started/) (self‑hosted or cloud)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section should be below ## Apify Scraper for AI crawling. First we need to introduce the concept, then show what is needed

Comment on lines 23 to 56
## Install the Apify Node (self-hosted)

If you're running a self-hosted n8n instance, you can install the Apify community node directly from the editor. This process adds the node to your available tools, enabling Apify operations in workflows.

1. Open your n8n instance.
1. Go to **Settings > Community Nodes**.
1. Select **Install**.
1. Enter the npm package name: `@apify/n8n-nodes-apify-content-crawler` (for latest version). To install a specific [version](https://www.npmjs.com/package/@apify/n8n-nodes-apify-content-crawler?activeTab=versions) enter e.g `@apify/[email protected]`.
1. Agree to the [risks](https://docs.n8n.io/integrations/community-nodes/risks/) of using community nodes and select **Install**.
1. You can now use the node in your workflows.

![Apify Install Node](images/install.png)

## Connect Apify Scraper for AI Crawling (self-hosted)

1. Create an account at [Apify](https://console.apify.com/). You can sign up using your email, Gmail, or GitHub account.

![Sign up page](../make/images/ai-crawling/wcc-signup.png)

1. To connect your Apify account to n8n, you can use an OAuth connection (recommended) or an Apify API token. To get the Apify API token, navigate to **[Settings > API & Integrations](https://console.apify.com/settings/integrations)** in the Apify Console.

![Apify Console token for n8n](../make/images/Apify_Console_token_for_Make.png)

1. Find your token under **Personal API tokens** section. You can also create a new API token with multiple customizable permissions by clicking on **+ Create a new token**.
1. Click the **Copy** icon next to your API token to copy it to your clipboard. Then, return to your n8n workflow interface.

![Apify token on n8n](../make/images/Apify_token_on_Make.png)

1. In n8n, click **Create new credential** of the chosen Apify Scraper module.
1. In the **API key** field, paste the API token you copied from Apify and click **Save**.

![Apify token on n8n](images/token.png)

Once connected, you can build workflows to automate website extraction and integrate results into your AI applications.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We mention that cloud is also an option, shouldn't there be also a brief explanation how to configure it for cloud version?

Co-authored-by: Michał Olender <[email protected]>
Copy link

cursor bot commented Aug 19, 2025

🚨 Bugbot Trial Expired

Your team's Bugbot trial has expired. Please contact your team administrator to turn on the paid plan to continue using Bugbot.

A team admin can activate the plan in the Cursor dashboard.

@apify-service-account
Copy link

Preview for this PR was built for commit 313cebd3 and is ready at https://pr-1763.preview.docs.apify.com!

Co-authored-by: Michał Olender <[email protected]>
@apify-service-account
Copy link

Preview for this PR was built for commit 7c7aec9f and is ready at https://pr-1763.preview.docs.apify.com!

@apify-service-account
Copy link

Preview for this PR was built for commit 4cb46462 and is ready at https://pr-1763.preview.docs.apify.com!

@protoss70
Copy link
Contributor Author

Hi @TC-MO I made a small refactoring since we made some changes to the WCC n8n node. One change is to the title from Apify Scraper for AI Crawling and we also removed the standard module leaving there only a default module. Could you please help review the latest text changes 🙏

@protoss70 protoss70 requested a review from TC-MO September 10, 2025 13:52
@apify-service-account
Copy link

Preview for this PR was built for commit d7d937ab and is ready at https://pr-1763.preview.docs.apify.com!

Copy link
Member

@drobnikj drobnikj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just two notes, otherwise fine

- An [Apify account](https://console.apify.com/)
- An [n8n instance](https://docs.n8n.io/getting-started/) (self‑hosted or cloud)

## Install the Apify Node (self-hosted)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not Apify Node, it is Website Content Crawler by Apify node

toc_max_heading_level: 4
---

## Website Content Crawler By Apify
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second caption looks duplicate, I would remove it and update title to
n8n - Website Content Crawler by Apify
image

@apify-service-account
Copy link

Preview for this PR was built for commit dc1ff9c3 and is ready at https://pr-1763.preview.docs.apify.com!

@TC-MO
Copy link
Contributor

TC-MO commented Sep 11, 2025

@protoss70 Hi there, thanks for the update and more context, I'll take a look today evening/tomorrow at the latest

@apify-service-account
Copy link

Preview for this PR was built for commit dc1ff9c3 and is ready at https://pr-1763.preview.docs.apify.com!

@apify-service-account
Copy link

Preview for this PR was built for commit afa76fe5 and is ready at https://pr-1763.preview.docs.apify.com!

@apify-service-account
Copy link

Preview for this PR was built for commit 6c0884b3 and is ready at https://pr-1763.preview.docs.apify.com!

@apify-service-account
Copy link

Preview for this PR was built for commit 8c9f2f0d and is ready at https://pr-1763.preview.docs.apify.com!

@apify-service-account
Copy link

Preview for this PR was built for commit d5258461 and is ready at https://pr-1763.preview.docs.apify.com!

@apify-service-account
Copy link

Preview for this PR was built for commit 5bff6afa and is ready at https://pr-1763.preview.docs.apify.com!

@protoss70 protoss70 requested a review from TC-MO September 22, 2025 13:34
@apify-service-account
Copy link

Preview for this PR was built for commit 62133cbf and is ready at https://pr-1763.preview.docs.apify.com!

@drobnikj drobnikj self-requested a review September 22, 2025 14:37
Copy link
Member

@drobnikj drobnikj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One nit, otherwise fine, let's merge it and @TC-MO will check it once he is back


### How it works

The **Advanced Settings** module provides granular control over the entire crawling process. For _Crawler selection_, you can choose from Playwright (Firefox/Chrome) or Cheerio, depending on the complexity of the target website. _URL management_ allows you to define the crawling scope with include and exclude URL patterns. You can also exercise precise _DOM manipulation_ by controlling which HTML elements to keep or remove. To ensure the best results, you can apply specialized algorithms for _Content transformation_ and select from various _Output formatting_ options for better AI model compatibility.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no advanced module anymore

@apify-service-account
Copy link

Preview for this PR was built for commit 66541367 and is ready at https://pr-1763.preview.docs.apify.com!

@drobnikj drobnikj requested a review from lukas-bekr September 22, 2025 15:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
t-integrations Issues with this label are in the ownership of the integrations team.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants