Node Input

  • url (string or string[]): The URL of the website to scrape. It can be a single URL string or a list of URLs, depending on the use case.

Node Output

  • content (string): The scraped content of the website in markdown format, based on the provided URL.

Function

The WebsiteScraper node is designed to extract content from specified websites. It navigates to the provided URL(s), optionally executing JavaScript if required, and retrieves the site’s content as markdown. This node can handle both general websites and specific formats, such as LinkedIn profiles, by applying relevant scraping strategies.

When to Use It?

The WebsiteScraper node is especially useful in cases such as:

  • Extracting structured content from profiles, news articles, or product pages
  • Gathering text-based data for further analysis
  • Automating web content extraction within workflows

For best results, ensure the url provided is specific to the content you want to extract. The node can handle both single and multiple URLs for flexible usage.