Some of Moonlit’s core functionalities require web scraping. That includes “Apps” that use functions such as the Sitemap Extractor, or Webpage Scraper, and also for the “Import from Website” option in the knowledge base. However, some websites are protected by anti-bot services, so in this guide we’ll show you how you can whitelist requests coming from Moonlit’s server.

How Moonlit Scraping Bot Identifies itself

A user agent is a short string identifier for an agent performing a web request. Moonlit’s User Agent string is:

Mozilla/5.0 (compatible; Moonlit/1.0; +https://moonlitplatform.com

Depending on your hosting provider or security software, you can add rules that allow this user agent to scrape your website.

Whitelisting in Robots.txt

If you have custom rules set in your site’s robots.txt file you can whitelist moonlit by adding the following lines:

User-agent: Moonlit

Allow:

Hosting Websites

If your hosting provider is not listed above, please contact us through the live chat widget on the bottom right and we’ll help you.

Bot Detection Software

CloudFlare

  1. Login to your Cloudflare account, go to the Security tab and select Firewall Rules.
  2. Click on Create a Firewall Rule, and give it a name (i.e Moonlit)
  1. Set the field as User Agent, the operator as ‘includes’ and the value to “Moonlit”
  2. Then turn on the Allow switch and click deploy in the bottom right corner.

Incapsula

  1. In your control panel, navigate to Settings > security and click on Whitelist Specific Sources.
  2. Click on Add Exception, this opens up the whitelist firewall modal.
  3. From the dropdown menu, select ‘User Agent’ and set the value to “Moonlit”