hivemunk Logo
GuidesFeaturesPricingRoadmapFAQAboutDocumentation
Plans
Swarm Trap / Bait Hive
Tools
Sugar Syrup Calculator

HivemunkCrawler

HivemunkCrawler is the web crawler operated by hivemunk. It visits publicly available beekeeping-related business websites to help build a directory and product index for beekeeping supplies. The purpose of the crawler is to help beekeepers find equipment, compare availability and prices across sellers, and discover beekeeping supply businesses they may not already know about. HivemunkCrawler is focused on beekeeping businesses and beekeeping equipment. It is not a general-purpose web crawler.

User agent

HivemunkCrawler identifies itself with the following user agent:

HivemunkCrawler/1.0 (+https://hivemunk.com/crawler)

The URL in the user agent points to this page so website owners can understand why the crawler visited their site and how to contact hivemunk.

Why HivemunkCrawler visits websites

HivemunkCrawler visits websites for a few related purposes:

  1. Business discovery
    To identify businesses that may be relevant to beekeepers, including beekeeping supply stores, honey producers, farm stores, beekeeper associations, and related organizations.

  2. Business verification
    To determine whether a discovered website is actually relevant to beekeeping, whether it sells physical products, and whether it appears to be a business that should be included in the hivemunk directory.

  3. Directory building
    To collect limited business-level information such as website URL, business type, public address, and general product focus.

  4. Product indexing
    When crawling appears to be allowed, to collect limited product-level information from beekeeping supply websites so beekeepers can search and compare products across sellers.

How websites are discovered

Candidate websites may be discovered from public sources such as:

  • Search engines
  • Public map and business listing results
  • Supplier directories
  • Beekeeping association pages
  • Public business websites
  • Other publicly available references

A website being discovered does not automatically mean it will be added to the hivemunk directory or product index. Candidate websites are reviewed by automated checks first.

What HivemunkCrawler may collect

Depending on the website and what is publicly available, HivemunkCrawler may collect the following business-level information:

  • Website URL
  • Domain name
  • Business name
  • Business type or category
  • Whether the business appears relevant to beekeeping
  • Whether the business appears to sell physical products
  • Public business address, if shown on the website
  • Contact or location page URLs
  • E-commerce platform, if identifiable
  • Whether a terms, legal, or terms of service page exists
  • Whether a sitemap or robots.txt file exists

For product indexing, HivemunkCrawler may collect limited product-level information such as:

  • Product name
  • Product page URL
  • Listed price
  • Currency
  • Product category or collection
  • Availability text, such as "in stock," "out of stock," or "sold out"
  • Basic product attributes visible on listing or product pages

The product index is intended to point users back to the original seller. hivemunk does not intend to replace the seller's website.

What HivemunkCrawler does not do

HivemunkCrawler does not:

  • Create accounts
  • Log in to websites
  • Submit forms
  • Sign up for newsletters
  • Add products to carts
  • Start checkout flows
  • Place orders
  • Bypass paywalls
  • Bypass CAPTCHAs
  • Access password-protected pages
  • Access private customer areas
  • Attempt to evade bot protections
  • Copy customer reviews
  • Download or rehost product images
  • Republish full website content
  • Republish full product descriptions as a substitute for visiting the seller's website

HivemunkCrawler is designed to collect limited structured information for discovery, comparison, and referral.

How HivemunkCrawler behaves

HivemunkCrawler works in stages.

1. Discovery stage

During discovery, hivemunk identifies possible beekeeping-related businesses from public sources.

At this stage, the crawler may not visit the business website yet. The goal is to create a candidate list of websites that may be relevant to beekeepers.

2. Verification stage

During verification, HivemunkCrawler may visit a small number of publicly available pages to understand what kind of business the website represents.

This may include requests to:

  • The homepage
  • robots.txt
  • sitemap.xml
  • Contact pages
  • About pages
  • Visit or location pages
  • Terms, legal, or terms of service pages

In some cases, the website URL that was originally discovered may redirect to a different URL before reaching the actual business website. HivemunkCrawler may follow these redirects to find the correct page.

In some cases, HivemunkCrawler may scan additional related pages beyond the homepage to find information needed for the directory. For example, contact, about, or similar pages may be visited if the homepage does not contain enough information.

The verification stage looks for limited signals such as:

  • Whether the site is relevant to beekeeping
  • Whether the site sells physical products
  • Whether the site appears to be a supply store, honey producer, association, educational site, or another type of business
  • Whether a public business address is shown
  • Whether product pages appear to exist
  • Whether the website publishes crawling restrictions

This stage is intentionally limited. Its purpose is to decide whether the site should be considered for the directory or later product indexing.

3. Permission review stage

Before product indexing, HivemunkCrawler checks for signs that automated access should be avoided or limited.

This includes reviewing:

  • robots.txt, specifically for rules addressed to HivemunkCrawler
  • Terms of service
  • Terms of use
  • Legal pages
  • Other pages that appear to describe automated access, scraping, crawling, bots, or data extraction

Terms, legal, and terms of service pages are analyzed using automated tools to determine whether they explicitly restrict automated access or data collection. General "all rights reserved" or intellectual property clauses do not count as restrictions — only explicit prohibitions on automated access or data collection are treated as blocks.

If product crawling appears to be disallowed, the site may be excluded from product indexing or marked for manual review.

The verification stage (step 2) involves only a small number of requests and occurs before the full permission review. The full robots.txt and terms review happens at this stage, before any product indexing begins.

4. Product indexing stage

If a website appears relevant to beekeeping and product crawling appears to be allowed, HivemunkCrawler may visit product-related pages.

When the e-commerce platform provides methods to access product data, HivemunkCrawler uses those methods. When such methods are not available, HivemunkCrawler may visit individual product pages directly.

The goal of this stage is to collect limited structured product information, not to copy the website.

5. Refresh stage

Business and product information may be refreshed periodically.

Refreshes are used to keep information accurate, including product availability, listed prices, business URLs, and whether a business still appears relevant to beekeeping.

Refresh frequency may vary depending on the type of information and the behavior of the website.

Headless browser

HivemunkCrawler renders pages using a headless browser. This means it loads and executes JavaScript on the pages it visits, similar to how a regular web browser would. This may trigger analytics, tracking, or statistics scripts on the visited site. The crawler does not interact with the page beyond loading it — it does not click buttons, scroll, or fill in forms.

robots.txt

HivemunkCrawler checks robots.txt for rules specifically addressed to it. If HivemunkCrawler is disallowed, the site will be excluded from product indexing.

To block HivemunkCrawler from your entire site, add the following to the robots.txt file at the root of your domain:

User-agent: HivemunkCrawler
Disallow: /

Blocking HivemunkCrawler may prevent your business from appearing in the hivemunk beekeeping supply directory or product index.

Blocking product indexing only

If you want your business to appear in the hivemunk directory but do not want product pages crawled, you can block product-related paths instead of blocking the entire site.

Exact paths vary by website platform. If you are not sure what to block, contact us and we can help.

Allowing HivemunkCrawler

To explicitly allow HivemunkCrawler, you can add:

User-agent: HivemunkCrawler
Allow: /

You do not need to add this if your site already allows general crawling.

Terms of service and legal restrictions

HivemunkCrawler may check terms, legal, or terms of service pages for restrictions related to:

  • Scraping
  • Crawling
  • Bots
  • Automated access
  • Data extraction
  • Screen scraping
  • Use of robots.txt

If your website terms prohibit automated crawling, scraping, or data extraction, HivemunkCrawler may exclude your site from product indexing or mark it for manual review.

Product images

HivemunkCrawler is not intended to download, copy, or rehost product images.

If product images are displayed in hivemunk in the future, they should generally be handled by linking back to the original seller or by using images only with permission, license, partnership, or another appropriate basis.

Product descriptions

HivemunkCrawler is not intended to republish full product descriptions.

Product indexing may use short product names, categories, prices, availability indicators, and product URLs to help beekeepers find relevant products. Users should visit the seller's website for the complete product listing, description, shipping terms, return policy, and checkout process.

Accuracy

HivemunkCrawler collects information from public websites, but product data can change quickly.

Prices, availability, shipping costs, taxes, promotions, and product details may change after a page is crawled. hivemunk may display when information was last checked, but the seller's website is the authoritative source for current product details.

Before purchasing, users should confirm all information directly on the seller's website.

Removal requests

If you want your website removed from the hivemunk directory or product index, contact:

hello@hivemunk.com

Please include:

  • Your domain name
  • Whether you want the entire business listing removed
  • Whether you only want product indexing disabled
  • Whether any information is inaccurate and should be corrected

Removal and correction requests are reviewed manually.

Correction requests

If hivemunk has incorrect information about your business, contact:

hello@hivemunk.com

Please include the correct information and the URL where it can be verified.

Examples of corrections include:

  • Business name
  • Business address
  • Website URL
  • Whether you sell beekeeping supplies
  • Whether you sell online
  • Product category information

Contact

For crawler questions, removal requests, correction requests, or crawl-rate concerns, contact:

hello@hivemunk.com

Please include your domain name so we can review the correct website.

© 2026 hivemunk·about|terms|privacy|cookies

© 2026 hivemunk. crafted with ♡ for the beekeeping community.
about|terms|privacy|cookies