# ===== Walton / UArk — Robots for AI & General Crawlers ===== # Contact: webmaster@walton.uark.edu # ---- Shared disallows (repeat in each bot group to ensure effect) ---- # Paths we don't want crawled by any agent Disallow: /test/ Disallow: /be-epic-podcast/ Disallow: /_dev/ Disallow: /news/posts/archive/ # ---- OpenAI ---- User-agent: GPTBot Allow: / Disallow: /test/ Disallow: /be-epic-podcast/ Disallow: /_dev/ Disallow: /news/posts/archive/ User-agent: OAI-SearchBot Allow: / Disallow: /test/ Disallow: /be-epic-podcast/ Disallow: /_dev/ Disallow: /news/posts/archive/ # ---- Anthropic ---- User-agent: ClaudeBot Allow: / Disallow: /test/ Disallow: /be-epic-podcast/ Disallow: /_dev/ Disallow: /news/posts/archive/ User-agent: Claude-SearchBot Allow: / Disallow: /test/ Disallow: /be-epic-podcast/ Disallow: /_dev/ Disallow: /news/posts/archive/ # ---- Google Extended (Gemini training/grounding control) ---- # Note: This does NOT control inclusion in AI Overviews. User-agent: Google-Extended Allow: / # change to Disallow: / if you want to opt out # ---- Apple Extended ---- User-agent: Applebot-Extended Allow: / # ---- Perplexity ---- User-agent: PerplexityBot Allow: / Disallow: /test/ Disallow: /be-epic-podcast/ Disallow: /_dev/ Disallow: /news/posts/archive/ # Perplexity-User may treat requests as on-behalf-of-user and ignore robots. # For sensitive paths, coordinate with IT for WAF/IP controls. User-agent: Perplexity-User Disallow: / # ---- Unknown / non-compliant ---- User-agent: BlackboxBot Disallow: / # ---- Catch-all (kept for completeness; not relied on) ---- User-agent: * Disallow: /test/ Disallow: /be-epic-podcast/ Disallow: /_dev/ Disallow: /news/posts/archive/ # ---- Sitemaps ---- Sitemap: https://walton.uark.edu/sitemap.xml Sitemap: https://walton.uark.edu/news/sitemap.xml Sitemap: https://walton.uark.edu/insights/sitemap.xml