# Allowed search engines directives User-agent: Mediapartners-Google User-agent: Googlebot User-agent: Googlebot-Image User-agent: Googlebot-Mobile User-agent: Googlebot-News User-agent: Googlebot-Video User-agent: Adsbot-Google User-agent: Twitterbot User-agent: Applebot User-agent: Bingbot User-agent: SiteAuditBot User-agent: Publication-Access-for-Facebook User-agent: facebookexternalhit User-agent: Flipboard User-agent: FlipboardProxy User-agent: upday User-agent: SiteAuditBot #Sitemap Sitemap: https://www.lesoir.be/sitemap-correctif-ls-0.xml Sitemap: https://www.lesoir.be/sites/default/files/sitemaps/plus_lesoir_be/sitemapnews-0.xml Sitemap: https://www.lesoir.be/sitemap.xml # Directories Disallow: /includes/ Disallow: /misc/ Disallow: */modules/* Disallow: /profiles/ Disallow: /scripts/ Disallow: /themes/ # Files Disallow: /CHANGELOG.txt Disallow: /cron.php Disallow: /INSTALL.mysql.txt Disallow: /INSTALL.pgsql.txt Disallow: /INSTALL.sqlite.txt Disallow: /install.php Disallow: /INSTALL.txt Disallow: /LICENSE.txt Disallow: /MAINTAINERS.txt Disallow: /update.php Disallow: /UPGRADE.txt Disallow: /xmlrpc.php # Paths (clean URLs) Disallow: /admin/ Disallow: /comment/reply/ Disallow: /filter/tips/ Disallow: /search/ Disallow: /user/register/ Disallow: /user/password/ Disallow: /user/login/ Disallow: /user/logout/ # Paths (no clean URLs) Disallow: /?q=admin/ Disallow: /?q=comment/reply/ Disallow: /?q=filter/tips/ Disallow: /?q=node/add/ Disallow: /?q=search/ Disallow: /?q=user/password/ Disallow: /?q=user/register/ Disallow: /?q=user/login/ Disallow: /?q=user/logout/ # Other paths Disallow: /81985301/LESOIR/ Disallow: /Sections/Soir.be/ Disallow: /archives/recherche* Disallow: /sport/* Disallow: /node Disallow: /*?page= Disallow: /art/art/* Disallow: */undefined Disallow: */export_header_and_footer Disallow: /*?from_direct=true Disallow: /*reactions_callback Disallow: /*?referer= Disallow: /*?fb_comment_id= Disallow: /*?wptouch_switch= Disallow: */misc/* # CSS, JS, Images Allow: /misc/*.css$ Allow: /misc/*.css? Allow: /misc/*.js$ Allow: /misc/*.js? Allow: /misc/*.gif Allow: /misc/*.jpg Allow: /misc/*.jpeg Allow: /misc/*.png Allow: /modules/*.css$ Allow: /modules/*.css? Allow: /modules/*.js$ Allow: /modules/*.js? Allow: /modules/*.gif Allow: /modules/*.jpg Allow: /modules/*.jpeg Allow: /modules/*.png Allow: /profiles/*.css$ Allow: /profiles/*.css? Allow: /profiles/*.js$ Allow: /profiles/*.js? Allow: /profiles/*.gif Allow: /profiles/*.jpg Allow: /profiles/*.jpeg Allow: /profiles/*.png Allow: /themes/*.css$ Allow: /themes/*.css? Allow: /themes/*.js$ Allow: /themes/*.js? Allow: /themes/*.gif Allow: /themes/*.jpg Allow: /themes/*.jpeg Allow: /themes/*.png # Not allowed bots User-agent: 5emeRue User-agent: 5erue User-agent: adequat User-agent: adequat-systems User-agent: AmiSoftware User-agent: AwarioRssBot User-agent: AwarioSmartBot User-agent: Argus User-agent: Ask n read User-agent: asknread.com User-agent: Augure User-agent: auramundi User-agent: Bloodhound User-agent: Cision User-agent: coexel User-agent: ConveraCrawler User-agent: Corporama User-agent: cydralspider User-agent: Digimind User-agent: Download Ninja User-agent: downloadexpress User-agent: EDD User-agent: ellisphere User-agent: eureka User-agent: Europresse User-agent: Explore User-agent: Factiva User-agent: Fasterfox User-agent: Fetch User-agent: gammaSpider User-agent: grub-client User-agent: HTTrack User-agent: ia_archiver User-agent: ia_archiver-web.archive.org User-agent: indexer User-agent: infoseek User-agent: Jetbot User-agent: k2spider User-agent: Kantar User-agent: kbcrawl User-agent: Knowings User-agent: larbin User-agent: leadbox User-agent: libwww User-agent: linkfluence User-agent: linko User-agent: manageo User-agent: mediacompil User-agent: Meltwater User-agent: mention User-agent: Moreover User-agent: MSIECrawler User-agent: mytwip User-agent: newscan-online User-agent: NewsNow User-agent: Newzbin User-agent: NPBot User-agent: ObjectsSearch User-agent: Offline Explorer User-agent: opinion-tracker User-agent: Pimptrain User-agent: proxem User-agent: QuepasaCreep User-agent: Qwam content intelligence User-agent: Raven User-agent: readability.com User-agent: scoop.it User-agent: score3 User-agent: Sindup User-agent: sitecheck.internetseer.com User-agent: SiteSnagger User-agent: spotter User-agent: Synthesio User-agent: Talkwater User-agent: Teleport User-agent: TeleportPro User-agent: trendeo User-agent: trendybuzz User-agent: TunitinBot User-agent: TurnitinBot User-agent: up2news User-agent: vecteurplus User-agent: Verif User-agent: verticalsearch User-agent: vsw User-agent: wapspider User-agent: WebCopier User-agent: WebReaper User-agent: WebStripper User-agent: WebZinger User-agent: WebZIP User-agent: Wget User-agent: winello User-agent: Youmag User-agent: Zealbot User-agent: Zite User-agent: ZyBORG Disallow: / # AI Data Scrapers User-agent: AI2Bot User-agent: Amazonbot User-agent: Applebot-Extended User-agent: anthropic-ai User-agent: Bytespider User-agent: CCBot User-agent: ChatGPT-User User-agent: ClaudeBot User-agent: Claude-Web User-agent: cohere-ai User-agent: Diffbot User-agent: DuckAssistBot User-agent: FacebookBot User-agent: Google-Extended User-agent: GPTBot User-agent: Meta-ExternalAgent User-agent: Meta-ExternalFetcher User-agent: OAI-SearchBot User-agent: PerplexityBot Disallow: /