robots.txt Crash Course

Nerd Cafe | نرد کافه

1. What robots.txt Does

robots.txt tells search engines like Google:

  • Which pages they can crawl

  • Which pages they cannot crawl

  • Where your sitemap is

It helps with SEO and site privacy.

Example:

https://example.com/robots.txt

The file must be in the root directory.

2. Basic Structure

A simple robots.txt file:

User-agent: *
Disallow:

Meaning:

Directive
Meaning

User-agent

Which bot the rule applies to

Disallow

Pages the bot cannot access

* = all bots.

3. First Working Example

Allow everything:

This means:

  • All search engines allowed.

4. Blocking Pages

Block a folder:

Blocks:

Block a file:

5. Allow Specific Pages

Meaning:

❌ Block /private/ ✔ Allow /private/public.html

6. Target Specific Bots

Example for Google:

This affects only Google crawler.

Other bots unaffected.

7. Common Bots

Examples:

Or use:

For all bots.

8. Blocking Entire Site

Meaning:

❌ No search engine allowed.

Useful for:

  • Testing websites

  • Private projects

9. Adding Sitemap

Helps search engines find pages faster.

10. Real Example robots.txt

11. How to Create robots.txt

Step 1

Create a file named:

Step 2

Paste rules:

Step 3

Upload to:

12. Comments

# = comment.

13. Wildcards

Blocks all PDF files.

Examples blocked:

14. Common Mistakes

Wrong filename

Wrong location

Blocking entire site accidentally

Very common error.

💖 Support Our Work

If you find this post helpful and would like to support my work, you can send a donation via TRC-20 (USDT). Your contributions help us keep creating and sharing more valuable content.

circle-check

Thank you for your generosity! 🙏

Channel Overview

🌐 Website: www.nerd-cafe.irarrow-up-right

📺 YouTube: @nerd-cafearrow-up-right

🎥 Aparat: nerd_cafearrow-up-right

📌 Pinterest: nerd_cafearrow-up-right

📱 Telegram: @nerd_cafearrow-up-right

📝 Blog: Nerd Café on Virgoolarrow-up-right

💻 GitHub: nerd-cafearrow-up-right

Last updated