What Is Meta Robots? SEO Glossary
Learn what meta robots means in SEO, why it matters, and how to implement it.
What Is Meta Robots?
The meta robots tag is an HTML element placed in the <head> section of a web page that instructs search engine crawlers how to handle that page. It controls whether search engines should index the page, follow its links, or display snippets and image previews in search results. The minimal valid form is a single directive, for example <meta name="robots" content="noindex">. Google documents all as the default value, which carries no restrictions and has no effect if listed explicitly (see Google Search Central, "Robots Meta Tags Specifications").
Unlike robots.txt, which operates at the crawl level by blocking access to URLs, the meta robots tag provides page-level instructions that are processed after a page has been crawled. This distinction is important because the meta robots tag requires the search engine to actually access and read the page to discover the directives. MDN makes the same point. A noindex directive only takes effect after the robot revisits the page, so robots.txt must not block that page or the crawler will never see the directive.
Why Meta Robots Matters for SEO
Index control. The meta robots tag is the most reliable way to prevent specific pages from appearing in search results while still allowing search engines to crawl them. This is critical for managing which pages represent your site in search results.
Crawl budget management. By using noindex on low-value pages like internal search results, tag archives, or parameter-based URLs, you signal to search engines that these pages are not worth prioritizing. This helps focus crawl budget on your most valuable content.
Link equity flow. The nofollow directive in a meta robots tag prevents search engines from following any links on the page, which affects how link equity flows through your site. This gives you granular control over your site's internal link architecture.
Snippet control. Directives like nosnippet and max-snippet let you limit or suppress the text Google shows for a page, and max-image-preview controls thumbnail size. These are useful for managing how premium, time-limited, or licensed content appears in results. Note that the old noarchive directive no longer does anything at Google because the cached-page feature was retired.
Penalty prevention. Properly using meta robots to exclude thin content, duplicate pages, and auto-generated pages from the index prevents quality issues that could trigger algorithmic penalties or manual actions.
How Meta Robots Works
The meta robots tag communicates specific directives to search engine crawlers. Here are the most commonly used values.
index / noindex. The noindex directive tells Google not to show the page, media, or resource in search results. The index value (and the broader all) is the default. Google lists index and follow as the implicit baseline, so if no meta robots tag exists the page is eligible for indexing. The active levers are the restrictive directives, not the permissive ones.
follow / nofollow. The nofollow directive tells search engines not to follow the links on the page. The follow value is the default behavior. MDN documents both all (equivalent to index, follow) and none (equivalent to noindex, nofollow).
Common combinations:
all: no indexing or serving restrictions. This is the default and the same asindex, follow.noindex, follow: do not show this page in search results, but follow its links.noindex, nofollow(ornone): do not index this page and do not follow its links.index, nofollow: eligible to index this page, but do not follow its links.
Additional directives (per Google's current specification):
nosnippet: do not show a text snippet or video preview in the search results for this page. Google notes it also prevents the content from being used as a direct input for AI Overviews and AI Mode.max-snippet:[number]: use at most this many characters as a textual snippet.max-snippet:0opts out of any text snippet.max-image-preview:[setting]: set the maximum image preview size. The three valid values arenone,standard, andlarge.max-video-preview:[number]: use at most this many seconds of any video on the page as a snippet.noimageindex: do not index images on this page.notranslate: do not offer translation of this page in search results.unavailable_after:[date/time]: do not show this page in search results after the specified date and time.indexifembedded: allow Google to index the page's content when it is embedded in another page through iframes, even alongside anoindex.
Obsolete: noarchive. Google no longer uses noarchive (or its synonym nocache). Google retired the cached-page feature, so the directive that once suppressed the "Cached" link has no effect in Google Search today. MDN still lists noarchive and nocache as a request not to cache page content, but treat it as a no-op for Google. If you need to limit how Google displays your text, use nosnippet or max-snippet instead.
Conflict resolution. When directives conflict, Google applies the more restrictive rule. For example, a page carrying both max-snippet:50 and nosnippet is treated as nosnippet.
X-Robots-Tag HTTP header. For non-HTML resources like PDFs, images, or video files that cannot contain meta tags, you can send the same directives via the HTTP response header, for example X-Robots-Tag: noindex. Any directive valid in the meta robots tag is also valid in the header.
Bot-specific directives. You can target a specific crawler with its user agent token: <meta name="googlebot" content="noindex"> applies only to Google's main web crawler, and <meta name="googlebot-news" content="noindex"> applies only to Google News. The generic robots name applies to all compliant crawlers.
Best Practices
Use noindex for low-value pages. Internal search results pages, filter/sort variations, thin tag archive pages, and paginated archives beyond page one are common candidates for noindex. These pages rarely provide unique value to searchers and can dilute your site's overall quality signals.
Prefer noindex over robots.txt blocking. If you want a page excluded from search results, use noindex. Blocking a URL in robots.txt prevents crawling, which means search engines may never see your noindex directive. Worse, a blocked page with external links pointing to it can still appear in search results with limited information.
Use noindex, follow for hub pages. If a page primarily exists to link to other important pages (like a tag page or category archive), consider using noindex, follow. This keeps the page out of search results while preserving the link equity flow to the pages it links to.
Audit meta robots regularly. CMS updates, theme changes, and plugin modifications can silently alter meta robots tags. A single misconfigured template adding noindex to your entire blog section can devastate organic traffic. Include meta robots verification in every technical SEO audit.
Be intentional with nofollow on the page level. The page-level nofollow directive affects all links on the page. If you only want to nofollow specific links, use the rel="nofollow" attribute on individual link elements instead of the meta robots tag.
Combine with canonical tags. For pages with duplicate content that you want to consolidate, using a canonical tag pointing to the preferred version is often better than noindex. Canonical tags transfer ranking signals, while noindex simply removes the page from results.
Common Mistakes
Accidentally noindexing important pages. This is the most damaging meta robots mistake. A staging environment configuration left on production, a theme setting checked incorrectly, or a plugin applying noindex globally can remove your entire site from search results. Monitor Google Search Console's index coverage report for unexpected drops.
Using noindex and canonical together. Placing a noindex tag on a page while also having a canonical tag pointing to a different URL creates conflicting signals. If the page should not be indexed, use noindex. If it should redirect ranking signals to another URL, use canonical. Do not use both.
Blocking in robots.txt and using noindex. If robots.txt blocks a URL, search engines cannot crawl the page and therefore cannot see the meta robots tag. The noindex directive will never be processed. Remove the robots.txt block if you need the noindex directive to work.
Forgetting about HTTP headers. PDF files, images, and other non-HTML resources cannot contain meta tags. If these files should not be indexed, you must use the X-Robots-Tag HTTP header, which requires server configuration.
Relying on noindex as a security measure. The noindex directive is a suggestion that most search engines respect, not a security mechanism. It does not prevent someone from accessing the page directly. Sensitive content should be protected by authentication, not meta robots tags.
Not testing after deployment. Always verify meta robots tags in the live HTML source after deployment. CMS caching, server-side rendering, and build processes can all alter the final output. Use the "View Page Source" function or a crawl tool to confirm the correct directives are in place.
In Practice
Say you run a faceted ecommerce category that generates thousands of filter and sort URLs (color, size, price order). Those URLs are crawlable and they dilute index quality, but the links on them still point to canonical product pages you want crawled. The right directive is noindex, follow. In the page template's <head> you would render this exact tag:
<meta name="robots" content="noindex, follow">
Google will drop the filtered URL from results while continuing to follow its links. For a downloadable price sheet PDF that should never appear in search, you cannot add a meta tag to a binary file, so you set the directive at the server instead. On Nginx that looks like this:
location ~* \.pdf$ {
add_header X-Robots-Tag "noindex" always;
}
Both approaches require that robots.txt leave the URL crawlable. If you also Disallow the path in robots.txt, Google never fetches the page or the header, so the noindex is never seen and the URL can still surface as a bare link.
Related Terms
- What Is robots.txt? explains the crawl-level control that must not block pages you want deindexed.
- What Is noindex? covers the single most important directive inside the meta robots tag in depth.
- What Is nofollow? compares the page-level directive with the link-level
rel="nofollow"attribute. - What Are Canonical Tags? is the consolidation signal you usually want instead of
noindexfor duplicate content. - What Is Crawling? describes the step that has to happen before any meta robots directive can be read.
Conclusion
The meta robots tag gives you precise, page-level control over how search engines handle your content. Used correctly, it keeps low-value pages out of search results, manages crawl budget, controls snippet display, and protects content presentation. The tag is simple to implement but powerful in its impact, which also makes it dangerous when misconfigured. Regular auditing, careful template management, and a clear understanding of each directive's behavior are essential for using meta robots effectively as part of your technical SEO strategy.
Sources
- Google Search Central, "Robots Meta Tags Specifications" (valid directives, conflict resolution, retired
noarchive, X-Robots-Tag): https://developers.google.com/search/docs/crawling-indexing/robots-meta-tag (checked 2026-05-30) - Google Search Central, "Block Search Indexing with noindex": https://developers.google.com/search/docs/crawling-indexing/block-indexing (checked 2026-05-30)
- MDN Web Docs,
<meta name="robots">(directive values, default behavior, robots.txt interaction): https://developer.mozilla.org/en-US/docs/Web/HTML/Reference/Elements/meta/name/robots (checked 2026-05-30)
Related Articles
What are Backlinks? SEO Guide for Beginners
Learn what backlinks mean in SEO, why they matter, and how to use them to improve your search rankings.
What are Canonical Tags? SEO Guide for Beginners
Learn what canonical tags mean in SEO, why they matter, and how to use them to improve your search rankings.
What are Core Web Vitals? SEO Guide for Beginners
Learn what Core Web Vitals mean in SEO, why they matter, and how to use them to improve your search rankings.