Image SEO 2026: Beyond Alt Text
Image SEO that compounds rankings. AVIF and WebP stack, alt text for accessibility first, LCP image rules, and visual search optimization.
Image SEO in 2026 is no longer the "remember to add alt text" sidebar most guides treat it as. Images are the LCP element on 73 percent of mobile pages and 83 percent of desktop pages, which makes image delivery the single biggest lever on Core Web Vitals. They also account for 50+ percent of total page weight on most content sites, which means the format stack you choose changes both ranking outcomes and hosting costs. And visual search in AI Overviews now drives a measurable share of impressions on image-heavy queries, so the long-ignored image sitemap is suddenly worth setting up.
Quick Answer: Image SEO in 2026 starts with accessibility-first alt text (written for screen readers, not for keywords), then layers on the AVIF then WebP then JPEG format stack served via the picture element, an LCP-protected hero image that never lazy loads, native lazy loading on everything else, and a maintained image sitemap that surfaces visual content to Google Images and AI Overviews.
Key Takeaways
- Images are the LCP element on 73 percent of mobile and 83 percent of desktop pages. Format choice directly impacts Core Web Vitals.
- AVIF first, WebP second, JPEG fallback is the 2026 standard format stack. AVIF saves an additional 20 to 50 percent over WebP.
- Alt text is accessibility-first, SEO-second. Screen reader users determine what alt text should sound like.
- Never lazy load the LCP image. Native loading="lazy" applies to below-fold images only.
- Image sitemaps surface content to Google Images and increasingly to AI Overviews' multimodal results.
Why Alt Text Should Be Written for Screen Readers First
Alt text is accessibility infrastructure that happens to be useful for SEO. The common mistake is reversing that relationship and writing alt text as if it were a keyword field. Doing so produces alt text that screen reader users hate and that Google's image understanding models treat as low quality.
The correct framing: write alt text as if you were describing the image out loud to someone who cannot see it. That description should be specific, concise, and accurate. If the image is a screenshot of a product dashboard, the alt text describes what is in the dashboard ("Project management dashboard showing four columns of tasks with priority badges"). It does not stuff keywords like "best project management software project management tool SaaS dashboard."
Google's image understanding has gotten substantially better. The image content gets recognized and contextualized regardless of the alt text. The alt text adds to that recognition by providing the context the image alone does not convey. Keyword-stuffed alt text actively hurts because it signals to Google that the page is optimizing for search rather than serving the user.
The accessibility-first standard also catches a common error around alt text for decorative images. A decorative divider, a background pattern, a flourish under a heading should have alt="" (empty alt). Screen readers will then skip them entirely, which is the correct behavior because announcing every decorative element creates a painful listening experience. If the image carries no information, it should have empty alt text. This is a WCAG requirement and Google treats it the same way.
For a primer on the broader role of images in SEO, the what is image SEO explainer covers the foundational concepts.
Decorative Versus Meaningful Images: When to Use Alt Empty
The decision tree for alt text is straightforward but consistently misapplied. Three categories:
Meaningful images carry information the surrounding text does not. Product photos, screenshots, diagrams, infographics, charts. These need descriptive alt text. The description should convey what a sighted user would understand from looking at the image.
Decorative images are visual flourishes that add no information. Section dividers, background textures, repeated icons next to bullet points, ornamental borders. These should have alt="" so screen readers skip them.
Functional images are images that act as controls (an icon button, a clickable logo, a social share image). These need alt text that describes the function, not the visual. A magnifying glass icon for search should have alt="Search" not alt="Magnifying glass icon."
A working test: read the alt text out loud in the context of the surrounding paragraph. Does it make the page flow better or worse? If it adds essential information, keep it. If it interrupts with redundant or vague text, either rewrite it or set it empty.
The WebAIM alt text guide is the canonical reference. It is the resource Google itself points to in accessibility documentation, and it is short enough to read in full.
The 2026 Format Stack: AVIF Then WebP Then JPEG
The image format conversation in 2026 has stabilized. WebP is fully supported in every major browser and accounts for roughly 35 to 40 percent of all image bytes served on the web. AVIF support has reached approximately 93 percent of browsers globally and is now the preferred format for new content. JPEG remains a fallback for legacy compatibility and shrinks to roughly 40 percent of image bytes.
The recommended stack is:
- AVIF as the primary delivery format for modern browsers.
- WebP as the secondary fallback for browsers without AVIF support.
- JPEG as the universal fallback for browsers without either.
The compression difference is substantial. AVIF typically produces files 20 to 50 percent smaller than WebP at equivalent quality, and 50 to 70 percent smaller than JPEG. For a content site with 200 images on a page, switching from JPEG to AVIF can cut page weight by half. That weight reduction translates directly into faster LCP, lower bounce rates, and better Core Web Vitals scores.
PNG should be reserved for what it actually does well: logos, icons, UI elements with transparency, and very simple graphics. Using PNG for photographs is a waste of bandwidth that AVIF or WebP would solve immediately. Modern PNG usage has dropped to roughly 12 percent of image bytes and that share continues to shrink.
The implementation gotcha is that you cannot serve AVIF to all browsers and expect it to work. About 7 percent of users will hit a fallback path. The picture element is the standard mechanism for this.
Picture Element Fallback That Just Works
The picture element is the HTML standard for serving different image formats and sizes based on browser support. The pattern that handles AVIF, WebP, and JPEG fallback in one block:
<picture>
<source srcset="hero.avif" type="image/avif">
<source srcset="hero.webp" type="image/webp">
<img src="hero.jpg" alt="Descriptive alt text here"
width="1200" height="630" loading="eager" fetchpriority="high">
</picture>
The browser tries each source in order. AVIF support? Serve AVIF. No AVIF but WebP? Serve WebP. Neither? Fall back to the img tag's JPEG src. The img tag is required as the actual rendered element; the source tags only inform which file to fetch.
Two attributes on the img tag matter for Core Web Vitals. The width and height attributes prevent CLS (Cumulative Layout Shift) because the browser reserves the correct space before the image loads. The loading="eager" and fetchpriority="high" attributes on the LCP image tell the browser to prioritize this download over other resources.
For images below the fold, the pattern simplifies because lazy loading is acceptable:
<picture>
<source srcset="content.avif" type="image/avif">
<source srcset="content.webp" type="image/webp">
<img src="content.jpg" alt="Descriptive alt text"
width="800" height="500" loading="lazy">
</picture>
Most modern frameworks generate this markup automatically via image components (Next.js Image, Astro Image, etc.). Verifying that your framework produces the picture element with all three formats is a 5-minute check worth doing today.
For the deeper context on how images intersect with technical SEO, the best image SEO tools roundup covers the tooling that automates the format conversion at build time.
LCP Image Rules (Never Lazy Load the Hero)
The single most common image SEO mistake in 2026 audits is lazy loading the LCP image. Native lazy loading via loading="lazy" is a Chrome heuristic that defers image downloads until the image is close to the viewport. For below-fold images, this is exactly what you want. For the LCP image (almost always the hero image), it adds 200 to 800 milliseconds of delay because the browser does not start fetching until after the page layout is calculated.
The correct rules for the LCP image:
loading="eager"(or omit loading entirely, which defaults to eager).fetchpriority="high"to tell the browser this is the most important resource.- Inline
widthandheightattributes to prevent CLS. - Served from the same origin or a fast CDN (not a third-party domain with extra DNS lookup).
- Preconnect or preload hints in the document head if the image is on a separate domain.
For the rest of the images on the page, the inverse rules apply:
loading="lazy"to defer fetching until needed.fetchpriority="low"if you want to be explicit (default behavior is fine in most cases).- Width and height attributes still required for CLS prevention.
The performance impact of getting these two patterns right is substantial. On a 4G mobile connection, the LCP for a properly optimized image is typically 1.2 to 1.8 seconds. The same image lazy loaded comes in at 2.0 to 2.6 seconds, which is the difference between a green LCP score and a yellow or red one.
For a deeper dive into how images interact with the broader Core Web Vitals story, Google's web.dev LCP guidance is the canonical reference.
Lazy Load Everything Else With Native Loading Attribute
Below-fold images should all use loading="lazy". Native browser lazy loading has been supported in Chrome, Edge, Firefox, and Safari for years now and the implementation is reliable. There is no longer a reason to load a lazy-loading JavaScript library unless you need scroll-triggered animations beyond image loading.
The pattern is simple:
<img src="content-image.jpg"
loading="lazy"
width="800"
height="500"
alt="Descriptive alt text">
The width and height attributes remain critical even with lazy loading because they reserve layout space. Without them, the page reflows as each image loads, hurting CLS scores.
Lazy loading is also worth applying to iframes (loading="lazy" works on iframes too). Embedded YouTube players, social media widgets, maps, and any other iframe-based content can be deferred until needed, saving meaningful bytes on initial page load.
The exception to the lazy loading rule (besides the LCP image) is any image that appears within the first viewport on the most common device size. For most sites, this is 1 to 3 images. Loading these eagerly ensures they paint at the same time as the LCP and reduces the visual jank of images popping in as the user starts to scroll.
To verify the lazy loading is working, open Chrome DevTools, Network tab, and reload the page. The lazy loaded images should not appear in the initial request waterfall. They should only appear as you scroll down the page.
Image Sitemap and Indexing for Google Images
Image sitemaps are an underused mechanism for telling Google about the images on your site. Google can discover images through normal crawling, but an image sitemap accelerates discovery and provides additional context (caption, geo-location, license) that improves how images surface in Google Images.
The XML format is an extension of the standard sitemap protocol:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:image="http://www.google.com/schemas/sitemap-image/1.1">
<url>
<loc>https://example.com/blog/post-name</loc>
<image:image>
<image:loc>https://example.com/images/hero.avif</image:loc>
<image:title>Article hero image description</image:title>
<image:caption>Caption text shown to users</image:caption>
</image:image>
</url>
</urlset>
Most CMS platforms generate image sitemaps automatically via plugins or built-in modules. If your CMS does not, generating one manually for the top 50 image-heavy pages is a single afternoon's work and worth doing.
The traffic from Google Images is meaningful for image-rich verticals (recipes, design portfolios, ecommerce, real estate, travel). For text-heavy blogs, image traffic is typically 1 to 5 percent of total traffic, but the absolute volume can still be significant on high-traffic sites.
The 2026 development worth noting is that AI Overviews increasingly include image results in multimodal answers. A query like "what does a Tudor-style house look like" surfaces an AI Overview with both text and image citations. Pages that maintain image sitemaps and use descriptive alt text get cited at noticeably higher rates in these multimodal Overviews.
For more on how AI search engines surface visual content, the answer engine optimization guide covers the broader citation mechanics.
Visual Search and the Multimodal AI Overview Bonus
Visual search is the use case where a user searches with an image instead of text. Google Lens is the most prominent implementation, but Pinterest Visual Search, Bing Visual Search, and increasingly the multimodal modes of ChatGPT and Claude all process images as queries.
Optimizing for visual search is a small additional layer on top of standard image SEO. The factors that improve visual search ranking:
- High image quality (sharp focus, well-lit subject).
- Descriptive alt text and surrounding caption text.
- Structured data on the page (Product schema, Recipe schema, Article schema).
- Original photos rather than stock (originality is a signal).
- Proper format and compression (slow-loading images are deprioritized).
The multimodal AI Overview bonus is the newer factor. When AI engines respond to a query with a visual component, they cite pages that have both relevant text and image content. Pages with only text get fewer multimodal citations. Pages with only images get fewer text-extraction citations. Pages with both, especially when the image is genuinely useful and well-described, win in both citation tracks.
For ecommerce sites, visual search is increasingly important because customers search with product photos. A user might photograph a couch they like and search for similar styles. Product pages with rich image catalogs and product schema rank in these visual search results. Pages with one tiny thumbnail per product do not.
File Naming Conventions That Help the Crawler
File naming is a minor signal but worth getting right because it costs nothing. The standard pattern:
- Lowercase only.
- Hyphens between words, not underscores.
- Descriptive of the content, not generic.
- Include the primary subject and any relevant qualifier.
A good file name: running-shoes-nike-air-zoom-pegasus-side-view.avif
A bad file name: IMG_2384.jpg
A keyword-stuffed file name (also bad): best-running-shoes-best-nike-shoes-buy-shoes.jpg
The file name appears in the image URL and in image search results when a thumbnail is displayed. Descriptive file names slightly improve image search CTR and provide a tiny signal to image understanding models. The signal is small. Do not over-engineer this.
Folder structure also matters mildly. Organizing images by content type or page rather than by upload date helps with site architecture clarity. /images/blog/article-name/hero.avif is cleaner than /uploads/2026/05/12345_hash.avif.
Astro SEO Blog has consistently found that the sites with the strongest image SEO are not necessarily the ones with the most images. They are the ones who treat each image as a deliberate asset with proper formatting, accessibility-first alt text, and an LCP-aware loading strategy. That discipline compounds across hundreds of images and the rankings reflect it.
For the broader context of how images fit into a technical SEO audit, the technical SEO audit primer covers the audit cadence, and what is technical SEO covers the foundational concepts. External references worth keeping bookmarked: Google Search Central's image SEO documentation for official guidance, and web.dev's image optimization guide for the performance side.
FAQ
What is image SEO?
Image SEO is the practice of optimizing images for both search engines and users. It includes accessibility-first alt text, modern format delivery (AVIF, WebP), LCP-aware loading, structured data, image sitemaps, and descriptive file naming.
Is AVIF better than WebP for SEO in 2026?
Yes, when the user's browser supports it. AVIF produces files 20 to 50 percent smaller than WebP at equivalent quality, which improves LCP and Core Web Vitals. The picture element handles browsers without AVIF support by falling back to WebP or JPEG.
Should I write alt text for keywords or for users?
For users, specifically for screen reader users. Alt text written as keyword stuffing is hostile to accessibility and ranks worse than descriptive, accurate alt text. Treat alt text as accessibility infrastructure first, SEO benefit second.
Why should I never lazy load the LCP image?
Native lazy loading defers image downloads until the image is near the viewport. For the LCP image (almost always the hero), this adds 200 to 800 ms of delay because the browser does not start fetching until after layout calculation. Use loading="eager" and fetchpriority="high" on the LCP image.
How big should images be for web?
The right size matches the largest dimension the image will actually display at. A hero image displayed at 1200px wide should not be 4000px wide. Use srcset to serve different sizes for different viewports. AVIF or WebP format with quality around 75 to 85 is the sweet spot for most photographs.
Do image sitemaps still matter in 2026?
Yes. Image sitemaps accelerate Google's discovery of images and add context (caption, geo, license) that improves Google Images surfacing. They also increasingly correlate with citation rates in multimodal AI Overview results.
What is the picture element used for?
The picture element is the HTML standard for serving different image formats and sizes based on browser support. It lets you offer AVIF first, WebP as fallback, and JPEG as the universal fallback in a single declarative block.
Does Google penalize sites with stock photos?
Not directly, but original photos consistently outperform stock photos in image search ranking and increasingly in visual search and multimodal Overviews. Original imagery is treated as an Experience signal under the E-E-A-T framework, which makes it a measurable ranking factor in 2026.
Related Articles
Crawl Budget Optimization for Large Sites
Stop wasting Googlebot on filter URLs and redirect chains. Sitemap discipline, robots.txt patterns, and AI bot competition mitigation.
How to Fix Indexing Issues in Search Console
Resolve Discovered, Crawled, and Page with redirect statuses. URL Inspection workflow, quality fixes, and what to deliberately leave unindexed.
How to Fix Keyword Cannibalization in 2026
Diagnose and resolve internal keyword competition. Search Console workflow, intent audit, and decision tree for merge, redirect, or differentiate.