Semantic HTML
How semantic HTML elements help AI agents understand your content structure.
2025-02-15
Semantic HTML means using the right HTML element for the right job. Instead of building everything with <div> and <span>, you use elements that carry inherent meaning. This helps AI agents — and search engines, and screen readers — understand your content structure without guessing.
Why semantics matter for agents
AI crawlers parse your HTML to extract:
- What kind of content this is (article, product, navigation…)
- What the main content is vs. sidebar/footer
- How content is organized (headings, sections, lists)
- What's important vs. supplementary
Semantic elements provide this information directly, without needing AI to infer it from class names or visual positioning.
Essential semantic elements
Document structure
<header> <!-- Site or section header, logo, nav -->
<nav> <!-- Navigation links -->
<main> <!-- The main content of the page (one per page) -->
<article> <!-- Self-contained content: blog post, news item -->
<section> <!-- Thematic grouping within a page -->
<aside> <!-- Secondary content: sidebar, related links -->
<footer> <!-- Site or section footer -->
One <main> per page — AI agents use <main> to locate the primary content and skip navigation, ads, and sidebars.
Headings
<h1>Page title (one per page)</h1>
<h2>Major section</h2>
<h3>Subsection</h3>
<h4>Sub-subsection</h4>
Rules:
- Exactly one
<h1>per page — the page title. - Never skip levels (don't jump from
<h2>to<h4>). - Use headings for structure, not for visual styling — use CSS for size.
AI agents use headings to understand the document outline and to chunk content when processing long pages.
Lists
Use lists for enumerated items, not paragraphs:
<!-- Unordered list: items without sequence -->
<ul>
<li>First point</li>
<li>Second point</li>
</ul>
<!-- Ordered list: steps or ranked items -->
<ol>
<li>Install the package</li>
<li>Configure the settings</li>
<li>Deploy</li>
</ol>
<!-- Definition list: terms and their definitions -->
<dl>
<dt>GEO</dt>
<dd>Generative Engine Optimization — the practice of optimizing for AI agents.</dd>
</dl>
<dl> (definition list) is particularly useful for GEO — it explicitly marks up term-definition pairs that AI agents can extract cleanly.
Time and dates
<time datetime="2025-01-15">January 15, 2025</time>
<time datetime="2025-01-15T09:00:00Z">9:00 AM UTC</time>
The datetime attribute provides a machine-readable date. AI agents use this to understand the freshness of content.
Inline semantics
<strong>important text</strong> <!-- Strong importance -->
<em>emphasized text</em> <!-- Emphasis -->
<abbr title="Generative Engine Optimization">GEO</abbr> <!-- Abbreviation with expansion -->
<code>robots.txt</code> <!-- Code or technical term -->
<cite>Author Name</cite> <!-- Citation source -->
<q cite="https://source.com">Quoted text</q> <!-- Inline quotation -->
<abbr> with a title attribute is especially useful for GEO — it expands acronyms that AI agents might encounter without context.
Anti-patterns to avoid
<!-- Bad: non-semantic div soup -->
<div class="header">
<div class="menu">...</div>
</div>
<div class="content">
<div class="title">Page Title</div>
<div class="body">...</div>
</div>
<!-- Good: semantic markup -->
<header>
<nav>...</nav>
</header>
<main>
<article>
<h1>Page Title</h1>
<p>...</p>
</article>
</main>
Practical checklist
- One
<h1>per page matching the<title>. - Headings form a logical hierarchy (h1 → h2 → h3).
-
<main>wraps the primary content. -
<nav>is used for navigation blocks. -
<article>wraps self-contained content. - Lists use
<ul>,<ol>, or<dl>— not<p>tags. - Dates use
<time datetime="...">. - Technical terms are wrapped in
<code>.
Semantic HTML is the foundation of accessibility and agent-readability. It costs nothing extra and pays dividends across SEO, GEO, and screen reader compatibility.