How to Create llms.txt: A Comprehensive Guide

How to Create llms.txt: A Comprehensive Guide

Creating an llms.txt file is a crucial step for making your website more accessible and understandable to Large Language Models (LLMs). This guide covers everything you need to know about the llms.txt standard, from its purpose and structure to step-by-step creation, automation, and best practices.

What is llms.txt?

The llms.txt file is a standardized Markdown document placed in the root directory of your website (e.g., https://yourwebsite.com/llms.txt). Its main purpose is to provide LLMs with a concise, structured overview of your site's content.

It enables them to understand, navigate, and retrieve information more effectively. It serves a similar function to robots.txt and sitemap.xml, but is specifically tailored for AI systems rather than search engines or crawlers.

Why Use llms.txt?

  • Optimized for LLMs: Provides LLMs with a clear, easily digestible summary of your website, improving their ability to answer questions and generate content based on your site.
  • Reduces Ambiguity: Strips away navigation, ads, and scripts, focusing on core content.
  • Improves AI Accuracy: Structured data leads to more accurate and context-aware AI responses.
  • Easy Maintenance: Can be updated manually or automated for dynamic site.
  • Standardized Format: Ensures consistency across the web, making it easier for LLMs to process information from different sites.

llms.txt vs. llms-full.txt

File NamePurposeContent Level
llms.txtConcise summary, key links, and essential documentationHigh-level
llms-full.txtComprehensive, full documentation in Markdown formatDetailed/full
  • llms.txt: Ideal for quick AI context and navigation.
  • llms-full.txt: Used when AIs need the entire documentation set for in-depth tasks.

llms.txt File Structure

The llms.txt file follows a specific Markdown-based structure, making it both human- and machine-readable:

  1. H1 Header: Project or website name.
  2. Blockquote: Brief summary of the website or project.
  3. (Optional) Paragraphs: Additional context or important notes.
  4. H2 Sections: Lists of core documentation or resources, with Markdown links and optional descriptions.
  5. Optional Section: Secondary resources that can be skipped if brevity is needed.

Example llms.txt

1. MyWebsite Docs

MyWebsite is a platform offering e-commerce solutions, including product automation and AI-driven analytics.

This website contains extensive documentation on our API, automation tools, and AI-powered insights.

Core Documentation

Optional

Step-by-Step: How to Create llms.txt

1. Plan Your Content

  • Identify the most important sections and documents on your site.
  • Prioritize core documentation, guides, and references.

2. Create the Markdown File

  • Open a text editor and create a new file named llms.txt.

a. Add the H1 Header

MyWebsite Docs

b. Write a Blockquote Summary

MyWebsite is a platform offering e-commerce solutions, including product automation and AI-driven analytics.

c. Add Additional Context (Optional)

This website contains extensive documentation on our API, automation tools, and AI-powered insights.

d. List Core Documentation in H2 Sections


e. Add an Optional Section

Optional

3. Place the File in Your Website Root

  • Upload llms.txt to your website’s root directory so it is accessible at https://yourwebsite.com/llms.txt.

4. Validate Formatting

  • Ensure all links are correct.
  • Use Markdown syntax for headers, blockquotes, and lists.

Automating llms.txt Generation

For large or frequently updated sites, automation is recommended.

Python Script Example

import os

docs_directory = "./docs"
llms_txt_path = "./llms.txt"

with open(llms_txt_path, "w") as f:
f.write("# MyWebsite Docs\n\n")
f.write("> MyWebsite provides AI-powered tools for e-commerce.\n\n")
f.write("## Core Documentation\n")
for doc in os.listdir(docs_directory):
if doc.endswith(".md"):
doc_name = doc.replace(".md", "").replace("-", " ").title()
doc_url = f"https://mywebsite.com/docs/{doc}"
f.write(f"- {doc_name}\n")
f.write("\n## Optional\n")
f.write("- Blog\n")
f.write("- Customer Stories\n")

print("LLMs.txt generated successfully!")

Using Tools and APIs

Several tools can automate or assist in generating llms.txt:

  • Firecrawl LLMs.txt Generator: Crawls your site and generates both llms.txt and llms-full.txt via API1.
  • Mintlify, llmstxt CLI: Command-line tools for generating and validating llms.txt file.
  • Custom Scripts: Tailor scripts to your site’s structure and update frequency.

Best Practices and Tips

  • Keep It Concise: Focus on key resources and summaries.
  • Use Markdown: Ensure all formatting follows Markdown conventions for maximum compatibility.
  • Update Regularly: Reflect changes in your site’s structure or documentation.
  • Test Accessibility: Visit https://yourwebsite.com/llms.txt to confirm it’s publicly accessible.
  • Leverage the Optional Section: Place less critical information here to help LLMs prioritize core content.
  • Document All Key Resources: Include links to guides, API docs, FAQs, and other essential materials.
  • Avoid Overloading: Don’t include every page—curate for clarity and relevance.

Advanced: Creating llms-full.txt

For sites with extensive documentation, consider also providing an llms-full.txt file:

  • Purpose: Offers the complete documentation set in a single Markdown file.
  • Use Case: When LLMs or AI agents need the full context for complex queries or training.

Common Mistakes to Avoid

  • Incorrect Placement: File must be at the root or specified subpath.
  • Poor Formatting: Inconsistent Markdown can confuse both humans and LLMs.
  • Broken Links: Double-check all URLs.
  • Redundant Content: Avoid duplicating information already presented in other sections.

Real-World Examples

  • Open Source Projects: List core documentation, contribution guides, and API references.
  • SaaS Platforms: Highlight onboarding guides, API docs, and support resources.
  • Educational Sites: Link to course overviews, syllabi, and learning resources.

Conclusion

Implementing an llms.txt file is a straightforward yet powerful way to make your website more accessible to LLMs and AI agents. By following the standardized Markdown format, updating your file, and leveraging automation where possible, you ensure that AI systems can understand and utilize your content efficiently.

References

  1. llm.txt Explained: A Comprehensive Guide
  2. LLMs.txt Generator: A Quick and Complete Guide for AI