Getting Started with catalog - AI-Ready Documentation Indexer

Getting Started with catalog

From installation to your first AI-ready documentation index
Transform your Markdown and HTML directories into structured llms.txt files for AI-powered workflows. Complete llms.txt standard compliance with enterprise-grade features.

Installation Options

Choose the installation method that works best for your environment

⚑

One-Line Install Script

# Automatically detects your platform and installs the binary curl -fsSL https://raw.githubusercontent.com/fwdslsh/catalog/main/install.sh | bash # Verify installation catalog --version # Start indexing immediately catalog --validate

The install script automatically detects your platform (Linux, macOS, Windows) and downloads the appropriate binary. Zero dependencies required!

πŸ“¦

Install with Bun

# Install globally with Bun bun install -g @fwdslsh/catalog # Or add to project bun add @fwdslsh/catalog # Run catalog catalog --help

Native Bun package for maximum performance and built-in optimizations. Requires Bun v1.0.0 or higher.

🐳

Docker Container

# Pull latest image docker pull fwdslsh/catalog:latest # Run catalog with mounted directories docker run --rm -v $(pwd)/docs:/input -v $(pwd)/output:/output \ fwdslsh/catalog --input /input --output /output # Or use interactive shell docker run -it fwdslsh/catalog bash

Containerized for consistent environments and CI/CD pipelines. Perfect for automated documentation processing workflows.

πŸ“₯

Manual Download

# Download from GitHub Releases # Visit: https://github.com/fwdslsh/catalog/releases # Linux wget https://github.com/fwdslsh/catalog/releases/latest/download/catalog-linux chmod +x catalog-linux ./catalog-linux --help # macOS curl -L -o catalog-mac https://github.com/fwdslsh/catalog/releases/latest/download/catalog-mac chmod +x catalog-mac ./catalog-mac --help # Windows # Download catalog-win.exe and run from command prompt

Download pre-built binaries directly. Each release includes binaries for Linux, macOS, and Windows with ARM64 support.

Create Your First Documentation Index

Learn the basics with hands-on examples

πŸ“‹

Basic Index Generation

Generate llms.txt from your current directory

# Generate basic llms.txt in current directory catalog # Or specify input and output directories catalog --input docs --output build

Result: Creates llms.txt and llms-full.txt with proper H1 β†’ blockquote β†’ sections format.

🌐

SEO-Optimized Output

Generate with sitemaps and absolute URLs

# Generate with sitemap and absolute URLs catalog --input docs --output build \ --base-url https://docs.example.com \ --sitemap

Result: Creates llms.txt with absolute links plus sitemap.xml for search engine optimization.

βœ…

Validated Compliance

Ensure llms.txt standard compliance with validation

# Generate and validate compliance catalog --input docs --output build \ --validate \ --index

Result: Validates H1 β†’ blockquote β†’ sections format and generates navigation metadata.

🎯

Optional Content Patterns

Mark supplementary content as optional

# Mark draft content as optional catalog --input docs --output build \ --optional "drafts/**/*" \ --optional "**/CHANGELOG.md"

Result: Creates llms-ctx.txt without optional content for context-limited AI systems.

Understanding Output Files

Learn what catalog generates and how to use each format

πŸ“‹ llms.txt (Structured Index)

Standard-compliant structured index with H1 β†’ blockquote β†’ sections format:

# Documentation Project > Complete API and user guide documentation ## Core Documentation - [index.md](index.md) - Project overview and introduction - [getting-started.md](getting-started.md) - Quick start guide - [tutorial.md](tutorial.md) - Step-by-step tutorial ## API Reference - [api/authentication.md](api/authentication.md) - Authentication methods - [api/endpoints.md](api/endpoints.md) - API endpoints reference - [api/errors.md](api/errors.md) - Error handling guide ## Optional - [drafts/future-plans.md](drafts/future-plans.md) - Future development plans - [archive/changelog.md](archive/changelog.md) - Historical changes

πŸ“š llms-full.txt (Complete Content)

Full concatenated content with clear separators for comprehensive AI analysis:

# Documentation Project > Complete API and user guide documentation ## index.md # Welcome to Our Documentation This is the main overview with complete content including all text, code examples, and formatting preserved. --- ## getting-started.md # Getting Started Guide Step-by-step instructions with full content preserved... --- ## api/authentication.md # Authentication Complete authentication documentation with examples... [Content continues for all files]

🎯 llms-ctx.txt (Context-Only)

Structured index without optional sections, optimized for context-limited AI systems:

# Documentation Project > Complete API and user guide documentation ## Core Documentation - [index.md](index.md) - Project overview and introduction - [getting-started.md](getting-started.md) - Quick start guide - [tutorial.md](tutorial.md) - Step-by-step tutorial ## API Reference - [api/authentication.md](api/authentication.md) - Authentication methods - [api/endpoints.md](api/endpoints.md) - API endpoints reference - [api/errors.md](api/errors.md) - Error handling guide # Note: Optional sections excluded for context optimization

πŸ—ΊοΈ sitemap.xml (SEO Optimization)

XML sitemap with intelligent priority assignment and change frequencies:

<?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <url> <loc>https://docs.example.com/</loc> <lastmod>2024-01-15T10:30:00Z</lastmod> <changefreq>weekly</changefreq> <priority>1.0</priority> </url> <url> <loc>https://docs.example.com/getting-started</loc> <lastmod>2024-01-15T10:30:00Z</lastmod> <changefreq>monthly</changefreq> <priority>0.8</priority> </url> </urlset>

Common Usage Patterns

Real-world workflows for different documentation scenarios

πŸ€–

AI Training Data

2 minutes

Prepare documentation for AI model training or fine-tuning

# Generate AI-optimized documentation catalog --input docs --output ai-training \ --optional "examples/**/*" \ --optional "appendix/**/*" \ --validate # Results in: # - llms.txt: Structured index for AI context # - llms-full.txt: Complete content for training # - llms-ctx.txt: Essential content only
πŸ”

Knowledge Base Creation

3 minutes

Create searchable knowledge bases with comprehensive indexing

# Generate comprehensive knowledge base catalog --input knowledge --output kb-site \ --index \ --sitemap \ --base-url https://kb.company.com \ --validate # Include internal documentation as optional catalog --input knowledge --output kb-site \ --optional "internal/**/*" \ --optional "drafts/**/*"
🌐

Documentation Website

4 minutes

Generate SEO-optimized sitemaps for documentation websites

# Generate for static site generators catalog --input docs --output build \ --sitemap \ --sitemap-no-extensions \ --base-url https://docs.example.com \ --index # Clean URLs without .html extensions # Perfect for Hugo, Jekyll, or unify integration
πŸ”„

CI/CD Pipeline

1 minute

Automate documentation processing in continuous integration

# Automated documentation pipeline catalog --input docs --output dist \ --validate \ --sitemap \ --base-url https://docs.company.com # Exit with error code if validation fails # Perfect for automated quality checks

Integration with fwdslsh Ecosystem

Combine catalog with other tools for powerful documentation workflows

i

inform

Extract web content

inform https://docs.site.com --output-dir docs
β†’
c

catalog

Generate AI-ready indexes

catalog --input docs --output indexed --sitemap
β†’
u

unify

Build static sites

unify build --input indexed --output dist
β†’
g

giv

Generate commit messages

giv message

πŸ”„ Complete Documentation Pipeline

# Extract content from multiple sources inform https://docs.example.com --output-dir docs inform https://api.example.com --output-dir api-docs --include "*/reference/*" # Combine and index all content mkdir -p combined-docs cp -r docs/* combined-docs/ cp -r api-docs/* combined-docs/api/ # Generate comprehensive indexes with validation catalog --input combined-docs --output production \ --base-url https://docs.company.com \ --optional "archive/**/*" \ --sitemap --validate --index # Build final site unify build --input production --output public # Professional commit with AI giv message # "docs: integrate API and user documentation with comprehensive indexing"

Pattern Matching and Filtering

Advanced content selection with include, exclude, and optional patterns

πŸ“Œ Include Patterns (Whitelist)

# Include only specific file types catalog --include "*.md" --include "*.html" # Include specific directories catalog --include "docs/*.md" --include "guides/*.html" # Complex patterns catalog --include "**/{docs,guides}/**/*.{md,html}"

Use include patterns to specify exactly which files should be processed.

🚫 Exclude Patterns (Blacklist)

# Exclude draft files catalog --exclude "*.draft.md" --exclude "*draft*" # Exclude temporary directories catalog --exclude "temp/*" --exclude "backup/*" # Exclude test files catalog --exclude "**/*test*" --exclude "**/*.spec.md"

Use exclude patterns to filter out unwanted content during processing.

πŸ“‹ Optional Patterns (Supplementary)

# Mark content as optional for AI contexts catalog --optional "drafts/**/*" --optional "archive/**/*" # Mark reference material as optional catalog --optional "**/CHANGELOG.md" --optional "**/LICENSE.md" # Multiple optional categories catalog --optional "examples/**/*" --optional "appendix/**/*"

Optional patterns create separate sections in llms.txt and are excluded from llms-ctx.txt for context-limited scenarios.

Pattern Matching Benefits

🎯

Precise Control

Target exactly the content you need with sophisticated glob patterns

🧹

Clean Output

Exclude drafts, temporary files, and noise automatically

πŸ“Š

Content Organization

Organize content by importance with core and optional sections

πŸ€–

AI Optimization

Create context-optimized versions for different AI use cases

What's Next?

Continue your catalog journey