Getting Started with inform
Installation Options
Choose the installation method that works best for your environment
One-Line Install Script
# Automatically detects your platform and installs the binary
curl -fsSL https://raw.githubusercontent.com/fwdslsh/inform/main/install.sh | sh
# Verify installation
inform --version
# Start crawling immediately
inform https://docs.example.com
The install script automatically detects your platform (Linux, macOS, Windows) and downloads the appropriate binary. No dependencies required!
Install with Bun
# Install globally with Bun
bun install -g @fwdslsh/inform
# Or add to project
bun add @fwdslsh/inform
# Run inform
inform --help
Use with Bun's superior performance and built-in features. Requires Bun v1.0.0 or higher.
Docker Container
# Pull latest image
docker pull fwdslsh/inform:latest
# Run crawler with mounted output directory
docker run --rm -v $(pwd)/output:/output fwdslsh/inform https://docs.example.com --output-dir /output
# Or use interactive shell
docker run -it fwdslsh/inform bash
Containerized for consistent environments and CI/CD pipelines. Perfect for automated content extraction workflows.
Manual Download
# Download from GitHub Releases
# Visit: https://github.com/fwdslsh/inform/releases
# Linux
wget https://github.com/fwdslsh/inform/releases/latest/download/inform-linux
chmod +x inform-linux
./inform-linux --help
# macOS
curl -L -o inform-mac https://github.com/fwdslsh/inform/releases/latest/download/inform-mac
chmod +x inform-mac
./inform-mac --help
# Windows
# Download inform-win.exe and run from command prompt
Download pre-built binaries directly. Each release includes binaries for Linux, macOS, and Windows.
Your First Content Extraction
Learn the basics with hands-on examples
Single Page Extraction
Extract one page to see how inform works
# Extract a single page
inform https://docs.example.com/getting-started
# Output: Creates getting-started.md in current directory
Result: Clean Markdown file with just the main content, navigation and ads removed.
Full Site Extraction
Crawl an entire documentation site with structure preservation
# Extract up to 50 pages from a documentation site
inform https://docs.example.com \
--output-dir extracted-docs \
--max-pages 50 \
--delay 1000
Result: Complete site structure with organized folders and properly formatted Markdown files.
Performance Optimized
High-speed extraction with concurrent processing
# High-performance extraction
inform https://large-site.com \
--concurrency 5 \
--delay 500 \
--max-pages 200
Result: Fast parallel processing while respecting server limits and rate limiting.
Content Filtering
Extract only specific content with pattern matching
# Extract only documentation pages
inform https://mixed-site.com \
--include "*/docs/*" \
--exclude "*/blog/*" \
--output-dir docs-only
Result: Precisely filtered content matching your inclusion and exclusion patterns.
Understanding Output
How inform structures and formats extracted content
📁 File Structure Preservation
Inform maintains the original site's URL structure for easy navigation:
Original URLs
https://docs.example.com/
https://docs.example.com/guide/setup
https://docs.example.com/api/auth
https://docs.example.com/tutorials/basics
Generated Files
extracted-docs/
├── index.md
├── guide/
│ └── setup.md
├── api/
│ └── auth.md
└── tutorials/
└── basics.md
📝 Markdown Format
Each extracted file includes metadata and clean formatting:
---
title: "Getting Started Guide"
url: "https://docs.example.com/getting-started"
extracted_at: "2024-01-15T10:30:00Z"
---
# Getting Started Guide
Clean content with proper formatting, links, and images preserved.
## Features and Benefits
- Lists are properly formatted
- **Bold** and *italic* text preserved
- [Links](https://example.com) work correctly
- Images  are included
> Blockquotes and code blocks are maintained
```javascript
// Code examples become proper code blocks
function example() {
return "formatted correctly";
}
```
Common Usage Patterns
Real-world workflows for different use cases
Documentation Migration
5 minutesMoving from an old documentation platform to a new one
# Step 1: Extract all content
inform https://old-docs.company.com \
--output-dir migrated-content \
--max-pages 200 \
--delay 2000
# Step 2: Review and organize
ls migrated-content/
# Edit files as needed, organize structure
# Step 3: Import to new platform
# Use with unify for static site generation
Content Research
3 minutesAnalyze competitor documentation and content strategies
# Extract competitor docs for analysis
inform https://competitor.com/docs \
--output-dir research/competitor-analysis \
--max-pages 30 \
--delay 3000
# Create structured analysis
inform https://competitor.com/docs \
--output-dir research/summaries
Content Backup
2 minutesPreserve important web content for archival purposes
# Create complete backup with date
inform https://important-site.com \
--output-dir "backups/important-site-$(date +%Y-%m-%d)" \
--max-pages 100 \
--delay 1500
# Include images and preserve structure
inform https://site.com \
--output-dir backup \
--max-pages 50
Integration with fwdslsh Ecosystem
Combine inform with other tools for powerful workflows
inform
Crawl and extract content
catalog
Generate llms.txt indexes
🔄 Complete Documentation Pipeline
# Extract content from multiple sources
inform https://docs.example.com --output-dir docs
inform https://api.example.com --output-dir api --include "*/reference/*"
# Generate AI-ready indexes
catalog --input docs --input api --output build --sitemap --base-url https://newdocs.com
What's Next?
Continue your inform journey