Real-World Examples
Practical workflows for corpus management
Single Source Sync
# Sync single documentation site bun run packages/inform/src/index.ts sync official-docs --print-cmds # Output: # > Cloning https://github.com/owner/repo... # > Running: gather https://docs.example.com --output-dir .fwdslsh/corpus/sources/official-docs/content # > Running: catalog .fwdslsh/corpus/sources/official-docs/content --output .fwdslsh/corpus/sources/official-docs/catalog
Multiple Sources
Sync multiple documentation sources
# corpus.yml
sources:
- id: main-docs
type: http
url: https://docs.example.com
- id: api-reference
type: git
url: https://github.com/owner/api-docs
- id: internal-guides
type: local
path: ./internal-docs
# Sync all sources
bun run packages/inform/src/index.ts sync --all --mirror
CI/CD Automation
GitHub Actions workflow for corpus management
# .github/workflows/corpus-sync.yml
name: Corpus Sync
on:
schedule:
- cron: '0 0 * * *'
workflow_dispatch:
jobs:
sync:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Install bun
run: curl -fsSL https://bun.sh/install | bash
- name: Sync corpus
run: |
bun run packages/inform/src/index.ts sync --all \
--timeout 600000 \
--print-cmds
- name: Commit manifests
run: |
git config user.name "Corpus Bot"
git config user.email "bot@example.com"
git add .fwdslsh/corpus/corpus-manifest.yml
git commit -m "Update corpus manifests"
git push
Troubleshooting Scenarios
Source Timeout
# Increase timeout for slow sources export FWD_CORPUS_TIMEOUT=600000 bun run packages/inform/src/index.ts sync problematic-source
Source Failure
# Check logs for details cat ~/.hyphn/logs/inform-sync.log | jq '.errors' # Re-sync failed source with verbose output bun run packages/inform/src/index.ts sync source-id --verbose