ingest
RAG document ingestion pipeline
Process markdown into semantic vector embeddings stored in PostgreSQL+pgvector. Add, list, delete, and search documents for retrieval-augmented generation.
Features
- Ingest markdown files into vector embeddings
- PostgreSQL + pgvector storage
- Ollama embedding via nomic-embed-text
- Add, list, delete, and search operations
- Language filtering for search results
Install
go install github.com/hegner123/ingest@latestThe Problem: AI agents can't search your documentation by meaning
# You have 200 markdown docs across your project.
# Grep finds exact strings, not concepts.
# "How do I handle authentication?" returns nothing
# because no file contains that exact phrase.Solution
$ ingest add --path ./docs --language go --recursiveOutput
{"ingested":15,"skipped":2,"errors":0,"language":"go","path":"./docs"}Comparison
| Metric | Value |
|---|---|
| Search type | Semantic (meaning-based, not keyword) |
| Embedding model | nomic-embed-text via Ollama (768 dims) |
| Storage | PostgreSQL + pgvector |