utf8
Fix malformed UTF-16/UTF-8 files
Corrects malformed UTF-16 and UTF-8 files into readable UTF-8. Fixes null-laced ASCII, missing BOMs, and unpaired surrogates in-place with backup.
Features
- Detect and fix UTF-16 LE/BE with or without BOM
- Fix null-laced ASCII (UTF-16 encoding of ASCII)
- Handle unpaired surrogates
- In-place conversion with .bak backup
- MCP server and CLI modes
Install
go install github.com/hegner123/utf8@latestThe Problem: AI agents can't read UTF-16 encoded files
# Some tools (looking at you, Supabase CLI) output UTF-16 files.
# AI agents see garbled binary or empty content.
# The Read tool returns gibberish or fails entirely.
#
# Manual fix: open in editor, re-save as UTF-8.
# But AI agents can't do that.Solution
$ utf8 --cli --file /path/to/database.types.tsOutput
{"file":"/path/to/database.types.ts","original_encoding":"utf16le_bom","size_before":250656,"size_after":125327,"backup":"/path/to/database.types.ts.bak"}Comparison
| Metric | Value |
|---|---|
| Size reduction (UTF-16 to UTF-8) | ~50% |
| Encoding detection | UTF-16 LE/BE, BOM, null-laced ASCII |
| Safety | Creates .bak backup before modifying |