utf8

Fix malformed UTF-16/UTF-8 files

Go File Recovery GitHub

Corrects malformed UTF-16 and UTF-8 files into readable UTF-8. Fixes null-laced ASCII, missing BOMs, and unpaired surrogates in-place with backup.

Features

  • Detect and fix UTF-16 LE/BE with or without BOM
  • Fix null-laced ASCII (UTF-16 encoding of ASCII)
  • Handle unpaired surrogates
  • In-place conversion with .bak backup
  • MCP server and CLI modes

Install

go install github.com/hegner123/utf8@latest

The Problem: AI agents can't read UTF-16 encoded files

# Some tools (looking at you, Supabase CLI) output UTF-16 files.
# AI agents see garbled binary or empty content.
# The Read tool returns gibberish or fails entirely.
#
# Manual fix: open in editor, re-save as UTF-8.
# But AI agents can't do that.

Solution

$ utf8 --cli --file /path/to/database.types.ts

Output

{"file":"/path/to/database.types.ts","original_encoding":"utf16le_bom","size_before":250656,"size_after":125327,"backup":"/path/to/database.types.ts.bak"}

Comparison

MetricValue
Size reduction (UTF-16 to UTF-8)~50%
Encoding detectionUTF-16 LE/BE, BOM, null-laced ASCII
SafetyCreates .bak backup before modifying