Deterministic rule

Encoding Validation Check

The encoding validation check flags file-level encoding risks that can appear when JSON translation files move between tools, vendors, and editors.

Examples

Source JSON
{ "settings.title": "設定", "user.name": "ユーザー名" }
Target JSON
{ "settings.title": "���", "user.name": "ユーザー名" }
Expected finding
Corrupted multibyte text candidate: settings.title
Encoding/BOM state warning when uploaded file metadata differs
Release impact
Japanese text can become unreadable after a bad export, wrong editor encoding, or copy/paste through a non-UTF-8 workflow.

What LocaleQA checks

LocaleQA looks for deterministic encoding warning patterns, including replacement-character corruption and UTF-8 BOM state changes captured during upload.

It is strongest for Japanese localization workflows because multibyte characters can be damaged by legacy encodings, spreadsheet exports, or editor settings while the file still appears structurally like JSON.

What appears in the report

The report shows the affected path or file-level warning so the team can check the export source, editor settings, or vendor handoff process.

The rule does not transcode, normalize, or repair text. It keeps the scan deterministic and leaves file correction to the right owner.

When to keep or skip this check

Some pipelines intentionally preserve a UTF-8 BOM, while others require no BOM. The important question is whether the source and target handling is expected for the product pipeline.

Disable this rule only when encoding state is controlled elsewhere or when BOM differences are known and harmless. Keep it enabled for Japanese projects, vendor handoffs, spreadsheet-mediated workflows, and files edited across multiple operating systems.

How to use the finding

LocaleQA reports the affected JSON path, the source and target values involved, and the rule that triggered the finding. Developers, localization PMs, vendors, and QA reviewers can use that as a specific file-level item to fix, approve, or discuss before release.

Some findings are valid exceptions. A migration file, documented style choice, or locale-specific convention may explain the difference. In those cases, teams can leave the string as-is or disable the check for that scan.

For Japanese projects, width, spacing, encoding, and katakana style often belong to the client style guide. Keep this check on when those conventions matter, and leave it off for projects where the pattern is not part of review.

Related pages