BOM头检测工具

上传文件即可自动检测是否包含BOM头(字节顺序标记),支持UTF-8、UTF-16、UTF-32等常见编码的BOM检测,一键去除或添加UTF-8 BOM头
📁
点击选择文件或拖拽文件到此处
支持 .txt, .csv, .html, .xml, .json, .js, .css, .py 等文本文件
- -
UTF-8 BOM EF BB BF
UTF-16 LE BOM FF FE
UTF-16 BE BOM FE FF
UTF-32 LE BOM FF FE 00 00
UTF-32 BE BOM 00 00 FE FF

📦 什么是 BOM 头?

BOM(Byte Order Mark,字节顺序标记)是一个特殊的Unicode字符,通常出现在文本文件的开头,用于标识文件的编码类型和字节序。不同的编码有不同的BOM字节序列。

常见的 BOM 类型

编码BOM 字节值说明
UTF-8EF BB BF最常用的BOM,Windows记事本保存UTF-8时会添加
UTF-16 LEFF FE小端序UTF-16
UTF-16 BEFE FF大端序UTF-16
UTF-32 LEFF FE 00 00小端序UTF-32
UTF-32 BE00 00 FE FF大端序UTF-32

为什么需要处理 BOM 头?

🐍 Python 中检测和去除 BOM

# 检测文件是否有 UTF-8 BOM
with open('file.txt', 'rb') as f:
    content = f.read()
    if content.startswith(b'\xef\xbb\xbf'):
        print("文件包含 UTF-8 BOM")
    else:
        print("文件无 BOM")

# 去除 BOM
with open('file.txt', 'rb') as f:
    content = f.read()
if content.startswith(b'\xef\xbb\xbf'):
    content = content[3:]
with open('file_nobom.txt', 'wb') as f:
    f.write(content)

# 添加 UTF-8 BOM
with open('file.txt', 'rb') as f:
    content = f.read()
with open('file_with_bom.txt', 'wb') as f:
    f.write(b'\xef\xbb\xbf' + content)