Fix broken UTF-8 characters by `IO#getc`
Character (multi-byte UTF-8) is destroyed when character spanning `IO::BUF_SIZE` (4096 bytes) exist. - Prepare file: ```ruby File.open("sample", "wb") { |f| f << "●" * 1370 } ``` - Before patched: ```ruby File.open("sample") { |f| a = []; while ch = f.getc; a << ch; end; p a } # => ["●", "●", ..., "●", "\xe2", "\x97", "\x8f", "●", "●", "●", "●"] - After patched: ```ruby File.open("sample") { |f| a = []; while ch = f.getc; a << ch; end; p a } # => ["●", "●", ..., "●", "●", "●", "●", "●", "●"]
Showing
Please register or sign in to comment