• dearblue's avatar
    Fix broken UTF-8 characters by `IO#getc` · 992ba476
    dearblue authored
    Character (multi-byte UTF-8) is destroyed when character spanning
    `IO::BUF_SIZE` (4096 bytes) exist.
    
    - Prepare file:
    
      ```ruby
      File.open("sample", "wb") { |f| f << "●" * 1370 }
      ```
    
    - Before patched:
    
      ```ruby
      File.open("sample") { |f| a = []; while ch = f.getc; a << ch; end; p a }
      # => ["●", "●", ..., "●", "\xe2", "\x97", "\x8f", "●", "●", "●", "●"]
    
    - After patched:
    
      ```ruby
      File.open("sample") { |f| a = []; while ch = f.getc; a << ch; end; p a }
      # => ["●", "●", ..., "●", "●", "●", "●", "●", "●"]
    992ba476
io.rb 6.58 KB