Skip to content

Instantly share code, notes, and snippets.

@tigris
Created September 17, 2012 14:17
Show Gist options
  • Save tigris/3737579 to your computer and use it in GitHub Desktop.
Save tigris/3737579 to your computer and use it in GitHub Desktop.
Ruby Net::FTP screwing encoding?

I don't get why when I upload (ftp.put) a copy of the same file, then download that copy, ruby thinks the content is different? Obviously something in Net::FTP is screwing with my ISO-8859-1 character in the file? I dunno, still trying to track it down.

<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
</head>
<body>
<div id="main">
<div title="Skötsel">
<p>F&ouml;r mer information, se Natursten Sk&ouml;tsel Inomhus, utgiven av Sveriges Stenindustrif&ouml;rbund (SSF)</p>
</div>
</div>
</body>
</html>
default encoding: UTF-8
default encoding valid?: true
forced encoding: ISO-8859-1
forced encoding valid?: true
content same after force encode?: true
encoding after ftp actions: UTF-8
encoding valid after ftp actions?: true
content match original after ftp actions?: false
content match original force encoded after ftp actions?: false
#!/usr/bin/env ruby
require 'net/ftp'
Net::FTP.open('tigris.id.au', 'danial', 'xxxxxx') do |ftp|
ftp.binary = true
ftp.passive = true
ftp.get 'iso-8859-1.html'
content = File.read('iso-8859-1.html')
puts "default encoding: #{content.encoding}"
puts "default encoding valid?: #{content.valid_encoding?}"
forced_encoding = content.force_encoding('iso-8859-1')
puts "forced encoding: #{forced_encoding.encoding}"
puts "forced encoding valid?: #{forced_encoding.valid_encoding?}"
puts "content same after force encode?: #{content.to_s == forced_encoding.to_s}"
`cp iso-8859-1.html iso-8859-1_new.html`
ftp.put 'iso-8859-1_new.html'
sleep 1
ftp.get 'iso-8859-1_new.html'
new_content = File.read('iso-8859-1_new.html')
puts "encoding after ftp actions: #{new_content.encoding}"
puts "encoding valid after ftp actions?: #{new_content.valid_encoding?}"
puts "content match original after ftp actions?: #{content.to_s == new_content.to_s}"
puts "content match original force encoded after ftp actions?: #{forced_encoding.to_s == new_content.to_s}"
`rm iso-8859-1.html`
`rm iso-8859-1_new.html`
end
@brantz
Copy link

brantz commented May 8, 2015

Hi tigris, i am currently experiencing the same issues (ISO-8859-1 encoded files always end up as UTF-8 on the server).

Could you track down the issue or gain any further insight on this?

Cheers,
brantz

@xpac27
Copy link

xpac27 commented Jun 15, 2015

Same problem here 😢 have you made any progress on that issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment