Here is a quick overview on how to stream gzip with Ruby: For the purpose of an OS-agnostic script to gzip a SQL dump, I needed a Ruby script. Unfortunately, for large files Ruby would return an error if the file size is too large to fit in the RAM. The error looks as below:
1 |
"failed to allocate memory (<span style="color: #000000;">NoMemoryError</span>)" |
So I decided to do a sort of stream, combining some basic examples online. As I was not able to find something ready made, I combined file streaming with gzipping.
The script below does a SQL dump with date, gzips it in chunks with Ruby, and deletes the non-zipped file. (It is written for a Windows server, so change the system commands if moving to Linux.)
Additionally I wanted to check performance, so I also added timing.
Stream gzip with Ruby
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
require 'zlib' time=Time.new date=time.to_s[0,10] database='mydatabasename' source=database+'_'+date+'.sql' dest=source+'.gz' MEGABYTE = 1024*1024 # or change size of chunk if you find it to run faster in larger chunks. system 'mysqldump --port=3307 --host=127.0.0.1 -u root --password=mypassword '+database+'> '+source #do not use exec to start the mysql dump, or the ruby script will terminate. dump_time=Time.new class File def each_chunk(chunk_size=MEGABYTE) yield read(chunk_size) until eof? end end File.open(dest, 'w') do |d| gz = Zlib::GzipWriter.new(d) gz.mtime = File.mtime(source) gz.orig_name = source open(source, "rb") do |f| f.each_chunk() {|chunk| gz.write chunk } end gz.close end system 'del '+source # delete original dump #show some stats: puts source end_time=Time.new duration=end_time-time dump_duration=dump_time-time gzip_duration=end_time-dump_time puts 'Dump duration:'+dump_duration.to_s+' seconds' puts 'Gzip duration:'+gzip_duration.to_s+' seconds' puts 'Total duration:'+duration.to_s+' seconds' |
There you go, an easy way to stream gzip with Ruby. Enjoy!