Live Dangerously.

Problem

I ran into a strange problem today. From a deep buried memory I remember that R supports import from compressed gzip files.

So I compressed a tab file(MAF file, ~ 1.5G ) and tried to read it using fread and read.table.

  • fread failed.

  • read.table succeeded, but only half of the rows are read in. What’s worse, when tried to generate a table using table(df$XXX), the R session ate up all my memory and crushed.

And it turns out to be a missing option quote="\"" in the command read.table.

Summary:

fread from data.table doesn’t support compressed files yet. But It’s super smart to figure all annoying I/O problems in R.

And it’s superfast.

The vanilla read.table requires carefully configured options. A day ruiner if you didn’t pay enough attention.