Live Dangerously.
Problem
I ran into a strange problem today. From a deep buried memory I remember that R supports import from compressed gzip files.
So I compressed a tab file(MAF file, ~ 1.5G ) and tried to read it using fread and read.table.
freadfailed.read.tablesucceeded, but only half of the rows are read in. What’s worse, when tried to generate a table usingtable(df$XXX), the R session ate up all my memory and crushed.
And it turns out to be a missing option quote="\"" in the command read.table.
Summary:
fread from data.table doesn’t support compressed files yet. But It’s super smart to figure all annoying I/O problems in R.
And it’s superfast.
The vanilla read.table requires carefully configured options. A day ruiner if you didn’t pay enough attention.