there large dataset, containing strings. want open via read_fwf using widths, this:
widths = [3, 7, ..., 9, 7] tp = pandas.read_fwf(file, widths=widths, header=none)
it me mark data, system crashes (works nrows=20000). decided chunk (e.g. 20000 rows), this:
cs = 20000 chunk in pd.read_fwf(file, widths=widths, header=none, chunksize=ch) ...: <some code using chunk>
my question is: should in loop merge (concatenate?) chunks in .csv file after processing of chunk (marking row, dropping or modyfiing column)? or there way?
i'm going assume since reading entire file
tp = pandas.read_fwf(file, widths=widths, header=none)
fails reading in chunks works, file big read @ once , encountered memoryerror.
in case, if can process data in chunks, concatenate results in csv, use chunk.to_csv
write csv in chunks:
filename = ... chunk in pd.read_fwf(file, widths=widths, header=none, chunksize=ch) # process chunk chunk.to_csv(filename, mode='a')
note mode='a'
opens file in append mode, output of each chunk.to_csv
call appended same file.