Today we are going to talk about the famous fwrite(fprintf) function , They are used for binary ( Text ) File write operation . because fwrite Function is the bottom layer I/O function , And it's used very often , Many users will question , How could it have room for performance improvement , If there is MathWorks It's already updated .


Flushing and Buffer


Unlike C/C++ Language , stay MATLAB Call in fwirte(fprintf) Function time ,MATLAB Will automatically refresh (flush) Output buffer (buffer), This point MATLAB There are no direct instructions in the help document for , It's just fopen There is an implicit reference to


See here , Many users should guess what to do . When writing data , If there is no cache (buffer), So every time I call fwrite function , A file write operation is needed , This way, it will seriously reduce I/O performance !


Let's look at a set of comparative data , First of all, we don't use caching

data = randi(250,1e6,1);  % Generate a set of data , For testing I/O performance

% Standard form , No application cache output - slow

fid = fopen('demo.dat', 'wb'); % w It's lowercase ,b For binary

tic, for idx = 1:length(data), fwrite(fid,data(idx)); end, toc


Elapsed time is 14.983201 seconds.

Consumed about 15s, Let's take a look at how caching is used

% Cache output mode – fast 3 times

fid = fopen('demo.dat', 'Wb'); % Be careful W It's capital ,b For binary

tic, for idx = 1:length(data), fwrite(fid,data(idx)); end, toc


Elapsed time is 5.616357 seconds.

After using the cache, the time is reduced to 5.6s, It seems that the efficiency has been improved a lot !

We can't understand MathWorks Why will fopen The default is non cached mode , But maybe they have a reason ! But when you're writing large data files , Recommend or use caching mode ('w') Well !


Chunking I/O


Write with cache , In fact, it is to reduce the number of access to data files , So in MATLAB in , If all the data is ready first , And then call it all at once fwrite Function to write it to a file

fid = fopen('demo.dat', 'wb'); % Notice that it's lowercase w

tic, fwrite(fid,data); toc


Elapsed time is 0.034816 seconds.

You can see that even in non cached mode , write in 30ms Our efficiency is still very high .


But if we read and write web files , It may take a long time due to network reasons , At this time, big data is broken into many small pieces , And then it's a wise choice to deal with it in blocks .

h = waitbar(0, 'Saving data...', 'Name','Saving data...');

cN = 100;  % number of steps/chunks

% Divide the data into chunks (last chunk is smaller than the rest)

dN = length(data);

dataIdx = [1 : round(dN/cN) : dN, dN+1];  % cN+1 chunk location indexes

% Save the data

fid = fopen('test.dat', 'Wb');

for chunkIdx = 0 : cN-1

   % Update the progress bar

   fraction = chunkIdx/cN;

   msg = sprintf('Saving data... (%d%% done)', round(100*fraction));

   waitbar(fraction, h, msg);

   % Save the next data chunk

   chunkData = data(dataIdx(chunkIdx+1) : dataIdx(chunkIdx+2)-1);





in general , The techniques in this paper are also applicable to fprintf and fwrite function , But storing and reading binaries (fwrite/fread) Far faster than text files (fprintf/fscanf/textscan), So if the data is not for human readability , Try to save in binary !


