声振论坛

 找回密码
 我要加入

QQ登录

只需一步,快速开始

查看: 2929|回复: 0

[编程技巧] Matlab读包含文本的数据文件

[复制链接]
发表于 2005-7-26 21:31 | 显示全部楼层 |阅读模式

马上注册,结交更多好友,享用更多功能,让你轻松玩转社区。

您需要 登录 才可以下载或查看,没有账号?我要加入

x
<TABLE cellSpacing=0 cellPadding=0 width=525 border=0>

<TR>
<TD><B><FONT color=#003333 size=+1>Can MATLAB Read Data Files with Text Headers?</FONT></B> </TD></TR>
<TR>
<TD><FONT size=-1><I>Revison: 1.1</I></FONT> </TD>
<TD align=right><FONT size=-1><I>Last Date Modified: 31-August-2000</FONT> </I></TD></TR>
<TR>
<TD colSpan=2>
<HR noShade SIZE=2>
</TD></TR>
<TR>
<TD colSpan=2>
<P>In this example, we examine a common application. Often data is stored in a text file as a descriptive header followed by columns of numbers. Because of the presence of the text header, the MATLAB load command does not load these files.
<P>We can therefore write our own function to load files in this format.</P>
<P><B>The Algorithm</B>
<P>We step through the file one line at a time, storing each line in a variable. As soon as we find a line beginning with a number, we stop looking for header information and read in the file directly using fscanf. At the end, we calculate the dimensions of the two outputs, and do some manipulation to ensure the correct format for each output.
<P>Let's see this work on a very simple file, called iotest.dat: <PRE>        Data file with header
        Created 4/9/95 at the MathWorks
        1 2 3
        4 5 6
      </PRE>
<OL>
<LI>We read in the first line, and find that there is no data to be read. We store this in an intermediate variable, line1, with size (1,21).
<br>
<LI>We read in the second line, and find that there is still no data. We store this in a second variable, line2, with size (1,31).
<p>
<LI>On the third line we have some data. We read in this line, keeping track of the number of data points read (which is 3). We now know that our data output will be n-by-3.
<p>
<LI>Now we stop reading in line by line and scan the rest of the data, all at once, into the variable data.
<p>
<LI>At this point we have three variables: line1, line2, and data.
<p>
<LI>We concatenate the two strings, line1 and line2, into a single output variable, header, of size (2,31).
<p>
<LI>Finally, we reshape the data variable to the correct size. This is straightforward, since we know that there are three columns, and we know what the size of the data variable is (in this case, six elements). </LI></OL>
<P>And here's what happens when you load this file: <PRE>        &gt;&gt; [h, d] = hdrload('iotest.dat')

        h =

        Data file with header
        Created 4/9/95 at the MathWorks

        d =

             1     2     3
             4     5     6
      </PRE>
<P><B>The Function Code</B>
<P>Now that we know what the function is supposed to do, we can show how it is implemented. <PRE>function [header, data] = hdrload(file)

% HDRLOAD Load data from an ASCII file containing a text header.
%     [header, data] = HDRLOAD('filename.ext') reads a data file
%     called 'filename.ext', which contains a text header.  There
%     is no default extension; any extensions must be explicitly
%     supplied.
%
%     The first output, HEADER, is the header information,
%     returned as a text array.
%     The second output, DATA, is the data matrix.  This data
%     matrix has the same dimensions as the data in the file, one
%     row per line of ASCII data in the file.  If the data is not
%     regularly spaced (i.e., each line of ASCII data does not
%     contain the same number of points), the data is returned as
%     a column vector.
%
%     Limitations:  No line of the text header can begin with
%     a number.  Only one header and data set will be read,
%     and the header must come before the data.
%
%     See also LOAD, SAVE, SPCONVERT, FSCANF, FPRINTF, STR2MAT.
%     See also the IOFUN directory.

% check number and type of arguments
if nargin &lt; 1
  error('Function requires one input argument');
elseif ~isstr(file)
  error('Input must be a string representing a filename');
end

% Open the file.  If this returns a -1, we did not open the file
% successfully.
fid = fopen(file);
if fid==-1
  error('File not found or permission denied');
  end

% Initialize loop variables
% We store the number of lines in the header, and the maximum
% length of any one line in the header.  These are used later
% in assigning the 'header' output variable.
no_lines = 0;
max_line = 0;

% We also store the number of columns in the data we read.  This
% way we can compute the size of the output based on the number
% of columns and the total number of data points.
ncols = 0;

% Finally, we initialize the data to [].
data = [];

% Start processing.
line = fgetl(fid);
if ~isstr(line)
  disp('Warning: file contains no header and no data')
  end;
[data, ncols, errmsg, nxtindex] = sscanf(line, '%f');

% One slight problem, pointed out by Peter vanderWal: If the
% first character of the line is 'e', then this will scan as
% 0.00e+00. We can trap this case specifically by using the
% 'next index' output: in the case of a stripped 'e' the next
% index is one, indicating zero characters read.  See the help
% entry for 'sscanf' for more information on this output
% parameter. We loop through the file one line at a time until
% we find some data.  After that point we stop checking for
% header information. This part of the program takes most of the
% processing time, because fgetl is relatively slow (compared to
% fscanf, which we will use later).
while isempty(data)|(nxtindex==1)
  no_lines = no_lines+1;
  max_line = max([max_line, length(line)]);
  % Create unique variable to hold this line of text information.
  % Store the last-read line in this variable.
  eval(['line', num2str(no_lines), '=line;']);
  line = fgetl(fid);
  if ~isstr(line)
    disp('Warning: file contains no data')
    break
    end;
  [data, ncols, errmsg, nxtindex] = sscanf(line, '%f');
  end % while

% Now that we have read in the first line of data, we can skip
% the processing that stores header information, and just read
% in the rest of the data.
data = [data; fscanf(fid, '%f')];
fclose(fid);

% Create header output from line information. The number of lines
% and the maximum line length are stored explicitly, and each
% line is stored in a unique variable using the 'eval' statement
% within the loop. Note that, if we knew a priori that the
% headers were 10 lines or less, we could use the STR2MAT
% function and save some work. First, initialize the header to an
% array of spaces.
header = setstr(' '*ones(no_lines, max_line));
for i = 1:no_lines
  varname = ['line' num2str(i)];
  % Note that we only assign this line variable to a subset of
  % this row of the header array.  We thus ensure that the matrix
  % sizes in the assignment are equal.
  eval(['header(i, 1:length(' varname ')) = ' varname ';']);
  end

% Resize output data, based on the number of columns (as returned
% from the sscanf of the first line of data) and the total number
% of data elements. Since the data was read in row-wise, and
% MATLAB stores data in columnwise format, we have to reverse the
% size arguments and then transpose the data.  If we read in
% irregularly spaced data, then the division we are about to do
% will not work. Therefore, we will trap the error with an EVAL
% call; if the reshape fails, we will just return the data as is.
eval('data = reshape(data, ncols, length(data)/ncols)'';', '');

% And we're done!
</PRE></TD></TR></TABLE>
回复
分享到:

使用道具 举报

您需要登录后才可以回帖 登录 | 我要加入

本版积分规则

QQ|小黑屋|Archiver|手机版|联系我们|声振论坛

GMT+8, 2024-5-6 01:44 , Processed in 0.049120 second(s), 17 queries , Gzip On.

Powered by Discuz! X3.4

Copyright © 2001-2021, Tencent Cloud.

快速回复 返回顶部 返回列表