Clicky

Fortran Wiki
Stream Input Output

Stream Input/Output in Fortran

Background - Record-based versus Stream I/O

When Fortan was invented in 1957 its I/O facilities were entirely record-based.  This was fine if you were reading or writing a file where all records had the same length, or alternatively a text file, where each line up to the line terminator counted as a record whatever its length.  Line terminators depend upon the operating system, they may be line-feeds, or carriage-returns, or possibly both, but with record-based I/O you can ignore such differences and simply read or write whole lines.  If, on the other hand, you want to use Fortran to read a file generated by some instrument, or a file exported from another software package such as a spreadsheet or database system, then record-based I/O is a serious obstacle.  Often these files have neither line terminators nor fixed-length records and their structure is often quite complex.   The solution is to use Stream I/O which was introduced with the Fortran 2003 Standard and is now widely implemented.

 
Stream I/O, as with sequential and direct-access I/O comes in two flavours: formatted and unformatted.  It is the unformatted form which provides the more powerful facilities, so I will describe this first.


I think, incidentally, that the hardware used on early computers had an influence on the I/O facilities of early languages.  Fortran was invented by a team working for IBM where the main data medium was the 80-column punched card, so record-oriented I/O would have seemed natural.  The C language, on the other hand, was invented by people using computers made by Digital Equipment Corporation where paper tape was much more common, and C supported stream I/O from the outset.  I must admit that not everyone agrees with this suggestion.

Unformatted Stream I/O

In unformatted stream I/O the file is treated as a sequence of file storage units.  These units are in principle system-dependent, but the Standard recommends the use of bytes, and I doubt if any other unit will be used in practice.  The facilities are closely modeled on those of the binary stream file in C.


The file is opened using an OPEN statement containing ACCESS = “STREAM” (note that FORM = “UNFORMATTED” is the default so is optional).  A new file can be written using simple WRITE statements, just like those required to populate an unformatted sequential file, using any mixture of data types you choose.  The effect of each WRITE is simply to append the appropriate sequence of bytes to the file, uninterrupted by record markers.  Similarly when reading an existing binary file opened for stream access, the READ statement will move the current position marker through the file by the number of bytes needed to satisfy its I/O list.  For numerical data types the order of bytes within each item will depend upon whether the processor uses big-endian or little-endian number formats.  Generally you will need to be concerned about this only if you transfer a file from a platform using one endian convention to one using the opposite convention (and the same concerns apply to record-based unformatted reads and writes).   The gfortran compiler has non-standard options in its I/O statements to allow the endian-ness to be specified.


Here is a trivial example of writing a file using unformatted stream access (note that Fortran keywords are shown in upper-case only to distinguish them from user-chosen names):

  PROGRAM writeUstream
IMPLICIT NONE
INTEGER :: myvalue = 12345, mypos, out
OPEN(NEWUNIT=out, FILE="ustream.demo", STATUS="NEW", ACCESS="STREAM")
WRITE(out) "first"
WRITE(out) "second"
INQUIRE(UNIT=out, POS=mypos)
PRINT *, "Myvalue will be written at position ", mypos
WRITE(out) myvalue
CLOSE(UNIT=out)
END PROGRAM writeUstream

The first two WRITE statements will put a total of 11 bytes on the file; assuming the integer value occupies a 32-bit (4-byte) word then the third WRITE will extend the file to 15 bytes.  An INQUIRE statement can be used at any point to find out the next character position in the file; in this case it should return 12.  When as here the preceding WRITE was appending to the file, the value returned will be the current length of the file in bytes plus one. 


The power of stream I/O derives from the fact that a READ or WRITE statement can specify the position at which the operation is to start using a POS= specifier, remembering that POS=1 means the start of the file.  For example if we were to open the file just produced we could access parts of it like this::

  PROGRAM readUstream
IMPLICIT NONE
CHARACTER :: string*3
INTEGER :: n, in
OPEN(NEWUNIT=in, FILE="ustream.demo", STATUS="OLD", ACCESS="STREAM")
READ(in, POS=4) string
READ(in, POS=12) n
END PROGRAM readUstream

Then the character variable would be set to “sts”, being the contents from the end of the word “first” and the start of “second”, and the integer n would read in the number derived originally from myvalue.  This provides a form of random access to a file, similar to that provided by direct-access files, but with addresses specified to the byte rather than to the record.  It is permitted to write an unformatted stream file at any position: if this was beyond the previous end of the file, then the contents of the gap are left undefined.

In fact if one uses a POS= specifier in a WRITE statement with an empty I/O list, then it resets the position in the file without actually writing anything to the file.  If the position precedes the previous end of the file and the list is not empty, then the byte positions specified are re-written, but the length of the file is unchanged.  This is quite different to what happens if you try to write data to some intermediate point in a (non-stream) sequential file: the file length is reset and the contents beyond the point at which the WRITE was executed are all lost.  But note that if you use a WRITE with a specific postion set by POS= then any subsequent WRITE without a POS specifier will simply write data to the file immediately afterwards.  If the current file is longer the next WRITE needs to specify a position at the end of the file if you simply want to append to it.  Note that if a READ statement attempts to read at a position in a file which has never been written then the results are undefined, as one might expect.


A more practical example is given here, which opens a file of type .DBF and lists some of its contents.  The .DBF format used to be used by many PC-based database management systems including dBASE, Alpha-5, and Paradox.  Actually there are several variants of the format, this is merely one of the most common.  The file consists of a header of up to 32 bytes, followed by the column details, and then the data records themselves.

PROGRAM readbf
! Reads .DBF files, lists header and first few records.
! Clive Page, 2005 July 9
IMPLICIT NONE
INTEGER, PARAMETER :: maxcol = 128
CHARACTER :: colname(maxcol)*11, coltype(maxcol)*1
INTEGER :: colwidth(maxcol), coldec(maxcol), coloff(maxcol)
CHARACTER :: version*1, year*1, month*1, day*1, &
ca*4, cwidth, cdec, string*100, flag*1
INTEGER :: in, nrecs, ncols, icol, irec, ioffset, dataoff, cw, k
INTEGER(kind=selected_int_kind(3)) :: lhead, lenrec ! =integer*2
!
OPEN(NEWUNIT=in, file="recs.dbf", status='old', ACCESS='stream')
READ(in) version, year, month, day, nrecs, lhead, lenrec
ncols = (lhead - 32)/32
WRITE(*, '(a,i4, a,i4, 2("-",i2.2), 3(i6,a))') &
'Version ', ichar(version), &
' Date ', ichar(year)+1900, ichar(month), ichar(day), &
nrecs, ' rows', ncols, ' columns', lenrec, ' bytes/row'
!
WRITE(*,*)'Col ---Name--- T Width Decimals Offset'
ioffset = 1
DO icol = 1,ncols
READ(in, POS=32*icol+1) colname(icol), coltype(icol), ca, cwidth, cdec
k = INDEX(colname(icol), char(0))
colname(icol)(k:) = " "
colwidth(icol) = ichar(cwidth)
coldec(icol) = ichar(cdec)
coloff(icol) = ioffset
ioffset = ioffset + colwidth(icol)
WRITE(*, '(i3,1x,a,1x,a,2i6,i8)') icol, colname(icol), &
coltype(icol), colwidth(icol), coldec(icol), coloff(icol)
END DO
! print contents of first three records
dataoff = 32 * ncols + 35
DO irec = 1,3
WRITE(*,'(a,i0)') 'Record ', irec
READ(in, pos=dataoff + (irec-1)*lenrec) flag
WRITE(*, '(2A)') 'Deleted flag = ', flag
DO icol = 1,ncols
ioffset = dataoff + (irec-1) * lenrec + coloff(icol)
cw = colwidth(icol)
READ(in, pos=ioffset) string(1:cw)
WRITE(*, '(i3,1x,3a)') icol, colname(icol), '=', string(1:cw)
END DO
END DO
END PROGRAM readbf

Formatted Stream Files

Formatted stream I/O is not all that useful; I suspect they were provided to give Fortran essentially the same I/O facilities as C.  It provides an alternative way of reading or writing a formatted sequential file, i.e. a text file, but with a little extra flexibility.  Such files are opened with ACCESS=“STREAM” and FORM=“FORMATTED”, and every READ and WRITE statement must include a format specification.  It appears to be legal to use list-directed formatting or even NAMELIST input/output on such a file, but it is hard to see a good reason for wanting to do this. 


The additional power of steam I/O arises from the fact that the READ or WRITE statement can specify the position at which the transfer starts, using a POS= specifier, but this cannot be chosen freely (one might say randomly) as for unformatted stream files, but it must be either the position of 1 (meaning the start of the file), or a position previously obtained using an INQUIRE statement with the file positioned after some earlier read or write operations.   The reason for this restriction, one can guess, is that the actual number of characters in a text file is somewhat system dependent and hard to calculate.  


When reading from a formatted stream file the usual rules concerning reading beyond the end of a record apply: the default setting is PAD=“YES” which means that a record will appear to be extended with an indefinite number of spaces.  When writing the records have no defined record terminators, but the intrinsic function NEW_LINE is provided to allow a record terminator to be produced (it takes a single character argument, the value of which is not used, but required to specify the kind of character value in use).  The output from the NEW_LINE intrinsic function is usually ASCII 10 (Carriage Return) but strictly system-dependent and might, on some opeerating systems, be a sequence of more than one character I suppose.


Another difference from unformatted stream output is that (according to my reading of the Standard) whenever a WRITE statement writes to a position preceding the end of the file, it has the effect of truncating the file at that position, i.e. all subsequent data in the file are lost. 


Here is a program fragment showing formatted stream output:

  OPEN(NEWUNIT=out, FILE="mystream", STATUS="REPLACE", ACCESS="STREAM", FORM="FORMATTED")
WRITE(out, "(4A)") "first line", NEW_LINE("x"), "second line", NEW_LINE("x")
INQUIRE(UNIT=out, POS=mypos)

Now the integer variable mypos contains a value which can be used in a subsequent READ or WRITE statement using POS=mypos to return to the same point in the file.


Note that it is advisable to insert a final newline sequence in the file or the last line of the file will be incomplete and may be hard to read as a piece of text.  There are other minor restrictions: the BACKSPACE statements cannot be used on a stream file (as there are no records to move back over), nor may ENDFILE statement - but this seems to me to be a totally redundant statement anyway.


Portability

All currently maintained Fortran compilers, including gfortran, support stream I/O so that programs using it should have few portability issues.


Clive Page

First draft: 2005 July 9.

Revised: 2005 Oct 31, following feedback from Richard Maine and others on the comp.lang.fortran newsgroup.

Revised 2006 April 14 following feedback from James van Buskirk on portability.

Revised 2021 February 25: many updates to text, use of NEWUNIT in OPEN statements.


As far as I can tell the Intel Fortran compiler also fully supports stream access IO. -IzaakBeekman