my typical use of fortran begins reading in file of unknown size (usually 5-100mb). current approach array allocation involves reading file twice. first determine size of problem (to allocate arrays) , second time read data arrays.
are there better approaches size determination/array allocation? read automatic array allocation (example below) in post seemed easier.
array = [array,new_data]
what options , pros , cons?
i'll bite, though question teetering close off-topicality. options are:
- read file once array size, allocate, read again.
- read piece-by-piece, (re-)allocating go. choose size of piece read wish (or, perhaps, think speedy case).
- always, always, work files contain metadata tell interested program how data there is; example block header line telling how many data elements in next block.
option 3 best far. little thought, , 1 whole line of code, @ beginning of project , wasted time , effort saved down line. don't have jump on hdf5 or similar heavyweight file design method, adopt enough discipline last useful life of contents of file. iteration-by-iteration dumps simulation of universe, home-brewed approach (be honest, you're person who's ever going @ them). data gathered @ approximate cost of $1m per tb (satellite observations, offshore seismic traces, etc) hdf5 or similar.
option 1 fine too. it's not have wait tapes rewind between reads more. (well, do, they're in niche these days, , de-archiving system move files tape disk if they're used.)
option 2 faff. may worst performing on largest files worst performance may within nano-century of best. if that's important check out.
if want quantification of opinions run own experiments on files on hardware.
ps haven't got clue how costs 1tb of satellite or seismic data, it's factoid invented support argument.
Comments
Post a Comment