python - Avoid reading all data into memory from hdf5 file when performing operations -
i'm new hdf5 format, , i'm using h5py these operations. i'm wondering happens when have large data set , perform kind of operation on it. instance:
>>> f = h5py.file("mytestfile.hdf5", "w") >>> dset = f.create_dataset("mydataset", (100000000,), dtype=np.float64) >>> dset[...] = np.linspace(0, 100, 100000000) >>> myresult = f["mydataset"][:] * 15 # graph myresult, or
is entirety of myresult in memory now? dset in memory? there way read disk memory part part, 100,000 points @ time or so, avoid overflow (let's assume might end data far larger example, say, 50gb worth)? there way efficiently, , in modular fashion (such saying data * 100
automatically operation in chunks, , stores results again in new place)? in advance.
Comments
Post a Comment