Save evolution with hdf5

Jirawat S · Post by **Jirawat S** » 20 Jun 2023, 13:47

I simulate time evolution of MPS by using \( ExpMPOEvolution\) method. Since I still do not know which observable I needed, instead of compute expectation value during and update to a next step, I would like to save the state at each steps in a file so I can open it later and compute the observables.

My code looks like this

Code: Select all

   
self.engine = ExpMPOEvolution(self.state, self.model, self.options)
"""
   Save state at step.
"""
if filename!=None:
     self.write_hdf5(filename, step)
"""
   Update state to a next timestep.
"""
self.engine.run()
self.state = self.engine.psi

For saving in HDF5 format,

Code: Select all

    def write_hdf5(self, filename:str, step:int):
        with h5py.File(filename, 'a') as f:
            saver = hdf5_io.Hdf5Saver(f)
            self.state.save_hdf5(saver, f, subpath=str(step)+'/')

However, compare to not calling the save function, write_hdf5 seriously slow down the computation.

Code: Select all

Unsaved
100%|██████████| 500/500 [00:02<00:00, 215.74it/s]
Saved
100%|██████████| 500/500 [00:14<00:00, 35.61it/s]

Question: Do I implement it correctly? Or is there a better way to save with HDF5?
I'm not really familiar with HDF5.
Thanks.

Johannes · Post by **Johannes** » 20 Jun 2023, 18:26

First of all, I would in general advise you to think about the extra amount of storage you need for this, and whether this is sensible, of whether it's easier to re-run the time evolution when you decide that you really need extra measurements. At least think about whether you want to save it at *every* time step, or just at larger intervals, where you want to measure.

Second, if you use the tenpy simulation setup, things like this are already built in to a large extend

Now, let's talk about speed. TeNPys support for HDF5 is not optimized for maximum speed - it needs to recursively inspect and convert the data, and all of that is done in pure python at the moment, giving you a bit of overhead.
Instead, the main focus and motivation of the HDF5 support was:

have a data format that makes sense when you look at the data (you can open the file with a hdf5 viewer and gradually dig into the subpaths to see individual variables. There's a nice extension for jupyter lab that allows to do that!)
allow to partially load the data
have the format be stable and compatible (cross-platform between different OS and python versions, and even with other languages like julia / C++)
not pay too much overhead in terms of storage size (tensor are saved binary)

Now, if I see this correctly, your whole simulation took just 2 and 14 seconds, correct? This is too short to have a fair benchmark for real-world simulations - I don't care if the simulation runs 2 or 14 seconds, I care if it runs 20 hours or 150 hours.
But the (relative) overhead of saving to disk will be much less once you crank up the bond dimension: compuations scale as chi^3, while saving goes with the amount of data, so chi^2. And the overhead only depends on the number of tensors (and blocks within those tensors), so it is roughly constant!

If you're really worried about the HDF5 overhead, you can just use pickle instead of HDF5 - this might be faster, but you loose the other advantages 1-3.

And just for completenes: opening / closing files also takes a bit of time, and you might want to avoid that overhead. You could just open the HDF5 file in the beginning, and re-use it! Watch out, though, that the HDF5Saver class assumes that the data doesn't change during save. To get correct results, you need to either use a new HDF5Saver at each write, or make sure you copy objects that were modified - in this particular case, a shallow copy of psi with again shallow copies of psi._B, psi._S should be enough, if I'm not mistaken. No guarantee, though - test it yourself, if you really need this!

TeNPy Forum

Save evolution with hdf5

Save evolution with hdf5

Re: Save evolution with hdf5