Merge branch 'develop' into feature/eigenvalue_smoothing

2025-07-01 17:29:27 +02:00 · 2022-03-31 11:46:23 +01:00 · 2022-03-31 11:46:23 +01:00 · 587cf6f9b0
commit 587cf6f9b0
parent 8cd96c584e 40fea8a698
11 changed files with 157 additions and 82 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -2,53 +2,54 @@

 All notable changes to this project will be documented in this file.

-## [2.0.0] - 2022-??-??
+## [2.0.0] - 2022-03-31
 ### Added
 - The possibility to work with Monte Carlo histories which are evenly or unevenly spaced was added.
 - `cov_Obs` added as a possibility to propagate the error of non Monte Carlo data together with Monte Carlo data.
 - `CObs` class added which can handle complex valued Markov chain Monte Carlo data and the corresponding error propagation.
 - Matrix to matrix operations like the matrix inverse now also work for complex matrices and matrices containing entries that are not `Obs` but `float` or `int`.
- Support for a new `json.gz` file format was added
+- Support for a new `json.gz` file format was added.
 - The Corr class now has additional methods like `reverse`, `T_symmetry`, `correlate` and `reweight`.
- `Corr.m_eff` can now cope with periodic and anti-periodic correlation functions
- Forward, backward and improved variants of the first and second derivative were added to the `Corr` class
- The `linalg` module now has explicit functions `inv` and `cholesky`.
+- `Corr.m_eff` can now cope with periodic and anti-periodic correlation functions.
+- Forward, backward and improved variants of the first and second derivative were added to the `Corr` class.
+- `GEVP` functionality of the `Corr` class was reworked and improved.
+- The `linalg` module now has explicit functions `inv`, `cholesky` and `det`.
 - `Obs` objects now have methods `is_zero` and `is_zero_within_error` as well as overloaded comparison operations.
- Functions to convert Obs data to or from jackknife was added.
- Alternative matrix multiplication routine `jack_matmul` was added to `linalg` module which makes use of the jackknife approximation and is much faster for large matrices.
+- Functions to convert `Obs` data to or from jackknife was added.
+- Alternative matrix multiplication routines `einsum` and `jack_matmul` were added to `linalg` module which make use of the jackknife approximation and are much faster for large matrices.
 - Additional input routines for npr data added to `input.hadrons`.
 - The `sfcf` and `openQCD` input modules can now handle all recent file type versions.
- `extract_t0` can now visualize the extraction on the fly
+- `extract_t0` can now visualize the extraction on the fly.
 - Module added which provides the Dirac gamma matrices in the Grid convention.
 - Version number added.

 ### Changed
 - The internal bookkeeping system for ensembles/replica was changed. The separator for replica is now `|`.
 - The fit functions were renamed to `least_squares` and `total_least_squares`.
- The output of the fit functions is now a dedicated results class which keeps track of all relevant information
+- The output of the fit functions is now a dedicated results class which keeps track of all relevant information.
 - The fit functions can now deal with provided covariance matrices.
- `covariance` can now operate on a list or array of `Obs` and returns a matrix
+- `covariance` can now operate on a list or array of `Obs` and returns a matrix. The covariance estimate by pyerrors is now always positive semi-definite (within machine precision. Various warnings and exceptions were added for cases in which estimated covariances are close to singular.
 - The convention for the fit range in the Corr class has been changed.
- Various method of the `Corr` class were renamed
+- Various method of the `Corr` class were renamed.
 - `Obs.print` was renamed to `Obs.details` and the output was improved.
 - The default value for `Corr.prange` is now `None`.
 - The `input` module was restructured to contain one submodule per data source.
 - Performance of Obs.__init__ improved.

 ### Removed
- The function `plot_corrs` was deprecated as all its functionality is now contained within `Corr.show`
- `fits.covariance_matrix` was removed as it is now redundant with the functionality of `covariance`
- The kwarg `bias_correction` in `derived_observable` was removed
- Obs no longer have an attribute `e_Q`
- Removed `fits.fit_exp`
- Removed jackknife module
+- The function `plot_corrs` was deprecated as all its functionality is now contained within `Corr.show`.
+- `fits.covariance_matrix` was removed as it is now redundant with the functionality of `covariance`.
+- The kwarg `bias_correction` in `derived_observable` was removed.
+- Obs no longer have an attribute `e_Q`.
+- Removed `fits.fit_exp`.
+- Removed jackknife module.

 ## [1.1.0] - 2021-10-11
 ### Added
 - `Corr` class added
 - `roots` module added which can find the roots of a function that depends on Monte Carlo data via pyerrors `Obs`
 - `input/hadrons` module added which can read hdf5 files written by [Hadrons](https://github.com/aportelli/Hadrons)
- `read_rwms` can now read reweighting factors in the format used by openQCD-2.0 
+- `read_rwms` can now read reweighting factors in the format used by openQCD-2.0.

 ## [1.0.1] - 2020-11-03
 ### Fixed
--- a/pyerrors/correlators.py
+++ b/pyerrors/correlators.py
@ -794,7 +794,34 @@ class Corr:
            else:
                raise Exception("'save' has to be a string.")

-        return
+    def spaghetti_plot(self, logscale=True):
+        """Produces a spaghetti plot of the correlator suited to monitor exceptional configurations.
+
+        Parameters
+        ----------
+        logscale : bool
+            Determines whether the scale of the y-axis is logarithmic or standard.
+        """
+        if self.N != 1:
+            raise Exception("Correlator needs to be projected first.")
+
+        mc_names = list(set([item for sublist in [o[0].mc_names for o in self.content if o is not None] for item in sublist]))
+        x0_vals = [n for (n, o) in zip(np.arange(self.T), self.content) if o is not None]
+
+        for name in mc_names:
+            data = np.array([o[0].deltas[name] + o[0].r_values[name] for o in self.content if o is not None]).T
+
+            fig = plt.figure()
+            ax = fig.add_subplot(111)
+            for dat in data:
+                ax.plot(x0_vals, dat, ls='-', marker='')
+
+            if logscale is True:
+                ax.set_yscale('log')
+
+            ax.set_xlabel(r'$x_0 / a$')
+            plt.title(name)
+            plt.draw()

    def dump(self, filename, datatype="json.gz", **kwargs):
        """Dumps the Corr into a file of chosen type
--- a/pyerrors/fits.py
+++ b/pyerrors/fits.py
@ -464,8 +464,10 @@ def _standard_fit(x, y, func, silent=False, **kwargs):
        corr = covariance(y, correlation=True, **kwargs)
        covdiag = np.diag(1 / np.asarray(dy_f))
        condn = np.linalg.cond(corr)
-        if condn > 1e8:
-            warnings.warn("Correlation matrix may be ill-conditioned, condition number: %1.2e" % (condn), RuntimeWarning)
+        if condn > 0.1 / np.finfo(float).eps:
+            raise Exception(f"Cannot invert correlation matrix as its condition number exceeds machine precision ({condn:1.2e})")
+        if condn > 1 / np.sqrt(np.finfo(float).eps):
+            warnings.warn("Correlation matrix may be ill-conditioned, condition number: {%1.2e}" % (condn), RuntimeWarning)
        chol = np.linalg.cholesky(corr)
        chol_inv = np.linalg.inv(chol)
        chol_inv = np.dot(chol_inv, covdiag)
@ -623,7 +625,7 @@ def qqplot(x, o_y, func, p):
    fit_stop = my_x[-1]
    samples = np.arange(fit_start, fit_stop, 0.01)
    plt.plot(samples, samples, 'k--', zorder=11, label='Standard normal distribution')
-    plt.plot(samples, probplot[1][0] * samples + probplot[1][1], zorder=10, label='Least squares fit, r=' + str(np.around(probplot[1][2], 3)))
+    plt.plot(samples, probplot[1][0] * samples + probplot[1][1], zorder=10, label='Least squares fit, r=' + str(np.around(probplot[1][2], 3)), marker='', ls='-')

    plt.xlabel('Theoretical quantiles')
    plt.ylabel('Ordered Values')
--- a/pyerrors/input/json.py
+++ b/pyerrors/input/json.py
@ -358,7 +358,7 @@ def _parse_json_dict(json_dict, verbose=True, full_output=False):
            ret = Obs([[ddi[0] + values[0] for ddi in di] for di in od['deltas']], od['names'], idl=od['idl'])
            ret.is_merged = od['is_merged']
        else:
-            ret = Obs([], [])
+            ret = Obs([], [], means=[])
            ret._value = values[0]
        for name in cd:
            co = cd[name][0]
@ -383,7 +383,7 @@ def _parse_json_dict(json_dict, verbose=True, full_output=False):
                ret.append(Obs([list(di[:, i] + values[i]) for di in od['deltas']], od['names'], idl=od['idl']))
                ret[-1].is_merged = od['is_merged']
            else:
-                ret.append(Obs([], []))
+                ret.append(Obs([], [], means=[]))
                ret[-1]._value = values[i]
                print('Created Obs with means= ', values[i])
            for name in cd:
@ -410,7 +410,7 @@ def _parse_json_dict(json_dict, verbose=True, full_output=False):
                ret.append(Obs([di[:, i] + values[i] for di in od['deltas']], od['names'], idl=od['idl']))
                ret[-1].is_merged = od['is_merged']
            else:
-                ret.append(Obs([], []))
+                ret.append(Obs([], [], means=[]))
                ret[-1]._value = values[i]
            for name in cd:
                co = cd[name][i]
--- a/pyerrors/obs.py
+++ b/pyerrors/obs.py
@ -90,51 +90,44 @@ class Obs:
        self.deltas = {}
        self._covobs = {}

+        self._value = 0
+        self.N = 0
+        self.is_merged = {}
        self.idl = {}
-        if len(samples):
-            if idl is not None:
-                for name, idx in sorted(zip(names, idl)):
-                    if isinstance(idx, range):
-                        self.idl[name] = idx
-                    elif isinstance(idx, (list, np.ndarray)):
-                        dc = np.unique(np.diff(idx))
-                        if np.any(dc < 0):
-                            raise Exception("Unsorted idx for idl[%s]" % (name))
-                        if len(dc) == 1:
-                            self.idl[name] = range(idx[0], idx[-1] + dc[0], dc[0])
-                        else:
-                            self.idl[name] = list(idx)
+        if idl is not None:
+            for name, idx in sorted(zip(names, idl)):
+                if isinstance(idx, range):
+                    self.idl[name] = idx
+                elif isinstance(idx, (list, np.ndarray)):
+                    dc = np.unique(np.diff(idx))
+                    if np.any(dc < 0):
+                        raise Exception("Unsorted idx for idl[%s]" % (name))
+                    if len(dc) == 1:
+                        self.idl[name] = range(idx[0], idx[-1] + dc[0], dc[0])
                    else:
-                        raise Exception('incompatible type for idl[%s].' % (name))
-            else:
-                for name, sample in sorted(zip(names, samples)):
-                    self.idl[name] = range(1, len(sample) + 1)
-
-            self._value = 0
-            self.N = 0
-            if kwargs.get("means") is not None:
-                for name, sample, mean in sorted(zip(names, samples, kwargs.get("means"))):
-                    self.shape[name] = len(self.idl[name])
-                    self.N += self.shape[name]
-                    self.r_values[name] = mean
-                    self.deltas[name] = sample
-            else:
-                for name, sample in sorted(zip(names, samples)):
-                    self.shape[name] = len(self.idl[name])
-                    self.N += self.shape[name]
-                    if len(sample) != self.shape[name]:
-                        raise Exception('Incompatible samples and idx for %s: %d vs. %d' % (name, len(sample), self.shape[name]))
-                    self.r_values[name] = np.mean(sample)
-                    self.deltas[name] = sample - self.r_values[name]
-                    self._value += self.shape[name] * self.r_values[name]
-                self._value /= self.N
-
-            self.is_merged = {}
-
+                        self.idl[name] = list(idx)
+                else:
+                    raise Exception('incompatible type for idl[%s].' % (name))
        else:
-            self._value = 0
-            self.is_merged = {}
-            self.N = 0
+            for name, sample in sorted(zip(names, samples)):
+                self.idl[name] = range(1, len(sample) + 1)
+
+        if kwargs.get("means") is not None:
+            for name, sample, mean in sorted(zip(names, samples, kwargs.get("means"))):
+                self.shape[name] = len(self.idl[name])
+                self.N += self.shape[name]
+                self.r_values[name] = mean
+                self.deltas[name] = sample
+        else:
+            for name, sample in sorted(zip(names, samples)):
+                self.shape[name] = len(self.idl[name])
+                self.N += self.shape[name]
+                if len(sample) != self.shape[name]:
+                    raise Exception('Incompatible samples and idx for %s: %d vs. %d' % (name, len(sample), self.shape[name]))
+                self.r_values[name] = np.mean(sample)
+                self.deltas[name] = sample - self.r_values[name]
+                self._value += self.shape[name] * self.r_values[name]
+            self._value /= self.N

        self._dvalue = 0.0
        self.ddvalue = 0.0
@ -443,17 +436,15 @@ class Obs:
        """
        return self.is_zero() or np.abs(self.value) <= sigma * self._dvalue

-    def is_zero(self, rtol=1.e-5, atol=1.e-8):
+    def is_zero(self, atol=1e-10):
        """Checks whether the observable is zero within a given tolerance.

        Parameters
        ----------
-        rtol : float
-            Relative tolerance (for details see numpy documentation).
        atol : float
            Absolute tolerance (for details see numpy documentation).
        """
-        return np.isclose(0.0, self.value, rtol, atol) and all(np.allclose(0.0, delta, rtol, atol) for delta in self.deltas.values()) and all(np.allclose(0.0, delta.errsq(), rtol, atol) for delta in self.covobs.values())
+        return np.isclose(0.0, self.value, 1e-14, atol) and all(np.allclose(0.0, delta, 1e-14, atol) for delta in self.deltas.values()) and all(np.allclose(0.0, delta.errsq(), 1e-14, atol) for delta in self.covobs.values())

    def plot_tauint(self, save=None):
        """Plot integrated autocorrelation time for each ensemble.
@ -584,7 +575,7 @@ class Obs:
        ensemble to the error and returns a dictionary containing the fractions."""
        if not hasattr(self, 'e_dvalue'):
            raise Exception('Run the gamma method first.')
-        if self._dvalue == 0.0:
+        if np.isclose(0.0, self._dvalue, atol=1e-15):
            raise Exception('Error is 0.0')
        labels = self.e_names
        sizes = [self.e_dvalue[name] ** 2 for name in labels] / self._dvalue ** 2
@ -731,6 +722,9 @@ class Obs:
    def __rsub__(self, y):
        return -1 * (self - y)

+    def __pos__(self):
+        return self
+
    def __neg__(self):
        return -1 * self

@ -913,8 +907,11 @@ class CObs:
    def __abs__(self):
        return np.sqrt(self.real**2 + self.imag**2)

-    def __neg__(other):
-        return -1 * other
+    def __pos__(self):
+        return self
+
+    def __neg__(self):
+        return -1 * self

    def __eq__(self, other):
        return self.real == other.real and self.imag == other.imag
@ -1511,7 +1508,8 @@ def import_jackknife(jacks, name, idl=None):
    length = len(jacks) - 1
    prj = (np.ones((length, length)) - (length - 1) * np.identity(length))
    samples = jacks[1:] @ prj
-    new_obs = Obs([samples], [name], idl=idl)
+    mean = np.mean(samples)
+    new_obs = Obs([samples - mean], [name], idl=idl, means=[mean])
    new_obs._value = jacks[0]
    return new_obs

@ -1570,7 +1568,7 @@ def cov_Obs(means, cov, name, grad=None):
        co : Covobs
            Covobs to be embedded into the Obs
        """
-        o = Obs([], [])
+        o = Obs([], [], means=[])
        o._value = co.value
        o.names.append(co.name)
        o._covobs[co.name] = co
--- a/pyerrors/version.py
+++ b/pyerrors/version.py
@ -1 +1 @@
-__version__ = "2.0.0-rc.3+dev"
+__version__ = "2.1.0+dev"
--- a/setup.py
+++ b/setup.py
@ -3,7 +3,7 @@
 from setuptools import setup, find_packages

 setup(name='pyerrors',
-      version='2.0.0-rc.3+dev',
+      version='2.1.0+dev',
      description='Error analysis for lattice QCD',
      author='Fabian Joswig',
      author_email='fabian.joswig@ed.ac.uk',
--- a/tests/correlators_test.py
+++ b/tests/correlators_test.py
@ -325,10 +325,20 @@ def test_corr_vector_operations():
    assert np.all([o == 0 for o in ((my_corr * my_vec) / my_vec) - my_corr])
    assert np.all([o == 0 for o in ((my_corr / my_vec) * my_vec) - my_corr])

-def _gen_corr(val):
+
+def test_spaghetti_plot():
+    corr = _gen_corr(12, 50)
+    corr += pe.pseudo_Obs(0.0, 0.1, 'another_ensemble')
+    corr += pe.cov_Obs(0.0, 0.01 ** 2, 'covobs')
+
+    corr.spaghetti_plot(True)
+    corr.spaghetti_plot(False)
+
+
+def _gen_corr(val, samples=2000):
    corr_content = []
    for t in range(16):
-        corr_content.append(pe.pseudo_Obs(val, 0.1, 't', 2000))
+        corr_content.append(pe.pseudo_Obs(val, 0.1, 't', samples))

    return pe.correlators.Corr(corr_content)

--- a/tests/fits_test.py
+++ b/tests/fits_test.py
@ -375,6 +375,12 @@ def test_fit_no_autograd():
        pe.total_least_squares(oy, oy, func)


+def test_singular_correlated_fit():
+    obs1 = pe.pseudo_Obs(1.0, 0.1, 'test')
+    with pytest.raises(Exception):
+        pe.fits.fit_lin([0, 1], [obs1, obs1], correlated_fit=True)
+
+
 def test_ks_test():
    def f(a, x):
        y = a[0] + a[1] * x
--- a/tests/linalg_test.py
+++ b/tests/linalg_test.py
@ -314,6 +314,9 @@ def test_matrix_functions():
    # Check determinant
    assert pe.linalg.det(np.diag(np.diag(matrix))) == np.prod(np.diag(matrix))

+    with pytest.raises(Exception):
+        pe.linalg.det(5)
+
    pe.linalg.pinv(matrix[:,:3])


@ -347,3 +350,10 @@ def test_complex_matrix_operations():
        diff = ta * tb - 1
        for (i, j), entry in np.ndenumerate(diff):
            assert entry.is_zero()
+
+
+def test_complex_matrix_real_entries():
+    my_mat = get_complex_matrix(4)
+    my_mat[0, 1] = 4
+    my_mat[2, 0] = pe.Obs([np.random.normal(1.0, 0.1, 100)], ['t'])
+    assert np.all((my_mat @ pe.linalg.inv(my_mat) - np.identity(4)) == 0)
--- a/tests/obs_test.py
+++ b/tests/obs_test.py
@ -51,6 +51,12 @@ def test_Obs_exceptions():
    my_obs.gamma_method()
    my_obs.details()

+    obs = pe.Obs([np.random.normal(1.0, 0.1, 100)], ['t'])
+    one = obs / obs
+    one.gamma_method()
+    with pytest.raises(Exception):
+        one.plot_piechart()
+
 def test_dump():
    value = np.random.normal(5, 10)
    dvalue = np.abs(np.random.normal(0, 1))
@ -86,6 +92,8 @@ def test_comparison():
    assert test_obs2 != value2
    assert test_obs1 != test_obs2
    assert test_obs2 != test_obs1
+    assert +test_obs1 == test_obs1
+    assert -test_obs1 == 0 - test_obs1


 def test_function_overloading():
@ -142,7 +150,6 @@ def test_overloading_vectorization():

 def test_gamma_method_standard_data():
    for data in [np.tile([1, -1], 1000),
-                 np.random.rand(100001),
                 np.zeros(1195),
                 np.sin(np.sqrt(2) * np.pi * np.arange(1812))]:
        test_obs = pe.Obs([data], ['t'])
@ -368,6 +375,8 @@ def test_cobs():
    obs2 = pe.pseudo_Obs(-0.2, 0.03, 't')

    my_cobs = pe.CObs(obs1, obs2)
+    assert +my_cobs == my_cobs
+    assert -my_cobs == 0 - my_cobs
    my_cobs == my_cobs
    str(my_cobs)
    repr(my_cobs)
@ -707,7 +716,7 @@ def test_covariance_rank_deficient():

 def test_empty_obs():
    o = pe.Obs([np.random.rand(100)], ['test'])
-    q = o + pe.Obs([], [])
+    q = o + pe.Obs([], [], means=[])
    assert q == o


@ -759,3 +768,15 @@ def test_merge_idx():
    assert(new_idx[-1] > new_idx[0])
    for i in range(1, len(new_idx)):
        assert(new_idx[i - 1] < new_idx[i])
+
+
+def test_cobs_array():
+    cobs = pe.Obs([np.random.normal(1.0, 0.1, 100)], ['t']) * (1 + 2j)
+    np.identity(4) + cobs
+    cobs + np.identity(4)
+    np.identity(4) - cobs
+    cobs - np.identity(4)
+    np.identity(4) * cobs
+    cobs * np.identity(4)
+    np.identity(4) / cobs
+    cobs / np.ones((4, 4))