Merge branch 'fjosw:develop' into develop

2025-07-12 06:39:26 +02:00 · 2022-01-18 13:21:01 +01:00 · 2022-01-18 13:21:01 +01:00 · 39f176585e
commit 39f176585e
parent 145a211bd0 268f71fa19
22 changed files with 1050 additions and 1316 deletions
--- a/.github/workflows/pytest.yml
+++ b/.github/workflows/pytest.yml
@ -11,11 +11,17 @@ on:

 jobs:
  pytest:
-    runs-on: ubuntu-latest
+    runs-on: ${{ matrix.os }}
    strategy:
      fail-fast: true                                                                                                                                                       
      matrix:
-        python-version: ["3.6", "3.7", "3.8", "3.9", "3.10"]
+        os: [ubuntu-latest]
+        python-version: ["3.7", "3.8", "3.9", "3.10"]
+        include:
+          - os: macos-latest
+            python-version: 3.9 
+          - os: windows-latest
+            python-version: 3.9 

    steps:
      - name: Checkout source
--- a/README.md
+++ b/README.md
@ -1,9 +1,9 @@
-[![flake8](https://github.com/fjosw/pyerrors/actions/workflows/flake8.yml/badge.svg)](https://github.com/fjosw/pyerrors/actions/workflows/flake8.yml) [![pytest](https://github.com/fjosw/pyerrors/actions/workflows/pytest.yml/badge.svg)](https://github.com/fjosw/pyerrors/actions/workflows/pytest.yml) [![docs](https://github.com/fjosw/pyerrors/actions/workflows/docs.yml/badge.svg)](https://github.com/fjosw/pyerrors/actions/workflows/docs.yml) [![](https://img.shields.io/badge/python-3.6+-blue.svg)](https://www.python.org/downloads/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
+[![flake8](https://github.com/fjosw/pyerrors/actions/workflows/flake8.yml/badge.svg)](https://github.com/fjosw/pyerrors/actions/workflows/flake8.yml) [![pytest](https://github.com/fjosw/pyerrors/actions/workflows/pytest.yml/badge.svg)](https://github.com/fjosw/pyerrors/actions/workflows/pytest.yml) [![docs](https://github.com/fjosw/pyerrors/actions/workflows/docs.yml/badge.svg)](https://github.com/fjosw/pyerrors/actions/workflows/docs.yml) [![](https://img.shields.io/badge/python-3.7+-blue.svg)](https://www.python.org/downloads/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
 # pyerrors
 `pyerrors` is a python package for error computation and propagation of Markov chain Monte Carlo data.

 - **Documentation:** https://fjosw.github.io/pyerrors/pyerrors.html
- **Examples**: https://github.com/fjosw/pyerrors/tree/develop/examples (Do not work properly at the moment)
+- **Examples**: https://github.com/fjosw/pyerrors/tree/develop/examples
 - **Contributing:** https://github.com/fjosw/pyerrors/blob/develop/CONTRIBUTING.md
 - **Bug reports:** https://github.com/fjosw/pyerrors/issues

--- a/examples/01_basic_example.ipynb
+++ b/examples/01_basic_example.ipynb
--- a/examples/02_correlators.ipynb
+++ b/examples/02_correlators.ipynb
--- a/examples/03_pcac_example.ipynb
+++ b/examples/03_pcac_example.ipynb
--- a/examples/04_fit_example.ipynb
+++ b/examples/04_fit_example.ipynb
--- a/examples/05_matrix_operations.ipynb
+++ b/examples/05_matrix_operations.ipynb
@ -51,7 +51,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "The standard matrix product can be performed with @"
+    "The standard matrix product can be performed with `@`"
   ]
  },
  {
@ -101,13 +101,38 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "Mathematical functions work elementwise"
+    "For large matrices overloading the standard operator `@` can become inefficient as pyerrors has to perform a large number of elementary opeations. For these situations pyerrors provides the function `linalg.matmul` which optimizes the required automatic differentiation. The function can take an arbitray number of operands."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "[[Obs[78.12099999999998] Obs[-22.909999999999997]]\n",
+      " [Obs[-22.909999999999997] Obs[7.1]]]\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(pe.linalg.matmul(matrix, matrix, matrix))  # Equivalent to matrix @ matrix @ matrix but faster for large matrices"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Mathematical functions work elementwise"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
@ -126,12 +151,12 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "For a vector of `Obs`, we again use np.asarray to end up with the correct object"
+    "For a vector of `Obs`, we again use `np.asarray` to end up with the correct object"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 6,
+   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
@ -160,14 +185,14 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 7,
+   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "[Obs[7.2(1.7)] Obs[-1.00(45)]]\n"
+      "[Obs[7.2(1.7)] Obs[-1.00(46)]]\n"
     ]
    }
   ],
@ -182,40 +207,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "## Matrix to scalar operations\n",
-    "If we want to apply a numpy matrix function with a scalar return value we can use `scalar_mat_op`. __Here we need to use the autograd wrapped version of numpy__  (imported as anp) to use automatic differentiation."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 8,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "det \t Obs[3.10(28)]\n",
-      "trace \t Obs[5.10(20)]\n",
-      "norm \t Obs[4.45(19)]\n"
-     ]
-    }
-   ],
-   "source": [
-    "import autograd.numpy as anp  # Thinly-wrapped numpy\n",
-    "funcs = [anp.linalg.det, anp.trace, anp.linalg.norm]\n",
-    "\n",
-    "for i, func in enumerate(funcs):\n",
-    "    res = pe.linalg.scalar_mat_op(func, matrix)\n",
-    "    res.gamma_method()\n",
-    "    print(func.__name__, '\\t', res)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "For matrix operations which are not supported by autograd we can use numerical differentiation"
+    "`pyerrors` provides the user with wrappers to the `numpy.linalg` functions which work on `Obs` valued matrices. We can for example calculate the determinant of the matrix via"
   ]
  },
  {
@ -227,26 +219,21 @@
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "cond    \t Obs[6.23(58)]\n",
-      "expm_cond    \t Obs[4.45(19)]\n"
+      "3.10(28)\n"
     ]
    }
   ],
   "source": [
-    "funcs = [np.linalg.cond, scipy.linalg.expm_cond]\n",
-    "\n",
-    "for i, func in enumerate(funcs):\n",
-    "    res = pe.linalg.scalar_mat_op(func, matrix, num_grad=True)\n",
-    "    res.gamma_method()\n",
-    "    print(func.__name__, '   \\t', res)"
+    "det = pe.linalg.det(matrix)\n",
+    "det.gamma_method()\n",
+    "print(det)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "## Matrix to matrix operations\n",
-    "For matrix operations with a matrix as return value we can use another wrapper `mat_mat_op`. Take as an example the cholesky decompostion. __Here we need to use the autograd wrapped version of numpy__ (imported as anp) to use automatic differentiation."
+    "The cholesky decomposition can be obtained as follows"
   ]
  },
  {
@ -259,12 +246,12 @@
     "output_type": "stream",
     "text": [
      "[[Obs[2.025(49)] Obs[0.0]]\n",
-      " [Obs[-0.494(51)] Obs[0.870(29)]]]\n"
+      " [Obs[-0.494(50)] Obs[0.870(29)]]]\n"
     ]
    }
   ],
   "source": [
-    "cholesky = pe.linalg.mat_mat_op(anp.linalg.cholesky, matrix)\n",
+    "cholesky = pe.linalg.cholesky(matrix)\n",
    "for (i, j), entry in np.ndenumerate(cholesky):\n",
    "    entry.gamma_method()\n",
    "print(cholesky)"
@ -321,7 +308,7 @@
    }
   ],
   "source": [
-    "inv = pe.linalg.mat_mat_op(anp.linalg.inv, cholesky)\n",
+    "inv = pe.linalg.inv(cholesky)\n",
    "for (i, j), entry in np.ndenumerate(inv):\n",
    "    entry.gamma_method()\n",
    "print(inv)\n",
@ -334,59 +321,14 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "Matrix to matrix operations which are not supported by autograd can also be computed with numeric differentiation"
+    "## Eigenvalues and eigenvectors\n",
+    "We can also compute eigenvalues and eigenvectors of symmetric matrices with a special wrapper `eigh`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "orth\n",
-      "[[Obs[-0.9592(76)] Obs[0.283(26)]]\n",
-      " [Obs[0.283(26)] Obs[0.9592(76)]]]\n",
-      "expm\n",
-      "[[Obs[75(15)] Obs[-21.4(4.1)]]\n",
-      " [Obs[-21.4(4.1)] Obs[8.3(1.4)]]]\n",
-      "logm\n",
-      "[[Obs[1.334(57)] Obs[-0.496(61)]]\n",
-      " [Obs[-0.496(61)] Obs[-0.203(50)]]]\n",
-      "sinhm\n",
-      "[[Obs[37.3(7.4)] Obs[-10.8(2.1)]]\n",
-      " [Obs[-10.8(2.1)] Obs[3.94(68)]]]\n",
-      "sqrtm\n",
-      "[[Obs[1.996(51)] Obs[-0.341(37)]]\n",
-      " [Obs[-0.341(37)] Obs[0.940(14)]]]\n"
-     ]
-    }
-   ],
-   "source": [
-    "funcs = [scipy.linalg.orth, scipy.linalg.expm, scipy.linalg.logm, scipy.linalg.sinhm, scipy.linalg.sqrtm]\n",
-    "\n",
-    "for i,func in enumerate(funcs):\n",
-    "    res = pe.linalg.mat_mat_op(func, matrix, num_grad=True)\n",
-    "    for (i, j), entry in np.ndenumerate(res):\n",
-    "        entry.gamma_method()\n",
-    "    print(func.__name__)\n",
-    "    print(res)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Eigenvalues and eigenvectors\n",
-    "We can also compute eigenvalues and eigenvectors of symmetric matrices with a special wrapper `eigh`"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 14,
-   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
@ -395,8 +337,8 @@
      "Eigenvalues:\n",
      "[Obs[0.705(57)] Obs[4.39(19)]]\n",
      "Eigenvectors:\n",
-      "[[Obs[-0.283(26)] Obs[-0.9592(76)]]\n",
-      " [Obs[-0.9592(76)] Obs[0.283(26)]]]\n"
+      "[[Obs[-0.283(26)] Obs[-0.9592(75)]]\n",
+      " [Obs[-0.9592(75)] Obs[0.283(26)]]]\n"
     ]
    }
   ],
@ -421,7 +363,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 15,
+   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
@ -451,7 +393,7 @@
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": "Python 3",
+   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
@ -465,7 +407,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.6.9"
+   "version": "3.8.10"
  }
 },
 "nbformat": 4,
--- a/examples/data/B1k2_f_A.p
+++ b/examples/data/B1k2_f_A.p
--- a/examples/data/B1k2_f_P.p
+++ b/examples/data/B1k2_f_P.p
--- a/examples/data/correlator_test.json.gz
+++ b/examples/data/correlator_test.json.gz
--- a/examples/data/correlator_test.p
+++ b/examples/data/correlator_test.p
--- a/examples/data/f_A.json.gz
+++ b/examples/data/f_A.json.gz
--- a/examples/data/f_P.json.gz
+++ b/examples/data/f_P.json.gz
--- a/pyerrors/init.py
+++ b/pyerrors/init.py
@ -193,6 +193,56 @@ Make sure to check the autocorrelation time with e.g. `pyerrors.obs.Obs.plot_rho
 For the full API see `pyerrors.obs.Obs`.

 # Correlators
+When one is not interested in single observables but correlation functions, `pyerrors` offers the `Corr` class which simplifies the corresponding error propagation and provides the user with a set of standard methods. In order to initialize a `Corr` objects one needs to arrange the data as a list of `Obs´
+```python
+my_corr = pe.Corr([obs_0, obs_1, obs_2, obs_3])
+print(my_corr)
+> x0/a	Corr(x0/a)
+> ------------------
+> 0	 0.7957(80)
+> 1	 0.5156(51)
+> 2	 0.3227(33)
+> 3	 0.2041(21)
+```
+In case the correlation functions are not defined on the outermost timeslices, for example because of fixed boundary conditions, a padding can be introduced.
+```python
+my_corr = pe.Corr([obs_0, obs_1, obs_2, obs_3], padding_front=1, padding_back=1)
+print(my_corr)
+> x0/a	Corr(x0/a)
+> ------------------
+> 0
+> 1	 0.7957(80)
+> 2	 0.5156(51)
+> 3	 0.3227(33)
+> 4	 0.2041(21)
+> 5
+```
+The individual entries of a correlator can be accessed via slicing
+```python
+print(my_corr[3])
+> 0.3227(33)
+```
+Error propagation with the `Corr` class works very similar to `Obs` objects. Mathematical operations are overloaded and `Corr` objects can be computed together with other `Corr` objects, `Obs` objects or real numbers and integers.
+```python
+my_new_corr = 0.3 * my_corr[2] * my_corr * my_corr + 12 / my_corr
+```
+
+`pyerrors` provides the user with a set of regularly used methods for the manipulation of correlator objects:
+- `Corr.gamma_method` applies the gamma method to all entries of the correlator.
+- `Corr.m_eff` to construct effective masses. Various variants for periodic and fixed temporal boundary conditions are available.
+- `Corr.deriv` returns the first derivative of the correlator as `Corr`. Different discretizations of the numerical derivative are available.
+- `Corr.second_deriv` returns the second derivative of the correlator as `Corr`. Different discretizations of the numerical derivative are available.
+- `Corr.symmetric` symmetrizes parity even correlations functions, assuming periodic boundary conditions.
+- `Corr.anti_symmetric` anti-symmetrizes parity odd correlations functions, assuming periodic boundary conditions.
+- `Corr.T_symmetry` averages a correlator with its time symmetry partner, assuming fixed boundary conditions.
+- `Corr.plateau` extracts a plateau value from the correlator in a given range.
+- `Corr.roll` periodically shifts the correlator.
+- `Corr.reverse` reverses the time ordering of the correlator.
+- `Corr.correlate` constructs a disconnected correlation function from the correlator and another `Corr` or `Obs` object.
+- `Corr.reweight` reweights the correlator.
+
+`pyerrors` can also handle matrices of correlation functions and extract energy states from these matrices via a generalized eigenvalue problem (see `pyerrors.correlators.Corr.GEVP).
+
 For the full API see `pyerrors.correlators.Corr`.

 # Complex observables
--- a/pyerrors/correlators.py
+++ b/pyerrors/correlators.py
@ -74,7 +74,7 @@ class Corr:

    @property
    def reweighted(self):
-        bool_array = np.array([list(map(lambda x: x.reweighted, o)) for o in list(filter(None.__ne__, self.content))])
+        bool_array = np.array([list(map(lambda x: x.reweighted, o)) for o in [x for x in self.content if x is not None]])
        if np.all(bool_array == 1):
            return True
        elif np.all(bool_array == 0):
@ -614,7 +614,7 @@ class Corr:
        if self.N != 1:
            raise Exception("Correlator must be projected before plotting")
        if x_range is None:
-            x_range = [0, self.T]
+            x_range = [0, self.T - 1]

        fig = plt.figure()
        ax1 = fig.add_subplot(111)
--- a/pyerrors/fits.py
+++ b/pyerrors/fits.py
@ -1,3 +1,4 @@
+import gc
 from collections.abc import Sequence
 import warnings
 import numpy as np
@ -7,6 +8,7 @@ import scipy.stats
 import matplotlib.pyplot as plt
 from matplotlib import gridspec
 from scipy.odr import ODR, Model, RealData
+from scipy.stats import chi2
 import iminuit
 from autograd import jacobian
 from autograd import elementwise_grad as egrad
@ -45,6 +47,8 @@ class Fit_result(Sequence):
            my_str += 'residual variance = ' + f'{self.residual_variance:2.6f}' + '\n'
        if hasattr(self, 'chisquare_by_expected_chisquare'):
            my_str += '\u03C7\u00b2/\u03C7\u00b2exp  = ' + f'{self.chisquare_by_expected_chisquare:2.6f}' + '\n'
+        if hasattr(self, 'p_value'):
+            my_str += 'p-value   = ' + f'{self.p_value:2.4f}' + '\n'
        my_str += 'Fit parameters:\n'
        for i_par, par in enumerate(self.fit_parameters):
            my_str += str(i_par) + '\t' + ' ' * int(par >= 0) + str(par).rjust(int(par < 0.0)) + '\n'
@ -306,6 +310,7 @@ def total_least_squares(x, y, func, silent=False, **kwargs):

    output.odr_chisquare = odr_chisquare(np.concatenate((out.beta, out.xplus.ravel())))
    output.dof = x.shape[-1] - n_parms
+    output.p_value = 1 - chi2.cdf(output.odr_chisquare, output.dof)

    return output

@ -619,6 +624,7 @@ def _standard_fit(x, y, func, silent=False, **kwargs):

    output.chisquare = chisqfunc(fit_result.x)
    output.dof = x.shape[-1] - n_parms
+    output.p_value = 1 - chi2.cdf(output.chisquare, output.dof)

    if kwargs.get('resplot') is True:
        residual_plot(x, y, func, result)
@ -703,7 +709,7 @@ def residual_plot(x, y, func, fit_res):
    ax1.plot(x, residuals, 'ko', ls='none', markersize=5)
    ax1.tick_params(direction='out')
    ax1.tick_params(axis="x", bottom=True, top=True, labelbottom=True)
-    ax1.axhline(y=0.0, ls='--', color='k')
+    ax1.axhline(y=0.0, ls='--', color='k', marker=" ")
    ax1.fill_between(x_samples, -1.0, 1.0, alpha=0.1, facecolor='k')
    ax1.set_xlim([xstart, xstop])
    ax1.set_ylabel('Residuals')
@ -740,3 +746,43 @@ def error_band(x, func, beta):
    err = np.array(err)

    return err
+
+
+def ks_test(objects=None):
+    """Performs a Kolmogorov–Smirnov test for the p-values of all fit object.
+
+    Parameters
+    ----------
+    objects : list
+        List of fit results to include in the analysis (optional).
+    """
+
+    if objects is None:
+        obs_list = []
+        for obj in gc.get_objects():
+            if isinstance(obj, Fit_result):
+                obs_list.append(obj)
+    else:
+        obs_list = objects
+
+    p_values = [o.p_value for o in obs_list]
+
+    bins = len(p_values)
+    x = np.arange(0, 1.001, 0.001)
+    plt.plot(x, x, 'k', zorder=1)
+    plt.xlim(0, 1)
+    plt.ylim(0, 1)
+    plt.xlabel('p-value')
+    plt.ylabel('Cumulative probability')
+    plt.title(str(bins) + ' p-values')
+
+    n = np.arange(1, bins + 1) / np.float64(bins)
+    Xs = np.sort(p_values)
+    plt.step(Xs, n)
+    diffs = n - Xs
+    loc_max_diff = np.argmax(np.abs(diffs))
+    loc = Xs[loc_max_diff]
+    plt.annotate('', xy=(loc, loc), xytext=(loc, loc + diffs[loc_max_diff]), arrowprops=dict(arrowstyle='<->', shrinkA=0, shrinkB=0))
+    plt.draw()
+
+    print(scipy.stats.kstest(p_values, 'uniform'))
--- a/pyerrors/input/hadrons.py
+++ b/pyerrors/input/hadrons.py
@ -58,16 +58,13 @@ def read_meson_hd5(path, filestem, ens_id, meson='meson_0', tree='meson', idl=No
    meson : str
        label of the meson to be extracted, standard value meson_0 which
        corresponds to the pseudoscalar pseudoscalar two-point function.
-    tree : str
-        Label of the upmost directory in the hdf5 file, default 'meson'
-        for outputs of the Meson module. Can be altered to read input
-        from other modules with similar structures.
    idl : range
        If specified only configurations in the given range are read in.
    """

    files, idx = _get_files(path, filestem, idl)

+    tree = meson.rsplit('_')[0]
    corr_data = []
    infos = []
    for hd5_file in files:
--- a/pyerrors/input/openQCD.py
+++ b/pyerrors/input/openQCD.py
@ -1,6 +1,3 @@
-#!/usr/bin/env python
-# coding: utf-8
-
 import os
 import fnmatch
 import re
@ -39,8 +36,9 @@ def read_rwms(path, prefix, version='2.0', names=None, **kwargs):

    if not ls:
        raise Exception('Error, directory not found')
-
-    # Exclude files with different names
+    if 'files' in kwargs:
+        ls = kwargs.get('files')
+    else:
        for exc in ls:
            if not fnmatch.fnmatch(exc, prefix + '*' + postfix + '.dat'):
                ls = list(set(ls) - set([exc]))
@ -64,9 +62,9 @@ def read_rwms(path, prefix, version='2.0', names=None, **kwargs):
    else:
        r_stop = [None] * replica

-    print('Read reweighting factors from', prefix[:-1], ',', replica, 'replica', end='')
+    print('Read reweighting factors from', prefix[:-1], ',',
+          replica, 'replica', end='')

-    # Adjust replica names to new bookmarking system
    if names is None:
        rep_names = []
        for entry in ls:
@ -85,7 +83,6 @@ def read_rwms(path, prefix, version='2.0', names=None, **kwargs):
        tmp_array = []
        with open(path + '/' + ls[rep], 'rb') as fp:

-            # header
            t = fp.read(4)  # number of reweighting factors
            if rep == 0:
                nrw = struct.unpack('i', t)[0]
@ -94,7 +91,7 @@ def read_rwms(path, prefix, version='2.0', names=None, **kwargs):
                for k in range(nrw):
                    deltas.append([])
            else:
-                if ((nrw != struct.unpack('i', t)[0] and (not version == '2.0')) or (nrw != struct.unpack('i', t)[0] / 2 and version == '2.0')):  # little weird if-clause due to the /2 operation needed.
+                if ((nrw != struct.unpack('i', t)[0] and (not version == '2.0')) or (nrw != struct.unpack('i', t)[0] / 2 and version == '2.0')):
                    raise Exception('Error: different number of reweighting factors for replicum', rep)

            for k in range(nrw):
@ -106,7 +103,6 @@ def read_rwms(path, prefix, version='2.0', names=None, **kwargs):
                for i in range(nrw):
                    t = fp.read(4)
                    nfct.append(struct.unpack('i', t)[0])
-                # print('nfct: ', nfct) # Hasenbusch factor, 1 for rat reweighting
            else:
                for i in range(nrw):
                    nfct.append(1)
@ -119,7 +115,6 @@ def read_rwms(path, prefix, version='2.0', names=None, **kwargs):
                if not struct.unpack('i', fp.read(4))[0] == 0:
                    print('something is wrong!')

-            # body
            while 0 < 1:
                t = fp.read(4)
                if len(t) < 4:
@ -135,8 +130,11 @@ def read_rwms(path, prefix, version='2.0', names=None, **kwargs):
                        for j in range(tmpd['n'][0]):
                            tmp_nfct *= np.mean(np.exp(-np.asarray(tmp_rw[j])))
                            if print_err:
-                                print(config_no, i, j, np.mean(np.exp(-np.asarray(tmp_rw[j]))), np.std(np.exp(-np.asarray(tmp_rw[j]))))
-                                print('Sources:', np.exp(-np.asarray(tmp_rw[j])))
+                                print(config_no, i, j,
+                                      np.mean(np.exp(-np.asarray(tmp_rw[j]))),
+                                      np.std(np.exp(-np.asarray(tmp_rw[j]))))
+                                print('Sources:',
+                                      np.exp(-np.asarray(tmp_rw[j])))
                                print('Partial factor:', tmp_nfct)
                    elif version == '1.6' or version == '1.4':
                        tmp_nfct = 1.0
@ -146,7 +144,9 @@ def read_rwms(path, prefix, version='2.0', names=None, **kwargs):
                            tmp_rw = struct.unpack('d' * nsrc[i], t)
                            tmp_nfct *= np.mean(np.exp(-np.asarray(tmp_rw)))
                            if print_err:
-                                print(config_no, i, j, np.mean(np.exp(-np.asarray(tmp_rw))), np.std(np.exp(-np.asarray(tmp_rw))))
+                                print(config_no, i, j,
+                                      np.mean(np.exp(-np.asarray(tmp_rw))),
+                                      np.std(np.exp(-np.asarray(tmp_rw))))
                                print('Sources:', np.exp(-np.asarray(tmp_rw)))
                                print('Partial factor:', tmp_nfct)
                    tmp_array[i].append(tmp_nfct)
@ -165,11 +165,14 @@ def read_rwms(path, prefix, version='2.0', names=None, **kwargs):
    return result


-def extract_t0(path, prefix, dtr_read, xmin, spatial_extent, fit_range=5, **kwargs):
+def extract_t0(path, prefix, dtr_read, xmin,
+               spatial_extent, fit_range=5, **kwargs):
    """Extract t0 from given .ms.dat files. Returns t0 as Obs.

-    It is assumed that all boundary effects have sufficiently decayed at x0=xmin.
-    The data around the zero crossing of t^2<E> - 0.3 is fitted with a linear function
+    It is assumed that all boundary effects have
+    sufficiently decayed at x0=xmin.
+    The data around the zero crossing of t^2<E> - 0.3
+    is fitted with a linear function
    from which the exact root is extracted.
    Only works with openQCD v 1.2.

@ -180,14 +183,17 @@ def extract_t0(path, prefix, dtr_read, xmin, spatial_extent, fit_range=5, **kwar
    prefix : str
        Ensemble prefix
    dtr_read : int
-        Determines how many trajectories should be skipped when reading the ms.dat files.
+        Determines how many trajectories should be skipped
+        when reading the ms.dat files.
        Corresponds to dtr_cnfg / dtr_ms in the openQCD input file.
    xmin : int
-        First timeslice where the boundary effects have sufficiently decayed.
+        First timeslice where the boundary
+        effects have sufficiently decayed.
    spatial_extent : int
        spatial extent of the lattice, required for normalization.
    fit_range : int
-        Number of data points left and right of the zero crossing to be included in the linear fit. (Default: 5)
+        Number of data points left and right of the zero
+        crossing to be included in the linear fit. (Default: 5)
    r_start : list
        list which contains the first config to be read for each replicum.
    r_stop: list
@ -204,7 +210,6 @@ def extract_t0(path, prefix, dtr_read, xmin, spatial_extent, fit_range=5, **kwar
    if not ls:
        raise Exception('Error, directory not found')

-    # Exclude files with different names
    for exc in ls:
        if not fnmatch.fnmatch(exc, prefix + '*.ms.dat'):
            ls = list(set(ls) - set([exc]))
@ -216,7 +221,6 @@ def extract_t0(path, prefix, dtr_read, xmin, spatial_extent, fit_range=5, **kwar
        r_start = kwargs.get('r_start')
        if len(r_start) != replica:
            raise Exception('r_start does not match number of replicas')
-        # Adjust Configuration numbering to python index
        r_start = [o - 1 if o else None for o in r_start]
    else:
        r_start = [None] * replica
@ -235,7 +239,6 @@ def extract_t0(path, prefix, dtr_read, xmin, spatial_extent, fit_range=5, **kwar
    for rep in range(replica):

        with open(path + '/' + ls[rep], 'rb') as fp:
-            # Read header
            t = fp.read(12)
            header = struct.unpack('iii', t)
            if rep == 0:
@ -254,7 +257,6 @@ def extract_t0(path, prefix, dtr_read, xmin, spatial_extent, fit_range=5, **kwar

            Ysl = []

-            # Read body
            while 0 < 1:
                t = fp.read(4)
                if(len(t) < 4):
@ -273,7 +275,9 @@ def extract_t0(path, prefix, dtr_read, xmin, spatial_extent, fit_range=5, **kwar

        Ysum.append([])
        for i, item in enumerate(Ysl):
-            Ysum[-1].append([np.mean(item[current + xmin:current + tmax - xmin]) for current in range(0, len(item), tmax)])
+            Ysum[-1].append([np.mean(item[current + xmin:
+                             current + tmax - xmin])
+                            for current in range(0, len(item), tmax)])

    t2E_dict = {}
    for n in range(nn + 1):
@ -286,10 +290,13 @@ def extract_t0(path, prefix, dtr_read, xmin, spatial_extent, fit_range=5, **kwar
        new_obs = Obs(samples, [(w.split('.'))[0] for w in ls])
        t2E_dict[n * dn * eps] = (n * dn * eps) ** 2 * new_obs / (spatial_extent ** 3) - 0.3

-    zero_crossing = np.argmax(np.array([o.value for o in t2E_dict.values()]) > 0.0)
+    zero_crossing = np.argmax(np.array(
+        [o.value for o in t2E_dict.values()]) > 0.0)

-    x = list(t2E_dict.keys())[zero_crossing - fit_range: zero_crossing + fit_range]
-    y = list(t2E_dict.values())[zero_crossing - fit_range: zero_crossing + fit_range]
+    x = list(t2E_dict.keys())[zero_crossing - fit_range:
+                              zero_crossing + fit_range]
+    y = list(t2E_dict.values())[zero_crossing - fit_range:
+                                zero_crossing + fit_range]
    [o.gamma_method() for o in y]

    fit_result = fit_lin(x, y)
@ -313,12 +320,6 @@ def _parse_array_openQCD2(d, n, size, wa, quadrupel=False):
    return arr


-# mimic the read_array routine of openQCD-2.0.
-# fp is the opened file handle
-# returns the dict array
-# at this point we only parse a 2d array
-# d = 2
-# n = [nfct[irw], 2*nsrc[irw]]
 def _read_array_openQCD2(fp):
    t = fp.read(4)
    d = struct.unpack('i', t)[0]
@ -343,3 +344,232 @@ def _read_array_openQCD2(fp):

    arr = _parse_array_openQCD2(d, n, size, tmp, quadrupel=True)
    return {'d': d, 'n': n, 'size': size, 'arr': arr}
+
+
+def read_qtop(path, prefix, c, dtr_cnfg=1, version="1.2", **kwargs):
+    """Read qtop format from given folder structure.
+
+    Parameters
+    ----------
+    path:
+        path of the measurement files
+    prefix:
+        prefix of the measurement files, e.g. <prefix>_id0_r0.ms.dat
+    c: double
+        Smearing radius in units of the lattice extent, c = sqrt(8 t0) / L
+    dtr_cnfg: int
+        (optional) parameter that specifies the number of trajectories
+        between two configs.
+        if it is not set, the distance between two measurements
+        in the file is assumed to be
+        the distance between two configurations.
+    steps: int
+        (optional) (maybe only necessary for openQCD2.0)
+        nt step size, guessed if not given
+    version: str
+        version string of the openQCD (sfqcd) version used to create
+        the ensemble
+    L: int
+        spatial length of the lattice in L/a.
+        HAS to be set if version != sfqcd, since openQCD does not provide
+        this in the header
+    r_start: list
+        offset of the first ensemble, making it easier to match
+        later on with other Obs
+    r_stop: list
+        last configurations that need to be read (per replicum)
+    files: list
+        specify the exact files that need to be read
+        from path, practical if e.g. only one replicum is needed
+    names: list
+        Alternative labeling for replicas/ensembles.
+        Has to have the appropriate length
+    """
+    known_versions = ["1.0", "1.2", "1.4", "1.6", "2.0", "sfqcd"]
+
+    if version not in known_versions:
+        raise Exception("Unknown openQCD version.")
+    if "steps" in kwargs:
+        steps = kwargs.get("steps")
+    if version == "sfqcd":
+        if "L" in kwargs:
+            supposed_L = kwargs.get("L")
+    else:
+        if "L" not in kwargs:
+            raise Exception("This version of openQCD needs you to provide the spatial length of the lattice as parameter 'L'.")
+        else:
+            L = kwargs.get("L")
+    r_start = 1
+    if "r_start" in kwargs:
+        r_start = kwargs.get("r_start")
+    if "r_stop" in kwargs:
+        r_stop = kwargs.get("r_stop")
+    if "files" in kwargs:
+        files = kwargs.get("files")
+    else:
+        found = []
+        files = []
+        for (dirpath, dirnames, filenames) in os.walk(path + "/"):
+            # print(filenames)
+            found.extend(filenames)
+            break
+        for f in found:
+            if fnmatch.fnmatch(f, prefix + "*" + ".ms.dat"):
+                files.append(f)
+        print(files)
+    rep_names = []
+
+    deltas = []
+    idl = []
+    for rep, file in enumerate(files):
+        with open(path + "/" + file, "rb") as fp:
+            t = fp.read(12)
+            header = struct.unpack('<iii', t)
+            # step size in integration steps "dnms"
+            dn = header[0]
+            # number of measurements, so "ntot"/dn
+            nn = header[1]
+            # lattice T/a
+            tmax = header[2]
+            if version == "sfqcd":
+                t = fp.read(12)
+                Ls = struct.unpack('<iii', t)
+                if(Ls[0] == Ls[1] and Ls[1] == Ls[2]):
+                    L = Ls[0]
+                    if not (supposed_L == L):
+                        raise Exception("It seems the length given in the header and by you contradict each other")
+                else:
+                    raise Exception("Found more than one spatial length in header!")
+
+            print('dnms:', dn)
+            print('nn:', nn)
+            print('tmax:', tmax)
+            t = fp.read(8)
+            eps = struct.unpack('d', t)[0]
+            print('eps:', eps)
+
+            Q = []
+            ncs = []
+            while 0 < 1:
+                t = fp.read(4)
+                if(len(t) < 4):
+                    break
+                ncs.append(struct.unpack('i', t)[0])
+                # Wsl
+                t = fp.read(8 * tmax * (nn + 1))
+                # Ysl
+                t = fp.read(8 * tmax * (nn + 1))
+                # Qsl, which is asked for in this method
+                t = fp.read(8 * tmax * (nn + 1))
+                # unpack the array of Qtops,
+                # on each timeslice t=0,...,tmax-1 and the
+                # measurement number in = 0...nn (see README.qcd1)
+                tmpd = struct.unpack('d' * tmax * (nn + 1), t)
+                Q.append(tmpd)
+
+        if not len(set([ncs[i] - ncs[i - 1] for i in range(1, len(ncs))])):
+            raise Exception("Irregularities in stepsize found")
+        else:
+            if 'steps' in kwargs:
+                if steps != ncs[1] - ncs[0]:
+                    raise Exception("steps and the found stepsize are not the same")
+            else:
+                steps = ncs[1] - ncs[0]
+
+        print(len(Q))
+        print('max_t:', dn * (nn) * eps)
+
+        t_aim = (c * L) ** 2 / 8
+
+        print('t_aim:', t_aim)
+        index_aim = round(t_aim / eps / dn)
+        print('index_aim:', index_aim)
+
+        Q_sum = []
+        for i, item in enumerate(Q):
+            Q_sum.append([sum(item[current:current + tmax])
+                         for current in range(0, len(item), tmax)])
+        print(len(Q_sum))
+        print(len(Q_sum[0]))
+        Q_round = []
+        for i in range(len(Q) // dtr_cnfg):
+            Q_round.append(round(Q_sum[dtr_cnfg * i][index_aim]))
+        if len(Q_round) != len(ncs) // dtr_cnfg:
+            raise Exception("qtops and ncs dont have the same length")
+
+        truncated_file = file[:-7]
+        print(truncated_file)
+        idl_start = 1
+
+        if "r_start" in kwargs:
+            Q_round = Q_round[r_start[rep]:]
+            idl_start = r_start[rep]
+        if "r_stop" in kwargs:
+            Q_round = Q_round[:r_stop[rep]]
+        idl_stop = idl_start + len(Q_round)
+        # keyword "names" prevails over "ens_name"
+        if "names" not in kwargs:
+            try:
+                idx = truncated_file.index('r')
+            except Exception:
+                if "names" not in kwargs:
+                    raise Exception("Automatic recognition of replicum failed, please enter the key word 'names'.")
+            if "ens_name" in kwargs:
+                ens_name = kwargs.get("ens_name")
+            else:
+                ens_name = truncated_file[:idx]
+            rep_names.append(ens_name + '|' + truncated_file[idx:])
+        else:
+            names = kwargs.get("names")
+            rep_names = names
+        deltas.append(np.array(Q_round))
+        idl.append(range(idl_start, idl_stop))
+    result = Obs(deltas, rep_names, idl=idl)
+    return result
+
+
+def read_qtop_sector(target=0, **kwargs):
+    """target: int
+            specifies the topological sector to be reweighted to (default 0)
+        q_top: Obs
+        alternatively takes args of read_qtop method as kwargs
+    """
+    if "q_top" in kwargs:
+        qtop = kwargs.get("q_top")
+    else:
+        if "path" in kwargs:
+            path = kwargs.get("path")
+            del kwargs["path"]
+        else:
+            raise Exception("If you are not providing q_top, please provide path")
+        if "prefix" in kwargs:
+            prefix = kwargs.get("prefix")
+            del kwargs["prefix"]
+        else:
+            raise Exception("If you are not providing q_top, please provide prefix")
+        if "c" in kwargs:
+            c = kwargs.get("c")
+            del kwargs["c"]
+        else:
+            raise Exception("If you are not providing q_top, please provide c")
+        if "version" in kwargs:
+            version = kwargs.get("version")
+            del kwargs["version"]
+        else:
+            version = "1.2"
+        if "dtr_cnfg" in kwargs:
+            dtr_cnfg = kwargs.get("dtr_cnfg")
+            del kwargs["dtr_cnfg"]
+        else:
+            dtr_cnfg = 1
+        qtop = read_qtop(path, prefix, c, dtr_cnfg=dtr_cnfg,
+                         version=version, **kwargs)
+    names = qtop.names
+    print(names)
+    print(qtop.deltas.keys())
+    proj_qtop = []
+    for n in qtop.deltas:
+        proj_qtop.append(np.array([1 if int(qtop.value + q) == target else 0 for q in qtop.deltas[n]]))
+
+    result = Obs(proj_qtop, qtop.names)
+    return result
--- a/pyerrors/input/sfcf.py
+++ b/pyerrors/input/sfcf.py
@ -6,17 +6,58 @@ import fnmatch
 import re
 import numpy as np  # Thinly-wrapped numpy
 from ..obs import Obs
+from . import utils


-def read_sfcf(path, prefix, name, **kwargs):
-    """Read sfcf C format from given folder structure.
+def read_sfcf(path, prefix, name, quarks='.*', noffset=0, wf=0, wf2=0,
+              version="1.0c", **kwargs):
+    """Read sfcf c format from given folder structure.

    Parameters
    ----------
-    im -- if True, read imaginary instead of real part of the correlation function.
-    single -- if True, read a boundary-to-boundary correlation function with a single value
-    b2b -- if True, read a time-dependent boundary-to-boundary correlation function
-    names -- Alternative labeling for replicas/ensembles. Has to have the appropriate length
+    quarks: str
+        Label of the quarks used in the sfcf input file. e.g. "quark quark"
+        for version 0.0 this does NOT need to be given with the typical " - "
+        that is present in the output file,
+        this is done automatically for this version
+    noffset: int
+        Offset of the source (only relevant when wavefunctions are used)
+    wf: int
+        ID of wave function
+    wf2: int
+        ID of the second wavefunction
+        (only relevant for boundary-to-boundary correlation functions)
+    im: bool
+        if True, read imaginary instead of real part
+        of the correlation function.
+    b2b: bool
+        if True, read a time-dependent boundary-to-boundary
+        correlation function
+    single: bool
+        if True, read time independent boundary to boundary
+        correlation function
+    names: list
+        Alternative labeling for replicas/ensembles.
+        Has to have the appropriate length
+    ens_name : str
+        replaces the name of the ensemble
+    version: str
+        version of SFCF, with which the measurement was done.
+        if the compact output option (-c) was specified,
+        append a "c" to the version (e.g. "1.0c")
+        if the append output option (-a) was specified,
+        append an "a" to the version. Currently supported versions
+        are "0.0", "1.0", "2.0", "1.0c", "2.0c", "1.0a" and "2.0a".
+    replica: list
+        list of replica to be read, default is all
+    files: list
+        list of files to be read per replica, default is all.
+        for non-compact output format, hand the folders to be read here.
+    check_configs:
+        list of list of supposed configs, eg. [range(1,1000)]
+        for one replicum with 1000 configs
+    TODO:
+    - whats going on with files here?
    """
    if kwargs.get('im'):
        im = 1
@ -29,119 +70,43 @@ def read_sfcf(path, prefix, name, **kwargs):
        b2b = 1
        single = 1
    else:
-        b2b = 0
-        single = 0
-
        if kwargs.get('b2b'):
            b2b = 1
+        else:
+            b2b = 0
+        single = 0
+    if "replica" in kwargs:
+        reps = kwargs.get("replica")

+    compact = True
+    appended = False
+    known_versions = ["0.0", "1.0", "2.0", "1.0c", "2.0c", "1.0a", "2.0a"]
+
+    if version not in known_versions:
+        raise Exception("This version is not known!")
+    if(version[-1] == "c"):
+        appended = False
+        compact = True
+        version = version[:-1]
+    elif(version[-1] == "a"):
+        appended = True
+        compact = False
+        version = version[:-1]
+    else:
+        compact = False
+        appended = False
    read = 0
    T = 0
    start = 0
    ls = []
+    if "replica" in kwargs:
+        ls = reps
+    else:
        for (dirpath, dirnames, filenames) in os.walk(path):
+            if not appended:
                ls.extend(dirnames)
-        break
-    if not ls:
-        raise Exception('Error, directory not found')
-    for exc in ls:
-        if fnmatch.fnmatch(exc, prefix + '*'):
-            ls = list(set(ls) - set(exc))
-    if len(ls) > 1:
-        ls.sort(key=lambda x: int(re.findall(r'\d+', x[len(prefix):])[0]))
-    replica = len(ls)
-    print('Read', part, 'part of', name, 'from', prefix, ',', replica, 'replica')
-    if 'names' in kwargs:
-        new_names = kwargs.get('names')
-        if len(new_names) != replica:
-            raise Exception('Names does not have the required length', replica)
            else:
-        # Adjust replica names to new bookmarking system
-        new_names = []
-        for entry in ls:
-            idx = entry.index('r')
-            new_names.append(entry[:idx] + '|' + entry[idx:])
-
-    print(replica, 'replica')
-    for i, item in enumerate(ls):
-        print(item)
-        sub_ls = []
-        for (dirpath, dirnames, filenames) in os.walk(path + '/' + item):
-            sub_ls.extend(dirnames)
-            break
-        for exc in sub_ls:
-            if fnmatch.fnmatch(exc, 'cfg*'):
-                sub_ls = list(set(sub_ls) - set(exc))
-        sub_ls.sort(key=lambda x: int(x[3:]))
-        no_cfg = len(sub_ls)
-        print(no_cfg, 'configurations')
-
-        if i == 0:
-            with open(path + '/' + item + '/' + sub_ls[0] + '/' + name) as fp:
-                for k, line in enumerate(fp):
-                    if read == 1 and not line.strip() and k > start + 1:
-                        break
-                    if read == 1 and k >= start:
-                        T += 1
-                    if '[correlator]' in line:
-                        read = 1
-                        start = k + 7 + b2b
-                        T -= b2b
-
-            deltas = []
-            for j in range(T):
-                deltas.append([])
-
-        sublength = len(sub_ls)
-        for j in range(T):
-            deltas[j].append(np.zeros(sublength))
-
-        for cnfg, subitem in enumerate(sub_ls):
-            with open(path + '/' + item + '/' + subitem + '/' + name) as fp:
-                for k, line in enumerate(fp):
-                    if(k >= start and k < start + T):
-                        floats = list(map(float, line.split()))
-                        deltas[k - start][i][cnfg] = floats[1 + im - single]
-
-    result = []
-    for t in range(T):
-        result.append(Obs(deltas[t], new_names))
-
-    return result
-
-
-def read_sfcf_c(path, prefix, name, quarks='.*', noffset=0, wf=0, wf2=0, **kwargs):
-    """Read sfcf c format from given folder structure.
-
-    Parameters
-    ----------
-    quarks -- Label of the quarks used in the sfcf input file
-    noffset -- Offset of the source (only relevant when wavefunctions are used)
-    wf -- ID of wave function
-    wf2 -- ID of the second wavefunction (only relevant for boundary-to-boundary correlation functions)
-    im -- if True, read imaginary instead of real part of the correlation function.
-    b2b -- if True, read a time-dependent boundary-to-boundary correlation function
-    names -- Alternative labeling for replicas/ensembles. Has to have the appropriate length
-    ens_name : str
-        replaces the name of the ensemble
-    """
-
-    if kwargs.get('im'):
-        im = 1
-        part = 'imaginary'
-    else:
-        im = 0
-        part = 'real'
-
-    if kwargs.get('b2b'):
-        b2b = 1
-    else:
-        b2b = 0
-
-    T = 0
-    ls = []
-    for (dirpath, dirnames, filenames) in os.walk(path):
-        ls.extend(dirnames)
+                ls.extend(filenames)
            break
        if not ls:
            raise Exception('Error, directory not found')
@ -150,54 +115,100 @@ def read_sfcf_c(path, prefix, name, quarks='.*', noffset=0, wf=0, wf2=0, **kwarg
            if not fnmatch.fnmatch(exc, prefix + '*'):
                ls = list(set(ls) - set([exc]))
    if len(ls) > 1:
-        ls.sort(key=lambda x: int(re.findall(r'\d+', x[len(prefix):])[0]))  # New version, to cope with ids, etc.
+        ls.sort(key=lambda x: int(re.findall(r'\d+', x[len(prefix):])[0]))
+
+    if not appended:
        replica = len(ls)
+    else:
+        replica = len([file.split(".")[-1] for file in ls]) // len(set([file.split(".")[-1] for file in ls]))
+    print('Read', part, 'part of', name, 'from', prefix[:-1],
+          ',', replica, 'replica')
    if 'names' in kwargs:
        new_names = kwargs.get('names')
+        if len(new_names) != len(set(new_names)):
+            raise Exception("names are not unique!")
        if len(new_names) != replica:
            raise Exception('Names does not have the required length', replica)
    else:
-        # Adjust replica names to new bookmarking system
        new_names = []
+        if not appended:
            for entry in ls:
+                try:
                    idx = entry.index('r')
+                except Exception:
+                    raise Exception("Automatic recognition of replicum failed, please enter the key word 'names'.")
+
                if 'ens_name' in kwargs:
                    new_names.append(kwargs.get('ens_name') + '|' + entry[idx:])
                else:
                    new_names.append(entry[:idx] + '|' + entry[idx:])
+        else:

-    print('Read', part, 'part of', name, 'from', prefix[:-1], ',', replica, 'replica')
+            for exc in ls:
+                if not fnmatch.fnmatch(exc, prefix + '*.' + name):
+                    ls = list(set(ls) - set([exc]))
+            ls.sort(key=lambda x: int(re.findall(r'\d+', x)[-1]))
+            for entry in ls:
+                myentry = entry[:-len(name) - 1]
+                try:
+                    idx = myentry.index('r')
+                except Exception:
+                    raise Exception("Automatic recognition of replicum failed, please enter the key word 'names'.")
+
+                if 'ens_name' in kwargs:
+                    new_names.append(kwargs.get('ens_name') + '|' + myentry[idx:])
+                else:
+                    new_names.append(myentry[:idx] + '|' + myentry[idx:])
+    idl = []
+    if not appended:
        for i, item in enumerate(ls):
            sub_ls = []
+            if "files" in kwargs:
+                sub_ls = kwargs.get("files")
+                sub_ls.sort(key=lambda x: int(re.findall(r'\d+', x)[-1]))
+            else:
                for (dirpath, dirnames, filenames) in os.walk(path + '/' + item):
+                    if compact:
                        sub_ls.extend(filenames)
+                    else:
+                        sub_ls.extend(dirnames)
                    break
+
                for exc in sub_ls:
+                    if compact:
                        if not fnmatch.fnmatch(exc, prefix + '*'):
                            sub_ls = list(set(sub_ls) - set([exc]))
-        sub_ls.sort(key=lambda x: int(re.findall(r'\d+', x)[-1]))
-
-        first_cfg = int(re.findall(r'\d+', sub_ls[0])[-1])
-
-        last_cfg = len(sub_ls) + first_cfg - 1
-
-        for cfg in range(1, len(sub_ls)):
-            if int(re.findall(r'\d+', sub_ls[cfg])[-1]) != first_cfg + cfg:
-                last_cfg = cfg + first_cfg - 1
-                break
-
-        no_cfg = last_cfg - first_cfg + 1
-        print(item, ':', no_cfg, 'evenly spaced configurations (', first_cfg, '-', last_cfg, ') ,', len(sub_ls) - no_cfg, 'configs omitted\n')
-
+                        sub_ls.sort(key=lambda x:
+                                    int(re.findall(r'\d+', x)[-1]))
+                    else:
+                        if not fnmatch.fnmatch(exc, 'cfg*'):
+                            sub_ls = list(set(sub_ls) - set([exc]))
+                        sub_ls.sort(key=lambda x: int(x[3:]))
+            rep_idl = []
+            no_cfg = len(sub_ls)
+            for cfg in sub_ls:
+                try:
+                    if compact:
+                        rep_idl.append(int(cfg.split("n")[-1]))
+                    else:
+                        rep_idl.append(int(cfg[3:]))
+                except Exception:
+                    raise Exception("Couldn't parse idl from directroy, problem with file " + cfg)
+            rep_idl.sort()
+            print(item, ':', no_cfg, ' configurations')
+            idl.append(rep_idl)
            if i == 0:
+                if compact:
                    pattern = 'name      ' + name + '\nquarks    ' + quarks + '\noffset    ' + str(noffset) + '\nwf        ' + str(wf)
                    if b2b:
                        pattern += '\nwf_2      ' + str(wf2)
-
                    with open(path + '/' + item + '/' + sub_ls[0], 'r') as file:
                        content = file.read()
                        match = re.search(pattern, content)
                        if match:
+                            # the start and end point of the correlator
+                            # in question is extracted for later use in
+                            # the other files
                            start_read = content.count('\n', 0, match.start()) + 5 + b2b
                            end_match = re.search(r'\n\s*\n', content[match.start():])
                            T = content[match.start():].count('\n', 0, end_match.start()) - 4 - b2b
@ -205,90 +216,136 @@ def read_sfcf_c(path, prefix, name, quarks='.*', noffset=0, wf=0, wf2=0, **kwarg
                            print(T, 'entries, starting to read in line', start_read)
                        else:
                            raise Exception('Correlator with pattern\n' + pattern + '\nnot found.')
+                else:
+                    # this part does the same as above,
+                    # but for non-compactified versions of the files
+                    with open(path + '/' + item + '/' + sub_ls[0] + '/' + name) as fp:
+                        for k, line in enumerate(fp):
+                            if version == "0.0":
+                                # check if this is really the right file
+                                # by matching pattern similar to above
+                                pattern = "# " + name + " : offset " + str(noffset) + ", wf " + str(wf)
+                                # if b2b, a second wf is needed
+                                if b2b:
+                                    pattern += ", wf_2 " + str(wf2)
+                                qs = quarks.split(" ")
+                                pattern += " : " + qs[0] + " - " + qs[1]
+                            if read == 1 and not line.strip() and k > start + 1:
+                                break
+                            if read == 1 and k >= start:
+                                T += 1

+                            if version == "0.0":
+                                if pattern in line:
+                                    # print(line)
+                                    read = 1
+                                    start = k + 1
+                            else:
+                                if '[correlator]' in line:
+                                    read = 1
+                                    start = k + 7 + b2b
+                                    T -= b2b
+                        print(str(T) + " entries found.")
                deltas = []
                for j in range(T):
                    deltas.append([])

-        sublength = no_cfg
-        for j in range(T):
-            deltas[j].append(np.zeros(sublength))
-
+            for t in range(T):
+                deltas[t].append(np.zeros(no_cfg))
+            # we iterate through all measurement files in the path given...
+            if compact:
                for cfg in range(no_cfg):
                    with open(path + '/' + item + '/' + sub_ls[cfg]) as fp:
-                for k, line in enumerate(fp):
+                        lines = fp.readlines()
+                        if(start_read + T > len(lines)):
+                            raise Exception("EOF before end of correlator data! Maybe " + path + '/' + item + '/' + sub_ls[cfg] + " is corrupted?")
+                        for k in range(start_read - 6, start_read + T):
                            if k == start_read - 5 - b2b:
-                        if line.strip() != 'name      ' + name:
-                            raise Exception('Wrong format', sub_ls[cfg])
+                                if lines[k].strip() != 'name      ' + name:
+                                    raise Exception('Wrong format',
+                                                    sub_ls[cfg])
                            if(k >= start_read and k < start_read + T):
-                        floats = list(map(float, line.split()))
+                                floats = list(map(float, lines[k].split()))
                                deltas[k - start_read][i][cfg] = floats[-2:][im]
+            else:
+                for cnfg, subitem in enumerate(sub_ls):
+                    with open(path + '/' + item + '/' + subitem + '/' + name) as fp:
+                        for k, line in enumerate(fp):
+                            if(k >= start and k < start + T):
+                                floats = list(map(float, line.split()))
+                                if version == "0.0":
+                                    deltas[k - start][i][cnfg] = floats[im]
+                                else:
+                                    deltas[k - start][i][cnfg] = floats[1 + im - single]

+    else:
+        if "files" in kwargs:
+            ls = kwargs.get("files")
+        else:
+            for exc in ls:
+                if not fnmatch.fnmatch(exc, prefix + '*.' + name):
+                    ls = list(set(ls) - set([exc]))
+                ls.sort(key=lambda x: int(re.findall(r'\d+', x)[-1]))
+        pattern = 'name      ' + name + '\nquarks    ' + quarks + '\noffset    ' + str(noffset) + '\nwf        ' + str(wf)
+        if b2b:
+            pattern += '\nwf_2      ' + str(wf2)
+        for rep, file in enumerate(ls):
+            rep_idl = []
+            with open(path + '/' + file, 'r') as fp:
+                content = fp.readlines()
+                data_starts = []
+                for linenumber, line in enumerate(content):
+                    if "[run]" in line:
+                        data_starts.append(linenumber)
+                if len(set([data_starts[i] - data_starts[i - 1] for i in
+                            range(1, len(data_starts))])) > 1:
+                    raise Exception("Irregularities in file structure found, not all runs have the same output length")
+                chunk = content[:data_starts[1]]
+                for linenumber, line in enumerate(chunk):
+                    if line.startswith("gauge_name"):
+                        gauge_line = linenumber
+                    elif line.startswith("[correlator]"):
+                        corr_line = linenumber
+                        found_pat = ""
+                        for li in chunk[corr_line + 1:corr_line + 6 + b2b]:
+                            found_pat += li
+                        if re.search(pattern, found_pat):
+                            start_read = corr_line + 7 + b2b
+                            T = len(chunk) - 1 - start_read
+                if rep == 0:
+                    deltas = []
+                    for t in range(T):
+                        deltas.append([])
+                for t in range(T):
+                    deltas[t].append(np.zeros(len(data_starts)))
+                for cnfg in range(len(data_starts)):
+                    start = data_starts[cnfg]
+                    stop = start + data_starts[1]
+                    chunk = content[start:stop]
+                    try:
+                        rep_idl.append(int(chunk[gauge_line].split("n")[-1]))
+                    except Exception:
+                        raise Exception("Couldn't parse idl from directroy, problem with chunk around line " + gauge_line)
+
+                    found_pat = ""
+                    for li in chunk[corr_line + 1:corr_line + 6 + b2b]:
+                        found_pat += li
+                    if re.search(pattern, found_pat):
+                        for t, line in enumerate(chunk[start_read:start_read + T]):
+                            floats = list(map(float, line.split()))
+                            deltas[t][rep][cnfg] = floats[-2:][im]
+            idl.append(rep_idl)
+
+    if "check_configs" in kwargs:
+        print("Checking for missing configs...")
+        che = kwargs.get("check_configs")
+        if not (len(che) == len(idl)):
+            raise Exception("check_configs has to be the same length as replica!")
+        for r in range(len(idl)):
+            print("checking " + new_names[r])
+            utils.check_idl(idl[r], che[r])
+        print("Done")
    result = []
    for t in range(T):
-        result.append(Obs(deltas[t], new_names))
-    return result
-
-
-def read_qtop(path, prefix, **kwargs):
-    """Read qtop format from given folder structure.
-
-    Parameters
-    ----------
-    target -- specifies the topological sector to be reweighted to (default 0)
-    full -- if true read the charge instead of the reweighting factor.
-    """
-
-    if 'target' in kwargs:
-        target = kwargs.get('target')
-    else:
-        target = 0
-
-    if kwargs.get('full'):
-        full = 1
-    else:
-        full = 0
-
-    ls = []
-    for (dirpath, dirnames, filenames) in os.walk(path):
-        ls.extend(filenames)
-        break
-
-    if not ls:
-        raise Exception('Error, directory not found')
-
-    # Exclude files with different names
-    for exc in ls:
-        if not fnmatch.fnmatch(exc, prefix + '*'):
-            ls = list(set(ls) - set([exc]))
-    if len(ls) > 1:
-        ls.sort(key=lambda x: int(re.findall(r'\d+', x[len(prefix):])[0]))  # New version, to cope with ids, etc.
-    replica = len(ls)
-    print('Read Q_top from', prefix[:-1], ',', replica, 'replica')
-
-    deltas = []
-
-    for rep in range(replica):
-        tmp = []
-        with open(path + '/' + ls[rep]) as fp:
-            for k, line in enumerate(fp):
-                floats = list(map(float, line.split()))
-                if full == 1:
-                    tmp.append(floats[1])
-                else:
-                    if int(floats[1]) == target:
-                        tmp.append(1.0)
-                    else:
-                        tmp.append(0.0)
-
-        deltas.append(np.array(tmp))
-
-    rep_names = []
-    for entry in ls:
-        truncated_entry = entry.split('.')[0]
-        idx = truncated_entry.index('r')
-        rep_names.append(truncated_entry[:idx] + '|' + truncated_entry[idx:])
-
-    result = Obs(deltas, rep_names)
-
+        result.append(Obs(deltas[t], new_names, idl=idl))
    return result
--- a/pyerrors/input/utils.py
+++ b/pyerrors/input/utils.py
@ -0,0 +1,24 @@
+"""Utilities for the input"""
+
+
+def check_idl(idl, che):
+    """Checks if list of configurations is contained in an idl
+
+    Parameters
+    ----------
+    idl : range or list
+        idl of the current replicum
+    che : list
+        list of configurations to be checked against
+    """
+    missing = []
+    for c in che:
+        if c not in idl:
+            missing.append(c)
+    # print missing configurations such that it can directly be parsed to slurm terminal
+    if not (len(missing) == 0):
+        print(len(missing), "configs missing")
+        miss_str = str(missing[0])
+        for i in missing[1:]:
+            miss_str += "," + str(i)
+        print(miss_str)
--- a/pyerrors/obs.py
+++ b/pyerrors/obs.py
@ -4,6 +4,7 @@ import numpy as np
 import autograd.numpy as anp  # Thinly-wrapped numpy
 from autograd import jacobian
 import matplotlib.pyplot as plt
+from scipy.stats import skew, skewtest, kurtosis, kurtosistest
 import numdifftools as nd
 from itertools import groupby
 from .covobs import Covobs
@ -564,7 +565,7 @@ class Obs:
            y = np.concatenate(tmp, axis=0)
            plt.errorbar(x, y, fmt='.', markersize=3)
            plt.xlim(-0.5, e_N - 0.5)
-            plt.title(e_name)
+            plt.title(e_name + f', skew: {skew(y):.3f} (p={skewtest(y).pvalue:.3f}), kurtosis: {kurtosis(y):.3f} (p={kurtosistest(y).pvalue:.3f})')
            plt.draw()

    def plot_piechart(self):
--- a/tests/fits_test.py
+++ b/tests/fits_test.py
@ -309,6 +309,25 @@ def test_error_band():
    pe.fits.error_band(x, f, fitp.fit_parameters)


+def test_ks_test():
+    def f(a, x):
+        y = a[0] + a[1] * x
+        return y
+
+    fit_res = []
+
+    for i in range(20):
+        data = []
+        for j in range(10):
+            data.append(pe.pseudo_Obs(j + np.random.normal(0.0, 0.25), 0.25, 'test'))
+        my_corr = pe.Corr(data)
+
+        fit_res.append(my_corr.fit(f, silent=True))
+
+    pe.fits.ks_test()
+    pe.fits.ks_test(fit_res)
+
+
 def fit_general(x, y, func, silent=False, **kwargs):
    """Performs a non-linear fit to y = func(x) and returns a list of Obs corresponding to the fit parameters.