Merge branch 'fjosw:develop' into develop

This commit is contained in:
JanNeuendorf 2022-01-18 13:21:01 +01:00 committed by GitHub
commit 39f176585e
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
22 changed files with 1050 additions and 1316 deletions

View file

@ -11,11 +11,17 @@ on:
jobs:
pytest:
runs-on: ubuntu-latest
runs-on: ${{ matrix.os }}
strategy:
fail-fast: true
matrix:
python-version: ["3.6", "3.7", "3.8", "3.9", "3.10"]
os: [ubuntu-latest]
python-version: ["3.7", "3.8", "3.9", "3.10"]
include:
- os: macos-latest
python-version: 3.9
- os: windows-latest
python-version: 3.9
steps:
- name: Checkout source

View file

@ -1,9 +1,9 @@
[![flake8](https://github.com/fjosw/pyerrors/actions/workflows/flake8.yml/badge.svg)](https://github.com/fjosw/pyerrors/actions/workflows/flake8.yml) [![pytest](https://github.com/fjosw/pyerrors/actions/workflows/pytest.yml/badge.svg)](https://github.com/fjosw/pyerrors/actions/workflows/pytest.yml) [![docs](https://github.com/fjosw/pyerrors/actions/workflows/docs.yml/badge.svg)](https://github.com/fjosw/pyerrors/actions/workflows/docs.yml) [![](https://img.shields.io/badge/python-3.6+-blue.svg)](https://www.python.org/downloads/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![flake8](https://github.com/fjosw/pyerrors/actions/workflows/flake8.yml/badge.svg)](https://github.com/fjosw/pyerrors/actions/workflows/flake8.yml) [![pytest](https://github.com/fjosw/pyerrors/actions/workflows/pytest.yml/badge.svg)](https://github.com/fjosw/pyerrors/actions/workflows/pytest.yml) [![docs](https://github.com/fjosw/pyerrors/actions/workflows/docs.yml/badge.svg)](https://github.com/fjosw/pyerrors/actions/workflows/docs.yml) [![](https://img.shields.io/badge/python-3.7+-blue.svg)](https://www.python.org/downloads/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
# pyerrors
`pyerrors` is a python package for error computation and propagation of Markov chain Monte Carlo data.
- **Documentation:** https://fjosw.github.io/pyerrors/pyerrors.html
- **Examples**: https://github.com/fjosw/pyerrors/tree/develop/examples (Do not work properly at the moment)
- **Examples**: https://github.com/fjosw/pyerrors/tree/develop/examples
- **Contributing:** https://github.com/fjosw/pyerrors/blob/develop/CONTRIBUTING.md
- **Bug reports:** https://github.com/fjosw/pyerrors/issues

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View file

@ -51,7 +51,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The standard matrix product can be performed with @"
"The standard matrix product can be performed with `@`"
]
},
{
@ -101,13 +101,38 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Mathematical functions work elementwise"
"For large matrices overloading the standard operator `@` can become inefficient as pyerrors has to perform a large number of elementary opeations. For these situations pyerrors provides the function `linalg.matmul` which optimizes the required automatic differentiation. The function can take an arbitray number of operands."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[Obs[78.12099999999998] Obs[-22.909999999999997]]\n",
" [Obs[-22.909999999999997] Obs[7.1]]]\n"
]
}
],
"source": [
"print(pe.linalg.matmul(matrix, matrix, matrix)) # Equivalent to matrix @ matrix @ matrix but faster for large matrices"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Mathematical functions work elementwise"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
@ -126,12 +151,12 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"For a vector of `Obs`, we again use np.asarray to end up with the correct object"
"For a vector of `Obs`, we again use `np.asarray` to end up with the correct object"
]
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 7,
"metadata": {},
"outputs": [
{
@ -160,14 +185,14 @@
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[Obs[7.2(1.7)] Obs[-1.00(45)]]\n"
"[Obs[7.2(1.7)] Obs[-1.00(46)]]\n"
]
}
],
@ -182,40 +207,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Matrix to scalar operations\n",
"If we want to apply a numpy matrix function with a scalar return value we can use `scalar_mat_op`. __Here we need to use the autograd wrapped version of numpy__ (imported as anp) to use automatic differentiation."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"det \t Obs[3.10(28)]\n",
"trace \t Obs[5.10(20)]\n",
"norm \t Obs[4.45(19)]\n"
]
}
],
"source": [
"import autograd.numpy as anp # Thinly-wrapped numpy\n",
"funcs = [anp.linalg.det, anp.trace, anp.linalg.norm]\n",
"\n",
"for i, func in enumerate(funcs):\n",
" res = pe.linalg.scalar_mat_op(func, matrix)\n",
" res.gamma_method()\n",
" print(func.__name__, '\\t', res)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For matrix operations which are not supported by autograd we can use numerical differentiation"
"`pyerrors` provides the user with wrappers to the `numpy.linalg` functions which work on `Obs` valued matrices. We can for example calculate the determinant of the matrix via"
]
},
{
@ -227,26 +219,21 @@
"name": "stdout",
"output_type": "stream",
"text": [
"cond \t Obs[6.23(58)]\n",
"expm_cond \t Obs[4.45(19)]\n"
"3.10(28)\n"
]
}
],
"source": [
"funcs = [np.linalg.cond, scipy.linalg.expm_cond]\n",
"\n",
"for i, func in enumerate(funcs):\n",
" res = pe.linalg.scalar_mat_op(func, matrix, num_grad=True)\n",
" res.gamma_method()\n",
" print(func.__name__, ' \\t', res)"
"det = pe.linalg.det(matrix)\n",
"det.gamma_method()\n",
"print(det)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Matrix to matrix operations\n",
"For matrix operations with a matrix as return value we can use another wrapper `mat_mat_op`. Take as an example the cholesky decompostion. __Here we need to use the autograd wrapped version of numpy__ (imported as anp) to use automatic differentiation."
"The cholesky decomposition can be obtained as follows"
]
},
{
@ -259,12 +246,12 @@
"output_type": "stream",
"text": [
"[[Obs[2.025(49)] Obs[0.0]]\n",
" [Obs[-0.494(51)] Obs[0.870(29)]]]\n"
" [Obs[-0.494(50)] Obs[0.870(29)]]]\n"
]
}
],
"source": [
"cholesky = pe.linalg.mat_mat_op(anp.linalg.cholesky, matrix)\n",
"cholesky = pe.linalg.cholesky(matrix)\n",
"for (i, j), entry in np.ndenumerate(cholesky):\n",
" entry.gamma_method()\n",
"print(cholesky)"
@ -321,7 +308,7 @@
}
],
"source": [
"inv = pe.linalg.mat_mat_op(anp.linalg.inv, cholesky)\n",
"inv = pe.linalg.inv(cholesky)\n",
"for (i, j), entry in np.ndenumerate(inv):\n",
" entry.gamma_method()\n",
"print(inv)\n",
@ -334,59 +321,14 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Matrix to matrix operations which are not supported by autograd can also be computed with numeric differentiation"
"## Eigenvalues and eigenvectors\n",
"We can also compute eigenvalues and eigenvectors of symmetric matrices with a special wrapper `eigh`"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"orth\n",
"[[Obs[-0.9592(76)] Obs[0.283(26)]]\n",
" [Obs[0.283(26)] Obs[0.9592(76)]]]\n",
"expm\n",
"[[Obs[75(15)] Obs[-21.4(4.1)]]\n",
" [Obs[-21.4(4.1)] Obs[8.3(1.4)]]]\n",
"logm\n",
"[[Obs[1.334(57)] Obs[-0.496(61)]]\n",
" [Obs[-0.496(61)] Obs[-0.203(50)]]]\n",
"sinhm\n",
"[[Obs[37.3(7.4)] Obs[-10.8(2.1)]]\n",
" [Obs[-10.8(2.1)] Obs[3.94(68)]]]\n",
"sqrtm\n",
"[[Obs[1.996(51)] Obs[-0.341(37)]]\n",
" [Obs[-0.341(37)] Obs[0.940(14)]]]\n"
]
}
],
"source": [
"funcs = [scipy.linalg.orth, scipy.linalg.expm, scipy.linalg.logm, scipy.linalg.sinhm, scipy.linalg.sqrtm]\n",
"\n",
"for i,func in enumerate(funcs):\n",
" res = pe.linalg.mat_mat_op(func, matrix, num_grad=True)\n",
" for (i, j), entry in np.ndenumerate(res):\n",
" entry.gamma_method()\n",
" print(func.__name__)\n",
" print(res)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Eigenvalues and eigenvectors\n",
"We can also compute eigenvalues and eigenvectors of symmetric matrices with a special wrapper `eigh`"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"name": "stdout",
@ -395,8 +337,8 @@
"Eigenvalues:\n",
"[Obs[0.705(57)] Obs[4.39(19)]]\n",
"Eigenvectors:\n",
"[[Obs[-0.283(26)] Obs[-0.9592(76)]]\n",
" [Obs[-0.9592(76)] Obs[0.283(26)]]]\n"
"[[Obs[-0.283(26)] Obs[-0.9592(75)]]\n",
" [Obs[-0.9592(75)] Obs[0.283(26)]]]\n"
]
}
],
@ -421,7 +363,7 @@
},
{
"cell_type": "code",
"execution_count": 15,
"execution_count": 14,
"metadata": {},
"outputs": [
{
@ -451,7 +393,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
@ -465,7 +407,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.9"
"version": "3.8.10"
}
},
"nbformat": 4,

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

BIN
examples/data/f_A.json.gz Normal file

Binary file not shown.

BIN
examples/data/f_P.json.gz Normal file

Binary file not shown.

View file

@ -193,6 +193,56 @@ Make sure to check the autocorrelation time with e.g. `pyerrors.obs.Obs.plot_rho
For the full API see `pyerrors.obs.Obs`.
# Correlators
When one is not interested in single observables but correlation functions, `pyerrors` offers the `Corr` class which simplifies the corresponding error propagation and provides the user with a set of standard methods. In order to initialize a `Corr` objects one needs to arrange the data as a list of `Obs´
```python
my_corr = pe.Corr([obs_0, obs_1, obs_2, obs_3])
print(my_corr)
> x0/a Corr(x0/a)
> ------------------
> 0 0.7957(80)
> 1 0.5156(51)
> 2 0.3227(33)
> 3 0.2041(21)
```
In case the correlation functions are not defined on the outermost timeslices, for example because of fixed boundary conditions, a padding can be introduced.
```python
my_corr = pe.Corr([obs_0, obs_1, obs_2, obs_3], padding_front=1, padding_back=1)
print(my_corr)
> x0/a Corr(x0/a)
> ------------------
> 0
> 1 0.7957(80)
> 2 0.5156(51)
> 3 0.3227(33)
> 4 0.2041(21)
> 5
```
The individual entries of a correlator can be accessed via slicing
```python
print(my_corr[3])
> 0.3227(33)
```
Error propagation with the `Corr` class works very similar to `Obs` objects. Mathematical operations are overloaded and `Corr` objects can be computed together with other `Corr` objects, `Obs` objects or real numbers and integers.
```python
my_new_corr = 0.3 * my_corr[2] * my_corr * my_corr + 12 / my_corr
```
`pyerrors` provides the user with a set of regularly used methods for the manipulation of correlator objects:
- `Corr.gamma_method` applies the gamma method to all entries of the correlator.
- `Corr.m_eff` to construct effective masses. Various variants for periodic and fixed temporal boundary conditions are available.
- `Corr.deriv` returns the first derivative of the correlator as `Corr`. Different discretizations of the numerical derivative are available.
- `Corr.second_deriv` returns the second derivative of the correlator as `Corr`. Different discretizations of the numerical derivative are available.
- `Corr.symmetric` symmetrizes parity even correlations functions, assuming periodic boundary conditions.
- `Corr.anti_symmetric` anti-symmetrizes parity odd correlations functions, assuming periodic boundary conditions.
- `Corr.T_symmetry` averages a correlator with its time symmetry partner, assuming fixed boundary conditions.
- `Corr.plateau` extracts a plateau value from the correlator in a given range.
- `Corr.roll` periodically shifts the correlator.
- `Corr.reverse` reverses the time ordering of the correlator.
- `Corr.correlate` constructs a disconnected correlation function from the correlator and another `Corr` or `Obs` object.
- `Corr.reweight` reweights the correlator.
`pyerrors` can also handle matrices of correlation functions and extract energy states from these matrices via a generalized eigenvalue problem (see `pyerrors.correlators.Corr.GEVP).
For the full API see `pyerrors.correlators.Corr`.
# Complex observables

View file

@ -74,7 +74,7 @@ class Corr:
@property
def reweighted(self):
bool_array = np.array([list(map(lambda x: x.reweighted, o)) for o in list(filter(None.__ne__, self.content))])
bool_array = np.array([list(map(lambda x: x.reweighted, o)) for o in [x for x in self.content if x is not None]])
if np.all(bool_array == 1):
return True
elif np.all(bool_array == 0):
@ -614,7 +614,7 @@ class Corr:
if self.N != 1:
raise Exception("Correlator must be projected before plotting")
if x_range is None:
x_range = [0, self.T]
x_range = [0, self.T - 1]
fig = plt.figure()
ax1 = fig.add_subplot(111)

View file

@ -1,3 +1,4 @@
import gc
from collections.abc import Sequence
import warnings
import numpy as np
@ -7,6 +8,7 @@ import scipy.stats
import matplotlib.pyplot as plt
from matplotlib import gridspec
from scipy.odr import ODR, Model, RealData
from scipy.stats import chi2
import iminuit
from autograd import jacobian
from autograd import elementwise_grad as egrad
@ -45,6 +47,8 @@ class Fit_result(Sequence):
my_str += 'residual variance = ' + f'{self.residual_variance:2.6f}' + '\n'
if hasattr(self, 'chisquare_by_expected_chisquare'):
my_str += '\u03C7\u00b2/\u03C7\u00b2exp = ' + f'{self.chisquare_by_expected_chisquare:2.6f}' + '\n'
if hasattr(self, 'p_value'):
my_str += 'p-value = ' + f'{self.p_value:2.4f}' + '\n'
my_str += 'Fit parameters:\n'
for i_par, par in enumerate(self.fit_parameters):
my_str += str(i_par) + '\t' + ' ' * int(par >= 0) + str(par).rjust(int(par < 0.0)) + '\n'
@ -306,6 +310,7 @@ def total_least_squares(x, y, func, silent=False, **kwargs):
output.odr_chisquare = odr_chisquare(np.concatenate((out.beta, out.xplus.ravel())))
output.dof = x.shape[-1] - n_parms
output.p_value = 1 - chi2.cdf(output.odr_chisquare, output.dof)
return output
@ -619,6 +624,7 @@ def _standard_fit(x, y, func, silent=False, **kwargs):
output.chisquare = chisqfunc(fit_result.x)
output.dof = x.shape[-1] - n_parms
output.p_value = 1 - chi2.cdf(output.chisquare, output.dof)
if kwargs.get('resplot') is True:
residual_plot(x, y, func, result)
@ -703,7 +709,7 @@ def residual_plot(x, y, func, fit_res):
ax1.plot(x, residuals, 'ko', ls='none', markersize=5)
ax1.tick_params(direction='out')
ax1.tick_params(axis="x", bottom=True, top=True, labelbottom=True)
ax1.axhline(y=0.0, ls='--', color='k')
ax1.axhline(y=0.0, ls='--', color='k', marker=" ")
ax1.fill_between(x_samples, -1.0, 1.0, alpha=0.1, facecolor='k')
ax1.set_xlim([xstart, xstop])
ax1.set_ylabel('Residuals')
@ -740,3 +746,43 @@ def error_band(x, func, beta):
err = np.array(err)
return err
def ks_test(objects=None):
"""Performs a KolmogorovSmirnov test for the p-values of all fit object.
Parameters
----------
objects : list
List of fit results to include in the analysis (optional).
"""
if objects is None:
obs_list = []
for obj in gc.get_objects():
if isinstance(obj, Fit_result):
obs_list.append(obj)
else:
obs_list = objects
p_values = [o.p_value for o in obs_list]
bins = len(p_values)
x = np.arange(0, 1.001, 0.001)
plt.plot(x, x, 'k', zorder=1)
plt.xlim(0, 1)
plt.ylim(0, 1)
plt.xlabel('p-value')
plt.ylabel('Cumulative probability')
plt.title(str(bins) + ' p-values')
n = np.arange(1, bins + 1) / np.float64(bins)
Xs = np.sort(p_values)
plt.step(Xs, n)
diffs = n - Xs
loc_max_diff = np.argmax(np.abs(diffs))
loc = Xs[loc_max_diff]
plt.annotate('', xy=(loc, loc), xytext=(loc, loc + diffs[loc_max_diff]), arrowprops=dict(arrowstyle='<->', shrinkA=0, shrinkB=0))
plt.draw()
print(scipy.stats.kstest(p_values, 'uniform'))

View file

@ -58,16 +58,13 @@ def read_meson_hd5(path, filestem, ens_id, meson='meson_0', tree='meson', idl=No
meson : str
label of the meson to be extracted, standard value meson_0 which
corresponds to the pseudoscalar pseudoscalar two-point function.
tree : str
Label of the upmost directory in the hdf5 file, default 'meson'
for outputs of the Meson module. Can be altered to read input
from other modules with similar structures.
idl : range
If specified only configurations in the given range are read in.
"""
files, idx = _get_files(path, filestem, idl)
tree = meson.rsplit('_')[0]
corr_data = []
infos = []
for hd5_file in files:

View file

@ -1,6 +1,3 @@
#!/usr/bin/env python
# coding: utf-8
import os
import fnmatch
import re
@ -39,8 +36,9 @@ def read_rwms(path, prefix, version='2.0', names=None, **kwargs):
if not ls:
raise Exception('Error, directory not found')
# Exclude files with different names
if 'files' in kwargs:
ls = kwargs.get('files')
else:
for exc in ls:
if not fnmatch.fnmatch(exc, prefix + '*' + postfix + '.dat'):
ls = list(set(ls) - set([exc]))
@ -64,9 +62,9 @@ def read_rwms(path, prefix, version='2.0', names=None, **kwargs):
else:
r_stop = [None] * replica
print('Read reweighting factors from', prefix[:-1], ',', replica, 'replica', end='')
print('Read reweighting factors from', prefix[:-1], ',',
replica, 'replica', end='')
# Adjust replica names to new bookmarking system
if names is None:
rep_names = []
for entry in ls:
@ -85,7 +83,6 @@ def read_rwms(path, prefix, version='2.0', names=None, **kwargs):
tmp_array = []
with open(path + '/' + ls[rep], 'rb') as fp:
# header
t = fp.read(4) # number of reweighting factors
if rep == 0:
nrw = struct.unpack('i', t)[0]
@ -94,7 +91,7 @@ def read_rwms(path, prefix, version='2.0', names=None, **kwargs):
for k in range(nrw):
deltas.append([])
else:
if ((nrw != struct.unpack('i', t)[0] and (not version == '2.0')) or (nrw != struct.unpack('i', t)[0] / 2 and version == '2.0')): # little weird if-clause due to the /2 operation needed.
if ((nrw != struct.unpack('i', t)[0] and (not version == '2.0')) or (nrw != struct.unpack('i', t)[0] / 2 and version == '2.0')):
raise Exception('Error: different number of reweighting factors for replicum', rep)
for k in range(nrw):
@ -106,7 +103,6 @@ def read_rwms(path, prefix, version='2.0', names=None, **kwargs):
for i in range(nrw):
t = fp.read(4)
nfct.append(struct.unpack('i', t)[0])
# print('nfct: ', nfct) # Hasenbusch factor, 1 for rat reweighting
else:
for i in range(nrw):
nfct.append(1)
@ -119,7 +115,6 @@ def read_rwms(path, prefix, version='2.0', names=None, **kwargs):
if not struct.unpack('i', fp.read(4))[0] == 0:
print('something is wrong!')
# body
while 0 < 1:
t = fp.read(4)
if len(t) < 4:
@ -135,8 +130,11 @@ def read_rwms(path, prefix, version='2.0', names=None, **kwargs):
for j in range(tmpd['n'][0]):
tmp_nfct *= np.mean(np.exp(-np.asarray(tmp_rw[j])))
if print_err:
print(config_no, i, j, np.mean(np.exp(-np.asarray(tmp_rw[j]))), np.std(np.exp(-np.asarray(tmp_rw[j]))))
print('Sources:', np.exp(-np.asarray(tmp_rw[j])))
print(config_no, i, j,
np.mean(np.exp(-np.asarray(tmp_rw[j]))),
np.std(np.exp(-np.asarray(tmp_rw[j]))))
print('Sources:',
np.exp(-np.asarray(tmp_rw[j])))
print('Partial factor:', tmp_nfct)
elif version == '1.6' or version == '1.4':
tmp_nfct = 1.0
@ -146,7 +144,9 @@ def read_rwms(path, prefix, version='2.0', names=None, **kwargs):
tmp_rw = struct.unpack('d' * nsrc[i], t)
tmp_nfct *= np.mean(np.exp(-np.asarray(tmp_rw)))
if print_err:
print(config_no, i, j, np.mean(np.exp(-np.asarray(tmp_rw))), np.std(np.exp(-np.asarray(tmp_rw))))
print(config_no, i, j,
np.mean(np.exp(-np.asarray(tmp_rw))),
np.std(np.exp(-np.asarray(tmp_rw))))
print('Sources:', np.exp(-np.asarray(tmp_rw)))
print('Partial factor:', tmp_nfct)
tmp_array[i].append(tmp_nfct)
@ -165,11 +165,14 @@ def read_rwms(path, prefix, version='2.0', names=None, **kwargs):
return result
def extract_t0(path, prefix, dtr_read, xmin, spatial_extent, fit_range=5, **kwargs):
def extract_t0(path, prefix, dtr_read, xmin,
spatial_extent, fit_range=5, **kwargs):
"""Extract t0 from given .ms.dat files. Returns t0 as Obs.
It is assumed that all boundary effects have sufficiently decayed at x0=xmin.
The data around the zero crossing of t^2<E> - 0.3 is fitted with a linear function
It is assumed that all boundary effects have
sufficiently decayed at x0=xmin.
The data around the zero crossing of t^2<E> - 0.3
is fitted with a linear function
from which the exact root is extracted.
Only works with openQCD v 1.2.
@ -180,14 +183,17 @@ def extract_t0(path, prefix, dtr_read, xmin, spatial_extent, fit_range=5, **kwar
prefix : str
Ensemble prefix
dtr_read : int
Determines how many trajectories should be skipped when reading the ms.dat files.
Determines how many trajectories should be skipped
when reading the ms.dat files.
Corresponds to dtr_cnfg / dtr_ms in the openQCD input file.
xmin : int
First timeslice where the boundary effects have sufficiently decayed.
First timeslice where the boundary
effects have sufficiently decayed.
spatial_extent : int
spatial extent of the lattice, required for normalization.
fit_range : int
Number of data points left and right of the zero crossing to be included in the linear fit. (Default: 5)
Number of data points left and right of the zero
crossing to be included in the linear fit. (Default: 5)
r_start : list
list which contains the first config to be read for each replicum.
r_stop: list
@ -204,7 +210,6 @@ def extract_t0(path, prefix, dtr_read, xmin, spatial_extent, fit_range=5, **kwar
if not ls:
raise Exception('Error, directory not found')
# Exclude files with different names
for exc in ls:
if not fnmatch.fnmatch(exc, prefix + '*.ms.dat'):
ls = list(set(ls) - set([exc]))
@ -216,7 +221,6 @@ def extract_t0(path, prefix, dtr_read, xmin, spatial_extent, fit_range=5, **kwar
r_start = kwargs.get('r_start')
if len(r_start) != replica:
raise Exception('r_start does not match number of replicas')
# Adjust Configuration numbering to python index
r_start = [o - 1 if o else None for o in r_start]
else:
r_start = [None] * replica
@ -235,7 +239,6 @@ def extract_t0(path, prefix, dtr_read, xmin, spatial_extent, fit_range=5, **kwar
for rep in range(replica):
with open(path + '/' + ls[rep], 'rb') as fp:
# Read header
t = fp.read(12)
header = struct.unpack('iii', t)
if rep == 0:
@ -254,7 +257,6 @@ def extract_t0(path, prefix, dtr_read, xmin, spatial_extent, fit_range=5, **kwar
Ysl = []
# Read body
while 0 < 1:
t = fp.read(4)
if(len(t) < 4):
@ -273,7 +275,9 @@ def extract_t0(path, prefix, dtr_read, xmin, spatial_extent, fit_range=5, **kwar
Ysum.append([])
for i, item in enumerate(Ysl):
Ysum[-1].append([np.mean(item[current + xmin:current + tmax - xmin]) for current in range(0, len(item), tmax)])
Ysum[-1].append([np.mean(item[current + xmin:
current + tmax - xmin])
for current in range(0, len(item), tmax)])
t2E_dict = {}
for n in range(nn + 1):
@ -286,10 +290,13 @@ def extract_t0(path, prefix, dtr_read, xmin, spatial_extent, fit_range=5, **kwar
new_obs = Obs(samples, [(w.split('.'))[0] for w in ls])
t2E_dict[n * dn * eps] = (n * dn * eps) ** 2 * new_obs / (spatial_extent ** 3) - 0.3
zero_crossing = np.argmax(np.array([o.value for o in t2E_dict.values()]) > 0.0)
zero_crossing = np.argmax(np.array(
[o.value for o in t2E_dict.values()]) > 0.0)
x = list(t2E_dict.keys())[zero_crossing - fit_range: zero_crossing + fit_range]
y = list(t2E_dict.values())[zero_crossing - fit_range: zero_crossing + fit_range]
x = list(t2E_dict.keys())[zero_crossing - fit_range:
zero_crossing + fit_range]
y = list(t2E_dict.values())[zero_crossing - fit_range:
zero_crossing + fit_range]
[o.gamma_method() for o in y]
fit_result = fit_lin(x, y)
@ -313,12 +320,6 @@ def _parse_array_openQCD2(d, n, size, wa, quadrupel=False):
return arr
# mimic the read_array routine of openQCD-2.0.
# fp is the opened file handle
# returns the dict array
# at this point we only parse a 2d array
# d = 2
# n = [nfct[irw], 2*nsrc[irw]]
def _read_array_openQCD2(fp):
t = fp.read(4)
d = struct.unpack('i', t)[0]
@ -343,3 +344,232 @@ def _read_array_openQCD2(fp):
arr = _parse_array_openQCD2(d, n, size, tmp, quadrupel=True)
return {'d': d, 'n': n, 'size': size, 'arr': arr}
def read_qtop(path, prefix, c, dtr_cnfg=1, version="1.2", **kwargs):
"""Read qtop format from given folder structure.
Parameters
----------
path:
path of the measurement files
prefix:
prefix of the measurement files, e.g. <prefix>_id0_r0.ms.dat
c: double
Smearing radius in units of the lattice extent, c = sqrt(8 t0) / L
dtr_cnfg: int
(optional) parameter that specifies the number of trajectories
between two configs.
if it is not set, the distance between two measurements
in the file is assumed to be
the distance between two configurations.
steps: int
(optional) (maybe only necessary for openQCD2.0)
nt step size, guessed if not given
version: str
version string of the openQCD (sfqcd) version used to create
the ensemble
L: int
spatial length of the lattice in L/a.
HAS to be set if version != sfqcd, since openQCD does not provide
this in the header
r_start: list
offset of the first ensemble, making it easier to match
later on with other Obs
r_stop: list
last configurations that need to be read (per replicum)
files: list
specify the exact files that need to be read
from path, practical if e.g. only one replicum is needed
names: list
Alternative labeling for replicas/ensembles.
Has to have the appropriate length
"""
known_versions = ["1.0", "1.2", "1.4", "1.6", "2.0", "sfqcd"]
if version not in known_versions:
raise Exception("Unknown openQCD version.")
if "steps" in kwargs:
steps = kwargs.get("steps")
if version == "sfqcd":
if "L" in kwargs:
supposed_L = kwargs.get("L")
else:
if "L" not in kwargs:
raise Exception("This version of openQCD needs you to provide the spatial length of the lattice as parameter 'L'.")
else:
L = kwargs.get("L")
r_start = 1
if "r_start" in kwargs:
r_start = kwargs.get("r_start")
if "r_stop" in kwargs:
r_stop = kwargs.get("r_stop")
if "files" in kwargs:
files = kwargs.get("files")
else:
found = []
files = []
for (dirpath, dirnames, filenames) in os.walk(path + "/"):
# print(filenames)
found.extend(filenames)
break
for f in found:
if fnmatch.fnmatch(f, prefix + "*" + ".ms.dat"):
files.append(f)
print(files)
rep_names = []
deltas = []
idl = []
for rep, file in enumerate(files):
with open(path + "/" + file, "rb") as fp:
t = fp.read(12)
header = struct.unpack('<iii', t)
# step size in integration steps "dnms"
dn = header[0]
# number of measurements, so "ntot"/dn
nn = header[1]
# lattice T/a
tmax = header[2]
if version == "sfqcd":
t = fp.read(12)
Ls = struct.unpack('<iii', t)
if(Ls[0] == Ls[1] and Ls[1] == Ls[2]):
L = Ls[0]
if not (supposed_L == L):
raise Exception("It seems the length given in the header and by you contradict each other")
else:
raise Exception("Found more than one spatial length in header!")
print('dnms:', dn)
print('nn:', nn)
print('tmax:', tmax)
t = fp.read(8)
eps = struct.unpack('d', t)[0]
print('eps:', eps)
Q = []
ncs = []
while 0 < 1:
t = fp.read(4)
if(len(t) < 4):
break
ncs.append(struct.unpack('i', t)[0])
# Wsl
t = fp.read(8 * tmax * (nn + 1))
# Ysl
t = fp.read(8 * tmax * (nn + 1))
# Qsl, which is asked for in this method
t = fp.read(8 * tmax * (nn + 1))
# unpack the array of Qtops,
# on each timeslice t=0,...,tmax-1 and the
# measurement number in = 0...nn (see README.qcd1)
tmpd = struct.unpack('d' * tmax * (nn + 1), t)
Q.append(tmpd)
if not len(set([ncs[i] - ncs[i - 1] for i in range(1, len(ncs))])):
raise Exception("Irregularities in stepsize found")
else:
if 'steps' in kwargs:
if steps != ncs[1] - ncs[0]:
raise Exception("steps and the found stepsize are not the same")
else:
steps = ncs[1] - ncs[0]
print(len(Q))
print('max_t:', dn * (nn) * eps)
t_aim = (c * L) ** 2 / 8
print('t_aim:', t_aim)
index_aim = round(t_aim / eps / dn)
print('index_aim:', index_aim)
Q_sum = []
for i, item in enumerate(Q):
Q_sum.append([sum(item[current:current + tmax])
for current in range(0, len(item), tmax)])
print(len(Q_sum))
print(len(Q_sum[0]))
Q_round = []
for i in range(len(Q) // dtr_cnfg):
Q_round.append(round(Q_sum[dtr_cnfg * i][index_aim]))
if len(Q_round) != len(ncs) // dtr_cnfg:
raise Exception("qtops and ncs dont have the same length")
truncated_file = file[:-7]
print(truncated_file)
idl_start = 1
if "r_start" in kwargs:
Q_round = Q_round[r_start[rep]:]
idl_start = r_start[rep]
if "r_stop" in kwargs:
Q_round = Q_round[:r_stop[rep]]
idl_stop = idl_start + len(Q_round)
# keyword "names" prevails over "ens_name"
if "names" not in kwargs:
try:
idx = truncated_file.index('r')
except Exception:
if "names" not in kwargs:
raise Exception("Automatic recognition of replicum failed, please enter the key word 'names'.")
if "ens_name" in kwargs:
ens_name = kwargs.get("ens_name")
else:
ens_name = truncated_file[:idx]
rep_names.append(ens_name + '|' + truncated_file[idx:])
else:
names = kwargs.get("names")
rep_names = names
deltas.append(np.array(Q_round))
idl.append(range(idl_start, idl_stop))
result = Obs(deltas, rep_names, idl=idl)
return result
def read_qtop_sector(target=0, **kwargs):
"""target: int
specifies the topological sector to be reweighted to (default 0)
q_top: Obs
alternatively takes args of read_qtop method as kwargs
"""
if "q_top" in kwargs:
qtop = kwargs.get("q_top")
else:
if "path" in kwargs:
path = kwargs.get("path")
del kwargs["path"]
else:
raise Exception("If you are not providing q_top, please provide path")
if "prefix" in kwargs:
prefix = kwargs.get("prefix")
del kwargs["prefix"]
else:
raise Exception("If you are not providing q_top, please provide prefix")
if "c" in kwargs:
c = kwargs.get("c")
del kwargs["c"]
else:
raise Exception("If you are not providing q_top, please provide c")
if "version" in kwargs:
version = kwargs.get("version")
del kwargs["version"]
else:
version = "1.2"
if "dtr_cnfg" in kwargs:
dtr_cnfg = kwargs.get("dtr_cnfg")
del kwargs["dtr_cnfg"]
else:
dtr_cnfg = 1
qtop = read_qtop(path, prefix, c, dtr_cnfg=dtr_cnfg,
version=version, **kwargs)
names = qtop.names
print(names)
print(qtop.deltas.keys())
proj_qtop = []
for n in qtop.deltas:
proj_qtop.append(np.array([1 if int(qtop.value + q) == target else 0 for q in qtop.deltas[n]]))
result = Obs(proj_qtop, qtop.names)
return result

View file

@ -6,17 +6,58 @@ import fnmatch
import re
import numpy as np # Thinly-wrapped numpy
from ..obs import Obs
from . import utils
def read_sfcf(path, prefix, name, **kwargs):
"""Read sfcf C format from given folder structure.
def read_sfcf(path, prefix, name, quarks='.*', noffset=0, wf=0, wf2=0,
version="1.0c", **kwargs):
"""Read sfcf c format from given folder structure.
Parameters
----------
im -- if True, read imaginary instead of real part of the correlation function.
single -- if True, read a boundary-to-boundary correlation function with a single value
b2b -- if True, read a time-dependent boundary-to-boundary correlation function
names -- Alternative labeling for replicas/ensembles. Has to have the appropriate length
quarks: str
Label of the quarks used in the sfcf input file. e.g. "quark quark"
for version 0.0 this does NOT need to be given with the typical " - "
that is present in the output file,
this is done automatically for this version
noffset: int
Offset of the source (only relevant when wavefunctions are used)
wf: int
ID of wave function
wf2: int
ID of the second wavefunction
(only relevant for boundary-to-boundary correlation functions)
im: bool
if True, read imaginary instead of real part
of the correlation function.
b2b: bool
if True, read a time-dependent boundary-to-boundary
correlation function
single: bool
if True, read time independent boundary to boundary
correlation function
names: list
Alternative labeling for replicas/ensembles.
Has to have the appropriate length
ens_name : str
replaces the name of the ensemble
version: str
version of SFCF, with which the measurement was done.
if the compact output option (-c) was specified,
append a "c" to the version (e.g. "1.0c")
if the append output option (-a) was specified,
append an "a" to the version. Currently supported versions
are "0.0", "1.0", "2.0", "1.0c", "2.0c", "1.0a" and "2.0a".
replica: list
list of replica to be read, default is all
files: list
list of files to be read per replica, default is all.
for non-compact output format, hand the folders to be read here.
check_configs:
list of list of supposed configs, eg. [range(1,1000)]
for one replicum with 1000 configs
TODO:
- whats going on with files here?
"""
if kwargs.get('im'):
im = 1
@ -29,119 +70,43 @@ def read_sfcf(path, prefix, name, **kwargs):
b2b = 1
single = 1
else:
b2b = 0
single = 0
if kwargs.get('b2b'):
b2b = 1
else:
b2b = 0
single = 0
if "replica" in kwargs:
reps = kwargs.get("replica")
compact = True
appended = False
known_versions = ["0.0", "1.0", "2.0", "1.0c", "2.0c", "1.0a", "2.0a"]
if version not in known_versions:
raise Exception("This version is not known!")
if(version[-1] == "c"):
appended = False
compact = True
version = version[:-1]
elif(version[-1] == "a"):
appended = True
compact = False
version = version[:-1]
else:
compact = False
appended = False
read = 0
T = 0
start = 0
ls = []
if "replica" in kwargs:
ls = reps
else:
for (dirpath, dirnames, filenames) in os.walk(path):
if not appended:
ls.extend(dirnames)
break
if not ls:
raise Exception('Error, directory not found')
for exc in ls:
if fnmatch.fnmatch(exc, prefix + '*'):
ls = list(set(ls) - set(exc))
if len(ls) > 1:
ls.sort(key=lambda x: int(re.findall(r'\d+', x[len(prefix):])[0]))
replica = len(ls)
print('Read', part, 'part of', name, 'from', prefix, ',', replica, 'replica')
if 'names' in kwargs:
new_names = kwargs.get('names')
if len(new_names) != replica:
raise Exception('Names does not have the required length', replica)
else:
# Adjust replica names to new bookmarking system
new_names = []
for entry in ls:
idx = entry.index('r')
new_names.append(entry[:idx] + '|' + entry[idx:])
print(replica, 'replica')
for i, item in enumerate(ls):
print(item)
sub_ls = []
for (dirpath, dirnames, filenames) in os.walk(path + '/' + item):
sub_ls.extend(dirnames)
break
for exc in sub_ls:
if fnmatch.fnmatch(exc, 'cfg*'):
sub_ls = list(set(sub_ls) - set(exc))
sub_ls.sort(key=lambda x: int(x[3:]))
no_cfg = len(sub_ls)
print(no_cfg, 'configurations')
if i == 0:
with open(path + '/' + item + '/' + sub_ls[0] + '/' + name) as fp:
for k, line in enumerate(fp):
if read == 1 and not line.strip() and k > start + 1:
break
if read == 1 and k >= start:
T += 1
if '[correlator]' in line:
read = 1
start = k + 7 + b2b
T -= b2b
deltas = []
for j in range(T):
deltas.append([])
sublength = len(sub_ls)
for j in range(T):
deltas[j].append(np.zeros(sublength))
for cnfg, subitem in enumerate(sub_ls):
with open(path + '/' + item + '/' + subitem + '/' + name) as fp:
for k, line in enumerate(fp):
if(k >= start and k < start + T):
floats = list(map(float, line.split()))
deltas[k - start][i][cnfg] = floats[1 + im - single]
result = []
for t in range(T):
result.append(Obs(deltas[t], new_names))
return result
def read_sfcf_c(path, prefix, name, quarks='.*', noffset=0, wf=0, wf2=0, **kwargs):
"""Read sfcf c format from given folder structure.
Parameters
----------
quarks -- Label of the quarks used in the sfcf input file
noffset -- Offset of the source (only relevant when wavefunctions are used)
wf -- ID of wave function
wf2 -- ID of the second wavefunction (only relevant for boundary-to-boundary correlation functions)
im -- if True, read imaginary instead of real part of the correlation function.
b2b -- if True, read a time-dependent boundary-to-boundary correlation function
names -- Alternative labeling for replicas/ensembles. Has to have the appropriate length
ens_name : str
replaces the name of the ensemble
"""
if kwargs.get('im'):
im = 1
part = 'imaginary'
else:
im = 0
part = 'real'
if kwargs.get('b2b'):
b2b = 1
else:
b2b = 0
T = 0
ls = []
for (dirpath, dirnames, filenames) in os.walk(path):
ls.extend(dirnames)
ls.extend(filenames)
break
if not ls:
raise Exception('Error, directory not found')
@ -150,54 +115,100 @@ def read_sfcf_c(path, prefix, name, quarks='.*', noffset=0, wf=0, wf2=0, **kwarg
if not fnmatch.fnmatch(exc, prefix + '*'):
ls = list(set(ls) - set([exc]))
if len(ls) > 1:
ls.sort(key=lambda x: int(re.findall(r'\d+', x[len(prefix):])[0])) # New version, to cope with ids, etc.
ls.sort(key=lambda x: int(re.findall(r'\d+', x[len(prefix):])[0]))
if not appended:
replica = len(ls)
else:
replica = len([file.split(".")[-1] for file in ls]) // len(set([file.split(".")[-1] for file in ls]))
print('Read', part, 'part of', name, 'from', prefix[:-1],
',', replica, 'replica')
if 'names' in kwargs:
new_names = kwargs.get('names')
if len(new_names) != len(set(new_names)):
raise Exception("names are not unique!")
if len(new_names) != replica:
raise Exception('Names does not have the required length', replica)
else:
# Adjust replica names to new bookmarking system
new_names = []
if not appended:
for entry in ls:
try:
idx = entry.index('r')
except Exception:
raise Exception("Automatic recognition of replicum failed, please enter the key word 'names'.")
if 'ens_name' in kwargs:
new_names.append(kwargs.get('ens_name') + '|' + entry[idx:])
else:
new_names.append(entry[:idx] + '|' + entry[idx:])
else:
print('Read', part, 'part of', name, 'from', prefix[:-1], ',', replica, 'replica')
for exc in ls:
if not fnmatch.fnmatch(exc, prefix + '*.' + name):
ls = list(set(ls) - set([exc]))
ls.sort(key=lambda x: int(re.findall(r'\d+', x)[-1]))
for entry in ls:
myentry = entry[:-len(name) - 1]
try:
idx = myentry.index('r')
except Exception:
raise Exception("Automatic recognition of replicum failed, please enter the key word 'names'.")
if 'ens_name' in kwargs:
new_names.append(kwargs.get('ens_name') + '|' + myentry[idx:])
else:
new_names.append(myentry[:idx] + '|' + myentry[idx:])
idl = []
if not appended:
for i, item in enumerate(ls):
sub_ls = []
if "files" in kwargs:
sub_ls = kwargs.get("files")
sub_ls.sort(key=lambda x: int(re.findall(r'\d+', x)[-1]))
else:
for (dirpath, dirnames, filenames) in os.walk(path + '/' + item):
if compact:
sub_ls.extend(filenames)
else:
sub_ls.extend(dirnames)
break
for exc in sub_ls:
if compact:
if not fnmatch.fnmatch(exc, prefix + '*'):
sub_ls = list(set(sub_ls) - set([exc]))
sub_ls.sort(key=lambda x: int(re.findall(r'\d+', x)[-1]))
first_cfg = int(re.findall(r'\d+', sub_ls[0])[-1])
last_cfg = len(sub_ls) + first_cfg - 1
for cfg in range(1, len(sub_ls)):
if int(re.findall(r'\d+', sub_ls[cfg])[-1]) != first_cfg + cfg:
last_cfg = cfg + first_cfg - 1
break
no_cfg = last_cfg - first_cfg + 1
print(item, ':', no_cfg, 'evenly spaced configurations (', first_cfg, '-', last_cfg, ') ,', len(sub_ls) - no_cfg, 'configs omitted\n')
sub_ls.sort(key=lambda x:
int(re.findall(r'\d+', x)[-1]))
else:
if not fnmatch.fnmatch(exc, 'cfg*'):
sub_ls = list(set(sub_ls) - set([exc]))
sub_ls.sort(key=lambda x: int(x[3:]))
rep_idl = []
no_cfg = len(sub_ls)
for cfg in sub_ls:
try:
if compact:
rep_idl.append(int(cfg.split("n")[-1]))
else:
rep_idl.append(int(cfg[3:]))
except Exception:
raise Exception("Couldn't parse idl from directroy, problem with file " + cfg)
rep_idl.sort()
print(item, ':', no_cfg, ' configurations')
idl.append(rep_idl)
if i == 0:
if compact:
pattern = 'name ' + name + '\nquarks ' + quarks + '\noffset ' + str(noffset) + '\nwf ' + str(wf)
if b2b:
pattern += '\nwf_2 ' + str(wf2)
with open(path + '/' + item + '/' + sub_ls[0], 'r') as file:
content = file.read()
match = re.search(pattern, content)
if match:
# the start and end point of the correlator
# in question is extracted for later use in
# the other files
start_read = content.count('\n', 0, match.start()) + 5 + b2b
end_match = re.search(r'\n\s*\n', content[match.start():])
T = content[match.start():].count('\n', 0, end_match.start()) - 4 - b2b
@ -205,90 +216,136 @@ def read_sfcf_c(path, prefix, name, quarks='.*', noffset=0, wf=0, wf2=0, **kwarg
print(T, 'entries, starting to read in line', start_read)
else:
raise Exception('Correlator with pattern\n' + pattern + '\nnot found.')
else:
# this part does the same as above,
# but for non-compactified versions of the files
with open(path + '/' + item + '/' + sub_ls[0] + '/' + name) as fp:
for k, line in enumerate(fp):
if version == "0.0":
# check if this is really the right file
# by matching pattern similar to above
pattern = "# " + name + " : offset " + str(noffset) + ", wf " + str(wf)
# if b2b, a second wf is needed
if b2b:
pattern += ", wf_2 " + str(wf2)
qs = quarks.split(" ")
pattern += " : " + qs[0] + " - " + qs[1]
if read == 1 and not line.strip() and k > start + 1:
break
if read == 1 and k >= start:
T += 1
if version == "0.0":
if pattern in line:
# print(line)
read = 1
start = k + 1
else:
if '[correlator]' in line:
read = 1
start = k + 7 + b2b
T -= b2b
print(str(T) + " entries found.")
deltas = []
for j in range(T):
deltas.append([])
sublength = no_cfg
for j in range(T):
deltas[j].append(np.zeros(sublength))
for t in range(T):
deltas[t].append(np.zeros(no_cfg))
# we iterate through all measurement files in the path given...
if compact:
for cfg in range(no_cfg):
with open(path + '/' + item + '/' + sub_ls[cfg]) as fp:
for k, line in enumerate(fp):
lines = fp.readlines()
if(start_read + T > len(lines)):
raise Exception("EOF before end of correlator data! Maybe " + path + '/' + item + '/' + sub_ls[cfg] + " is corrupted?")
for k in range(start_read - 6, start_read + T):
if k == start_read - 5 - b2b:
if line.strip() != 'name ' + name:
raise Exception('Wrong format', sub_ls[cfg])
if lines[k].strip() != 'name ' + name:
raise Exception('Wrong format',
sub_ls[cfg])
if(k >= start_read and k < start_read + T):
floats = list(map(float, line.split()))
floats = list(map(float, lines[k].split()))
deltas[k - start_read][i][cfg] = floats[-2:][im]
else:
for cnfg, subitem in enumerate(sub_ls):
with open(path + '/' + item + '/' + subitem + '/' + name) as fp:
for k, line in enumerate(fp):
if(k >= start and k < start + T):
floats = list(map(float, line.split()))
if version == "0.0":
deltas[k - start][i][cnfg] = floats[im]
else:
deltas[k - start][i][cnfg] = floats[1 + im - single]
else:
if "files" in kwargs:
ls = kwargs.get("files")
else:
for exc in ls:
if not fnmatch.fnmatch(exc, prefix + '*.' + name):
ls = list(set(ls) - set([exc]))
ls.sort(key=lambda x: int(re.findall(r'\d+', x)[-1]))
pattern = 'name ' + name + '\nquarks ' + quarks + '\noffset ' + str(noffset) + '\nwf ' + str(wf)
if b2b:
pattern += '\nwf_2 ' + str(wf2)
for rep, file in enumerate(ls):
rep_idl = []
with open(path + '/' + file, 'r') as fp:
content = fp.readlines()
data_starts = []
for linenumber, line in enumerate(content):
if "[run]" in line:
data_starts.append(linenumber)
if len(set([data_starts[i] - data_starts[i - 1] for i in
range(1, len(data_starts))])) > 1:
raise Exception("Irregularities in file structure found, not all runs have the same output length")
chunk = content[:data_starts[1]]
for linenumber, line in enumerate(chunk):
if line.startswith("gauge_name"):
gauge_line = linenumber
elif line.startswith("[correlator]"):
corr_line = linenumber
found_pat = ""
for li in chunk[corr_line + 1:corr_line + 6 + b2b]:
found_pat += li
if re.search(pattern, found_pat):
start_read = corr_line + 7 + b2b
T = len(chunk) - 1 - start_read
if rep == 0:
deltas = []
for t in range(T):
deltas.append([])
for t in range(T):
deltas[t].append(np.zeros(len(data_starts)))
for cnfg in range(len(data_starts)):
start = data_starts[cnfg]
stop = start + data_starts[1]
chunk = content[start:stop]
try:
rep_idl.append(int(chunk[gauge_line].split("n")[-1]))
except Exception:
raise Exception("Couldn't parse idl from directroy, problem with chunk around line " + gauge_line)
found_pat = ""
for li in chunk[corr_line + 1:corr_line + 6 + b2b]:
found_pat += li
if re.search(pattern, found_pat):
for t, line in enumerate(chunk[start_read:start_read + T]):
floats = list(map(float, line.split()))
deltas[t][rep][cnfg] = floats[-2:][im]
idl.append(rep_idl)
if "check_configs" in kwargs:
print("Checking for missing configs...")
che = kwargs.get("check_configs")
if not (len(che) == len(idl)):
raise Exception("check_configs has to be the same length as replica!")
for r in range(len(idl)):
print("checking " + new_names[r])
utils.check_idl(idl[r], che[r])
print("Done")
result = []
for t in range(T):
result.append(Obs(deltas[t], new_names))
return result
def read_qtop(path, prefix, **kwargs):
"""Read qtop format from given folder structure.
Parameters
----------
target -- specifies the topological sector to be reweighted to (default 0)
full -- if true read the charge instead of the reweighting factor.
"""
if 'target' in kwargs:
target = kwargs.get('target')
else:
target = 0
if kwargs.get('full'):
full = 1
else:
full = 0
ls = []
for (dirpath, dirnames, filenames) in os.walk(path):
ls.extend(filenames)
break
if not ls:
raise Exception('Error, directory not found')
# Exclude files with different names
for exc in ls:
if not fnmatch.fnmatch(exc, prefix + '*'):
ls = list(set(ls) - set([exc]))
if len(ls) > 1:
ls.sort(key=lambda x: int(re.findall(r'\d+', x[len(prefix):])[0])) # New version, to cope with ids, etc.
replica = len(ls)
print('Read Q_top from', prefix[:-1], ',', replica, 'replica')
deltas = []
for rep in range(replica):
tmp = []
with open(path + '/' + ls[rep]) as fp:
for k, line in enumerate(fp):
floats = list(map(float, line.split()))
if full == 1:
tmp.append(floats[1])
else:
if int(floats[1]) == target:
tmp.append(1.0)
else:
tmp.append(0.0)
deltas.append(np.array(tmp))
rep_names = []
for entry in ls:
truncated_entry = entry.split('.')[0]
idx = truncated_entry.index('r')
rep_names.append(truncated_entry[:idx] + '|' + truncated_entry[idx:])
result = Obs(deltas, rep_names)
result.append(Obs(deltas[t], new_names, idl=idl))
return result

24
pyerrors/input/utils.py Normal file
View file

@ -0,0 +1,24 @@
"""Utilities for the input"""
def check_idl(idl, che):
"""Checks if list of configurations is contained in an idl
Parameters
----------
idl : range or list
idl of the current replicum
che : list
list of configurations to be checked against
"""
missing = []
for c in che:
if c not in idl:
missing.append(c)
# print missing configurations such that it can directly be parsed to slurm terminal
if not (len(missing) == 0):
print(len(missing), "configs missing")
miss_str = str(missing[0])
for i in missing[1:]:
miss_str += "," + str(i)
print(miss_str)

View file

@ -4,6 +4,7 @@ import numpy as np
import autograd.numpy as anp # Thinly-wrapped numpy
from autograd import jacobian
import matplotlib.pyplot as plt
from scipy.stats import skew, skewtest, kurtosis, kurtosistest
import numdifftools as nd
from itertools import groupby
from .covobs import Covobs
@ -564,7 +565,7 @@ class Obs:
y = np.concatenate(tmp, axis=0)
plt.errorbar(x, y, fmt='.', markersize=3)
plt.xlim(-0.5, e_N - 0.5)
plt.title(e_name)
plt.title(e_name + f', skew: {skew(y):.3f} (p={skewtest(y).pvalue:.3f}), kurtosis: {kurtosis(y):.3f} (p={kurtosistest(y).pvalue:.3f})')
plt.draw()
def plot_piechart(self):

View file

@ -309,6 +309,25 @@ def test_error_band():
pe.fits.error_band(x, f, fitp.fit_parameters)
def test_ks_test():
def f(a, x):
y = a[0] + a[1] * x
return y
fit_res = []
for i in range(20):
data = []
for j in range(10):
data.append(pe.pseudo_Obs(j + np.random.normal(0.0, 0.25), 0.25, 'test'))
my_corr = pe.Corr(data)
fit_res.append(my_corr.fit(f, silent=True))
pe.fits.ks_test()
pe.fits.ks_test(fit_res)
def fit_general(x, y, func, silent=False, **kwargs):
"""Performs a non-linear fit to y = func(x) and returns a list of Obs corresponding to the fit parameters.