92 KiB
Basic pyerrors example¶
Import pyerrors, as well as autograd wrapped numpy and matplotlib. The sys statement is not necessary if pyerrors was installed via pip.
import sys
sys.path.append('..')
import numpy as np
import matplotlib.pyplot as plt
import pyerrors as pe
We use numpy to generate some fake data
test_sample1 = np.random.normal(2.0, 0.5, 1000)
test_sample2 = np.random.normal(1.0, 0.1, 1000)
From this we can construct Obs
, which are the basic object of pyerrors
. For each sample we give to the obs, we also have to specify an ensemble/replica name. In this example we assume that both datasets originate from the same gauge field ensemble labeled 'ens1'.
obs1 = pe.Obs([test_sample1], ['ens1'])
obs2 = pe.Obs([test_sample2], ['ens1'])
We can now combine these two observables into a third one:
obs3 = np.log(obs1 ** 2 / obs2 ** 4)
pyerrors
overloads all basic math operations, the user can work with these Obs
as if they were real numbers. The proper resampling is performed in the background via automatic differentiation.
If we are now interested in the error of obs3, we can use the gamma_method
to compute it and then print the object to the notebook
obs3.gamma_method()
print(obs3)
With print level 1 we can take a look at the integrated autocorrelation time estimated by the automatic windowing procedure.
obs3.print(1)
As expected the random data from numpy exhibits no autocorrelation ($\tau_\text{int}\,\approx0.5$). It can still be interesting to have a look at the window size dependence of the integrated autocorrelation time
obs3.plot_tauint()
This figure shows the windowsize dependence of the integrated autocorrelation time. The red vertical line signalizes the window chosen by the automatic windowing procedure with $S=2.0$. Choosing a larger windowsize would not significantly alter $\tau_\text{int}$, so everything seems to be correct here.
Correlated data¶
We can now generate fake data with given covariance matrix and integrated autocorrelation times:
cov = np.array([[0.5, -0.2], [-0.2, 0.3]]) # Covariance matrix
tau = [4, 8] # Autocorrelation times
c_obs1, c_obs2 = pe.misc.gen_correlated_data([2.8, 2.1], cov, 'ens1', tau)
and once again combine the two Obs
to a new one with arbitrary math operations
c_obs3 = np.sin(c_obs1 / c_obs2 - 1)
c_obs3.gamma_method()
c_obs3.print()
This time we see a significant autocorrelation so it is worth to have a closer look at the normalized autocorrelation function (rho) and the integrated autocorrelation time
c_obs3.plot_rho()
c_obs3.plot_tauint()
We can now redo the error analysis and alter the value of S or attach a tail to the autocorrelation function to take into account long range autocorrelations
c_obs3.gamma_method(tau_exp=20)
c_obs3.print()
c_obs3.plot_tauint()
Jackknife¶
For comparison and as a crosscheck, we can do a jackknife binning analysis. We compare the result for different binsizes with the result from the gamma method. Besides the more robust approach of the gamma method, it can also be shown that the systematic error of the error decreases faster with $N$ in comparison to the binning approach (see hep-lat/0306017)
import pyerrors.jackknifing as jn
jack1 = jn.generate_jack(c_obs1, max_binsize=120)
jack2 = jn.generate_jack(c_obs2, max_binsize=120)
jack3 = jn.derived_jack(lambda x: np.sin(x[0] / x[1] - 1), [jack1, jack2])
print('Binning analysis:')
jack3.print(binsize=25)
jack3.print(binsize=50)
jack3.print(binsize=100)
jack3.plot_tauint()
print('Result from the automatic windowing procedure for comparison:')
c_obs3.gamma_method(S=1.5)
c_obs3.print()
c_obs3.gamma_method(S=2)
c_obs3.print()
c_obs3.gamma_method(S=3)
c_obs3.print()
c_obs3.gamma_method(S=2)
c_obs3.plot_tauint()
For this specific example the binned Jackknife procedure seems to underestimate the final error, the deduced intergrated autocorrelation time depends strongly on the chosen binsize. The automatic windowing procedure displayed for comparison gives more robust results for this example.