pyerrors/examples/01_basic_example.ipynb
2020-10-13 16:53:00 +02:00

92 KiB

None <html> <head> </head>

Basic pyerrors example

Import pyerrors, as well as autograd wrapped numpy and matplotlib. The sys statement is not necessary if pyerrors was installed via pip.

In [1]:
import sys
sys.path.append('..')
import numpy as np
import matplotlib.pyplot as plt
import pyerrors as pe

We use numpy to generate some fake data

In [2]:
test_sample1 = np.random.normal(2.0, 0.5, 1000)
test_sample2 = np.random.normal(1.0, 0.1, 1000)

From this we can construct Obs, which are the basic object of pyerrors. For each sample we give to the obs, we also have to specify an ensemble/replica name. In this example we assume that both datasets originate from the same gauge field ensemble labeled 'ens1'.

In [3]:
obs1 = pe.Obs([test_sample1], ['ens1'])
obs2 = pe.Obs([test_sample2], ['ens1'])

We can now combine these two observables into a third one:

In [4]:
obs3 = np.log(obs1 ** 2 / obs2 ** 4)

pyerrors overloads all basic math operations, the user can work with these Obs as if they were real numbers. The proper resampling is performed in the background via automatic differentiation.

If we are now interested in the error of obs3, we can use the gamma_method to compute it and then print the object to the notebook

In [5]:
obs3.gamma_method()
print(obs3)
Obs[1.415(20)]

With print level 1 we can take a look at the integrated autocorrelation time estimated by the automatic windowing procedure.

In [6]:
obs3.print(1)
Result	 1.41522010e+00 +/- 2.03946273e-02 +/- 1.01973136e-03 (1.441%)
  t_int	 5.07378446e-01 +/- 4.51400871e-02 S = 2.00

As expected the random data from numpy exhibits no autocorrelation ($\tau_\text{int}\,\approx0.5$). It can still be interesting to have a look at the window size dependence of the integrated autocorrelation time

In [7]:
obs3.plot_tauint()

This figure shows the windowsize dependence of the integrated autocorrelation time. The red vertical line signalizes the window chosen by the automatic windowing procedure with $S=2.0$. Choosing a larger windowsize would not significantly alter $\tau_\text{int}$, so everything seems to be correct here.

Correlated data

We can now generate fake data with given covariance matrix and integrated autocorrelation times:

In [8]:
cov = np.array([[0.5, -0.2], [-0.2, 0.3]]) # Covariance matrix
tau = [4, 8] # Autocorrelation times
c_obs1, c_obs2 = pe.misc.gen_correlated_data([2.8, 2.1], cov, 'ens1', tau)

and once again combine the two Obs to a new one with arbitrary math operations

In [9]:
c_obs3 = np.sin(c_obs1 / c_obs2 - 1)
c_obs3.gamma_method()
c_obs3.print()
Result	 3.27194697e-01 +/- 1.96872835e+00 +/- 3.38140198e-01 (601.699%)
  t_int	 5.41336983e+00 +/- 1.59801329e+00 S = 2.00

This time we see a significant autocorrelation so it is worth to have a closer look at the normalized autocorrelation function (rho) and the integrated autocorrelation time

In [10]:
c_obs3.plot_rho()
c_obs3.plot_tauint()

We can now redo the error analysis and alter the value of S or attach a tail to the autocorrelation function to take into account long range autocorrelations

In [11]:
c_obs3.gamma_method(tau_exp=20)
c_obs3.print()
c_obs3.plot_tauint()
Result	 3.27194697e-01 +/- 2.14280114e+00 +/- 2.48970994e-01 (654.901%)
  t_int	 6.41297945e+00 +/- 2.18167829e+00 tau_exp = 20.00,  N_sigma = 1

Jackknife

For comparison and as a crosscheck, we can do a jackknife binning analysis. We compare the result for different binsizes with the result from the gamma method. Besides the more robust approach of the gamma method, it can also be shown that the systematic error of the error decreases faster with $N$ in comparison to the binning approach (see hep-lat/0306017)

In [12]:
import pyerrors.jackknifing as jn
jack1 = jn.generate_jack(c_obs1, max_binsize=120)
jack2 = jn.generate_jack(c_obs2, max_binsize=120)
jack3 = jn.derived_jack(lambda x: np.sin(x[0] / x[1] - 1), [jack1, jack2])

print('Binning analysis:')
jack3.print(binsize=25)
jack3.print(binsize=50)
jack3.print(binsize=100)

jack3.plot_tauint()

print('Result from the automatic windowing procedure for comparison:')
c_obs3.gamma_method(S=1.5)
c_obs3.print()
c_obs3.gamma_method(S=2)
c_obs3.print()
c_obs3.gamma_method(S=3)
c_obs3.print()

c_obs3.gamma_method(S=2)
c_obs3.plot_tauint()
Binning analysis:
Result:	 3.27194697e-01 +/- 1.81819841e+00 +/- 3.98347312e-01 (555.693%)
Result:	 3.27194697e-01 +/- 1.66475180e+00 +/- 5.21149746e-01 (508.795%)
Result:	 3.27194697e-01 +/- 1.41273466e+00 +/- 6.28627238e-01 (431.772%)
Result from the automatic windowing procedure for comparison:
Result	 3.27194697e-01 +/- 2.02097394e+00 +/- 3.22723658e-01 (617.667%)
  t_int	 5.70449936e+00 +/- 1.53928442e+00 S = 1.50
Result	 3.27194697e-01 +/- 1.96872835e+00 +/- 3.38140198e-01 (601.699%)
  t_int	 5.41336983e+00 +/- 1.59801329e+00 S = 2.00
Result	 3.27194697e-01 +/- 1.89700786e+00 +/- 3.67353992e-01 (579.780%)
  t_int	 5.02613753e+00 +/- 1.69573607e+00 S = 3.00

For this specific example the binned Jackknife procedure seems to underestimate the final error, the deduced intergrated autocorrelation time depends strongly on the chosen binsize. The automatic windowing procedure displayed for comparison gives more robust results for this example.

In [ ]:

</html>