1
0
Fork 0
mirror of https://github.com/fjosw/pyerrors.git synced 2025-03-15 14:50:25 +01:00
pyerrors/examples/07_data_management.ipynb
2022-09-29 17:11:24 +01:00

10 KiB

None <html> <head> </head>

Data management

In [1]:
import numpy as np
import pandas as pd
import pyerrors as pe

For the data management example we reuse the data from the correlator example.

In [2]:
correlator_data = pe.input.json.load_json("./data/correlator_test")
my_correlator = pe.Corr(correlator_data)
my_correlator.gamma_method()
Data has been written using pyerrors 2.0.0.
Format version 0.1
Written by fjosw on 2022-01-06 11:11:19 +0100 on host XPS139305, Linux-5.11.0-44-generic-x86_64-with-glibc2.29

Description:  Test data for the correlator example
In [3]:
import autograd.numpy as anp
def func_exp(a, x):
    return a[1] * anp.exp(-a[0] * x)

In this example we perform uncorrelated fits of a single exponential function to the correlator and vary the range of the fit. The fit result can be conveniently stored in a pandas DataFrame together with the corresponding metadata.

In [4]:
rows = []
for t_start in range(12, 17):
    for t_stop in range(30, 32):
        fr = my_correlator.fit(func_exp, [t_start, t_stop], silent=True)
        fr.gamma_method()
        row = {"t_start": t_start,
               "t_stop": t_stop,
               "datapoints": t_stop - t_start + 1,
               "chisquare_by_dof": fr.chisquare_by_dof,
               "mass": fr[0]}
        rows.append(row)
my_df = pd.DataFrame(rows)
In [5]:
my_df
Out[5]:
t_start t_stop datapoints chisquare_by_dof mass
0 12 30 19 0.057872 0.2218(12)
1 12 31 20 0.063951 0.2221(11)
2 13 30 18 0.051577 0.2215(12)
3 13 31 19 0.060901 0.2219(11)
4 14 30 17 0.052349 0.2213(13)
5 14 31 18 0.063640 0.2218(13)
6 15 30 16 0.056088 0.2213(16)
7 15 31 17 0.067552 0.2218(17)
8 16 30 15 0.059969 0.2214(21)
9 16 31 16 0.070874 0.2220(20)

The content of this pandas DataFrame can be inserted into a relational database, making use of the JSON serialization of pyerrors objects. In this example we use an SQLite database.

In [6]:
pe.input.pandas.to_sql(my_df, "mass_table", "my_db.sqlite", if_exists='fail')

At a later stage of the analysis the content of the database can be reconstructed into a DataFrame via SQL queries. In this example we extract t_start, t_stop and the fitted mass for all fits which start at times larger than 14.

In [7]:
new_df = pe.input.pandas.read_sql(f"SELECT t_start, t_stop, mass FROM mass_table WHERE t_start > 14",
                                  "my_db.sqlite",
                                  auto_gamma=True)
In [8]:
new_df
Out[8]:
t_start t_stop mass
0 15 30 0.2213(16)
1 15 31 0.2218(17)
2 16 30 0.2214(21)
3 16 31 0.2220(20)

The storage of intermediate analysis results in relational databases allows for a convenient and scalable way of splitting up a detailed analysis in multiple independent steps.

In [ ]:

</html>