10 KiB
Data management¶
import numpy as np
import pandas as pd
import pyerrors as pe
For the data management example we reuse the data from the correlator example.
correlator_data = pe.input.json.load_json("./data/correlator_test")
my_correlator = pe.Corr(correlator_data)
my_correlator.gamma_method()
import autograd.numpy as anp
def func_exp(a, x):
return a[1] * anp.exp(-a[0] * x)
In this example we perform uncorrelated fits of a single exponential function to the correlator and vary the range of the fit. The fit result can be conveniently stored in a pandas DataFrame together with the corresponding metadata.
rows = []
for t_start in range(12, 17):
for t_stop in range(30, 32):
fr = my_correlator.fit(func_exp, [t_start, t_stop], silent=True)
fr.gamma_method()
row = {"t_start": t_start,
"t_stop": t_stop,
"datapoints": t_stop - t_start + 1,
"chisquare_by_dof": fr.chisquare_by_dof,
"mass": fr[0]}
rows.append(row)
my_df = pd.DataFrame(rows)
my_df
The content of this pandas DataFrame can be inserted into a relational database, making use of the JSON
serialization of pyerrors
objects. In this example we use an SQLite database.
pe.input.pandas.to_sql(my_df, "mass_table", "my_db.sqlite", if_exists='fail')
At a later stage of the analysis the content of the database can be reconstructed into a DataFrame via SQL queries.
In this example we extract t_start
, t_stop
and the fitted mass for all fits which start at times larger than 14.
new_df = pe.input.pandas.read_sql(f"SELECT t_start, t_stop, mass FROM mass_table WHERE t_start > 14",
"my_db.sqlite",
auto_gamma=True)
new_df
The storage of intermediate analysis results in relational databases allows for a convenient and scalable way of splitting up a detailed analysis in multiple independent steps.