MarginalDist
class MarginalDist(debug=False)
Learn/Build marginal distributions for univariate data.
Parameters
debug: boolean, default False. Whether to print debug-related outputs to console.
Notes
Reference List of Distributions
The following methods are used in the same way:
xx_dist(data, operation=<chosen operation>, new_params=<dict of params>, sample_size=<int>)
List of operations:
- fit: given some input
data, find the best parameters for chosen xx distribution - sample: given some params (given as
new_paramsor from prior fitted distribution), sample some datapoints from chosen xx distribution. Number of datapoints =sample_size - pdf : given some params (given as
new_paramsor from prior fitted distribution), return the PDF for points given indata. - cdf : given some params (given as
new_paramsor from prior fitted distribution), return the CDF for points given indata. - ppf : given some params (given as
new_paramsor from prior fitted distribution), return the PPF for “probability” given indata.
| Ref. String | Method | Parameters |
|---|---|---|
| beta | beta_dist | loc, scale, a, b |
| laplace | laplace_dist | loc, scale |
| loglaplace | loglaplace_dist | c, loc, scale |
| gamma | gamma_dist | loc, scale, a |
| gaussian | gaussian_dist | loc, scale |
| student_t | t_dist | loc, scale |
| uniform | uni_dist | loc, scale |
| emp | empirical_dist | loc, scale |
| gaussian_kde | gaussian_kde_dist | scale |
| degenerate | degenerate_dist | constant_value |
Examples
Please refer to the below pages for detailed examples:
| Example | Description |
|---|---|
| Univariate | Demonstrates use of MarginalDist to create univariate synthetic data. |
Attributes
| Attribute | Description |
|---|---|
| debug | (boolean) whether to debug or not |
| marginal_dist | (str) Type of marginal distribution the class instance is set to. |
| fitted_marginal_dist | (str) Type of marginal distribution the data is fitted to. |
| sample_size | (int) Number of samples to generate. |
| gaussian_kde_model | (obj) stats.gaussian_kde instance used. |
| fitted | (boolean) Set to True if successfully fitted. |
| params | (dict) List of parameters for fitted marginal distribution |
| sample_cdf | (array) CDF of samples used to fit the distribution |
| sample_pdf | (array) PDF of samples used to fit the distribution |
| samples | (array) x-values (samples) generated based on parameters (fitted or given) |
| cdf | (array) cumulative probability of new data input based on parameters (either fitted or given) |
| (array) probability of new data input based on parameters (either fitted or given) | |
| ppf | (array) x-value of cumulative probability of new data input |
| parametric | (list) List of parametric distributions: ["beta", "laplace", "loglaplace", "gamma", "gaussian", "student_t", "uniform"] |
| nonparametric | (list) List of non-parametric distributions: ["emp", "gaussian_kde"] |
Methods
| Method | Description |
|---|---|
| load_params([new_params, ]) | Replace MarginalDist.params with specified parameters in new_params dictionary. |
| generic_cdf(x, fn_pdf) | Compute generic CDF using integration |
| inv_CDF_fn(x, u) | Build CDF inverse function |
| fwd_CDF_fn(x, u) | Build CDF forward function |
| eCDF_fn(input, x, u, [init_val, ]) | Implements eCDF. For each element in input, it finds its best position in x, determines the corresponding cumulative probability from u, and returns the interpolated cumulative probability. |
| ecdf(x) | Computes the ECDF of x. Use only for continuous distributions. |
| beta_dist([data, operation, new_params, sample_size]) | Compute Beta Distribution related operations, including fit, sample, pdf, cdf, ppf. |
| laplace_dist([data, operation, new_params, sample_size]) | Compute Laplace Distribution related operations, including fit, sample, pdf, cdf, ppf. |
| loglaplace_dist([data, operation, new_params, sample_size]) | Compute log Laplace Distribution related operations, including fit, sample, pdf, cdf, ppf. |
| gamma_dist([data, operation, new_params, sample_size]) | Compute Gamma Distribution related operations, including fit, sample, pdf, cdf, ppf. |
| gaussian_dist([data, operation, new_params, sample_size]) | Compute Gaussian Distribution related operations, including fit, sample, pdf, cdf, ppf. |
| t_dist([data, operation, new_params, sample_size]) | Compute Student-t Distribution related operations, including fit, sample, pdf, cdf, ppf. |
| uni_dist([data, operation, new_params, sample_size]) | Compute Uniform Distribution related operations, including fit, sample, pdf, cdf, ppf. |
| degenerate_dist([data, operation, new_params, sample_size]) | Compute Degenerate Distribution related operations, including fit, sample, pdf, cdf, ppf. |
| gaussian_kde_dist([data, operation, new_params, sample_size, bw_method, weights]) | Compute Gaussian Kernel Density Estimate related operations, including fit, pdf, cdf, ppf. |
| empirical_dist([data, operation, new_params, sample_size]) | Compute Empirical distribution related operations, including fit, sample, cdf, ppf. |
| select_univariate([data, candidates]) | Evaluate and return the best univariate class for input data using scipy.stats.kstest. Use candidates to restrict the eligible distributions. |
| fit(data, [candidates, ]) | Wrapper function to fit the input data to best distribution, based on scipy.stats.kstest. Use candidates to restrict the eligible distributions. |
| pdf_wrapper(data) | Wrapper function to compute PDF given data samples. Use only when class instance has already been fitted to a distribution. |
| cdf_wrapper(data) | Wrapper function to compute CDF given data samples. Use only when class instance has already been fitted to a distribution. |
| ppf_wrapper(data) | Wrapper function to compute PPF given data samples. Use only when class instance has already been fitted to a distribution. |