MCMC sampling with funcFit tutorial

Currently, funcFit supports MCMC sampling either via the pymc or the emcee package. To do this, the model objects provides the fitMCMC method (pymc) and the fitEMCEE method (emcee). Both are basically independent and can be used separately.

Although the method names start with fit, we emphasize that fitting, in the sense of optimization, is not the exact purpose of the analysis carried out here, but sampling from the posterior.

Note

The Markov chains produced by fitMCMC and fitEMCEE can be analyzed using the TraceAnalysis class. See this tutorial: Analyze Markov-Chains using TraceAnalysis.

Sampling using emcee (fitEMCEE)

The emcee package relies on sampling with an ensemble of chains, the so-called walkers. In this way, the sampling is automatically adapted to the scale of the problem, which simplifies obtained reasonable acceptance rates.

Basic sampling with emcee

The following example shows a basic application of fitEMCEE. By default, the resulting Markov chain is saved to a file called ‘chain.emcee’.

# Import numpy and matplotlib
from numpy import arange, sqrt, exp, pi, random, ones
import matplotlib.pylab as plt
# ... and now the funcFit package
from PyAstronomy import funcFit as fuf

# Before we can start fitting, we need something to fit.
# So let us create some data...

# Choose some signal-to-noise ratio
snr = 25.0

# Creating a Gaussian with some noise
# Choose some parameters...
gf = fuf.GaussFit1d()
gf.assignValues({"A": -5.0, "sig": 2.5, "mu": 10.0, "off": 1.0, "lin": 0.0})
# Calculate profile
x = arange(100) - 50.0
y = gf.evaluate(x)
# Add some noise
y += random.normal(0.0, 1.0/snr, x.size)

# Define the free parameters
gf.thaw(["A", "sig", "mu", "off"])

# Start a fit (quite dispensable here)
gf.fit(x, y, yerr=ones(x.size)/snr)

# Say, we want 200 burn-in iterations and, thereafter,
# 1000 further iterations (per walker).
sampleArgs = {"iters": 1000, "burn": 200}

# Start the sampling (ps could be used to continue the sampling)
ps = gf.fitEMCEE(x, y, yerr=ones(x.size)/snr, sampleArgs=sampleArgs)

# Plot the distributions of the chains
# NOTE: the order of the parameters in the chain object is the same
#       as the order of the parameters returned by freeParamNames()
for i, p in enumerate(gf.freeParamNames()):
    plt.subplot(len(gf.freeParamNames()), 1, i+1)
    plt.hist(gf.emceeSampler.flatchain[::, i], label=p)
    plt.legend()
plt.show()

Sampling with prior information

In this example, we use a very simple constant model to, first, compare the result from sampling with classical error estimation; note that in this simple case Bayesian credibility intervals are, indeed, numerically identical to classical confidence intervals, which is, however, not generally the case. Second, we introduce (strong) prior information and repeat the sampling.

A number of ready-to-use priors are implemented here: emceePriors

from __future__ import print_function, division
# Import numpy and matplotlib
from numpy import arange, sqrt, exp, pi, random, ones
import matplotlib.pylab as plt
# ... and now the funcFit package
from PyAstronomy import funcFit as fuf
import numpy as np

# Before we can start fitting, we need something to fit.
# So let us create some data...

# Choose some signal-to-noise ratio
snr = 25.0

# Choosing an arbitrary constant and ...
c = 10.0
# ... an equally arbitrary number of data points
npoint = 10

# Define 'data'
x = arange(npoint)
y = np.ones(len(x)) * c
# Add some noise
y += random.normal(0.0, 1.0/snr, x.size)

# A funcFit object representing a constant
pf = fuf.PolyFit1d(0)
pf["c0"] = c

# The only parameter shall be free
pf.thaw("c0")

# Say, we want 200 burn-in iterations and, thereafter,
# 2500 further iterations (per walker).
sampleArgs = {"iters": 2500, "burn": 200}

# Start the sampling (ps could be used to continue the sampling)
ps = pf.fitEMCEE(x, y, yerr=ones(x.size)/snr, sampleArgs=sampleArgs)
print()

# Plot the distributions of the chains
# NOTE: the order of the parameters in the chain object is the same
#       as the order of the parameters returned by freeParamNames()
h = plt.hist(pf.emceeSampler.flatchain[::, 0], label="c0", density=True)
# Construct "data points" in the middle of the bins
xhist = (h[1][1:] + h[1][0:-1]) / 2.0
yhist = h[0]

# Fit the histogram using a Gaussian
gf = fuf.GaussFit1d()
gf.assignValues({"A": 1.0, "mu": c, "sig": 1.0/snr/np.sqrt(npoint)})
# First fitting only "mu" is simply quite stable
gf.thaw("mu")
gf.fit(xhist, yhist)
gf.thaw(["A", "sig"])
gf.fit(xhist, yhist)

print()
print("  --- Sampling results ---")
print("Posterior estimate of constant: ", np.mean(pf.emceeSampler.flatchain[::, 0]))
print("Nominal error of the mean: ", 1.0/snr/np.sqrt(npoint))
print("Estimate from Markov chain: ", np.std(pf.emceeSampler.flatchain[::, 0]), end=' ')
print(" and from Gaussian fit to distribution: ", gf["sig"])

# Evaluate best-fit model ...
xmodel = np.linspace(c - 10.0/snr, c + 10.0/snr, 250)
ymodel = gf.evaluate(xmodel)
# ... and plot
plt.plot(xhist, yhist, 'rp')
plt.plot(xmodel, ymodel, 'r--')
plt.legend()
plt.show()


# Defining a prior on c0. Prior knowledge tells us that its value
# is around 7. Let us choose the standard deviation of the prior so
# that the estimate will lie in the middle between 7 and 10. Here we
# exploit symmetry and make the prior information as strong as the
# information contained in the likelihood function.
priors = {"c0": fuf.FuFPrior("gaussian", sig=1.0/snr/np.sqrt(npoint), mu=7.0)}

# Start the sampling (ps could be used to continue the sampling)
ps = pf.fitEMCEE(x, y, yerr=ones(x.size)/snr, sampleArgs=sampleArgs, priors=priors)

print()
print("  --- Sampling results with strong prior information ---")
print("Posterior estimate of constant: ", np.mean(pf.emceeSampler.flatchain[::, 0]), end=' ')
print(" +/-", np.std(pf.emceeSampler.flatchain[::, 0]))

plt.hist(pf.emceeSampler.flatchain[::, 0], label="c0", density=True)
plt.show()

Sampling from specific distribution using emcee (sampleEMCEE)

The function sampleEMCEE() allows to sample arbitrary distributions using the emcee implementation of MCMC sampling without the need to define a formal funcFit model. This function is only a wrapper around emcee functionality. The resulting chains can be studied using the TraceAnalysis class.

Sampling from a Gaussian distribution

The following example shows how to use sampleEMCEE() to obtain realizations from a Gaussian probability distribution. Of course there are more advantageous ways of doing this.

from __future__ import print_function
import numpy as np
from PyAstronomy import funcFit as fuf
import matplotlib.pylab as plt


def lfGauss(v, sigma, mu):
    """
    Gaussian density

    Parameters
    ----------
    v : dictionary
        Holds current values of "x"
    mus, sigma : float
        Mean and standard deviation of the Gaussian. Specified via
        the `largs` argument.

    Returns
    -------
    lp : float
        Natural logarithm of the density.
    """
    result = 0.0
    # Log(density)
    result += -0.5*np.log(2.*np.pi*sigma**2) - (v["x"] - mu)**2/(2.*sigma**2)
    return result


# Sampling arguments
# burn: Number of burn-in steps per walker
# iters: Number of iterations per walker
sa = {"burn": 1000, "iters": 5000}

# Starting values
fv0 = {"x": 0.5}
# Specify standard deviation and mean of Gaussian
la = {"mu": 0.5, "sigma": 0.25}

# Sample from distribution
ps = fuf.sampleEMCEE(["x"], fv0, lfGauss, largs=la, sampleArgs=sa, nwalker=4, dbfile="gauss.emcee")
print()

# Use TraceAnalysis to look at chains
ta = fuf.TraceAnalysis("gauss.emcee")
print("Available chains: ", ta.availableParameters())

print("Mean and STD of chain: ", np.mean(ta["x"]), np.std(ta["x"]))

# Check distribution of chain
# Plot histogram of chain
plt.hist(ta["x"], 60, density=True)
# Overplot Gaussian model
xx = np.linspace(la["mu"]-6*la["sigma"], la["mu"]+6*la["sigma"], 1000)
yy = 1./np.sqrt(2.*np.pi*la["sigma"]**2) * np.exp(-(xx - la["mu"])**2/(2.*la["sigma"]**2))
plt.plot(xx, yy, 'r--')
plt.show()

Estimate mean and STD from Gaussian sample

MCMC sampling is frequently used to sample from posterior distributions and obtain marginal distributions. The following example demonstrates how to estimate the mean and standard deviation from a sample of Gaussian data using sampleEMCEE().

from __future__ import print_function
import numpy as np
from PyAstronomy import funcFit as fuf


def lfGaussMS(v, x=None):
    """
    Gaussian posterior with 1/sigma prior on sigma.

    Parameters
    ----------
    v : dictionary
        Holds current values of "sigma" and "mu"
    x : array
        The 'data' observed. Will be specified by the `largs` keyword.

    Returns
    -------
    lp : float
        Natural logarithm of the density.
    """
    if v["sigma"] < 0.:
        # Penalize negative standard deviations
        return -1e20*abs(v["sigma"])

    result = 0.0
    # Apply prior on sigma
    result -= np.log(v["sigma"])
    # Add log(likelihood)
    result += np.sum(-0.5*np.log(2.*np.pi*v["sigma"]**2) - (x - v["mu"])**2/(2.*v["sigma"]**2))
    return result


# Sampling arguments
# burn: Number of burn-in steps per walker
# iters: Number of iterations per walker
sa = {"burn": 1000, "iters": 5000}

# Starting values
fv0 = {"sigma": 1., "mu": 1.}
# 'Observed' data
la = {"x": np.random.normal(0., 1., 1000)}

print("Mean of 'data': ", np.mean(la["x"]))
print("Standard deviation of 'data': ", np.std(la["x"]))

# Scale width for distributing the walkers
s = {"mu": 0.01, "sigma": 0.5}
ps = fuf.sampleEMCEE(["mu", "sigma"], fv0, lfGaussMS, largs=la, sampleArgs=sa, nwalker=4,
                     scales=s, dbfile="musig.emcee")
print()

# Use TraceAnalysis to look at chains
ta = fuf.TraceAnalysis("musig.emcee")
print("Available chains: ", ta.availableParameters())

ta.plotTraceHist('mu')
ta.show()

ta.plotTraceHist('sigma')
ta.show()

LEGACY: Sampling using pymc (fitMCMC)

The fitMCMC method provided by funcFit is not an MCMC sampler itself, but it is a wrapper around functionality provided by a third party package, namely, PyMC.

pymc is a powerful Python package providing a wealth of functionality concerning Bayesian analysis. fitMCMC provides an easy to use interface to pymc sampling, which allows to carry out a basic Bayesian data analysis quickly.

Note

To run these examples, pymc must be installed (check the output of funcFit.status() shown at the beginning of this tutorial to see whether this is the case on your system).

Warning

Unfortunately, pymc is only supported in version 2.x. In particular, version 3.x is not supported.

Using the Markov-Chain Monte-Carlo (MCMC) sampler

The following example demonstrates how the funcFit interface can be used to carry out a Bayesian analysis using pymc. For a deeper understanding of the working, adaptability, and logic implemented by pymc, we refer the reader to their web page (PyMC).

from __future__ import print_function, division
# Import some required modules
from numpy import arange, sqrt, exp, pi, random, ones
import matplotlib.pylab as plt
import pymc
# ... and now the funcFit package
from PyAstronomy import funcFit as fuf

# Creating a Gaussian with some noise
# Choose some parameters...
gPar = {"A": -5.0, "sig": 10.0, "mu": 10.0, "off": 1.0, "lin": 0.0}
# Calculate profile
x = arange(100) - 50.0
y = gPar["off"] + gPar["A"] / sqrt(2*pi*gPar["sig"]**2) \
    * exp(-(x-gPar["mu"])**2/(2*gPar["sig"]**2))
# Add some noise
y += random.normal(0.0, 0.01, x.size)

# Now let us come to the fitting
# First, we create the Gauss1d fit object
gf = fuf.GaussFit1d()
# See what parameters are available
print("List of available parameters: ", gf.availableParameters())
# Set guess values for the parameters
gf["A"] = -10.0
gf["sig"] = 15.77
gf["off"] = 0.87
gf["mu"] = 7.5
# Let us see whether the assignment worked
print("Parameters and guess values: ", gf.parameters())

# Which parameters shall be variable during the fit?
# 'Thaw' those (the order is irrelevant)
gf.thaw(["A", "sig", "off", "mu"])

# Now start a simplex fit
gf.fit(x, y, yerr=ones(x.size)*0.01)

# Obtain the best-fit values derived by the simplex fit.
# They are to be used as start values for the MCMC sampling.
# Note that 'A' is missing - we will introduce this later.
X0 = {"sig": gf["sig"], "off": gf["off"], "mu": gf["mu"]}

# Now we specify the limits within which the individual parameters
# can be varied (for those parameters listed in the 'X0' dictionary).
Lims = {"sig": [-20., 20.], "off": [0., 2.], "mu": [5., 15.]}

# For the parameters contained in 'X0', define the step widths, which
# are to be used by the MCMC sampler. The steps are specified using
# the same scale/units as the actual parameters.
steps = {"A": 0.01, "sig": 0.1, "off": 0.1, "mu": 0.1}

# In this example, we wish to define our ``own'' PyMC variable for the parameter
# 'A'. This can be useful, if nonstandard behavior is desired. Note that this
# is an optional parameter and you could simply include the parameter 'A' into
# The framework of X0, Lims, and steps.
ppa = {}
ppa["A"] = pymc.Uniform("A", value=gf["A"], lower=-20.,
                        upper=10.0, doc="Amplitude")

# Start the sampling. The resulting Marchov-Chain will be written
# to the file 'mcmcExample.tmp'. In default configuration, pickle
# is used to write that file.
# To save the chain to a compressed 'hdf5'
# file, you have to specify the dbArgs keyword; e.g., use:
#   dbArgs = {"db":"hdf5", "dbname":"mcmcExample.hdf5"}
gf.fitMCMC(x, y, X0, Lims, steps, yerr=ones(x.size)*0.01,
           pymcPars=ppa, iter=2500, burn=0, thin=1,
           dbfile="mcmcExample.tmp")

# Reload the database (here, this is actually not required, but it is
# if the Marchov chain is to be analyzed later).
db = pymc.database.pickle.load('mcmcExample.tmp')
# Plot the trace of the amplitude, 'A'.
plt.hist(db.trace("A", 0)[:])
plt.show()
Some points shall be emphasized in this example:
  • For MCMC sampling the exact same fit object is used as for “normal” fitting.

  • If the yerr keyword is specified in the call to fitMCMC, a Gaussian distribution is assumed for the data points. Otherwise a Poisson distribution is assumed.

  • We used the normal simplex fit to obtain starting values for the Markov chain. You may also use, e.g., burn-in.

  • In the example, we demonstrated how a uniformly distributed PyMC variable is created. Normally, the fitMCMC method does this for you.

  • The result, i.e., the Markov chain, is saved to the file mcmcExample.tmp and is reloaded to obtain the trace of the amplitude.

Note

A convenient analysis of the resulting traces can be carried out using the TraceAnalysis class (see Analysis of Markov-Chains)

We emphasize that PyMC is a powerful and highly adaptable package, which can do a lot more. A more detailed introduction is, however, beyond the scope of this tutorial.

Simplifying initialization (MCMCautoParameters)

It can become cumbersome to define the starting values, steps, and ranges for uniform priors as done in the above example. Using the “auto” methods defined in the fitting class, you can take a short cut. However, be warned:

Warning

There is NO guarantee that the auto functions produce reasonable results. You need to check that.

from __future__ import print_function, division
from PyAstronomy import funcFit as fuf
import numpy as np
import matplotlib.pylab as plt

x = np.linspace(0, 30, 1000)
gauss = fuf.GaussFit1d()
gauss["A"] = 1
gauss["mu"] = 23.
gauss["sig"] = 0.5
# Generate some "data" to fit
yerr = np.random.normal(0., 0.05, len(x))
y = gauss.evaluate(x) + yerr
# Thaw the parameters A, mu, and sig
gauss.thaw(["A", "mu", "sig"])

# Define the ranges, which are used to construct the
# uniform priors and step sizes.
# Note that for "sig", we give only a single value.
# In this case, the limits for the uniform prior will
# be constructed as [m0-1.5, m0+1.5], where m0 is the
# starting value interpreted as the current value of
# mu (23. in this case).
ranges = {"A": [0, 10], "mu": 3, "sig": [0.1, 1.0]}
# Generate default input for X0, lims, and steps
X0, lims, steps = gauss.MCMCautoParameters(ranges)

# Show what happened...
print()
print("Auto-generated input parameters:")
print("X0: ", X0)
print("lims: ", lims)
print("steps: ", steps)
print()
# Call the usual sampler
gauss.fitMCMC(x, y, X0, lims, steps, yerr=yerr, iter=1000)

# and plot the results
plt.plot(x, y, 'k+')
plt.plot(x, gauss.evaluate(x), 'r--')
plt.show()

You may even shorten the short-cut by using the autoFitMCMC method. However, note that the same warning remains valid here.

from PyAstronomy import funcFit as fuf
import numpy as np
import matplotlib.pylab as plt

x = np.linspace(0, 30, 1000)
gauss = fuf.GaussFit1d()
gauss["A"] = 1
gauss["mu"] = 23.
gauss["sig"] = 0.5
# Generate some "data" to fit
yerr = np.random.normal(0., 0.05, len(x))
y = gauss.evaluate(x) + yerr

# Define the ranges, which are used to construct the
# uniform priors and step sizes.
# Note that for "sig", we give only a single value.
# In this case, the limits for the uniform prior will
# be constructed as [m0-1.5, m0+1.5], where m0 is the
# starting value interpreted as the current value of
# mu (23. in this case).
ranges = {"A": [0, 10], "mu": 3, "sig": [0.1, 1.0]}

# Call the auto-sampler
# Note that we set picky to False here. In this case, the
# parameters specified in ranges will be thawed automatically.
# All parameters not mentioned there, will be frozen.
gauss.autoFitMCMC(x, y, ranges, yerr=yerr, picky=False, iter=1000)

# and plot the results
plt.plot(x, y, 'k+')
plt.plot(x, gauss.evaluate(x), 'r--')
plt.show()

Sampling with Gaussian and uniform priors

The use of prior information is inherent in Bayesian analyses. The following example demonstrates how prior information can explicitly be included in the sampling. We note, however, that some kind of prior is implicitly assumed for all parameters; in this case, a uniform one.

from PyAstronomy import funcFit as fuf
import numpy as np
import matplotlib.pylab as plt
import pymc

# Create a Gauss-fit object
gf = fuf.GaussFit1d()

# Choose some parameters
gf["A"] = -0.65
gf["mu"] = 1.0
gf["lin"] = 0.0
gf["off"] = 1.1
gf["sig"] = 0.2

# Simulate data with noise
x = np.linspace(0., 2., 100)
y = gf.evaluate(x)
y += np.random.normal(0, 0.05, len(x))

gf.thaw(["A", "off", "mu", "sig"])

# Set up a normal prior for the offset parameter
# Note!---The name (first parameter) must correspond to that
#         of the parameter.
# The expectation value us set to 0.9 while the width is given
# as 0.01 (tau = 1/sigma**2). The starting value is specified
# as 1.0.
offPar = pymc.Normal("off", mu=0.9, tau=(1./0.01)**2, value=1.0)
# Use a uniform prior for mu.
muPar = pymc.Uniform("mu", lower=0.95, upper=0.97, value=0.96)

# Collect the "extra"-variables in a dictionary using
# their names as keys
pymcPars = {"mu": muPar, "off": offPar}

# Specify starting values, X0, and limits, lims, for
# those parameter distributions not given specifically.
X0 = {"A": gf["A"], "sig": gf["sig"]}
lims = {"A": [-1.0, 0.0], "sig": [0., 1.0]}
# Still, the steps dictionary has to contain all
# parameter distributions.
steps = {"A": 0.02, "sig": 0.02, "mu": 0.01, "off": 0.01}

# Carry out the MCMC sampling
gf.fitMCMC(x, y, X0, lims, steps, yerr=np.ones(len(x))*0.05,
           pymcPars=pymcPars, burn=1000, iter=3000)

# Setting parameters to mean values
for p in gf.freeParameters():
    gf[p] = gf.MCMC.trace(p)[:].mean()

# Show the "data" and model in the upper panel
plt.subplot(2, 1, 1)
plt.title("Data and model")
plt.errorbar(x, y, yerr=np.ones(len(x))*0.05, fmt="bp")
# Plot lowest deviance solution
plt.plot(x, gf.evaluate(x), 'r--')

# Show the residuals in the lower panel
plt.subplot(2, 1, 2)
plt.title("Residuals")
plt.errorbar(x, y-gf.evaluate(x), yerr=np.ones(len(x))*0.05, fmt="bp")
plt.plot([min(x), max(x)], [0.0, 0.0], 'r-')

plt.show()

Clearly, the plot shows that the solution does not fit well in a Chi-square sense, because the prior information has a significant influence on the outcome. Whether this should be considered reasonable or not is not a question the sampler could answer.