.. _tutMCMCSampler: .. _matplotlib: http://matplotlib.sourceforge.net/ .. _pymc: https://github.com/pymc-devs/pymc .. _SciPy: www.scipy.org/ .. _numpy: numpy.scipy.org/ .. _XSPEC: http://heasarc.nasa.gov/xanadu/xspec/ .. _emcee: http://dan.iel.fm/emcee/current/ MCMC sampling with funcFit tutorial ====================================== Currently, funcFit supports MCMC sampling either via the pymc_ or the emcee_ package. To do this, the model objects provides the `fitMCMC` method (pymc) and the `fitEMCEE` method (emcee). Both are basically independent and can be used separately. Although the method names start with *fit*, we emphasize that fitting, in the sense of optimization, is not the exact purpose of the analysis carried out here, but sampling from the posterior. .. note:: The Markov chains produced by fitMCMC and fitEMCEE can be analyzed using the TraceAnalysis class. See this tutorial: :ref:`traceAnalysisTut`. .. contents:: Sections of this tutorial Sampling using emcee (fitEMCEE) --------------------------------------------- The emcee_ package relies on sampling with an ensemble of chains, the so-called walkers. In this way, the sampling is automatically adapted to the scale of the problem, which simplifies obtained reasonable acceptance rates. Basic sampling with emcee ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The following example shows a basic application of fitEMCEE. By default, the resulting Markov chain is saved to a file called 'chain.emcee'. :: # Import numpy and matplotlib from numpy import arange, sqrt, exp, pi, random, ones import matplotlib.pylab as plt # ... and now the funcFit package from PyAstronomy import funcFit as fuf # Before we can start fitting, we need something to fit. # So let us create some data... # Choose some signal-to-noise ratio snr = 25.0 # Creating a Gaussian with some noise # Choose some parameters... gf = fuf.GaussFit1d() gf.assignValues({"A": -5.0, "sig": 2.5, "mu": 10.0, "off": 1.0, "lin": 0.0}) # Calculate profile x = arange(100) - 50.0 y = gf.evaluate(x) # Add some noise y += random.normal(0.0, 1.0/snr, x.size) # Define the free parameters gf.thaw(["A", "sig", "mu", "off"]) # Start a fit (quite dispensable here) gf.fit(x, y, yerr=ones(x.size)/snr) # Say, we want 200 burn-in iterations and, thereafter, # 1000 further iterations (per walker). sampleArgs = {"iters": 1000, "burn": 200} # Start the sampling (ps could be used to continue the sampling) ps = gf.fitEMCEE(x, y, yerr=ones(x.size)/snr, sampleArgs=sampleArgs) # Plot the distributions of the chains # NOTE: the order of the parameters in the chain object is the same # as the order of the parameters returned by freeParamNames() for i, p in enumerate(gf.freeParamNames()): plt.subplot(len(gf.freeParamNames()), 1, i+1) plt.hist(gf.emceeSampler.flatchain[::, i], label=p) plt.legend() plt.show() Sampling with prior information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In this example, we use a very simple constant model to, first, compare the result from sampling with classical error estimation; note that in this simple case Bayesian credibility intervals are, indeed, numerically identical to classical confidence intervals, which is, however, not generally the case. Second, we introduce (strong) prior information and repeat the sampling. A number of ready-to-use priors are implemented here: :ref:`emceePriors` :: from __future__ import print_function, division # Import numpy and matplotlib from numpy import arange, sqrt, exp, pi, random, ones import matplotlib.pylab as plt # ... and now the funcFit package from PyAstronomy import funcFit as fuf import numpy as np # Before we can start fitting, we need something to fit. # So let us create some data... # Choose some signal-to-noise ratio snr = 25.0 # Choosing an arbitrary constant and ... c = 10.0 # ... an equally arbitrary number of data points npoint = 10 # Define 'data' x = arange(npoint) y = np.ones(len(x)) * c # Add some noise y += random.normal(0.0, 1.0/snr, x.size) # A funcFit object representing a constant pf = fuf.PolyFit1d(0) pf["c0"] = c # The only parameter shall be free pf.thaw("c0") # Say, we want 200 burn-in iterations and, thereafter, # 2500 further iterations (per walker). sampleArgs = {"iters": 2500, "burn": 200} # Start the sampling (ps could be used to continue the sampling) ps = pf.fitEMCEE(x, y, yerr=ones(x.size)/snr, sampleArgs=sampleArgs) print() # Plot the distributions of the chains # NOTE: the order of the parameters in the chain object is the same # as the order of the parameters returned by freeParamNames() h = plt.hist(pf.emceeSampler.flatchain[::, 0], label="c0", density=True) # Construct "data points" in the middle of the bins xhist = (h[1][1:] + h[1][0:-1]) / 2.0 yhist = h[0] # Fit the histogram using a Gaussian gf = fuf.GaussFit1d() gf.assignValues({"A": 1.0, "mu": c, "sig": 1.0/snr/np.sqrt(npoint)}) # First fitting only "mu" is simply quite stable gf.thaw("mu") gf.fit(xhist, yhist) gf.thaw(["A", "sig"]) gf.fit(xhist, yhist) print() print(" --- Sampling results ---") print("Posterior estimate of constant: ", np.mean(pf.emceeSampler.flatchain[::, 0])) print("Nominal error of the mean: ", 1.0/snr/np.sqrt(npoint)) print("Estimate from Markov chain: ", np.std(pf.emceeSampler.flatchain[::, 0]), end=' ') print(" and from Gaussian fit to distribution: ", gf["sig"]) # Evaluate best-fit model ... xmodel = np.linspace(c - 10.0/snr, c + 10.0/snr, 250) ymodel = gf.evaluate(xmodel) # ... and plot plt.plot(xhist, yhist, 'rp') plt.plot(xmodel, ymodel, 'r--') plt.legend() plt.show() # Defining a prior on c0. Prior knowledge tells us that its value # is around 7. Let us choose the standard deviation of the prior so # that the estimate will lie in the middle between 7 and 10. Here we # exploit symmetry and make the prior information as strong as the # information contained in the likelihood function. priors = {"c0": fuf.FuFPrior("gaussian", sig=1.0/snr/np.sqrt(npoint), mu=7.0)} # Start the sampling (ps could be used to continue the sampling) ps = pf.fitEMCEE(x, y, yerr=ones(x.size)/snr, sampleArgs=sampleArgs, priors=priors) print() print(" --- Sampling results with strong prior information ---") print("Posterior estimate of constant: ", np.mean(pf.emceeSampler.flatchain[::, 0]), end=' ') print(" +/-", np.std(pf.emceeSampler.flatchain[::, 0])) plt.hist(pf.emceeSampler.flatchain[::, 0], label="c0", density=True) plt.show() Sampling from specific distribution using emcee (sampleEMCEE) ---------------------------------------------------------------- .. currentmodule:: PyAstronomy.funcFit The function :py:func:`sampleEMCEE` allows to sample arbitrary distributions using the emcee_ implementation of MCMC sampling without the need to define a formal funcFit model. This function is only a wrapper around emcee functionality. The resulting chains can be studied using the :py:class:`TraceAnalysis` class. Sampling from a Gaussian distribution ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The following example shows how to use :py:func:`sampleEMCEE` to obtain realizations from a Gaussian probability distribution. Of course there are more advantageous ways of doing this. :: from __future__ import print_function import numpy as np from PyAstronomy import funcFit as fuf import matplotlib.pylab as plt def lfGauss(v, sigma, mu): """ Gaussian density Parameters ---------- v : dictionary Holds current values of "x" mus, sigma : float Mean and standard deviation of the Gaussian. Specified via the `largs` argument. Returns ------- lp : float Natural logarithm of the density. """ result = 0.0 # Log(density) result += -0.5*np.log(2.*np.pi*sigma**2) - (v["x"] - mu)**2/(2.*sigma**2) return result # Sampling arguments # burn: Number of burn-in steps per walker # iters: Number of iterations per walker sa = {"burn": 1000, "iters": 5000} # Starting values fv0 = {"x": 0.5} # Specify standard deviation and mean of Gaussian la = {"mu": 0.5, "sigma": 0.25} # Sample from distribution ps = fuf.sampleEMCEE(["x"], fv0, lfGauss, largs=la, sampleArgs=sa, nwalker=4, dbfile="gauss.emcee") print() # Use TraceAnalysis to look at chains ta = fuf.TraceAnalysis("gauss.emcee") print("Available chains: ", ta.availableParameters()) print("Mean and STD of chain: ", np.mean(ta["x"]), np.std(ta["x"])) # Check distribution of chain # Plot histogram of chain plt.hist(ta["x"], 60, density=True) # Overplot Gaussian model xx = np.linspace(la["mu"]-6*la["sigma"], la["mu"]+6*la["sigma"], 1000) yy = 1./np.sqrt(2.*np.pi*la["sigma"]**2) * np.exp(-(xx - la["mu"])**2/(2.*la["sigma"]**2)) plt.plot(xx, yy, 'r--') plt.show() Estimate mean and STD from Gaussian sample ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ MCMC sampling is frequently used to sample from posterior distributions and obtain marginal distributions. The following example demonstrates how to estimate the mean and standard deviation from a sample of Gaussian data using :py:func:`sampleEMCEE`. :: from __future__ import print_function import numpy as np from PyAstronomy import funcFit as fuf def lfGaussMS(v, x=None): """ Gaussian posterior with 1/sigma prior on sigma. Parameters ---------- v : dictionary Holds current values of "sigma" and "mu" x : array The 'data' observed. Will be specified by the `largs` keyword. Returns ------- lp : float Natural logarithm of the density. """ if v["sigma"] < 0.: # Penalize negative standard deviations return -1e20*abs(v["sigma"]) result = 0.0 # Apply prior on sigma result -= np.log(v["sigma"]) # Add log(likelihood) result += np.sum(-0.5*np.log(2.*np.pi*v["sigma"]**2) - (x - v["mu"])**2/(2.*v["sigma"]**2)) return result # Sampling arguments # burn: Number of burn-in steps per walker # iters: Number of iterations per walker sa = {"burn": 1000, "iters": 5000} # Starting values fv0 = {"sigma": 1., "mu": 1.} # 'Observed' data la = {"x": np.random.normal(0., 1., 1000)} print("Mean of 'data': ", np.mean(la["x"])) print("Standard deviation of 'data': ", np.std(la["x"])) # Scale width for distributing the walkers s = {"mu": 0.01, "sigma": 0.5} ps = fuf.sampleEMCEE(["mu", "sigma"], fv0, lfGaussMS, largs=la, sampleArgs=sa, nwalker=4, scales=s, dbfile="musig.emcee") print() # Use TraceAnalysis to look at chains ta = fuf.TraceAnalysis("musig.emcee") print("Available chains: ", ta.availableParameters()) ta.plotTraceHist('mu') ta.show() ta.plotTraceHist('sigma') ta.show() LEGACY: Sampling using pymc (fitMCMC) -------------------------------------------- The *fitMCMC* method provided by funcFit is not an MCMC sampler itself, but it is a wrapper around functionality provided by a third party package, namely, PyMC_. pymc is a powerful Python package providing a wealth of functionality concerning Bayesian analysis. *fitMCMC* provides an easy to use interface to pymc sampling, which allows to carry out a basic Bayesian data analysis quickly. .. note:: To run these examples, pymc must be installed (check the output of funcFit.status() shown at the beginning of this tutorial to see whether this is the case on your system). .. warning:: Unfortunately, pymc is only supported in version 2.x. In particular, version 3.x is not supported. Using the Markov-Chain Monte-Carlo (MCMC) sampler ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The following example demonstrates how the funcFit interface can be used to carry out a Bayesian analysis using pymc. For a deeper understanding of the working, adaptability, and logic implemented by pymc, we refer the reader to their web page (PyMC_). :: from __future__ import print_function, division # Import some required modules from numpy import arange, sqrt, exp, pi, random, ones import matplotlib.pylab as plt import pymc # ... and now the funcFit package from PyAstronomy import funcFit as fuf # Creating a Gaussian with some noise # Choose some parameters... gPar = {"A": -5.0, "sig": 10.0, "mu": 10.0, "off": 1.0, "lin": 0.0} # Calculate profile x = arange(100) - 50.0 y = gPar["off"] + gPar["A"] / sqrt(2*pi*gPar["sig"]**2) \ * exp(-(x-gPar["mu"])**2/(2*gPar["sig"]**2)) # Add some noise y += random.normal(0.0, 0.01, x.size) # Now let us come to the fitting # First, we create the Gauss1d fit object gf = fuf.GaussFit1d() # See what parameters are available print("List of available parameters: ", gf.availableParameters()) # Set guess values for the parameters gf["A"] = -10.0 gf["sig"] = 15.77 gf["off"] = 0.87 gf["mu"] = 7.5 # Let us see whether the assignment worked print("Parameters and guess values: ", gf.parameters()) # Which parameters shall be variable during the fit? # 'Thaw' those (the order is irrelevant) gf.thaw(["A", "sig", "off", "mu"]) # Now start a simplex fit gf.fit(x, y, yerr=ones(x.size)*0.01) # Obtain the best-fit values derived by the simplex fit. # They are to be used as start values for the MCMC sampling. # Note that 'A' is missing - we will introduce this later. X0 = {"sig": gf["sig"], "off": gf["off"], "mu": gf["mu"]} # Now we specify the limits within which the individual parameters # can be varied (for those parameters listed in the 'X0' dictionary). Lims = {"sig": [-20., 20.], "off": [0., 2.], "mu": [5., 15.]} # For the parameters contained in 'X0', define the step widths, which # are to be used by the MCMC sampler. The steps are specified using # the same scale/units as the actual parameters. steps = {"A": 0.01, "sig": 0.1, "off": 0.1, "mu": 0.1} # In this example, we wish to define our ``own'' PyMC variable for the parameter # 'A'. This can be useful, if nonstandard behavior is desired. Note that this # is an optional parameter and you could simply include the parameter 'A' into # The framework of X0, Lims, and steps. ppa = {} ppa["A"] = pymc.Uniform("A", value=gf["A"], lower=-20., upper=10.0, doc="Amplitude") # Start the sampling. The resulting Marchov-Chain will be written # to the file 'mcmcExample.tmp'. In default configuration, pickle # is used to write that file. # To save the chain to a compressed 'hdf5' # file, you have to specify the dbArgs keyword; e.g., use: # dbArgs = {"db":"hdf5", "dbname":"mcmcExample.hdf5"} gf.fitMCMC(x, y, X0, Lims, steps, yerr=ones(x.size)*0.01, pymcPars=ppa, iter=2500, burn=0, thin=1, dbfile="mcmcExample.tmp") # Reload the database (here, this is actually not required, but it is # if the Marchov chain is to be analyzed later). db = pymc.database.pickle.load('mcmcExample.tmp') # Plot the trace of the amplitude, 'A'. plt.hist(db.trace("A", 0)[:]) plt.show() Some points shall be emphasized in this example: * For MCMC sampling the exact same fit object is used as for "normal" fitting. * If the *yerr* keyword is specified in the call to fitMCMC, a Gaussian distribution is assumed for the data points. Otherwise a Poisson distribution is assumed. * We used the normal simplex fit to obtain starting values for the Markov chain. You may also use, e.g., burn-in. * In the example, we demonstrated how a uniformly distributed PyMC variable is created. Normally, the fitMCMC method does this for you. * The result, i.e., the Markov chain, is saved to the file mcmcExample.tmp and is reloaded to obtain the trace of the amplitude. .. note:: A convenient analysis of the resulting traces can be carried out using the *TraceAnalysis* class (see :ref:`traceAnalysisClass`) We emphasize that PyMC is a powerful and highly adaptable package, which can do a lot more. A more detailed introduction is, however, beyond the scope of this tutorial. Simplifying initialization (MCMCautoParameters) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ It can become cumbersome to define the starting values, steps, and ranges for uniform priors as done in the above example. Using the "auto" methods defined in the fitting class, you can take a short cut. However, be warned: .. warning:: There is NO guarantee that the auto functions produce reasonable results. You need to check that. :: from __future__ import print_function, division from PyAstronomy import funcFit as fuf import numpy as np import matplotlib.pylab as plt x = np.linspace(0, 30, 1000) gauss = fuf.GaussFit1d() gauss["A"] = 1 gauss["mu"] = 23. gauss["sig"] = 0.5 # Generate some "data" to fit yerr = np.random.normal(0., 0.05, len(x)) y = gauss.evaluate(x) + yerr # Thaw the parameters A, mu, and sig gauss.thaw(["A", "mu", "sig"]) # Define the ranges, which are used to construct the # uniform priors and step sizes. # Note that for "sig", we give only a single value. # In this case, the limits for the uniform prior will # be constructed as [m0-1.5, m0+1.5], where m0 is the # starting value interpreted as the current value of # mu (23. in this case). ranges = {"A": [0, 10], "mu": 3, "sig": [0.1, 1.0]} # Generate default input for X0, lims, and steps X0, lims, steps = gauss.MCMCautoParameters(ranges) # Show what happened... print() print("Auto-generated input parameters:") print("X0: ", X0) print("lims: ", lims) print("steps: ", steps) print() # Call the usual sampler gauss.fitMCMC(x, y, X0, lims, steps, yerr=yerr, iter=1000) # and plot the results plt.plot(x, y, 'k+') plt.plot(x, gauss.evaluate(x), 'r--') plt.show() You may even shorten the short-cut by using the `autoFitMCMC` method. **However, note that the same warning remains valid here.** :: from PyAstronomy import funcFit as fuf import numpy as np import matplotlib.pylab as plt x = np.linspace(0, 30, 1000) gauss = fuf.GaussFit1d() gauss["A"] = 1 gauss["mu"] = 23. gauss["sig"] = 0.5 # Generate some "data" to fit yerr = np.random.normal(0., 0.05, len(x)) y = gauss.evaluate(x) + yerr # Define the ranges, which are used to construct the # uniform priors and step sizes. # Note that for "sig", we give only a single value. # In this case, the limits for the uniform prior will # be constructed as [m0-1.5, m0+1.5], where m0 is the # starting value interpreted as the current value of # mu (23. in this case). ranges = {"A": [0, 10], "mu": 3, "sig": [0.1, 1.0]} # Call the auto-sampler # Note that we set picky to False here. In this case, the # parameters specified in ranges will be thawed automatically. # All parameters not mentioned there, will be frozen. gauss.autoFitMCMC(x, y, ranges, yerr=yerr, picky=False, iter=1000) # and plot the results plt.plot(x, y, 'k+') plt.plot(x, gauss.evaluate(x), 'r--') plt.show() Sampling with Gaussian and uniform priors ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The use of prior information is inherent in Bayesian analyses. The following example demonstrates how prior information can explicitly be included in the sampling. We note, however, that some kind of prior is implicitly assumed for all parameters; in this case, a uniform one. :: from PyAstronomy import funcFit as fuf import numpy as np import matplotlib.pylab as plt import pymc # Create a Gauss-fit object gf = fuf.GaussFit1d() # Choose some parameters gf["A"] = -0.65 gf["mu"] = 1.0 gf["lin"] = 0.0 gf["off"] = 1.1 gf["sig"] = 0.2 # Simulate data with noise x = np.linspace(0., 2., 100) y = gf.evaluate(x) y += np.random.normal(0, 0.05, len(x)) gf.thaw(["A", "off", "mu", "sig"]) # Set up a normal prior for the offset parameter # Note!---The name (first parameter) must correspond to that # of the parameter. # The expectation value us set to 0.9 while the width is given # as 0.01 (tau = 1/sigma**2). The starting value is specified # as 1.0. offPar = pymc.Normal("off", mu=0.9, tau=(1./0.01)**2, value=1.0) # Use a uniform prior for mu. muPar = pymc.Uniform("mu", lower=0.95, upper=0.97, value=0.96) # Collect the "extra"-variables in a dictionary using # their names as keys pymcPars = {"mu": muPar, "off": offPar} # Specify starting values, X0, and limits, lims, for # those parameter distributions not given specifically. X0 = {"A": gf["A"], "sig": gf["sig"]} lims = {"A": [-1.0, 0.0], "sig": [0., 1.0]} # Still, the steps dictionary has to contain all # parameter distributions. steps = {"A": 0.02, "sig": 0.02, "mu": 0.01, "off": 0.01} # Carry out the MCMC sampling gf.fitMCMC(x, y, X0, lims, steps, yerr=np.ones(len(x))*0.05, pymcPars=pymcPars, burn=1000, iter=3000) # Setting parameters to mean values for p in gf.freeParameters(): gf[p] = gf.MCMC.trace(p)[:].mean() # Show the "data" and model in the upper panel plt.subplot(2, 1, 1) plt.title("Data and model") plt.errorbar(x, y, yerr=np.ones(len(x))*0.05, fmt="bp") # Plot lowest deviance solution plt.plot(x, gf.evaluate(x), 'r--') # Show the residuals in the lower panel plt.subplot(2, 1, 2) plt.title("Residuals") plt.errorbar(x, y-gf.evaluate(x), yerr=np.ones(len(x))*0.05, fmt="bp") plt.plot([min(x), max(x)], [0.0, 0.0], 'r-') plt.show() Clearly, the plot shows that the solution does not fit well in a Chi-square sense, because the prior information has a significant influence on the outcome. Whether this should be considered reasonable or not is not a question the sampler could answer.