Examples of Beta Sigma Noise Estimation ===================================================== In the following, a number of examples demonstrate aspects of the :math:`\beta\sigma` procedure. Please see `the corresponding paper `_ for further discussion. - :ref:`bsOrders` - :ref:`bsRobust` - :ref:`bsEquisamp` .. currentmodule:: PyAstronomy.pyasl .. _bsOrders: Noise estimates with different orders of approximations (N) ----------------------------------------------------------------- The required order of approximation (N) depends on the level of variation in the data (or, equivalently, the sampling cadence in relation to the variability time scale of the signal). Comparing estimates obtained from various orders of approximation provides a useful cross-check for the plausibility of the result. In the example below, it is clearly seen that the zeroth order yields a too high estimate for the amplitude of noise because the signal varies too quickly. :: from __future__ import print_function, division import numpy as np import matplotlib.pylab as plt from PyAstronomy import pyasl def g(t): """ Function determining the behavior of the data. """ return 1.3 - 0.003*t + 1.3*np.sin(t/5.) * np.exp(-t/100.) # Number of data points nd = 200 ti = np.arange(float(nd)) gi = g(ti) mdiff = np.max(np.abs(gi[1:] - gi[0:-1])) print("""Maximum absolute difference between consecutive values of g(t): """, mdiff) # Standard deviation of noise istd = 0.02 print("Input standard deviation: ", istd) print("Number of 'data points': ", nd) print() # Add Gaussian noise to data yi = gi + np.random.normal(0.0, istd, nd) # Create class instance for equidistant sampling bseq = pyasl.BSEqSamp() # Specify jump parameter (j) for construction of beta sample j = 1 # Order of approximation to use Ns = [0, 1, 2, 3] # Use to store noise estimates smads, dsmads = [], [] # Loop over orders of approximation between 0 and 3 for N in Ns: # Get estimates of standard deviation based on robust (MAD-based) estimator smad, dsmad = bseq.betaSigma(yi, N, j, returnMAD=True) print("Order of approximation (N): ", N) print(" Size of beta sample: ", len(bseq.betaSample)) print(" Robust estimate of noise std: %6.3f +/- %6.3f" % (smad, dsmad)) # Save result smads.append(smad) dsmads.append(dsmad) # Plot g(t) and the synthetic data plt.subplot(2, 1, 1) plt.title("Data (top) and noise estimates (bottom)") plt.plot(ti, gi, 'b.-', label="$g(t_i)$") plt.errorbar(ti, yi, yerr=np.ones(nd)*istd, fmt='r+', label="$y_i$") plt.legend() plt.subplot(2, 1, 2) plt.title("N=0 is insufficient") plt.errorbar(Ns, smads, yerr=dsmads, fmt='k+', label="Noise estimates") plt.plot([min(Ns)-0.5, max(Ns)+0.5], [istd]*2, 'k--', label="Input value") plt.legend() plt.xlabel("Order of approximation (N)") plt.ylabel("Noise STD") plt.tight_layout() plt.show() .. _bsRobust: The effect of outliers: A case for robust estimation ------------------------------------------------------------ Outliers are a common nuisance in working with real data. They can strongly affect the :math:`\beta\sigma` noise estimate even in otherwise very benign data sets. **However, the effect of outliers on the noise estimate is much less significant, if robust estimation is used.** In many cases, the gain in precision from using a more efficient estimator is insignificant, because the main uncertainties arise from deviations from the assumption of Gaussian noise. :: from __future__ import print_function, division import numpy as np from PyAstronomy import pyasl import matplotlib.pylab as plt # Number of data points nd = 200 # Input standard deviation istd = 0.2 # Create some synthetic data (Gaussian noise) with # input standard deviation. y = np.random.normal(1.0, istd, nd) # Introduce some outliers # Number of outliers no = 15 for _ in range(no): # Choose an index (could occur more than once) index = np.random.randint(0, high=nd) # Add point from normal distribution with # STD of 50 y[index] = np.random.normal(1.0, 50.0) # Create class instance for equidistant sampling bseq = pyasl.BSEqSamp() # Specify order of approximation (N) and jump parameter (j) for # construction of beta sample N = 0 j = 1 print("Order of approximation: ", N) print("Jump parameter: ", j) print() # Get estimates of standard deviation based on MV estimator and ... smv, dsmv = bseq.betaSigma(y, N, j, returnMAD=False) # ... robust (MAD-based) estimator smad, dsmad = bseq.betaSigma(y, N, j, returnMAD=True) print("Input standard deviation: ", istd) print("Number of 'data points': ", nd) print("Size of beta sample: ", len(bseq.betaSample)) print() print("Minimum-variance estimate: %6.3f +/- %6.3f" % (smv, dsmv)) print("Robust estimate: %6.3f +/- %6.3f" % (smad, dsmad)) plt.subplot(2, 1, 1) plt.title("Synthetic data") plt.plot(y, 'bp') plt.subplot(2, 1, 2) plt.title("Histogram of $\\beta$ sample") plt.hist(bseq.betaSample, 30) plt.show() .. _bsEquisamp: Equidistant vs. arbitrary sampling -------------------------------------------- The classes :py:func:`BSEqSamp` and :py:func:`BSArbSamp` treat the cases of equidistant and arbitrary sampling of the signal explicitly. Equidistant sampling is technically more simple to treat and is often a good approximation *even if* the actual sampling is not equidistant. This is true when there is a "not too complicated" transformation relating the actual and an equidistant sampling axes. The following example demonstrates a case, where equidistant sampling can be assumed but a higher order of approximation is required. :: from __future__ import print_function, division import numpy as np import matplotlib.pylab as plt from PyAstronomy import pyasl def g(t): """ Function determining the behavior of the data. """ return 1.3 - 10.0*t # Number of data points nd = 30 # Creating non-equidistant samping axis (ti) te = np.arange(float(nd)) ti = (te**3) / float(nd**2) # Get values of g(t) gi = g(ti) # Standard deviation of noise istd = 0.3 # Add Gaussian noise to data yi = gi + np.random.normal(0.0, istd, nd) print("Input standard deviation: ", istd) print("Number of 'data points': ", nd) print() # Create class instance for equidistant sampling bseq = pyasl.BSEqSamp() # Create class instance for arbitrary sampling bsar = pyasl.BSArbSamp() # Get estimates assung equidistant and arbitrary sampling # using N = 1 and j = 1. From the definition of g(t), N = 1 # will be sufficient for the case of arbitrary sampling, but # not necessarily for (assumed) equidistant sampling. smv_es, dsmv_es = bseq.betaSigma(yi, 1, 1) smv_as, dsmv_as = bsar.betaSigma(ti, yi, 1, 1) print("Estimates for N=1 and j=1") print(" Equidistant sampling: %5.3f +/- %5.3f" % (smv_es, dsmv_es)) print(" Arbitrary sampling: %5.3f +/- %5.3f" % (smv_as, dsmv_as)) print() # Get estimates for N=2 and 3 assuming equidistant sampling smv_es2, dsmv_es2 = bseq.betaSigma(yi, 2, 1) smv_es3, dsmv_es3 = bseq.betaSigma(yi, 3, 1) print("Estimates for N=2 and 3 based on equidistant sampling") print(" N = 2: %5.3f +/- %5.3f" % (smv_es2, dsmv_es2)) print(" N = 3: %5.3f +/- %5.3f" % (smv_es3, dsmv_es3)) plt.subplot(2, 1, 1) plt.title("Data with true sampling") plt.plot(ti, gi, 'b-') plt.errorbar(ti, yi, yerr=np.ones(nd)*istd, fmt='b+') plt.subplot(2, 1, 2) plt.title("Same data assuming equidistant sampling") plt.plot(te, gi, 'r-') plt.errorbar(te, yi, yerr=np.ones(nd)*istd, fmt='r+') plt.tight_layout() plt.show()