Cross-correlation¶
- PyAstronomy.pyasl.crosscorrRV(w, f, tw, tf, rvmin, rvmax, drv, mode='doppler', skipedge=0, edgeTapering=None, weights=None, meanwvl=None)¶
Cross-correlate a spectrum with a template.
The algorithm implemented here works as follows: For each RV shift to be considered, the wavelength axis of the template is shifted, either linearly or using a proper Doppler shift depending on the mode. The shifted template is then linearly interpolated at the wavelength points of the observation (spectrum) to calculate the cross-correlation function.
If N data points \(f_i\) at wavelengths \(w_i\) are given, the cross correlation function \(CC(v_j)\), depending on the velocity \(v_j\) and optional weights \(\alpha_i\) is calculated as:
\[CC(v_j) = \sum_{i=1}^N \alpha_i \times \left(f_i \times t(w_i-\Delta_{i,j}) \right)\]If the mode is lin, the shift is implemented as \(\Delta_{i,j} = \bar{w} (v_j/c)\), where \(\bar{w}\) is the mean wavelength (i.e., the mean of the input w unless meanwvl is specified). For any velocity, the shift is a constant across the entire wavelength axis. If the mode is doppler, the shift is implemented as \(\Delta_{i,j} = w_i (v_j/c)\), which depends both on the wavelength and the velocity. The evaluation of the shifted template \(t(w_i-\Delta_{i,j})\) is carried out using linear interpolation.
If no weights are specified, \(\alpha_i = 1\) is adopted. If uncertainties, \(\sigma_i\), for the \(f_i\) are known, a useful choice for the weights is \(\alpha_i = \sigma_i^{-2}\), because this makes maximizing the CCF equivalent to maximizing the likelihood (see, e.g., Lookwood et al. 2014, ApJ 783).
- Parameters
- warray
The wavelength axis of the observation.
- farray
The flux axis of the observation.
- twarray
The wavelength axis of the template.
- tfarray
The flux axis of the template.
- rvminfloat
Minimum radial velocity for which to calculate the cross-correlation function [km/s].
- rvmaxfloat
Maximum radial velocity for which to calculate the cross-correlation function [km/s].
- drvfloat
The width of the radial-velocity steps to be applied in the calculation of the cross-correlation function [km/s].
- modestring, {lin, doppler}, optional
The mode determines how the wavelength axis will be modified to represent a RV shift. If “lin” is specified, a mean wavelength shift will be calculated based on the mean wavelength of the observation. The wavelength axis will then be shifted by that amount. If “doppler” is specified (the default), the wavelength axis will properly be Doppler-shifted.
- skipedgeint, optional
If larger zero, the specified number of bins will be skipped from the begin and end of the observation. This may be useful if the template does not provide sufficient coverage of the observation.
- edgeTaperingfloat or tuple of two floats
If not None, the method will “taper off” the edges of the observed spectrum by multiplying with a sine function. If a float number is specified, this will define the width (in wavelength units) to be used for tapering on both sides. If different tapering widths shall be used, a tuple with two (positive) numbers must be given, specifying the width to be used on the low- and high wavelength end. If a nonzero ‘skipedge’ is given, it will be applied first. Edge tapering can help to avoid edge effects (see, e.g., Gullberg and Lindegren 2002, A&A 390).
- weightsarray of floats, optional
If given, it specifies weights for individual data points (given by f). A possible choice for weights is 1/sigma_i**2 where sigma_i is the uncertainty of the i-th data point.
- meanwvlfloat, optional
If given, this value will be adopted for the mean wavelength \(\bar{w}\) to calculate shifts in ‘lin’ mode instead of the mean of the input array w.
- Returns
- dRVarray
The RV axis of the cross-correlation function. The radial velocity refer to a shift of the template, i.e., positive values indicate that the template has been red-shifted and negative numbers indicate a blue-shift of the template. The numbers are given in km/s.
- CCarray
The cross-correlation function.
See also
An algorithm for finding the extreme points by
parabolic approximation (quadExtreme()
).
Example: Cross-correlation with a Gaussian¶
from __future__ import print_function, division
from PyAstronomy import pyasl
import numpy as np
import matplotlib.pylab as plt
# Create the template
tw = np.linspace(5000, 5010, 1000)
tf = np.exp(-(tw-5004.0)**2/(2.*0.1**2))
# Create data, which are not that well sampled
dw = np.linspace(5000, 5010, 200)
df = np.exp(-(dw-5004.17)**2/(2.*0.1**2))
# Plot template and data
plt.title("Template (blue) and data (red)")
plt.plot(tw, tf, 'b.-')
plt.plot(dw, df, 'r.-')
plt.show()
# Carry out the cross-correlation.
# The RV-range is -30 - +30 km/s in steps of 0.6 km/s.
# The first and last 20 points of the data are skipped.
rv, cc = pyasl.crosscorrRV(dw, df, tw, tf, -30., 30., 30./50., skipedge=20)
# Find the index of maximum cross-correlation function
maxind = np.argmax(cc)
print("Cross-correlation function is maximized at dRV = ", rv[maxind], " km/s")
if rv[maxind] > 0.0:
print(" A red-shift with respect to the template")
else:
print(" A blue-shift with respect to the template")
plt.plot(rv, cc, 'bp-')
plt.plot(rv[maxind], cc[maxind], 'ro')
plt.show()
Example: Cross-correlation including weights¶
from __future__ import print_function, division
from PyAstronomy import pyasl
import numpy as np
import matplotlib.pylab as plt
# Create the template
tw = np.linspace(5000, 5010, 1000)
tf = np.exp(-(tw-5004.0)**2/(2.*0.1**2))
# Create data, which are not that well sampled
dw = np.linspace(5000, 5010, 200)
df = np.exp(-(dw-5004.17)**2/(2.*0.1**2))
df += 0.2*np.exp(-(dw-5004.67)**2/(2.*0.1**2))
# Define errors, use large errors for strong peak
err = np.ones_like(dw) * 0.01
err[(dw > 5004) & (dw < 5004.4)] *= 40
# Plot template and data
plt.subplot(2,1,1)
plt.title("Template (blue) and data (red)")
plt.plot(tw, tf, 'b.-', label="Template")
plt.errorbar(dw, df, yerr=err, fmt='r+-', label="Date")
plt.legend()
# Carry out the cross-correlation.
# The RV-range is -30 - +100 km/s in steps of 0.6 km/s.
# The first and last 50 points of the data are skipped.
rv, cc = pyasl.crosscorrRV(dw, df, tw, tf, -30., 100., 1., skipedge=50)
# Include weights (1/err**2)
rvw, ccw = pyasl.crosscorrRV(dw, df, tw, tf, -30., 100., 1., skipedge=50, \
weights=1/err**2)
plt.subplot(2,1,2)
plt.plot(rv, cc/max(cc), 'b-', label="normalized unweighted CCF")
plt.plot(rvw, ccw/max(ccw), 'r.-', label="normalized weighted CCF")
plt.legend()
plt.show()