Poisson errors, fitting positions (due 10 Oct 2025)

1. Numerical simulations

In this problem set, you will use numerical simulations to test some statistical ideas, some of which we already discussed in class. You need to write your own code. To help you get started, you will likely need numpy.random.poisson and scipy.optimize.leastsq.

For evaluation, you need to submit code (can be via a link) that is written to be (reasonably) self-explanatory (with docstrings, etc.), and that can be run to produce your results (graphs, a little table, etc.). This should be accompanied with a (short) document that discusses the statistical implications of the results.

Note: if you need help with writing code, do ask! A goal of this problem set is to get you to a point where you don't hesitate to do a Monte-Carlo simulation if it seems useful.

2. Poisson distribution

Often, in analyzing Poisson-distributed data, one will assume that the uncertainties are approximately normally-distributed with \(\sigma=\sqrt{N}\). This can be fine, but one has to worry about biases, in particular that values that are low by chance get larger weight. Verify that weighting can indeed be problematic, by doing a simulation in which you do \(n=100\) trials of \(m=100\) points each for Poisson-distributed data with mean \(\mu=100\), and calculate both the straight average and a weighted average. Make a histogram of these averages, and discuss in a few words why the two give inconsistent results. Also try one or two different mean values μ and look for consistencies in the deviations. (For more discussion, see Mighell 1999, ApJ, 518, 380.)

3. Fitting gaussians

Now write a routine that simulates a one-dimensional image of a star, with model data \(M(x)\) given by, \[ M(x) = C + A \frac{1}{\sqrt{2\pi}\sigma_{w}} \exp\left[-\frac12\left(\frac{x-x_c}{\sigma_w}\right)^{2}\right]. \] For definiteness, use pixels \(x=0\ldots99\), sky \(C=100\) photons, amplitude \(A=5000\) photons, and width \(\sigma_{w}=2\) pixels. Now add Poisson noise to all these data points, and do a non-linear least-squares fit with a Gaussian (assuming uncertainties of \(\sqrt{N}\), i.e., ignoring the minor deficiency arising from that assumption that you found above). Repeat 100 times and compare the typical uncertainty on \(x_{c}\) inferred from the covariance matrix with the scatter between different trials. How does the scatter relate to the width \(\sigma_{w}\) and the signal-to-noise ratio?

4. Gaussian fit to non-Gaussian PSF

On purpose, make stars that do not conform to a simple Gaussian, e.g., by replacing the above model with (feel free to do something else),

\begin{eqnarray} M(x) &=& C + A(1-\alpha)\frac{1}{\sqrt{2\pi}\sigma_{w}} \exp\left[-\frac12\left(\frac{x-x_{c}}{\sigma_{w}}\right)^{2}\right]\nonumber\\ &+&A\alpha\frac{1}{\sqrt{2\pi}\sigma_{w,2}} \exp\left[-\frac12\left(\frac{x-x_{c}-\delta_x}{\sigma_{w,2}}\right)^{2}\right]. \end{eqnarray}

This mimics a poor PSF, in which a fraction \(\alpha=0.2\) of the light is in, say, a wider PSF with width \(\sigma_{w,2}=4\) pixels, which is offset from the good ``core'' by \(\delta_x=1\) pixel. Again, add Poisson noise and fit the model star, but fit it with the simple model from the previous question (i.e., ignore the second Gaussian in the fit). Compared to the good fits, what happens to \(\chi^2\)? What happens to the uncertainty on \(x_{c}\) inferred from the covariance matter? How does the latter compare to the offset from the true position? And how does it compare to the scatter around the mean from the different trials? If you wanted to measure differences in positions on an image, what would you suggest is a good estimate for the uncertainties on these position differences?