Anke van Dyk

Career Stage
Student (postgraduate)
Poster Abstract

Name: Anke van Dyk
Presentation Title: Capturing Transients

We aim to transcribe the Capture-Recapture methodology for Astronomy. Capture-Recapture is a well-established method within the fields of biostatistics for the study of animal populations as well as epidemiology for the analysis of the spread of disease. We hypothesise that this method can successfully be applied to transient Astronomy for the estimation of the underlying populations of transients. Up until now, efforts in transient Astronomy have focused on finding and following up transients. This work broadens the field of transient Astronomy by attempting to measure underlying populations and would be complementary to population synthesis, as selection effects due to the observational sampling strategy are built-in. Exploration of the topic has been aimed at simulations of lightcurves of X-ray binaries (following the work of Laycock (2017)) and applying population estimators such as the Schnabel estimator on strategically sampled lightcurves of various cadences. These early simulations show promise of recovering the input population to an accuracy of about 20% within 10-15 observations, even with low cadences between 120 and 240 days. The work is being extended to apply these techniques to real astronomical data to explore population recovery rates given different sampling strategies. Transient surveys that are synoptic, such as the upcoming LSST, will provide higher (and deeper) completeness to population studies of transient and variable stars and will be of great use for this study. We also aim to characterise the selection biases of this population estimation introduced by sampling strategy in future projects, such as the LSST.

Plain text summary
The biostatistical capture-recapture method presents us with ways of estimating underlying population sizes of transients and/or variable stars. Recurring transients such as High Mass X-ray Binaries are of interest for this application. These populations can be defined as statistically 'closed' since the population is assumed to remain constant throughout the duration of consecutive observation.
A simple case scenario for estimating a population, which assumes equal probability of capture of an individual, is the Schnabel estimator. The i'th observation estimate is defined as the sum of the first to the i'th observation of the product of the total individual count at observation i and the cumulative count of unique individuals at observation i minus 1
all divided by:
the sum of the re-encountered individuals at each observation i and 1 added in the denominator.
To simulate a population of High Mass X-ray Binaries, we assume their outbursts at periastron. The recurrence time of the outbursts are based on the orbital period of the binary. Six orbital period distributions (A to F) shown in Figure one are modeled by right-skewed gaussians. The orbital periods range from one day up to about 350 days with median values between 100 and 300 days.
Figure two depicts the simulated light curves of a small population of Model A High Mass X-ray binaries that have randomly scaled outburst amplitudes on a relative flux scale between zero and one. Model A has a median orbital period of about 150 days. The population is shown to be sampled concurrently but randomly at a 7 to 14 day cadence over a 400 day window. A threshold at a relative flux of 0.2 represent an arbitrary detection limit, above which we say that the transient has been detected and below has been missed.
In order to estimate the population, the detect and non-detect information for each individual at each observation is stored in a capture (a.k.a. an encounter) history. Logistic regression is employed to estimate the population size given representative sampling at each occasion. There are various estimator types that can be employed to deal with assumptions of heterogeneous and temporal capture probability.
Our simulations and estimation have built into them the parameters of amplitude and duration of outbursts as well as the brightness detection threshold.
Figure 3 shows two plots for a Model A simulated population of 100 High mass x-ray Binaries. Six different cadence strategies were investigated: 7 to 14, 14 to 30, 30 to 60, 60 to 90, 60 to 120, 90 to 120, and 120 to 240 days.
A relative threshold of 0.2 has been applied to all of them. On the left we have the cumulative count of individuals detected as a function of observation. For all of the cadences it reaches 80 percent of the population after 15 or more observations. On the right hand side the Schnabel estimator reaches 80 percent of the population in 15 or fewer observations for all but the 7 to 14 day cadence. This shows promise of making reasonable estimations with limited data and serves as a motivator for further transient searches.
The next avenues of investigation include quantifying the robustness of the estimators with regards to variable parameters such as amplitude and duration of outburst, as well as threshold modelling. We are further investigating methods that allow for heterogeneous capture probability and its affect on the accuracy of estimation. Lastly, we are developing a methodology and science case for real astronomical data and potential other applications such as Dwarf Novae Cataclysmic Variables and Fast Radio Burst populations.
Poster Title
Capturing Transients: An application of Biostatistics to Astronomy
Tags
Astronomy
Data Science
Url
anke@saao.ac.za