Reproducing Boone 2021
Overview
The details of the ParSNIP model are documented in Boone 2021. To reproduce all of the results in that paper, follow the following steps.
Installing ParSNIP
Install the ParSNIP software package following the instructions on the Installation page.
Downloading the data
From the desired working directory, run the following scripts on the command line to
download the PLAsTiCC and PS1 datasets to /data/
directory.
Download PS1:
$ lcdata_download_ps1
Download PLAsTiCC (warning, this can take a long time):
$ lcdata_download_plasticc
Build a combined PLAsTiCC training set for ParSNIP:
$ parsnip_build_plasticc_combined
Training the ParSNIP model
Note: Model training is much faster if a GPU is available. By default, ParSNIP will
attempt to use the GPU if there is one and fallback to CPU if not. This can be overriden
by passing e.g. --device cpu
to the parsnip_train
script where cpu
is the desired
PyTorch device.
Train a PS1 model using the full dataset (1 hour):
$ parsnip_train \
./models/parsnip_ps1.pt \
./data/ps1.h5
Train a PS1 model with a held-out validation set (1 hour):
$ parsnip_train \
./models/parsnip_ps1_validation.pt \
./data/ps1.h5 \
--split_train_test
Train a PLAsTiCC model using the full dataset (1 day):
$ parsnip_train \
./models/parsnip_plasticc.pt \
./data/plasticc_combined.h5
Train a PLAsTiCC model with a held-out validation set (1 day):
$ parsnip_train \
./models/parsnip_plasticc_validation.pt \
./data/plasticc_combined.h5 \
--split_train_test
Generate predictions
Generate predictions for the PS1 dataset (< 1 min):
parsnip_predict ./predictions/parsnip_predictions_ps1.h5 \
./models/parsnip_ps1.pt \
./data/ps1.h5
Generate predictions for the PS1 dataset with 100-fold augmentation (3 min):
parsnip_predict ./predictions/parsnip_predictions_ps1_aug_100.h5 \
./models/parsnip_ps1.pt \
./data/ps1.h5 \
--augments 100
Generate predictions for the PLAsTiCC combined training dataset (7 min):
parsnip_predict ./predictions/parsnip_predictions_plasticc_combined.h5 \
./models/parsnip_plasticc.pt \
./data/plasticc_combined.h5
Generate predictions for the PLAsTiCC training set with 100-fold augmentation (4 min):
parsnip_predict ./predictions/parsnip_predictions_plasticc_train_aug_100.h5 \
./models/parsnip_plasticc.pt \
./data/plasticc_train.h5 \
--augments 100
Generate predictions for the full PLAsTiCC dataset (1 hour):
parsnip_predict ./predictions/parsnip_predictions_plasticc_test.h5 \
./models/parsnip_plasticc.pt \
./data/plasticc_test.h5
Figures and analysis
All of the figures and analysis in Boone 2021 were done with Jupyter notebooks that are available on GitHub. To rerun these notebooks, copy the notebooks folder to the working directory and run the notebooks from within that folder.