DESI DR1 Stream Tutorial¶
Welcome to the 2025 version of our stellar stream characterization tutorial notebook. This notebook will walk you through using data from the DESI Milky Way Survey data from the first full public release. For more information on the DESI DR1 stellar catalogue from the Milky Way Survey, see the README.md
file.
Coments like the one below highlight the pieces of code you're expected to tweak in your search for stellar streams.
#-----------------------#
Example_variable = 1000000000000e10 #Look here for stuff to change throughout the notebook!
#-----------------------#
Below, we're going to import the packages we'll be using.
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import scipy as sp
import scipy.stats as stats
from astropy.io import fits
from astropy import table
import matplotlib
import matplotlib.patheffects as path_effects
import importlib
import stream_functions as stream_funcs
importlib.reload(stream_funcs)
import emcee
import corner
from astropy.table import Table
from astropy import units as u
from collections import OrderedDict
import time
from scipy import optimize, stats
import matplotlib.colors as mcolors
colors = mcolors.CSS4_COLORS
color_names = list(colors.keys())
import streamTutorial as st
import copy
importlib.reload(st);
Next, let's input our DESI MWS DR1 data.
importlib.reload(st)
# Add the path to the DESI data and STREAMFINDER data below
#-----------------------#
desi_path = '../mwsall-pix-iron.fits'
sf_path = './data/streamfinder_gaiadr3.fits'
#-----------------------#
Data = st.Data(desi_path, sf_path)
print('Now our desi data has been loaded under Data.desi_data')
# _______
# ( ^ ^ )š
# | ~ |
# |_______|
# ~ "Well done!"
Length of DESI Data before Cuts: 6372607 Length after NaN cut: 4075716 Now our desi data has been loaded under Data.desi_data
Pick Stream¶
We'll pick a stream to work with by entering it in
#-----------------------#
st.stream(Data, streamName='SoI-I21', streamNo=42)
#-----------------------#
Initializing this stream object will add an attribute to the Data
object, accessed either through
Data.confirmed_sf_and_desi
or SoI.data.confirmed_sf_and_desi
Here is the table to reference in picking your stream:
SF3 Stream Index | Stream | # SF3 Stars | # SF3 Stars in DESI | # of SF3 Stars with VHel | Galstream Stream name | Metallicities | Distance_kpc |
---|---|---|---|---|---|---|---|
1 | C-20 | 29.0 | 8.0 | 8.0 | C-20-I24 | -2.93 | |
3 | NGC | 173.0 | 3.0 | 7.0 | False | -1.32 | |
4 | ATLAS | 208.0 | 3.0 | 80.0 | AAU-ATLAS | -2.2 | |
6 | Kwando | 138.0 | 32.0 | 62.0 | Kwando-I21 | -2.29 | 8.2 |
8 | New-2 | 90.0 | 20.0 | 18.0 | New-2-I24 | -2.07 | |
10 | Gaia-12 | 46.0 | 9.0 | 5.0 | Gaia-12-I21 | -3.28 | 13.3 |
17 | New-3 | 120.0 | 3.0 | 10.0 | New-3-I24 | -2.71 | |
19 | Leiptr | 412.0 | 26.0 | 37.0 | Leiptr-I21 | -2.17 | 8.9 |
22 | New-6 | 146.0 | 12.0 | 12.0 | New-6-I24 | -2.18 | |
24 | C-25 | 121.0 | 40.0 | 25.0 | C-25-I24 | -2.3 | |
25 | C-11 | 112.0 | 33.0 | 9.0 | C-11-I24 | -2.91 | |
27 | New-7 | 37.0 | 15.0 | 11.0 | New-7-I24 | -2.23 | |
28 | New-8 | 7.0 | 1.0 | 1.0 | New-8-I24 | -1.99 | |
29 | New-9 | 10.0 | 6.0 | 7.0 | New-9-I24 | -1.95 | |
30 | C-10 | 158.0 | 49.0 | 5.0 | C-10-I24 | -0.91 | |
31 | New-10 | 47.0 | 1.0 | 7.0 | New-10-I24 | -0.99 | |
32 | New-11 | 18.0 | 7.0 | 6.0 | New-11-I24 | -2.03 | |
33 | Gaia-10 | 141.0 | 45.0 | 26.0 | Gaia-10-I21 | -1.4 | 17.4 |
34 | Gjoll | 607.0 | 18.0 | 40.0 | NGC3201-P21 | -1.63 | |
35 | C-9 | 183.0 | 70.0 | 35.0 | C-9-I24 | -0.72 | |
36 | C-24 | 244.0 | 86.0 | 27.0 | C-24-I24 | -0.93 | |
39 | Slidr | 330.0 | 148.0 | 34.0 | Slidr-I21 | -1.7 | 2.4 |
41 | Ylgr | 919.0 | 14.0 | 20.0 | Ylgr-I21 | -2.09 | 8.6 |
42 | Sylgr | 256.0 | 109.0 | 33.0 | Sylgr-I21 | -2.92 | 2.4 |
43 | New-15 | 152.0 | 74.0 | 23.0 | New-15-I24 | -2.01 | |
47 | Fjorm | 297.0 | 104.0 | 29.0 | M68-P19 | -2.23 | |
48 | Orphan | 867.0 | 113.0 | 247.0 | Orphan-K23 | -1.9 | |
49 | Gaia-1 | 200.0 | 44.0 | 28.0 | Gaia-1-I21 | -1.8 | 5.0 |
50 | C-23 | 29.0 | 12.0 | 6.0 | C-23-I24 | -2.36 | |
51 | LMS-1 | 358.0 | 89.0 | 29.0 | LMS1 | -2.09 | 16.2 |
52 | C-22 | 39.0 | 17.0 | 11.0 | C-22-I24 | -2.05 | |
53 | GD-1 | 1468.0 | 670.0 | 323.0 | GD-1-I21 | -2.49 | 9.2 |
56 | New-17 | 13.0 | 2.0 | 0.0 | New-17-I24 | -2.35 | |
57 | NGC5466 | 43.0 | 14.0 | 6.0 | NGC5466-J21 | -1.98 | 16.6 |
58 | New-18 | 107.0 | 33.0 | 1.0 | New-18-I24 | -0.94 | |
59 | Gaia-6 | 145.0 | 63.0 | 14.0 | Gaia-6-I21 | -1.53 | 8.5 |
60 | New-19 | 40.0 | 22.0 | 4.0 | New-19-I24 | -0.79 | |
61 | Pal-5 | 129.0 | 33.0 | 69.0 | Pal5 | -1.41 | |
62 | M5 | 91.0 | 33.0 | 11.0 | M5-G19 | -1.29 | 15.9 |
63 | Kshir | 141.0 | 50.0 | 7.0 | Kshir-I21 | -1.83 | 10.1 |
64 | Svol | 234.0 | 59.0 | 16.0 | Svol-I21 | -1.98 | 8.9 |
65 | Gaia-9 | 233.0 | 78.0 | 25.0 | Gaia-9-I21 | -2.21 | 3.8 |
68 | M92 | 202.0 | 140.0 | 23.0 | M92-I21 | -2.31 | 9.2 |
70 | Gaia-11 | 84.0 | 11.0 | 3.0 | Gaia-11-I21 | -1.19 | 11.5 |
71 | Hrid | 666.0 | 46.0 | 29.0 | Hrid-I21 | -1.13 | 3.2 |
74 | New-21 | 762.0 | 61.0 | 3.0 | New-21-I24 | -0.87 | |
75 | Phlegethon | 632.0 | 85.0 | 41.0 | Phlegethon-I21 | -2.19 | 3.5 |
76 | NGC7089 | 15.0 | 2.0 | 2.0 | M2-I21 | -1.65 | |
79 | New-23 | 44.0 | 4.0 | 6.0 | New-23-I24 | -2.25 | |
80 | New-24 | 56.0 | 1.0 | 7.0 | New-24-I24 | -0.64 | |
82 | New-26 | 43.0 | 11.0 | 4.0 | New-26-I24 | -0.17 | |
84 | NGC7492 | 26.0 | 1.0 | 1.0 | NGC7492-I24 | -1.78 | |
85 | C-19 | 46.0 | 6.0 | 12.0 | C-19-I21 | -3.58 | 18.0 |
86 | Jhelum | 986.0 | 1.0 | 160.0 | Jhelum-a/b | -2.12 | 10.9 |
importlib.reload(stream_funcs)
#-----------------------#
streamName= 'Sylgr-I21'#'Fjorm-I21' #'Sylgr-I21'
streamNo= 42# 47 #42
#-----------------------#
importlib.reload(st)
SoI = st.stream(Data, streamName, streamNo)
Importing galstreams module... Initializing galstreams library from master_log... 2.4107524852223685 Creating combined DataFrame of SF and DESI No original data available for comparison - cut_confirmed_sf_and_desi is empty Number of stars in SF: 256, Number of DESI and SF stars: 101 Saved merged DataFrame as self.data.confirmed_sf_and_desi Stars only in SF3: 155
Let us take a look at our stellar stream below:
Figure 1a¶
importlib.reload(st)
plt_soi = st.StreamPlotter(SoI)
plt_soi.on_sky(stream_frame=False)
/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/streamTutorial.py:762: UserWarning: You passed a edgecolor/edgecolors ('k') for an unfilled marker ('x'). Matplotlib is ignoring the edgecolor in favor of the facecolor. This behavior may change in the future. ax.scatter(
We also calculated the stream coordinates $\phi_1$ and $\phi_2$. This is achieved by rotating the right ascension and declination such that the length of the stream lies along $\phi_2 \sim 0$, and the center of the stream is $\phi_1 \sim 0$.
Figure 1b¶
plt_soi.on_sky(stream_frame=True)
Great! Lets move on now.
We have lots of DESI data, lets cut that down with the following steps:
- Perform an RA/DEC cut to remove stars on different parts of the sky
- Perform a distance cut to remove stars too nearby
- Trim stars that are too metal rich
- Trim stars that fall outside of the best-fit stellar population model
- Trim stars with kinematics nothing like the STREAMFINDER stars
Spatial Cuts¶
The first two bullet points can be achieved using a handy function we have stored in stream_functions.py
stream_funcs.threeD_max_min_mask
.
importlib.reload(st)
selection_fine = st.Selection(SoI.data.desi_data)
# We want to get the ra and dec from STREAMFINDER stars for this function so we can cut around it
#-----------------------#
ra_cut = 5 #deg
dec_cut = 5 #deg
#-----------------------#
selection_fine.add_mask(name='3D',
mask_func=lambda df: stream_funcs.threeD_max_min_mask(
df['TARGET_RA'],
df['TARGET_DEC'],
df['PARALLAX'],
df['PARALLAX_ERROR'],
SoI.data.SoI_streamfinder['RAdeg'],
SoI.data.SoI_streamfinder['DEdeg'],
SoI.min_dist,
ra_cut,dec_cut) #<-----------------------# dec cut, ra cut based on streamfinder star values
)
Selection object created for DataFrame with 4075716 rows. Mask added: '3D'
Once we've done all our cuts, we can get all our masks in one using mask = selection_fine.get_masks(['mask 1', ... 'mask n'])
Figure 2¶
importlib.reload(st)
# Lets take a look at our trimmed DESI data
mask = selection_fine.get_masks(['3D'])
# Using the new method that replaces the 4-line pattern
trimmed_stream = SoI.mask_stream(mask)
plt_trim = st.StreamPlotter(trimmed_stream)
plt_trim.on_sky(showStream=True, background=True, stream_frame=False)
plt_trim.on_sky(showStream=True, background=True)
...'3D' selected 215560 stars Selection for specified masks: 215560 / 4075716 stars. Created cut_confirmed_sf_and_desi with 1 stars that were filtered out Number of stars in SF: 256, Number of DESI and SF stars: 100 Saved merged DataFrame as self.data.confirmed_sf_and_desi_b Stars only in SF3: 156 /Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/streamTutorial.py:762: UserWarning: You passed a edgecolor/edgecolors ('k') for an unfilled marker ('x'). Matplotlib is ignoring the edgecolor in favor of the facecolor. This behavior may change in the future. ax.scatter(
Lets also look at our parallax cut. We use the calculation of plx - 2 * plx_err < 1/min_dist. This simply gets rid of nearby stars that, after subtracting 2 times the parallax error, are still below minimum parallax.
Figure 3¶
plt_trim.plx_cut()
/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/streamTutorial.py:828: UserWarning: You passed a edgecolor/edgecolors ('k') for an unfilled marker ('x'). Matplotlib is ignoring the edgecolor in favor of the facecolor. This behavior may change in the future. ax.scatter(
Now that we've visualized the stream in the stream-frame, we can get a bit more specific with our on-sky cuts. We can narrow in by only keeping stars within some range of $\phi_2$ = 0.
#-----------------------#
phi2_wiggle = 5 #[deg]
#-----------------------#
selection_fine.add_mask(name='phi2',
mask_func=lambda df: (df['phi2'] < phi2_wiggle) & (df['phi2'] > -1*phi2_wiggle))
Mask added: 'phi2'
Figure 4¶
importlib.reload(st)
# Lets take a look at our trimmed DESI data
mask = selection_fine.get_masks(['3D', 'phi2'])
trimmed_stream = SoI.mask_stream(mask)
plt_trim = st.StreamPlotter(trimmed_stream)
plt_trim.on_sky(showStream=True, background=True, stream_frame=False)
plt_trim.on_sky(showStream=True, background=True)
...'3D' selected 215560 stars ...'phi2' selected 593800 stars Selection for specified masks: 120082 / 4075716 stars. Created cut_confirmed_sf_and_desi with 1 stars that were filtered out Number of stars in SF: 256, Number of DESI and SF stars: 100 Saved merged DataFrame as self.data.confirmed_sf_and_desi_b Stars only in SF3: 156 /Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/streamTutorial.py:762: UserWarning: You passed a edgecolor/edgecolors ('k') for an unfilled marker ('x'). Matplotlib is ignoring the edgecolor in favor of the facecolor. This behavior may change in the future. ax.scatter(
Metallicity Cut¶
Stars in a stellar stream all share some chemical characteristics, and we can use this to cut down on the contaminents in our field. Lets grab the stream's known metallicity from STREAMFINDER and make a cut using that information.
sf3_table = pd.read_csv('./data/sf3_only_table.csv')
sf_streamname = streamName.rsplit('-', 1)[0] #If this fails, manually enter the stream name from the first table.
metallicity = sf3_table[sf3_table['Stream'] == sf_streamname]['Metallicities'].values
print(f"{sf_streamname} Metallicity: {metallicity}")
print(f'Mass Fraction (Z) Guess: {0.0181 * 10 ** metallicity}')
Sylgr Metallicity: [-2.92] Mass Fraction (Z) Guess: [2.17609863e-05]
Figure 5¶
plt_trim.plot_params['background']['alpha']=0.05
plot = plt_trim.feh_plot(showStream=True, background=True)
ax = plot[1]
ax.axhline(metallicity, color='g', linestyle='solid', label='Metallicity from SF3', alpha=0.5)
ax.legend()
/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/streamTutorial.py:1006: UserWarning: You passed a edgecolor/edgecolors ('k') for an unfilled marker ('x'). Matplotlib is ignoring the edgecolor in favor of the facecolor. This behavior may change in the future. ax.scatter(
<matplotlib.legend.Legend at 0x163df37d0>
Now lets trim stars that are too metal rich!
#-----------------------#
feh_cut = -1.5 # [dex]
#-----------------------#
selection_fine.add_mask(name='feh',
mask_func=lambda df: (df['FEH'] < feh_cut))
Mask added: 'feh'
Figure 6¶
#del trimmed_stream, plt_trim # clear memory
importlib.reload(st)
mask = selection_fine.get_masks(['3D', 'phi2', 'feh'])
trimmed_stream = SoI.mask_stream(mask)
plt_trim = st.StreamPlotter(trimmed_stream)
plot = plt_trim.feh_plot(showStream=True, background=True)
ax = plot[1]
ax.axhline(metallicity, color='g', linestyle='solid', label='Metallicity from SF3', alpha=0.5)
ax.axhline(feh_cut, color='red', linestyle='--', label='Cut')
ax.legend(loc='upper right')
...'3D' selected 215560 stars ...'phi2' selected 593800 stars ...'feh' selected 418490 stars Selection for specified masks: 24559 / 4075716 stars. Created cut_confirmed_sf_and_desi with 6 stars that were filtered out Number of stars in SF: 256, Number of DESI and SF stars: 95 Saved merged DataFrame as self.data.confirmed_sf_and_desi_b Stars only in SF3: 161 /Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/streamTutorial.py:1006: UserWarning: You passed a edgecolor/edgecolors ('k') for an unfilled marker ('x'). Matplotlib is ignoring the edgecolor in favor of the facecolor. This behavior may change in the future. ax.scatter(
<matplotlib.legend.Legend at 0x163ab3810>
Isochrone Cut¶
When searching for stellar streams, an isochrone cut helps isolate stars that are likely part of the same stellar population. Streams originate from disrupted star clusters or dwarf galaxies, whose stars share a common age and chemical composition. By selecting only stars lying close to a theoretical stellar evolution isochrone (in colourāmagnitude space), we suppress contamination from unrelated field stars with different ages and metallicities.
The position of an isochrone in colour-magnitude space depends on the age of the stellar population (~shifts it up and down) and the metallicity (~shifts it left and right)
#-----------------------#
colour_wiggle = 0.15 # [dex]
age = 13.5 # [Gyr] -> of form 10.0, 10.1, 10.2, etc
metallicity =metallicity # may want to adjust the metallicity of the isochrone manually
#-----------------------#
SoI.isochrone(metallicity, age=age)
selection_fine.add_mask(name='iso',
mask_func=lambda df: (stream_funcs.betw(SoI.data.desi_colour_idx, SoI.isochrone_fit(SoI.data.desi_abs_mag), colour_wiggle)))
Mass Fraction (Z): [2.17609863e-05] using ./data/dotter/iso_a13.5_z0.00006.dat Using distance gradient /Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/stream_functions.py:635: RuntimeWarning: invalid value encountered in log10 R_mag = r_mag - 5 * np.log10(distances) + 5 Mask added: 'iso'
Figure 7¶
importlib.reload(st)
mask = selection_fine.get_masks(['3D', 'phi2', 'feh', 'iso'])
trimmed_stream = SoI.mask_stream(mask)
trimmed_stream.isochrone(metallicity, age=age)
plt_trim = st.StreamPlotter(trimmed_stream)
plt_trim.plot_params['background']['alpha']=0.1
plt_trim.iso_plot(wiggle=colour_wiggle, background=True, showStream=True)
...'3D' selected 215560 stars ...'phi2' selected 593800 stars ...'feh' selected 418490 stars ...'iso' selected 844332 stars Selection for specified masks: 10990 / 4075716 stars. Created cut_confirmed_sf_and_desi with 19 stars that were filtered out Number of stars in SF: 256, Number of DESI and SF stars: 82 Saved merged DataFrame as self.data.confirmed_sf_and_desi_b Stars only in SF3: 174 Mass Fraction (Z): [2.17609863e-05] using ./data/dotter/iso_a13.5_z0.00006.dat Using distance gradient /Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/streamTutorial.py:1060: UserWarning: You passed a edgecolor/edgecolors ('k') for an unfilled marker ('x'). Matplotlib is ignoring the edgecolor in favor of the facecolor. This behavior may change in the future. ax.scatter(
Kinematic Cuts¶
DESI's radial velocity measurements are generally more reliable than from the Gaia RVS data (which the STREAMFINDER Catalogue is based off of). We may see some obvious outliers in the velocity data, and we will want to cut those out.
Figure 8¶
importlib.reload(st)
plt_trim.plot_params['background']['alpha']=0.08
plt_trim.kin_plot(showStream=True, background=True, show_sf_only=False)
/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/streamTutorial.py:885: UserWarning: You passed a edgecolor/edgecolors ('k') for an unfilled marker ('x'). Matplotlib is ignoring the edgecolor in favor of the facecolor. This behavior may change in the future. ax[0].scatter( /Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/streamTutorial.py:890: UserWarning: You passed a edgecolor/edgecolors ('k') for an unfilled marker ('x'). Matplotlib is ignoring the edgecolor in favor of the facecolor. This behavior may change in the future. ax[1].scatter( /Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/streamTutorial.py:895: UserWarning: You passed a edgecolor/edgecolors ('k') for an unfilled marker ('x'). Matplotlib is ignoring the edgecolor in favor of the facecolor. This behavior may change in the future. ax[2].scatter(
#-----------------------#
vgsr_max=-100 #[km/s]
vgsr_min=-400
#-----------------------#
selection_fine.add_mask(name='VGSR',
mask_func=lambda df: (df['VGSR'] < vgsr_max) & (df['VGSR'] > vgsr_min))
Mask added: 'VGSR'
#-----------------------#
pmra_wiggle = 8 # [mas/yr]
#-----------------------#
selection_fine.add_mask(
name='PMRA',
mask_func=lambda df: (
(
df['PMRA'] >= np.interp(df['phi1'], [SoI.data.SoI_streamfinder['phi1'].min(), SoI.data.SoI_streamfinder['phi1'].max()],
[SoI.data.SoI_streamfinder.loc[SoI.data.SoI_streamfinder['phi1'].idxmin(), 'pmRA'] - pmra_wiggle,
SoI.data.SoI_streamfinder.loc[SoI.data.SoI_streamfinder['phi1'].idxmax(), 'pmRA'] - pmra_wiggle])
) &
(
df['PMRA'] <= np.interp(df['phi1'], [SoI.data.SoI_streamfinder['phi1'].min(), SoI.data.SoI_streamfinder['phi1'].max()],
[SoI.data.SoI_streamfinder.loc[SoI.data.SoI_streamfinder['phi1'].idxmin(), 'pmRA'] + pmra_wiggle,
SoI.data.SoI_streamfinder.loc[SoI.data.SoI_streamfinder['phi1'].idxmax(), 'pmRA'] + pmra_wiggle])
) ))
Mask added: 'PMRA'
#-----------------------#
pmdec_wiggle = 8 # [mas/yr]
#-----------------------#
selection_fine.add_mask(
name='PMDEC',
mask_func=lambda df: (
(
df['PMDEC'] >= np.interp(df['phi1'], [SoI.data.SoI_streamfinder['phi1'].min(), SoI.data.SoI_streamfinder['phi1'].max()],
[SoI.data.SoI_streamfinder.loc[SoI.data.SoI_streamfinder['phi1'].idxmin(), 'pmDE'] - pmdec_wiggle,
SoI.data.SoI_streamfinder.loc[SoI.data.SoI_streamfinder['phi1'].idxmax(), 'pmDE'] - pmdec_wiggle])
) &
(
df['PMDEC'] <= np.interp(df['phi1'], [SoI.data.SoI_streamfinder['phi1'].min(), SoI.data.SoI_streamfinder['phi1'].max()],
[SoI.data.SoI_streamfinder.loc[SoI.data.SoI_streamfinder['phi1'].idxmin(), 'pmDE'] + pmdec_wiggle,
SoI.data.SoI_streamfinder.loc[SoI.data.SoI_streamfinder['phi1'].idxmax(), 'pmDE'] + pmdec_wiggle])
) ))
Mask added: 'PMDEC'
It is sometimes easier to do wider cuts in PMRA/PMDEC for finding the stream by-eye, you can use those masks instead below.
pmra_max = 0 # [mas/yr]
pmra_min = -25
pmdec_max = 0 # [mas/yr]
pmdec_min = -25
#-----------------------#
selection_fine.add_mask(name='PMRA_wide',
mask_func=lambda df: (df['PMRA'] < pmra_max) & (df['PMRA'] > pmra_min))
selection_fine.add_mask(name='PMDEC_wide',
mask_func=lambda df: (df['PMDEC'] < pmdec_max) & (df['PMDEC'] > pmdec_min))
Mask added: 'PMRA_wide' Mask added: 'PMDEC_wide'
Lets look at our cuts!¶
selection_fine.list_masks()
Active masks: - 3D - phi2 - feh - iso - VGSR - PMRA - PMDEC - PMRA_wide - PMDEC_wide
Figure 9¶
importlib.reload(st)
# Lets take a look at our trimmed DESI data
#-----------------------#
mask = selection_fine.get_masks(['3D', 'phi2', 'feh','iso', 'VGSR', 'PMRA_wide', 'PMDEC_wide']) # TIP 'PMRA_wide', 'PMDEC_wide' can be used instead if defined
#-----------------------#
# Using the new method that replaces the 4-line pattern
# VGSR is now automatically computed for confirmed_sf_not_desi within mask_stream()
trimmed_stream = SoI.mask_stream(mask)
plt_trim = st.StreamPlotter(trimmed_stream)
plt_trim.plot_params['sf_in_desi']['alpha']= 0.9
plt_trim.plot_params['background']['alpha']= 0.5
plt_trim.plot_params['background']['s'] = 10
#-----------------------#
plt_trim.kin_plot(showStream=False, show_sf_only=False, background=True) # You can change showStream to True to see where the STREAMFINDER stars are in this space.
# If there aren't many SF stars in DESI, you can show the stars not in DESI as a guide by setting show_sf_only to True
#-----------------------#
...'3D' selected 215560 stars ...'phi2' selected 593800 stars ...'feh' selected 418490 stars ...'iso' selected 844332 stars ...'VGSR' selected 655452 stars ...'PMRA_wide' selected 2644812 stars ...'PMDEC_wide' selected 3118757 stars Selection for specified masks: 1361 / 4075716 stars. Created cut_confirmed_sf_and_desi with 22 stars that were filtered out Number of stars in SF: 256, Number of DESI and SF stars: 79 Saved merged DataFrame as self.data.confirmed_sf_and_desi_b Stars only in SF3: 177
Lets look at an example of a stream (Sylgr-I21) being spotted once our initial cuts have been made. We can see lines of overdensities in all three kinematic dimensions!
Furthermore, we can overlay the STREAMFINDER stars to see that this overdensity lies on the previously disovered stream, and which SF stars we have cutout.
If you don't see a stream, there is no point in continuing
These cuts are quite restrictive for the sake of spotting a stream by-eye. We'll widen them now to have more stars to run our mixture model on.
import copy
selection_mcmc = copy.copy(selection_fine)
## Make your cuts wider than the ones above.
#-----------------------#
colour_wiggle = 0.25 # [dex]
vgsr_max= -100 # [km/s]
vgsr_min= -400
pmra_max = -0 # [mas/yr]
pmra_min = -30
pmdec_max = 0 # [mas/yr]
pmdec_min = -30
feh_wide = -1
#-----------------------#
selection_fine.add_mask(name='feh_wide',
mask_func=lambda df: (df['FEH'] < feh_wide))
selection_mcmc.add_mask(name='iso_wide',
mask_func=lambda df: (stream_funcs.betw(SoI.data.desi_colour_idx, SoI.isochrone_fit(SoI.data.desi_abs_mag), colour_wiggle)))
selection_mcmc.add_mask(name='VGSR_wide',
mask_func=lambda df: (df['VGSR'] < vgsr_max) & (df['VGSR'] > vgsr_min))
selection_mcmc.add_mask(name='PMRA_wide',
mask_func=lambda df: (df['PMRA'] < pmra_max) & (df['PMRA'] > pmra_min))
selection_mcmc.add_mask(name='PMDEC_wide',
mask_func=lambda df: (df['PMDEC'] < pmdec_max) & (df['PMDEC'] > pmdec_min))
Mask added: 'feh_wide' Mask added: 'iso_wide' Mask added: 'VGSR_wide' Mask added: 'PMRA_wide' Mask added: 'PMDEC_wide'
mcmc_mask = selection_mcmc.get_masks(['3D', 'phi2', 'feh_wide','iso_wide', 'VGSR_wide', 'PMRA_wide', 'PMDEC_wide'])
...'3D' selected 215560 stars ...'phi2' selected 593800 stars ...'feh_wide' selected 866850 stars ...'iso_wide' selected 1300579 stars ...'VGSR_wide' selected 655452 stars ...'PMRA_wide' selected 2685256 stars ...'PMDEC_wide' selected 3158780 stars Selection for specified masks: 3704 / 4075716 stars.
Below, lets look at our final selection of stars before we run our optimization.
Figure 10¶
importlib.reload(st)
tomcmc_stream = SoI.mask_stream(mcmc_mask)
plt_trim = st.StreamPlotter(tomcmc_stream)
plt_trim.plot_params['sf_in_desi']['alpha']= 0.9
plt_trim.plot_params['background']['alpha']= 0.4
plt_trim.plot_params['background']['s'] = 1
#-----------------------#
plt_trim.sixD_plot(showStream=True, show_sf_only=False, background=True) # You can change showStream to True to see where the STREAMFINDER stars are in this space.
# If there aren't many SF stars in DESI, you can show the stars not in DESI as a guide by setting show_sf_only to True
#-----------------------#
Created cut_confirmed_sf_and_desi with 21 stars that were filtered out Number of stars in SF: 256, Number of DESI and SF stars: 80 Saved merged DataFrame as self.data.confirmed_sf_and_desi_b Stars only in SF3: 176
(<Figure size 1000x1500 with 5 Axes>, array([<Axes: ylabel='$\\phi_2$'>, <Axes: ylabel='V$_{GSR}$ (km/s)'>, <Axes: ylabel='$\\mu_{\\alpha}$ [mas/yr]'>, <Axes: ylabel='$\\mu_{\\delta}$ [mas/yr]'>, <Axes: xlabel='$\\phi_1$', ylabel='[Fe/H]'>], dtype=object))
Optimizing¶
If you see a stream above, you may continue with the rest of the notebook!
Before jumping into running MCMC, we will find a good starting point using scipy.minimize
. This reduces the amount of time we'll need to run the MCMC, and generally means we'll get a better fit.
Streams aren't flat lines in $\phi_1$ dynamically, so we want to allow our fits to vary along the track.
#-----------------------#
no_of_spline_points = 5
#-----------------------#
Typically we want between 3 and 5 spline points.
importlib.reload(st)
#-----------------------#
no_of_spline_points = 5
# Optional: Set custom spline range (if not specified, uses StreamFinder data range)
phi1_min = None # Set to a specific value to override the leftmost spline location
phi1_max = None # Set to a specific value to override the rightmost spline location
#-----------------------#
# Define truncation parameters based on our selection cuts
truncation_params = {
'vgsr_min': vgsr_min, 'vgsr_max': vgsr_max,
'feh_min': -4.0, 'feh_max': feh_wide,
'pmra_min': pmra_min, 'pmra_max': pmra_max,
'pmdec_min': pmdec_min, 'pmdec_max': pmdec_max
}
MCMeta = st.MCMeta(no_of_spline_points, tomcmc_stream, trimmed_stream.data.confirmed_sf_and_desi,
truncation_params=truncation_params, phi1_min=phi1_min, phi1_max=phi1_max)
Making stream initial guess based on galstream and STREAMFINDER... Stream VGSR dispersion from trimmed SF: 8.99 km/s Stream mean metallicity from trimmed SF: -2.59 +- 0.240 dex Stream PMRA dispersion from trimmed SF: 3.03 mas/yr Stream PMDEC dispersion from trimmed SF: 2.88 mas/yr Making background initial guess... Background velocity: -156.34 +- 52.29 km/s Background metallicity: -1.61 +- 0.478 dex Background PMRA: -4.26 +- 4.63 mas/yr Background PMDEC: -6.54 +- 4.64 mas/yr
We're doing a mixture model, but what are we mixing? Lets take a look at our intial guesses below:
Figure 11¶
plt_mcmeta = st.StreamPlotter(MCMeta)
# Show our initial parameter guesses
print("Initial Parameters:")
for param, value in MCMeta.initial_params.items():
if isinstance(value, np.ndarray):
print(f"{param}: {value}")
else:
print(f"{param}: {value:.4f}")
# Plot the Gaussian mixture model based on our initial guesses
plt_mcmeta.gaussian_mixture_plot(showStream=True, background=True)
Initial Parameters: lsigvgsr: 0.9538 vgsr_spline_points: [-263.10489607 -289.85502934 -308.08839713 -317.80499943 -319.00483625] feh1: -2.5860 lsigfeh: -0.6191 lsigpmra: 0.4815 pmra_spline_points: [-24.88423358 -21.20953594 -16.99754348 -12.24825619 -6.96167408] lsigpmdec: 0.4601 pmdec_spline_points: [-24.19698943 -20.14484016 -15.82709712 -11.2437603 -6.39482971] bv: -156.3445 lsigbv: 1.7184 bfeh: -1.6103 lsigbfeh: -0.3207 bpmra: -4.2565 lsigbpmra: 0.6655 bpmdec: -6.5369 lsigbpmdec: 0.6669
(<Figure size 900x900 with 4 Axes>, array([[<Axes: xlabel='V$_{GSR}$ (km/s)'>, <Axes: xlabel='[Fe/H]'>], [<Axes: xlabel='$\\mu_{RA}$ (mas/yr)'>, <Axes: xlabel='$\\mu_{DEC}$ (mas/yr)'>]], dtype=object))
You can see our initial guess above. showing the mixture of the stream model (blue) and the background model (orange) combining to make the overall distribution (black). Depending on your cuts, you may also notice that the gaussians are truncated for some parametersāthis is done to allow us to cut down on the background stars while still modelling each population with some gaussian mean and spread.
Priors¶
An important step for our analysis is to provide strong priors about the background and stream. We will base those around our initial guess, using the STREAMFINDER catalogue to inform our priors.
importlib.reload(st)
vgsr_range_wiggle = 10
pmra_range_wiggle = 5
pmdec_range_wiggle = 10
lsigvgsr_r = (-4, 4)
vgsr_ranges = [(v - vgsr_range_wiggle, v + vgsr_range_wiggle) for v in MCMeta.initial_params['vgsr_spline_points']]
pmra_ranges = [(v - pmra_range_wiggle, v + pmra_range_wiggle) for v in MCMeta.initial_params['pmra_spline_points']]
pmdec_ranges = [(v - pmdec_range_wiggle, v + pmdec_range_wiggle) for v in MCMeta.initial_params['pmdec_spline_points']]
#-----------------------#
feh1_r = (-3,-1)
#-----------------------#
lsigfeh_r = (-2,4)
lsigpmra_r = (-5, 1)
lsigpmdec_r = (-5, 1)
bv_r = (-400, 400)
lsigbv_r = (-2,4)
bfeh_r = (-4, 4)
lsigbfeh_r = (-2,2)
bpmra_r = (-20, 20)
lsigbpmra_r = (-2,3)
bpmdec_r = (-30, 30)
lsigbpmdec_r = (-2,3)
pstream_r = (0.005, 1.0)
prior = [
pstream_r, # pstream (single value)
*vgsr_ranges, # vgsr_spline_points (multiple values)
lsigvgsr_r, # lsigvgsr (single value)
feh1_r, lsigfeh_r, # feh parameters
*pmra_ranges, # pmra_spline_points (multiple values)
lsigpmra_r, # lsigpmra (single value)
*pmdec_ranges, # pmdec_spline_points (multiple values)
lsigpmdec_r, # lsigpmdec (single value)
bv_r, lsigbv_r, bfeh_r, lsigbfeh_r, # background parameters
bpmra_r, lsigbpmra_r, bpmdec_r, lsigbpmdec_r
]
MCMeta.priors(prior)
Now that we have our priors set, lets try running a basic minimization to find a good starting point for the MCMC. This will run from 10 to 60 seconds...
MCMeta.scipy_optimize()
Running optimization... Maximum number of function evaluations has been exceeded. Optimized Parameters: pstream: 0.0556 vgsr_spline_points: [-264.21229037 -290.07980557 -318.08747266 -314.04618491 -320.15169051] sigvgsr: 18.1592 feh1: -2.6108 sigfeh: 0.1573 pmra_spline_points: [-19.88427539 -23.30238676 -17.03810652 -12.54089837 -6.08551417] sigpmra: 2.2639 pmdec_spline_points: [-33.69305732 -18.24069897 -25.82636154 -11.51744935 -4.81658078] sigpmdec: 9.9887 bv: -70.4926 sigbv: 79.6728 bfeh: -1.2113 sigbfeh: 0.5763 bpmra: 2.3098 sigbpmra: 6.5907 bpmdec: -4.8705 sigbpmdec: 5.8892
Figure 13¶
# First reload the module to get the fixed version
importlib.reload(st)
plt_optimized = st.StreamPlotter(MCMeta)
original_params = MCMeta.initial_params.copy()
optimized_for_plotting = MCMeta.sp_output.copy()
# Update MCMeta with optimized parameters for plotting
MCMeta.initial_params = optimized_for_plotting
print("\nPlotting optimized Gaussian mixture...")
# This should now work without the broadcasting error
plt_optimized.gaussian_mixture_plot(showStream=False, background=True)
# Restore original initial_params
MCMeta.initial_params = original_params
plt.suptitle('Optimized Gaussian Mixture Model', fontsize=14, y=0.98)
plt.tight_layout()
plt.show()
MCMeta.prior_validation()
Plotting optimized Gaussian mixture...
Your prior is good, you've found something! All 70 walkers initialized successfully!
The above cell automatically checks if your prior was any good. If you get a message like the following:
Try going back and adjusting the mentioned parameter.
Figure 14¶
importlib.reload(st)
plt_enhanced = st.StreamPlotter(MCMeta)
plt_enhanced.plot_params['sf_in_desi']['alpha'] = 0.9
plt_enhanced.plot_params['background']['alpha'] = 0.4
plt_enhanced.plot_params['background']['s'] = 1
fig, ax = plt_enhanced.sixD_plot(
showStream=True,
show_sf_only=False,
background=True,
show_initial_splines=True,
show_optimized_splines=True,
show_sf_errors=True # Now enable error bars
)
plt.tight_layout()
plt.show()
EMCEE¶
Once we've optimized our intial guess with Scipy, we can run our MCMC sampler.
Only run this is you are interested in the MCMC sampler. This will take up to 20 minutes unless you reduce the number of burnin's or steps taken.
importlib.reload(st)
from datetime import datetime
import os
note = 'test'
current_date = datetime.now().strftime("%y%m%d")
output_dir = f"./runs/{streamName}_{current_date}_{note}"
if not os.path.exists(output_dir):
os.makedirs(output_dir)
mcmc = st.MCMC(MCMeta, output_dir)
%%time
#-----------------------#
nproc = 32
nburnin = 5000
nstep = 5000
use_optimized_start = True # Toggle: True for optimized results, False for initial guess
#-----------------------#
mcmc.run(nproc=nproc, nburnin=nburnin, nstep=nstep, use_optimized_start=use_optimized_start)
Using optimized parameters as starting positions... Running burn-in with 5000 iterations. starting from optimized parameters... Clipping all walker positions to be within prior ranges... Took 147.3 seconds (2.5 minutes) Now sampling with 5000 iterations Took 130.7 seconds (2.2 minutes) Getting flatchain... CPU times: user 2min 18s, sys: 1min 45s, total: 4min 3s Wall time: 4min 38s
Figure 15¶
mcmc.show_chains()
Figure 16¶
mcmc.show_corner()