DESI DR1 Stream Tutorial¶

Welcome to the 2025 version of our stellar stream characterization tutorial notebook. This notebook will walk you through using data from the DESI Milky Way Survey data from the first full public release. For more information on the DESI DR1 stellar catalogue from the Milky Way Survey, see the README.md file.

Coments like the one below highlight the pieces of code you're expected to tweak in your search for stellar streams.

#-----------------------#

Example_variable = 1000000000000e10 #Look here for stuff to change throughout the notebook!

#-----------------------#

Below, we're going to import the packages we'll be using.

InĀ [1]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import scipy as sp
import scipy.stats as stats
from astropy.io import fits
from astropy import table
import matplotlib
import matplotlib.patheffects as path_effects
import importlib
import stream_functions as stream_funcs
importlib.reload(stream_funcs)
import emcee
import corner
from astropy.table import Table
from astropy import units as u
from collections import OrderedDict
import time
from scipy import optimize, stats
import matplotlib.colors as mcolors
colors = mcolors.CSS4_COLORS
color_names = list(colors.keys())
import streamTutorial as st
import copy
importlib.reload(st);

Next, let's input our DESI MWS DR1 data.

InĀ [2]:
importlib.reload(st)
# Add the path to the DESI data and STREAMFINDER data below
#-----------------------#
desi_path = '../mwsall-pix-iron.fits'
sf_path = './data/streamfinder_gaiadr3.fits'
#-----------------------#

Data = st.Data(desi_path, sf_path)

print('Now our desi data has been loaded under Data.desi_data')


#      _______
#    (  ^   ^  )šŸ‘
#     |   ~   |
#     |_______|
#   ~ "Well done!"
Length of DESI Data before Cuts: 6372607
Length after NaN cut: 4075716
Now our desi data has been loaded under Data.desi_data

Pick Stream¶

We'll pick a stream to work with by entering it in

#-----------------------#
st.stream(Data, streamName='SoI-I21', streamNo=42)
#-----------------------#

Initializing this stream object will add an attribute to the Data object, accessed either through

Data.confirmed_sf_and_desi or SoI.data.confirmed_sf_and_desi

Here is the table to reference in picking your stream:

SF3 Stream Index Stream # SF3 Stars # SF3 Stars in DESI # of SF3 Stars with VHel Galstream Stream name Metallicities Distance_kpc
1 C-20 29.0 8.0 8.0 C-20-I24 -2.93
3 NGC 173.0 3.0 7.0 False -1.32
4 ATLAS 208.0 3.0 80.0 AAU-ATLAS -2.2
6 Kwando 138.0 32.0 62.0 Kwando-I21 -2.29 8.2
8 New-2 90.0 20.0 18.0 New-2-I24 -2.07
10 Gaia-12 46.0 9.0 5.0 Gaia-12-I21 -3.28 13.3
17 New-3 120.0 3.0 10.0 New-3-I24 -2.71
19 Leiptr 412.0 26.0 37.0 Leiptr-I21 -2.17 8.9
22 New-6 146.0 12.0 12.0 New-6-I24 -2.18
24 C-25 121.0 40.0 25.0 C-25-I24 -2.3
25 C-11 112.0 33.0 9.0 C-11-I24 -2.91
27 New-7 37.0 15.0 11.0 New-7-I24 -2.23
28 New-8 7.0 1.0 1.0 New-8-I24 -1.99
29 New-9 10.0 6.0 7.0 New-9-I24 -1.95
30 C-10 158.0 49.0 5.0 C-10-I24 -0.91
31 New-10 47.0 1.0 7.0 New-10-I24 -0.99
32 New-11 18.0 7.0 6.0 New-11-I24 -2.03
33 Gaia-10 141.0 45.0 26.0 Gaia-10-I21 -1.4 17.4
34 Gjoll 607.0 18.0 40.0 NGC3201-P21 -1.63
35 C-9 183.0 70.0 35.0 C-9-I24 -0.72
36 C-24 244.0 86.0 27.0 C-24-I24 -0.93
39 Slidr 330.0 148.0 34.0 Slidr-I21 -1.7 2.4
41 Ylgr 919.0 14.0 20.0 Ylgr-I21 -2.09 8.6
42 Sylgr 256.0 109.0 33.0 Sylgr-I21 -2.92 2.4
43 New-15 152.0 74.0 23.0 New-15-I24 -2.01
47 Fjorm 297.0 104.0 29.0 M68-P19 -2.23
48 Orphan 867.0 113.0 247.0 Orphan-K23 -1.9
49 Gaia-1 200.0 44.0 28.0 Gaia-1-I21 -1.8 5.0
50 C-23 29.0 12.0 6.0 C-23-I24 -2.36
51 LMS-1 358.0 89.0 29.0 LMS1 -2.09 16.2
52 C-22 39.0 17.0 11.0 C-22-I24 -2.05
53 GD-1 1468.0 670.0 323.0 GD-1-I21 -2.49 9.2
56 New-17 13.0 2.0 0.0 New-17-I24 -2.35
57 NGC5466 43.0 14.0 6.0 NGC5466-J21 -1.98 16.6
58 New-18 107.0 33.0 1.0 New-18-I24 -0.94
59 Gaia-6 145.0 63.0 14.0 Gaia-6-I21 -1.53 8.5
60 New-19 40.0 22.0 4.0 New-19-I24 -0.79
61 Pal-5 129.0 33.0 69.0 Pal5 -1.41
62 M5 91.0 33.0 11.0 M5-G19 -1.29 15.9
63 Kshir 141.0 50.0 7.0 Kshir-I21 -1.83 10.1
64 Svol 234.0 59.0 16.0 Svol-I21 -1.98 8.9
65 Gaia-9 233.0 78.0 25.0 Gaia-9-I21 -2.21 3.8
68 M92 202.0 140.0 23.0 M92-I21 -2.31 9.2
70 Gaia-11 84.0 11.0 3.0 Gaia-11-I21 -1.19 11.5
71 Hrid 666.0 46.0 29.0 Hrid-I21 -1.13 3.2
74 New-21 762.0 61.0 3.0 New-21-I24 -0.87
75 Phlegethon 632.0 85.0 41.0 Phlegethon-I21 -2.19 3.5
76 NGC7089 15.0 2.0 2.0 M2-I21 -1.65
79 New-23 44.0 4.0 6.0 New-23-I24 -2.25
80 New-24 56.0 1.0 7.0 New-24-I24 -0.64
82 New-26 43.0 11.0 4.0 New-26-I24 -0.17
84 NGC7492 26.0 1.0 1.0 NGC7492-I24 -1.78
85 C-19 46.0 6.0 12.0 C-19-I21 -3.58 18.0
86 Jhelum 986.0 1.0 160.0 Jhelum-a/b -2.12 10.9
InĀ [3]:
importlib.reload(stream_funcs)

#-----------------------#
streamName= 'Hrid-I21'#'Fjorm-I21' #'Sylgr-I21'
streamNo= 71# 47 #42
#-----------------------#

importlib.reload(st)
SoI = st.stream(Data, streamName, streamNo)
Importing galstreams module...
Initializing galstreams library from master_log... 
3.1078523981609845
Creating combined DataFrame of SF and DESI
No original data available for comparison - cut_confirmed_sf_and_desi is empty
Number of stars in SF: 666, Number of DESI and SF stars: 22
Saved merged DataFrame as self.data.confirmed_sf_and_desi
Stars only in SF3: 644

Let us take a look at our stellar stream below:

Figure 1a¶

InĀ [4]:
importlib.reload(st)
plt_soi = st.StreamPlotter(SoI)

plt_soi.on_sky(stream_frame=False)
/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/streamTutorial.py:762: UserWarning: You passed a edgecolor/edgecolors ('k') for an unfilled marker ('x').  Matplotlib is ignoring the edgecolor in favor of the facecolor.  This behavior may change in the future.
  ax.scatter(

No description has been provided for this image

We also calculated the stream coordinates $\phi_1$ and $\phi_2$. This is achieved by rotating the right ascension and declination such that the length of the stream lies along $\phi_2 \sim 0$, and the center of the stream is $\phi_1 \sim 0$.

Figure 1b¶

InĀ [5]:
plt_soi.on_sky(stream_frame=True)
No description has been provided for this image

Great! Lets move on now.

We have lots of DESI data, lets cut that down with the following steps:

  • Perform an RA/DEC cut to remove stars on different parts of the sky
  • Perform a distance cut to remove stars too nearby
  • Trim stars that are too metal rich
  • Trim stars that fall outside of the best-fit stellar population model
  • Trim stars with kinematics nothing like the STREAMFINDER stars

Spatial Cuts¶

The first two bullet points can be achieved using a handy function we have stored in stream_functions.py

stream_funcs.threeD_max_min_mask.

InĀ [6]:
importlib.reload(st)
selection_fine = st.Selection(SoI.data.desi_data)

# We want to get the ra and dec from STREAMFINDER stars for this function so we can cut around it

#-----------------------#
ra_cut = 5 #deg
dec_cut = 5 #deg
#-----------------------#

selection_fine.add_mask(name='3D',
        mask_func=lambda df: stream_funcs.threeD_max_min_mask(
        df['TARGET_RA'],        
        df['TARGET_DEC'],         
        df['PARALLAX'],       
        df['PARALLAX_ERROR'],   
        SoI.data.SoI_streamfinder['RAdeg'],         
        SoI.data.SoI_streamfinder['DEdeg'],        
        SoI.min_dist,              
        ra_cut,dec_cut) #<-----------------------# dec cut, ra cut based on streamfinder star values
)
Selection object created for DataFrame with 4075716 rows.
Mask added: '3D'

Once we've done all our cuts, we can get all our masks in one using mask = selection_fine.get_masks(['mask 1', ... 'mask n'])

Figure 2¶

InĀ [7]:
importlib.reload(st)
# Lets take a look at our trimmed DESI data

mask = selection_fine.get_masks(['3D'])

# Using the new method that replaces the 4-line pattern
trimmed_stream = SoI.mask_stream(mask)

plt_trim = st.StreamPlotter(trimmed_stream)
plt_trim.on_sky(showStream=True, background=True, stream_frame=False)
plt_trim.on_sky(showStream=True, background=True)
...'3D' selected 200553 stars
Selection for specified masks: 200553 / 4075716 stars.
No stars were cut - cut_confirmed_sf_and_desi is empty
Number of stars in SF: 666, Number of DESI and SF stars: 22
Saved merged DataFrame as self.data.confirmed_sf_and_desi_b
Stars only in SF3: 644
/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/streamTutorial.py:762: UserWarning: You passed a edgecolor/edgecolors ('k') for an unfilled marker ('x').  Matplotlib is ignoring the edgecolor in favor of the facecolor.  This behavior may change in the future.
  ax.scatter(

No description has been provided for this image
No description has been provided for this image

Lets also look at our parallax cut. We use the calculation of plx - 2 * plx_err < 1/min_dist. This simply gets rid of nearby stars that, after subtracting 2 times the parallax error, are still below minimum parallax.

Figure 3¶

InĀ [8]:
plt_trim.plx_cut()
/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/streamTutorial.py:828: UserWarning: You passed a edgecolor/edgecolors ('k') for an unfilled marker ('x').  Matplotlib is ignoring the edgecolor in favor of the facecolor.  This behavior may change in the future.
  ax.scatter(

No description has been provided for this image

Now that we've visualized the stream in the stream-frame, we can get a bit more specific with our on-sky cuts. We can narrow in by only keeping stars within some range of $\phi_2$ = 0.

InĀ [9]:
#-----------------------#
phi2_wiggle = 10 #[deg]
#-----------------------#

selection_fine.add_mask(name='phi2',
        mask_func=lambda df: (df['phi2'] < phi2_wiggle) & (df['phi2'] > -1*phi2_wiggle))
Mask added: 'phi2'

Figure 4¶

InĀ [10]:
importlib.reload(st)
# Lets take a look at our trimmed DESI data
mask = selection_fine.get_masks(['3D', 'phi2'])

trimmed_stream = SoI.mask_stream(mask)

plt_trim = st.StreamPlotter(trimmed_stream)
plt_trim.on_sky(showStream=True, background=True, stream_frame=False)
plt_trim.on_sky(showStream=True, background=True)
...'3D' selected 200553 stars
...'phi2' selected 362136 stars
Selection for specified masks: 133472 / 4075716 stars.
No stars were cut - cut_confirmed_sf_and_desi is empty
Number of stars in SF: 666, Number of DESI and SF stars: 22
Saved merged DataFrame as self.data.confirmed_sf_and_desi_b
Stars only in SF3: 644
/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/streamTutorial.py:762: UserWarning: You passed a edgecolor/edgecolors ('k') for an unfilled marker ('x').  Matplotlib is ignoring the edgecolor in favor of the facecolor.  This behavior may change in the future.
  ax.scatter(

No description has been provided for this image
No description has been provided for this image

Metallicity Cut¶

Stars in a stellar stream all share some chemical characteristics, and we can use this to cut down on the contaminents in our field. Lets grab the stream's known metallicity from STREAMFINDER and make a cut using that information.

InĀ [11]:
sf3_table = pd.read_csv('./data/sf3_only_table.csv')


sf_streamname = streamName.rsplit('-', 1)[0] #If this fails, manually enter the stream name from the first table.
metallicity = sf3_table[sf3_table['Stream'] == sf_streamname]['Metallicities'].values
print(f"{sf_streamname} Metallicity: {metallicity}")
print(f'Mass Fraction (Z) Guess: {0.0181 * 10 ** metallicity}')
Hrid Metallicity: [-1.13]
Mass Fraction (Z) Guess: [0.00134177]

Figure 5¶

InĀ [12]:
plt_trim.plot_params['background']['alpha']=0.05
plot = plt_trim.feh_plot(showStream=True, background=True)
ax = plot[1]
ax.axhline(metallicity, color='g', linestyle='solid', label='Metallicity from SF3', alpha=0.5)
ax.legend()
/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/streamTutorial.py:1006: UserWarning: You passed a edgecolor/edgecolors ('k') for an unfilled marker ('x').  Matplotlib is ignoring the edgecolor in favor of the facecolor.  This behavior may change in the future.
  ax.scatter(

Out[12]:
<matplotlib.legend.Legend at 0x17636f2d0>
No description has been provided for this image

Now lets trim stars that are too metal rich!

InĀ [13]:
#-----------------------#
feh_cut = -0.5 # [dex]
#-----------------------#

selection_fine.add_mask(name='feh',
        mask_func=lambda df: (df['FEH'] < feh_cut))
Mask added: 'feh'

Figure 6¶

InĀ [14]:
#del trimmed_stream, plt_trim # clear memory
importlib.reload(st)
mask = selection_fine.get_masks(['3D', 'phi2', 'feh'])

trimmed_stream = SoI.mask_stream(mask)


plt_trim = st.StreamPlotter(trimmed_stream)
plot = plt_trim.feh_plot(showStream=True, background=True)
ax = plot[1]
ax.axhline(metallicity, color='g', linestyle='solid', label='Metallicity from SF3', alpha=0.5)
ax.axhline(feh_cut, color='red', linestyle='--', label='Cut')
ax.legend(loc='upper right')
...'3D' selected 200553 stars
...'phi2' selected 362136 stars
...'feh' selected 2238378 stars
Selection for specified masks: 79393 / 4075716 stars.
No stars were cut - cut_confirmed_sf_and_desi is empty
Number of stars in SF: 666, Number of DESI and SF stars: 22
Saved merged DataFrame as self.data.confirmed_sf_and_desi_b
Stars only in SF3: 644
/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/streamTutorial.py:1006: UserWarning: You passed a edgecolor/edgecolors ('k') for an unfilled marker ('x').  Matplotlib is ignoring the edgecolor in favor of the facecolor.  This behavior may change in the future.
  ax.scatter(

Out[14]:
<matplotlib.legend.Legend at 0x16c8ced50>
No description has been provided for this image

Isochrone Cut¶

When searching for stellar streams, an isochrone cut helps isolate stars that are likely part of the same stellar population. Streams originate from disrupted star clusters or dwarf galaxies, whose stars share a common age and chemical composition. By selecting only stars lying close to a theoretical stellar evolution isochrone (in colour–magnitude space), we suppress contamination from unrelated field stars with different ages and metallicities.

The position of an isochrone in colour-magnitude space depends on the age of the stellar population (~shifts it up and down) and the metallicity (~shifts it left and right)

InĀ [38]:
#-----------------------#
colour_wiggle = 0.1 # [dex]
age = 10.5 # [Gyr] -> of form 10.0, 10.1, 10.2, etc

metallicity =metallicity # may want to adjust the metallicity of the isochrone manually
#-----------------------#
SoI.isochrone(metallicity, age=age) 

selection_fine.add_mask(name='iso',
        mask_func=lambda df: (stream_funcs.betw(SoI.data.desi_colour_idx, SoI.isochrone_fit(SoI.data.desi_abs_mag), colour_wiggle)))
Mass Fraction (Z): [0.00134177]

using ./data/dotter/iso_a10.5_z0.00144.dat
Using distance gradient
Mask added: 'iso'

Figure 7¶

InĀ [39]:
importlib.reload(st)
mask = selection_fine.get_masks(['3D', 'phi2', 'feh', 'iso'])

trimmed_stream = SoI.mask_stream(mask)
trimmed_stream.isochrone(metallicity, age=age) 

plt_trim = st.StreamPlotter(trimmed_stream)
plt_trim.plot_params['background']['alpha']=0.1
plt_trim.iso_plot(wiggle=colour_wiggle, background=True, showStream=True)
...'3D' selected 200553 stars
...'phi2' selected 362136 stars
...'feh' selected 2238378 stars
...'iso' selected 720926 stars
Selection for specified masks: 32576 / 4075716 stars.
No stars were cut - cut_confirmed_sf_and_desi is empty
Number of stars in SF: 666, Number of DESI and SF stars: 22
Saved merged DataFrame as self.data.confirmed_sf_and_desi_b
Stars only in SF3: 644
Mass Fraction (Z): [0.00134177]

using ./data/dotter/iso_a10.5_z0.00144.dat
Using distance gradient
/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/streamTutorial.py:1060: UserWarning: You passed a edgecolor/edgecolors ('k') for an unfilled marker ('x').  Matplotlib is ignoring the edgecolor in favor of the facecolor.  This behavior may change in the future.
  ax.scatter(

No description has been provided for this image

Kinematic Cuts¶

DESI's radial velocity measurements are generally more reliable than from the Gaia RVS data (which the STREAMFINDER Catalogue is based off of). We may see some obvious outliers in the velocity data, and we will want to cut those out.

Figure 8¶

InĀ [40]:
importlib.reload(st)
plt_trim.plot_params['background']['alpha']=0.08
plt_trim.kin_plot(showStream=True, background=True, show_sf_only=False)
/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/streamTutorial.py:885: UserWarning: You passed a edgecolor/edgecolors ('k') for an unfilled marker ('x').  Matplotlib is ignoring the edgecolor in favor of the facecolor.  This behavior may change in the future.
  ax[0].scatter(

/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/streamTutorial.py:890: UserWarning: You passed a edgecolor/edgecolors ('k') for an unfilled marker ('x').  Matplotlib is ignoring the edgecolor in favor of the facecolor.  This behavior may change in the future.
  ax[1].scatter(

/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/streamTutorial.py:895: UserWarning: You passed a edgecolor/edgecolors ('k') for an unfilled marker ('x').  Matplotlib is ignoring the edgecolor in favor of the facecolor.  This behavior may change in the future.
  ax[2].scatter(

No description has been provided for this image
InĀ [41]:
#-----------------------#
vgsr_max=200 #[km/s]
vgsr_min=-100
#-----------------------#

selection_fine.add_mask(name='VGSR',
        mask_func=lambda df: (df['VGSR'] < vgsr_max) & (df['VGSR'] > vgsr_min))
Mask added: 'VGSR'
InĀ [42]:
#-----------------------#
pmra_wiggle = 6 # [mas/yr]
#-----------------------#

selection_fine.add_mask(
    name='PMRA',
    mask_func=lambda df: (
        (
            df['PMRA'] >= np.interp(df['phi1'], [SoI.data.SoI_streamfinder['phi1'].min(), SoI.data.SoI_streamfinder['phi1'].max()],
                                            [SoI.data.SoI_streamfinder.loc[SoI.data.SoI_streamfinder['phi1'].idxmin(), 'pmRA'] - pmra_wiggle,
                                             SoI.data.SoI_streamfinder.loc[SoI.data.SoI_streamfinder['phi1'].idxmax(), 'pmRA'] - pmra_wiggle])
        ) &
        (
            df['PMRA'] <= np.interp(df['phi1'], [SoI.data.SoI_streamfinder['phi1'].min(), SoI.data.SoI_streamfinder['phi1'].max()],
                                            [SoI.data.SoI_streamfinder.loc[SoI.data.SoI_streamfinder['phi1'].idxmin(), 'pmRA'] + pmra_wiggle,
                                             SoI.data.SoI_streamfinder.loc[SoI.data.SoI_streamfinder['phi1'].idxmax(), 'pmRA'] + pmra_wiggle])
        ) ))
Mask added: 'PMRA'
InĀ [43]:
#-----------------------#
pmdec_wiggle = 6 # [mas/yr]
#-----------------------#


selection_fine.add_mask(
    name='PMDEC',
    mask_func=lambda df: (
        (
            df['PMDEC'] >= np.interp(df['phi1'], [SoI.data.SoI_streamfinder['phi1'].min(), SoI.data.SoI_streamfinder['phi1'].max()],
                                            [SoI.data.SoI_streamfinder.loc[SoI.data.SoI_streamfinder['phi1'].idxmin(), 'pmDE'] - pmdec_wiggle,
                                             SoI.data.SoI_streamfinder.loc[SoI.data.SoI_streamfinder['phi1'].idxmax(), 'pmDE'] - pmdec_wiggle])
        ) &
        (
            df['PMDEC'] <= np.interp(df['phi1'], [SoI.data.SoI_streamfinder['phi1'].min(), SoI.data.SoI_streamfinder['phi1'].max()],
                                            [SoI.data.SoI_streamfinder.loc[SoI.data.SoI_streamfinder['phi1'].idxmin(), 'pmDE'] + pmdec_wiggle,
                                             SoI.data.SoI_streamfinder.loc[SoI.data.SoI_streamfinder['phi1'].idxmax(), 'pmDE'] + pmdec_wiggle])
        ) ))
Mask added: 'PMDEC'

It is sometimes easier to do wider cuts in PMRA/PMDEC for finding the stream by-eye, you can use those masks instead below.

InĀ [48]:
pmra_max = 0 # [mas/yr]
pmra_min = -15

pmdec_max = 30 # [mas/yr]
pmdec_min = 15


#-----------------------#

selection_fine.add_mask(name='PMRA_wide',
        mask_func=lambda df: (df['PMRA'] < pmra_max) & (df['PMRA'] > pmra_min))

selection_fine.add_mask(name='PMDEC_wide',
        mask_func=lambda df: (df['PMDEC'] < pmdec_max) & (df['PMDEC'] > pmdec_min))
Mask added: 'PMRA_wide'
Mask added: 'PMDEC_wide'

Lets look at our cuts!¶

InĀ [49]:
selection_fine.list_masks()
Active masks:
- 3D
- phi2
- feh
- iso
- VGSR
- PMRA
- PMDEC
- PMRA_wide
- PMDEC_wide

Figure 9¶

InĀ [50]:
importlib.reload(st)
# Lets take a look at our trimmed DESI data
#-----------------------#
mask = selection_fine.get_masks(['3D', 'phi2', 'feh','iso', 'VGSR', 'PMRA_wide', 'PMDEC_wide']) # TIP 'PMRA_wide', 'PMDEC_wide' can be used instead if defined
#-----------------------#


# Using the new method that replaces the 4-line pattern
# VGSR is now automatically computed for confirmed_sf_not_desi within mask_stream()
trimmed_stream = SoI.mask_stream(mask)

plt_trim = st.StreamPlotter(trimmed_stream)
plt_trim.plot_params['sf_in_desi']['alpha']= 0.9
plt_trim.plot_params['background']['alpha']= 0.5
plt_trim.plot_params['background']['s'] = 10
#-----------------------#
plt_trim.kin_plot(showStream=False, show_sf_only=False, background=True) # You can change showStream to True to see where the STREAMFINDER stars are in this space.
                                                                       # If there aren't many SF stars in DESI, you can show the stars not in DESI as a guide by setting show_sf_only to True
#-----------------------#
...'3D' selected 200553 stars
...'phi2' selected 362136 stars
...'feh' selected 2238378 stars
...'iso' selected 720926 stars
...'VGSR' selected 3259260 stars
...'PMRA_wide' selected 2450141 stars
...'PMDEC_wide' selected 31797 stars
Selection for specified masks: 158 / 4075716 stars.
No stars were cut - cut_confirmed_sf_and_desi is empty
Number of stars in SF: 666, Number of DESI and SF stars: 22
Saved merged DataFrame as self.data.confirmed_sf_and_desi_b
Stars only in SF3: 644
No description has been provided for this image

Lets look at an example of a stream (Sylgr-I21) being spotted once our initial cuts have been made. We can see lines of overdensities in all three kinematic dimensions!

Furthermore, we can overlay the STREAMFINDER stars to see that this overdensity lies on the previously disovered stream, and which SF stars we have cutout.

Stream Image Stream Image

If you don't see a stream, there is no point in continuing

These cuts are quite restrictive for the sake of spotting a stream by-eye. We'll widen them now to have more stars to run our mixture model on.

InĀ [47]:
import copy
selection_mcmc = copy.copy(selection_fine)
InĀ [51]:
## Make your cuts wider than the ones above.
#-----------------------#
colour_wiggle = 0.2 # [dex]

vgsr_max=  200 # [km/s]
vgsr_min= -200

pmra_max = 0 # [mas/yr]
pmra_min = -15

pmdec_max = 30 # [mas/yr]
pmdec_min = 15

feh_wide = -0.25
#-----------------------#

selection_fine.add_mask(name='feh_wide',
        mask_func=lambda df: (df['FEH'] < feh_wide))

selection_mcmc.add_mask(name='iso_wide',
        mask_func=lambda df: (stream_funcs.betw(SoI.data.desi_colour_idx, SoI.isochrone_fit(SoI.data.desi_abs_mag), colour_wiggle)))

selection_mcmc.add_mask(name='VGSR_wide',
        mask_func=lambda df: (df['VGSR'] < vgsr_max) & (df['VGSR'] > vgsr_min))

selection_mcmc.add_mask(name='PMRA_wide',
        mask_func=lambda df: (df['PMRA'] < pmra_max) & (df['PMRA'] > pmra_min))

selection_mcmc.add_mask(name='PMDEC_wide',
        mask_func=lambda df: (df['PMDEC'] < pmdec_max) & (df['PMDEC'] > pmdec_min))
Mask added: 'feh_wide'
Mask added: 'iso_wide'
Mask added: 'VGSR_wide'
Mask added: 'PMRA_wide'
Mask added: 'PMDEC_wide'
InĀ [52]:
mcmc_mask = selection_mcmc.get_masks(['3D', 'phi2', 'feh_wide','iso_wide', 'VGSR_wide', 'PMRA_wide', 'PMDEC_wide'])
...'3D' selected 200553 stars
...'phi2' selected 362136 stars
...'feh_wide' selected 3183059 stars
...'iso_wide' selected 1300351 stars
...'VGSR_wide' selected 3868113 stars
...'PMRA_wide' selected 2450141 stars
...'PMDEC_wide' selected 31797 stars
Selection for specified masks: 202 / 4075716 stars.

Below, lets look at our final selection of stars before we run our optimization.

Figure 10¶

InĀ [55]:
importlib.reload(st)
tomcmc_stream = SoI.mask_stream(mcmc_mask)

plt_trim = st.StreamPlotter(tomcmc_stream)
plt_trim.plot_params['sf_in_desi']['alpha']= 0.9
plt_trim.plot_params['background']['alpha']= 0.4
plt_trim.plot_params['background']['s'] = 1
#-----------------------#
plt_trim.sixD_plot(showStream=True, show_sf_only=False, background=True) # You can change showStream to True to see where the STREAMFINDER stars are in this space.
                                                                       # If there aren't many SF stars in DESI, you can show the stars not in DESI as a guide by setting show_sf_only to True
#-----------------------#
No stars were cut - cut_confirmed_sf_and_desi is empty
Number of stars in SF: 666, Number of DESI and SF stars: 22
Saved merged DataFrame as self.data.confirmed_sf_and_desi_b
Stars only in SF3: 644
Out[55]:
(<Figure size 1000x1500 with 5 Axes>,
 array([<Axes: ylabel='$\\phi_2$'>, <Axes: ylabel='V$_{GSR}$ (km/s)'>,
        <Axes: ylabel='$\\mu_{\\alpha}$ [mas/yr]'>,
        <Axes: ylabel='$\\mu_{\\delta}$ [mas/yr]'>,
        <Axes: xlabel='$\\phi_1$', ylabel='[Fe/H]'>], dtype=object))
No description has been provided for this image

Optimizing¶

If you see a stream above, you may continue with the rest of the notebook!

Before jumping into running MCMC, we will find a good starting point using scipy.minimize. This reduces the amount of time we'll need to run the MCMC, and generally means we'll get a better fit.

Streams aren't flat lines in $\phi_1$ dynamically, so we want to allow our fits to vary along the track.

#-----------------------#
no_of_spline_points = 5
#-----------------------#

Typically we want between 3 and 5 spline points.

InĀ [70]:
importlib.reload(st)
#-----------------------#
no_of_spline_points = 5

# Optional: Set custom spline range (if not specified, uses StreamFinder data range)
phi1_min = -12  # Set to a specific value to override the leftmost spline location
phi1_max = None  # Set to a specific value to override the rightmost spline location

#-----------------------#

# Define truncation parameters based on our selection cuts
truncation_params = {
    'vgsr_min': vgsr_min, 'vgsr_max': vgsr_max,
    'feh_min': -4.0, 'feh_max': feh_wide,
    'pmra_min': pmra_min, 'pmra_max': pmra_max,
    'pmdec_min': pmdec_min, 'pmdec_max': pmdec_max
}

MCMeta = st.MCMeta(no_of_spline_points, tomcmc_stream, trimmed_stream.data.confirmed_sf_and_desi, 
                   truncation_params=truncation_params, phi1_min=phi1_min, phi1_max=phi1_max)
Making stream initial guess based on galstream and STREAMFINDER...
Stream VGSR dispersion from trimmed SF: 60.58 km/s
Stream mean metallicity from trimmed SF: -1.29 +- 0.199 dex
Stream PMRA dispersion from trimmed SF: 2.43 mas/yr
Stream PMDEC dispersion from trimmed SF: 1.64 mas/yr
Making background initial guess...
Background velocity: 60.98 +- 72.81 km/s
Background metallicity: -1.47 +- 0.380 dex
Background PMRA: -6.82 +- 3.62 mas/yr
Background PMDEC: 19.50 +- 3.47 mas/yr

We're doing a mixture model, but what are we mixing? Lets take a look at our intial guesses below:

Figure 11¶

InĀ [71]:
plt_mcmeta = st.StreamPlotter(MCMeta)

# Show our initial parameter guesses
print("Initial Parameters:")
for param, value in MCMeta.initial_params.items():
    if isinstance(value, np.ndarray):
        print(f"{param}: {value}")
    else:
        print(f"{param}: {value:.4f}")

# Plot the Gaussian mixture model based on our initial guesses
plt_mcmeta.gaussian_mixture_plot(showStream=True, background=True)
Initial Parameters:
lsigvgsr: 1.7823
vgsr_spline_points: [-48.09398781  22.70402041  95.5610612  170.47713458 247.45224052]
feh1: -1.2868
lsigfeh: -0.7002
lsigpmra: 0.3860
pmra_spline_points: [-8.41711261 -7.9875275  -6.61263906 -4.29244728 -1.02695218]
lsigpmdec: 0.2143
pmdec_spline_points: [19.57569127 22.75160513 23.24238203 21.04802198 16.16852496]
bv: 60.9772
lsigbv: 1.8622
bfeh: -1.4695
lsigbfeh: -0.4207
bpmra: -6.8221
lsigbpmra: 0.5588
bpmdec: 19.4997
lsigbpmdec: 0.5405
Out[71]:
(<Figure size 900x900 with 4 Axes>,
 array([[<Axes: xlabel='V$_{GSR}$ (km/s)'>, <Axes: xlabel='[Fe/H]'>],
        [<Axes: xlabel='$\\mu_{RA}$ (mas/yr)'>,
         <Axes: xlabel='$\\mu_{DEC}$ (mas/yr)'>]], dtype=object))
No description has been provided for this image

You can see our initial guess above. showing the mixture of the stream model (blue) and the background model (orange) combining to make the overall distribution (black). Depending on your cuts, you may also notice that the gaussians are truncated for some parameters—this is done to allow us to cut down on the background stars while still modelling each population with some gaussian mean and spread.

Priors¶

An important step for our analysis is to provide strong priors about the background and stream. We will base those around our initial guess, using the STREAMFINDER catalogue to inform our priors.

InĀ [72]:
importlib.reload(st)
vgsr_range_wiggle = 10
pmra_range_wiggle = 5
pmdec_range_wiggle = 10
lsigvgsr_r = (-4, 4)

vgsr_ranges = [(v - vgsr_range_wiggle, v + vgsr_range_wiggle) for v in MCMeta.initial_params['vgsr_spline_points']]
pmra_ranges = [(v - pmra_range_wiggle, v + pmra_range_wiggle) for v in MCMeta.initial_params['pmra_spline_points']]
pmdec_ranges = [(v - pmdec_range_wiggle, v + pmdec_range_wiggle) for v in MCMeta.initial_params['pmdec_spline_points']]

#-----------------------#
feh1_r = (-3,0)
#-----------------------#
lsigfeh_r = (-2,4)
lsigpmra_r = (-5, 1)
lsigpmdec_r = (-5, 1)
bv_r = (-400, 400)
lsigbv_r = (-2,4)
bfeh_r = (-4, 4)
lsigbfeh_r = (-2,2)
bpmra_r = (-20, 20)
lsigbpmra_r = (-2,3)
bpmdec_r = (15, 30)
lsigbpmdec_r = (-2,3)

pstream_r = (0.005, 1.0)

prior = [
    pstream_r,              # pstream (single value)
    *vgsr_ranges,           # vgsr_spline_points (multiple values)
    lsigvgsr_r,             # lsigvgsr (single value)
    feh1_r, lsigfeh_r,      # feh parameters
    *pmra_ranges,           # pmra_spline_points (multiple values)  
    lsigpmra_r,             # lsigpmra (single value)
    *pmdec_ranges,          # pmdec_spline_points (multiple values)
    lsigpmdec_r,            # lsigpmdec (single value)
    bv_r, lsigbv_r, bfeh_r, lsigbfeh_r,  # background parameters
    bpmra_r, lsigbpmra_r, bpmdec_r, lsigbpmdec_r
]

MCMeta.priors(prior)

Now that we have our priors set, lets try running a basic minimization to find a good starting point for the MCMC. This will run from 10 to 60 seconds...

InĀ [73]:
MCMeta.scipy_optimize()
Running optimization...
Maximum number of function evaluations has been exceeded.

Optimized Parameters:
pstream: 0.0993
vgsr_spline_points: [-38.09623725  18.01902313 105.54914409 168.78783883 254.97341612]
sigvgsr: 6.7969
feh1: -1.1896
sigfeh: 0.1362
pmra_spline_points: [-9.627314   -7.74611121 -7.51422379 -3.34435766 -1.70282397]
sigpmra: 2.6637
pmdec_spline_points: [19.16808223 23.15504955 22.52968389 20.70770875 25.972101  ]
sigpmdec: 1.2904
bv: 62.2959
sigbv: 80.2458
bfeh: -1.4906
sigbfeh: 0.3345
bpmra: -6.3383
sigbpmra: 4.8855
bpmdec: 15.0000
sigbpmdec: 5.6613

Figure 13¶

InĀ [74]:
# First reload the module to get the fixed version
importlib.reload(st)


plt_optimized = st.StreamPlotter(MCMeta)
original_params = MCMeta.initial_params.copy()
optimized_for_plotting = MCMeta.sp_output.copy()

# Update MCMeta with optimized parameters for plotting
MCMeta.initial_params = optimized_for_plotting

print("\nPlotting optimized Gaussian mixture...")
# This should now work without the broadcasting error
plt_optimized.gaussian_mixture_plot(showStream=False, background=True)

# Restore original initial_params
MCMeta.initial_params = original_params

plt.suptitle('Optimized Gaussian Mixture Model', fontsize=14, y=0.98)
plt.tight_layout()
plt.show()

MCMeta.prior_validation()
Plotting optimized Gaussian mixture...
No description has been provided for this image
Your prior is good, you've found something!
All 70 walkers initialized successfully!

The above cell automatically checks if your prior was any good. If you get a message like the following:

prior_error.png

Try going back and adjusting the mentioned parameter.

Figure 14¶

InĀ [75]:
importlib.reload(st)

plt_enhanced = st.StreamPlotter(MCMeta) 
plt_enhanced.plot_params['sf_in_desi']['alpha'] = 0.9
plt_enhanced.plot_params['background']['alpha'] = 0.4
plt_enhanced.plot_params['background']['s'] = 1

fig, ax = plt_enhanced.sixD_plot(
    showStream=True, 
    show_sf_only=False, 
    background=True,
    show_initial_splines=True,
    show_optimized_splines=True,
    show_sf_errors=True  # Now enable error bars
)

plt.tight_layout()
plt.show()
No description has been provided for this image

EMCEE¶

Once we've optimized our intial guess with Scipy, we can run our MCMC sampler.

Only run this is you are interested in the MCMC sampler. This will take up to 20 minutes unless you reduce the number of burnin's or steps taken.

InĀ [83]:
importlib.reload(st)
from datetime import datetime
import os
note = 'test'
current_date = datetime.now().strftime("%y%m%d")
output_dir = f"./runs/{streamName}_{current_date}_{note}"
if not os.path.exists(output_dir):
    os.makedirs(output_dir)

mcmc = st.MCMC(MCMeta, output_dir)
InĀ [84]:
%%time 
#-----------------------#
nproc = 32
nburnin = 5000
nstep = 5000
use_optimized_start = True  # Toggle: True for optimized results, False for initial guess
#-----------------------#

mcmc.run(nproc=nproc, nburnin=nburnin, nstep=nstep, use_optimized_start=use_optimized_start)
Using optimized parameters as starting positions...
Running burn-in with 5000 iterations. starting from optimized parameters...
Clipping all walker positions to be within prior ranges...
/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/stream_functions.py:1993: RuntimeWarning: divide by zero encountered in log
  lpmra_cdf_dif = np.log(stats.norm.cdf(pmra_trunc[1], loc=bpmra, scale=scale_bg_pmra) - stats.norm.cdf(pmra_trunc[0], loc=bpmra, scale=scale_bg_pmra))

/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/stream_functions.py:1993: RuntimeWarning: divide by zero encountered in log
  lpmra_cdf_dif = np.log(stats.norm.cdf(pmra_trunc[1], loc=bpmra, scale=scale_bg_pmra) - stats.norm.cdf(pmra_trunc[0], loc=bpmra, scale=scale_bg_pmra))

/opt/anaconda3/envs/streamTutorial/lib/python3.11/site-packages/emcee/moves/red_blue.py:99: RuntimeWarning: invalid value encountered in scalar subtract
  lnpdiff = f + nlp - state.log_prob[j]

/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/stream_functions.py:1993: RuntimeWarning: divide by zero encountered in log
  lpmra_cdf_dif = np.log(stats.norm.cdf(pmra_trunc[1], loc=bpmra, scale=scale_bg_pmra) - stats.norm.cdf(pmra_trunc[0], loc=bpmra, scale=scale_bg_pmra))

/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/stream_functions.py:1993: RuntimeWarning: divide by zero encountered in log
  lpmra_cdf_dif = np.log(stats.norm.cdf(pmra_trunc[1], loc=bpmra, scale=scale_bg_pmra) - stats.norm.cdf(pmra_trunc[0], loc=bpmra, scale=scale_bg_pmra))

/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/stream_functions.py:1993: RuntimeWarning: divide by zero encountered in log
  lpmra_cdf_dif = np.log(stats.norm.cdf(pmra_trunc[1], loc=bpmra, scale=scale_bg_pmra) - stats.norm.cdf(pmra_trunc[0], loc=bpmra, scale=scale_bg_pmra))

/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/stream_functions.py:1993: RuntimeWarning: divide by zero encountered in log
  lpmra_cdf_dif = np.log(stats.norm.cdf(pmra_trunc[1], loc=bpmra, scale=scale_bg_pmra) - stats.norm.cdf(pmra_trunc[0], loc=bpmra, scale=scale_bg_pmra))

/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/stream_functions.py:1993: RuntimeWarning: divide by zero encountered in log
  lpmra_cdf_dif = np.log(stats.norm.cdf(pmra_trunc[1], loc=bpmra, scale=scale_bg_pmra) - stats.norm.cdf(pmra_trunc[0], loc=bpmra, scale=scale_bg_pmra))

/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/stream_functions.py:1993: RuntimeWarning: divide by zero encountered in log
  lpmra_cdf_dif = np.log(stats.norm.cdf(pmra_trunc[1], loc=bpmra, scale=scale_bg_pmra) - stats.norm.cdf(pmra_trunc[0], loc=bpmra, scale=scale_bg_pmra))

/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/stream_functions.py:1993: RuntimeWarning: divide by zero encountered in log
  lpmra_cdf_dif = np.log(stats.norm.cdf(pmra_trunc[1], loc=bpmra, scale=scale_bg_pmra) - stats.norm.cdf(pmra_trunc[0], loc=bpmra, scale=scale_bg_pmra))

/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/stream_functions.py:1993: RuntimeWarning: divide by zero encountered in log
  lpmra_cdf_dif = np.log(stats.norm.cdf(pmra_trunc[1], loc=bpmra, scale=scale_bg_pmra) - stats.norm.cdf(pmra_trunc[0], loc=bpmra, scale=scale_bg_pmra))

/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/stream_functions.py:1993: RuntimeWarning: divide by zero encountered in log
  lpmra_cdf_dif = np.log(stats.norm.cdf(pmra_trunc[1], loc=bpmra, scale=scale_bg_pmra) - stats.norm.cdf(pmra_trunc[0], loc=bpmra, scale=scale_bg_pmra))

/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/stream_functions.py:1993: RuntimeWarning: divide by zero encountered in log
  lpmra_cdf_dif = np.log(stats.norm.cdf(pmra_trunc[1], loc=bpmra, scale=scale_bg_pmra) - stats.norm.cdf(pmra_trunc[0], loc=bpmra, scale=scale_bg_pmra))

/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/stream_functions.py:1993: RuntimeWarning: divide by zero encountered in log
  lpmra_cdf_dif = np.log(stats.norm.cdf(pmra_trunc[1], loc=bpmra, scale=scale_bg_pmra) - stats.norm.cdf(pmra_trunc[0], loc=bpmra, scale=scale_bg_pmra))

/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/stream_functions.py:1993: RuntimeWarning: divide by zero encountered in log
  lpmra_cdf_dif = np.log(stats.norm.cdf(pmra_trunc[1], loc=bpmra, scale=scale_bg_pmra) - stats.norm.cdf(pmra_trunc[0], loc=bpmra, scale=scale_bg_pmra))

/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/stream_functions.py:1993: RuntimeWarning: divide by zero encountered in log
  lpmra_cdf_dif = np.log(stats.norm.cdf(pmra_trunc[1], loc=bpmra, scale=scale_bg_pmra) - stats.norm.cdf(pmra_trunc[0], loc=bpmra, scale=scale_bg_pmra))

/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/stream_functions.py:1993: RuntimeWarning: divide by zero encountered in log
  lpmra_cdf_dif = np.log(stats.norm.cdf(pmra_trunc[1], loc=bpmra, scale=scale_bg_pmra) - stats.norm.cdf(pmra_trunc[0], loc=bpmra, scale=scale_bg_pmra))

/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/stream_functions.py:1993: RuntimeWarning: divide by zero encountered in log
  lpmra_cdf_dif = np.log(stats.norm.cdf(pmra_trunc[1], loc=bpmra, scale=scale_bg_pmra) - stats.norm.cdf(pmra_trunc[0], loc=bpmra, scale=scale_bg_pmra))

/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/stream_functions.py:1993: RuntimeWarning: divide by zero encountered in log
  lpmra_cdf_dif = np.log(stats.norm.cdf(pmra_trunc[1], loc=bpmra, scale=scale_bg_pmra) - stats.norm.cdf(pmra_trunc[0], loc=bpmra, scale=scale_bg_pmra))

/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/stream_functions.py:1993: RuntimeWarning: divide by zero encountered in log
  lpmra_cdf_dif = np.log(stats.norm.cdf(pmra_trunc[1], loc=bpmra, scale=scale_bg_pmra) - stats.norm.cdf(pmra_trunc[0], loc=bpmra, scale=scale_bg_pmra))

/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/stream_functions.py:1993: RuntimeWarning: divide by zero encountered in log
  lpmra_cdf_dif = np.log(stats.norm.cdf(pmra_trunc[1], loc=bpmra, scale=scale_bg_pmra) - stats.norm.cdf(pmra_trunc[0], loc=bpmra, scale=scale_bg_pmra))

/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/stream_functions.py:1993: RuntimeWarning: divide by zero encountered in log
  lpmra_cdf_dif = np.log(stats.norm.cdf(pmra_trunc[1], loc=bpmra, scale=scale_bg_pmra) - stats.norm.cdf(pmra_trunc[0], loc=bpmra, scale=scale_bg_pmra))

/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/stream_functions.py:1993: RuntimeWarning: divide by zero encountered in log
  lpmra_cdf_dif = np.log(stats.norm.cdf(pmra_trunc[1], loc=bpmra, scale=scale_bg_pmra) - stats.norm.cdf(pmra_trunc[0], loc=bpmra, scale=scale_bg_pmra))

/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/stream_functions.py:1993: RuntimeWarning: divide by zero encountered in log
  lpmra_cdf_dif = np.log(stats.norm.cdf(pmra_trunc[1], loc=bpmra, scale=scale_bg_pmra) - stats.norm.cdf(pmra_trunc[0], loc=bpmra, scale=scale_bg_pmra))

/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/stream_functions.py:1993: RuntimeWarning: divide by zero encountered in log
  lpmra_cdf_dif = np.log(stats.norm.cdf(pmra_trunc[1], loc=bpmra, scale=scale_bg_pmra) - stats.norm.cdf(pmra_trunc[0], loc=bpmra, scale=scale_bg_pmra))

/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/stream_functions.py:1993: RuntimeWarning: divide by zero encountered in log
  lpmra_cdf_dif = np.log(stats.norm.cdf(pmra_trunc[1], loc=bpmra, scale=scale_bg_pmra) - stats.norm.cdf(pmra_trunc[0], loc=bpmra, scale=scale_bg_pmra))

/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/stream_functions.py:1993: RuntimeWarning: divide by zero encountered in log
  lpmra_cdf_dif = np.log(stats.norm.cdf(pmra_trunc[1], loc=bpmra, scale=scale_bg_pmra) - stats.norm.cdf(pmra_trunc[0], loc=bpmra, scale=scale_bg_pmra))

/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/stream_functions.py:1993: RuntimeWarning: divide by zero encountered in log
  lpmra_cdf_dif = np.log(stats.norm.cdf(pmra_trunc[1], loc=bpmra, scale=scale_bg_pmra) - stats.norm.cdf(pmra_trunc[0], loc=bpmra, scale=scale_bg_pmra))

/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/stream_functions.py:1993: RuntimeWarning: divide by zero encountered in log
  lpmra_cdf_dif = np.log(stats.norm.cdf(pmra_trunc[1], loc=bpmra, scale=scale_bg_pmra) - stats.norm.cdf(pmra_trunc[0], loc=bpmra, scale=scale_bg_pmra))

/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/stream_functions.py:1993: RuntimeWarning: divide by zero encountered in log
  lpmra_cdf_dif = np.log(stats.norm.cdf(pmra_trunc[1], loc=bpmra, scale=scale_bg_pmra) - stats.norm.cdf(pmra_trunc[0], loc=bpmra, scale=scale_bg_pmra))

/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/stream_functions.py:1993: RuntimeWarning: divide by zero encountered in log
  lpmra_cdf_dif = np.log(stats.norm.cdf(pmra_trunc[1], loc=bpmra, scale=scale_bg_pmra) - stats.norm.cdf(pmra_trunc[0], loc=bpmra, scale=scale_bg_pmra))

/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/stream_functions.py:1993: RuntimeWarning: divide by zero encountered in log
  lpmra_cdf_dif = np.log(stats.norm.cdf(pmra_trunc[1], loc=bpmra, scale=scale_bg_pmra) - stats.norm.cdf(pmra_trunc[0], loc=bpmra, scale=scale_bg_pmra))

/Users/nasserm/Documents/vscode/research/streamTut/DESI-DR1_streamTutorial/stream_functions.py:1993: RuntimeWarning: divide by zero encountered in log
  lpmra_cdf_dif = np.log(stats.norm.cdf(pmra_trunc[1], loc=bpmra, scale=scale_bg_pmra) - stats.norm.cdf(pmra_trunc[0], loc=bpmra, scale=scale_bg_pmra))

Took 66.3 seconds (1.1 minutes)
Now sampling with 5000 iterations
Took 52.0 seconds (0.9 minutes)
Getting flatchain...
CPU times: user 1min 14s, sys: 27.3 s, total: 1min 42s
Wall time: 1min 58s

Figure 15¶

InĀ [85]:
mcmc.show_chains()
No description has been provided for this image

Figure 16¶

InĀ [86]:
mcmc.show_corner()
WARNING:root:Too few points to create valid contours
No description has been provided for this image
InĀ [87]:
mcmc.print_result()
29
OrderedDict([('pstream', 0.10881946705538123), ('vgsr1', -49.70398855982656), ('vgsr2', 20.846279792657718), ('vgsr3', 98.45351053533719), ('vgsr4', 166.98611834998297), ('vgsr5', 245.90457694265598), ('lsigvgsr', 0.811794913805584), ('feh1', -1.223353276992123), ('lsigfeh', -1.1312422330359246), ('pmra1', -8.041203925157566), ('pmra2', -8.94585045562394), ('pmra3', -6.2022593169463685), ('pmra4', -6.845346347358921), ('pmra5', -1.6041224892998398), ('lsigpmra', 0.4134933070945016), ('pmdec1', 19.730360196132526), ('pmdec2', 22.735881333343897), ('pmdec3', 23.386970156808438), ('pmdec4', 20.326068271193147), ('pmdec5', 14.150116922581502), ('lsigpmdec', 0.125388765049126), ('bv', 68.61854180215634), ('lsigbv', 1.915668394497988), ('bfeh', -1.4843317183511284), ('lsigbfeh', -0.4647627531225802), ('bpmra', -5.836071638448182), ('lsigbpmra', 0.7621859773570243), ('bpmdec', 15.500152606695803), ('lsigbpmdec', 0.7324893737884188)])
param             med         em         ep   exp(med)    exp(em)    exp(ep)
--------------------------------------------------------------------------------------
pstream         0.109     -0.029      0.038
vgsr1         -49.704     -4.651      6.166
vgsr2          20.846     -3.657      3.737
vgsr3          98.454     -4.432      3.897
vgsr4         166.986     -3.630      4.317
vgsr5         245.905     -5.848      6.538
lsigvgsr        0.812     -0.139      0.169      6.483     -1.780      3.084
feh1           -1.223     -0.027      0.030
lsigfeh        -1.131     -0.344      0.246      0.074     -0.040      0.056
pmra1          -8.041     -2.249      2.117
pmra2          -8.946     -1.366      1.393
pmra3          -6.202     -1.389      1.404
pmra4          -6.845     -1.536      1.898
pmra5          -1.604     -3.158      3.687
lsigpmra        0.413     -0.092      0.100      2.591     -0.495      0.672
pmdec1         19.730     -1.458      1.290
pmdec2         22.736     -0.933      0.696
pmdec3         23.387     -0.938      1.055
pmdec4         20.326     -1.108      1.156
pmdec5         14.150     -4.703      5.749
lsigpmdec       0.125     -0.219      0.201      1.335     -0.528      0.786
bv             68.619     -8.687      7.986
lsigbv          1.916     -0.033      0.034     82.351     -5.974      6.644
bfeh           -1.484     -0.027      0.029
lsigbfeh       -0.465     -0.023      0.030      0.343     -0.018      0.025
bpmra          -5.836     -0.899      2.131
lsigbpmra       0.762     -0.089      0.215      5.783     -1.072      3.704
bpmdec         15.500     -0.363      0.625
lsigbpmdec      0.732     -0.032      0.028      5.401     -0.382      0.365

Figure 17¶

InĀ [88]:
importlib.reload(st)

# Use MCMC results for plotting
meds = mcmc.meds # Store MCMC results in MCMeta
original_params = MCMeta.initial_params.copy()

# Convert MCMC results to the format expected by the plotting function
mcmc_for_plotting = {}

# Convert individual spline points back to arrays
vgsr_points = np.array([meds['vgsr1'], meds['vgsr2'], meds['vgsr3'], meds['vgsr4'], meds['vgsr5']])
pmra_points = np.array([meds['pmra1'], meds['pmra2'], meds['pmra3'], meds['pmra4'], meds['pmra5']])
pmdec_points = np.array([meds['pmdec1'], meds['pmdec2'], meds['pmdec3'], meds['pmdec4'], meds['pmdec5']])

mcmc_for_plotting = {
	'vgsr_spline_points': vgsr_points,
	'lsigvgsr': meds['lsigvgsr'], 
	'feh1': meds['feh1'],
	'lsigfeh': meds['lsigfeh'],
	'pmra_spline_points': pmra_points,
	'lsigpmra': meds['lsigpmra'],
	'pmdec_spline_points': pmdec_points,
	'lsigpmdec': meds['lsigpmdec'],
	'bv': meds['bv'],
	'lsigbv': meds['lsigbv'],
	'bfeh': meds['bfeh'],
	'lsigbfeh': meds['lsigbfeh'],
	'bpmra': meds['bpmra'],
	'lsigbpmra': meds['lsigbpmra'],
	'bpmdec': meds['bpmdec'],
	'lsigbpmdec': meds['lsigbpmdec']
}

# Update MCMeta with MCMC results
MCMeta.initial_params = mcmc_for_plotting

print("\nPlotting MCMC-based Gaussian mixture...")
plt_mcmc = st.StreamPlotter(MCMeta)
plt_mcmc.gaussian_mixture_plot(showStream=False, background=True, show_model=True, show_total=True)

# Restore original initial_params
MCMeta.initial_params = original_params

#plt.suptitle('MCMC-based Gaussian Mixture Model', fontsize=14, y=0.98)
plt.tight_layout()
plt.show()
Plotting MCMC-based Gaussian mixture...
No description has been provided for this image

Membership Probability¶

We now have our final gaussian mixture model parameters! We can go ahead and then calculate the probability that each star belongs to the 'stream' population.

Figure 18¶

InĀ [89]:
# Calculate membership probabilities using mcmc.memprob function
stream_prob = mcmc.memprob()


# Create membership probability histogram
fig, ax = plt.subplots(1, 1, figsize=(7, 6))

#-----------------------#
min_prob = 0.5  # Change this to adjust the stream probability threshold
#-----------------------#

ax.hist(stream_prob, bins=50, alpha=0.7, color='blue', 
        label=f'All Stars ({len(stream_prob)} total)', density=True)

ax.axvline(0.5, color='black', linestyle='--', alpha=0.7, label='50% threshold')

# Format plot
ax.set_xlabel('Membership Probability', fontsize=12)
ax.set_ylabel('Density', fontsize=12)
ax.set_title(f'Membership Probability Distribution for {streamName}', fontsize=14)
ax.legend()
ax.grid(alpha=0.3)
stream_funcs.plot_form(ax)

# log yaxis
ax.set_yscale('log')

plt.tight_layout()
plt.show()

# Print statistics for different probability thresholds
thresholds = [0.3, 0.5, 0.9]
print("\nMembership statistics:")
print("Threshold | Count")
print("----------|-------")
for thresh in thresholds:
    count = len(stream_prob[stream_prob >= thresh])
    print(f"   ≄{thresh:3.1f}   | {count:5d} ")
Calculated membership probabilities for 202 stars
Membership probabilities range from 0.000 to 0.982
Mean membership probability: 0.112
Stars with >50% probability: 24
Stars with >70% probability: 23
Stars with >90% probability: 18
No description has been provided for this image
Membership statistics:
Threshold | Count
----------|-------
   ≄0.3   |    25 
   ≄0.5   |    24 
   ≄0.9   |    18 

Now lets take a look at the stream you found!

InĀ [90]:
importlib.reload(st)

plt_enhanced = st.StreamPlotter(MCMeta) 
plt_enhanced.plot_params['sf_in_desi']['alpha'] = 0.5
plt_enhanced.plot_params['sf_in_desi']['zorder'] = 10
plt_enhanced.plot_params['background']['alpha'] = 0.2
plt_enhanced.plot_params['background']['s'] = 1
plt_enhanced.plot_params['optimized_spline_line']['lw'] = 1
plt_enhanced.plot_params['initial_spline_line']['lw'] = 1

fig, ax = plt_enhanced.sixD_plot(
    showStream=False, 
    show_sf_only=False, 
    background=True,
    show_initial_splines=True,
    show_optimized_splines=True,
    show_sf_errors=False,
    show_mcmc_splines=True,
    show_membership_prob=True, stream_prob=stream_prob, min_prob=min_prob, show_residuals=False
)

#plt.tight_layout()
plt.show()
No description has been provided for this image
InĀ [91]:
importlib.reload(st)

plt_enhanced = st.StreamPlotter(MCMeta) 
plt_enhanced.plot_params['sf_in_desi']['alpha'] = 0.5
plt_enhanced.plot_params['sf_in_desi']['zorder'] = 10
plt_enhanced.plot_params['background']['alpha'] = 0.2
plt_enhanced.plot_params['background']['s'] = 1
plt_enhanced.plot_params['optimized_spline_line']['lw'] = 1
plt_enhanced.plot_params['initial_spline_line']['lw'] = 1

fig, ax = plt_enhanced.sixD_plot(
    showStream=False, 
    show_sf_only=False, 
    background=False,
    show_initial_splines=False,
    show_optimized_splines=False,
    show_sf_errors=False,
    show_mcmc_splines=True,
    show_membership_prob=True, stream_prob=stream_prob, min_prob=min_prob, show_residuals=True
)

#plt.tight_layout()
plt.show()
No description has been provided for this image
InĀ [92]:
mcmc.save_run()
Calculated membership probabilities for 202 stars
Membership probabilities range from 0.000 to 0.982
Mean membership probability: 0.112
Stars with >50% probability: 24
Stars with >70% probability: 23
Stars with >90% probability: 18
Saved 24 high-probability members to: ./runs/Hrid-I21_250811_test/Hrid-I21_phi2_spline_50%_mem.fits
Saved 202 total stars to: ./runs/Hrid-I21_250811_test/Hrid-I21_phi2_spline_all%_mem.fits