why and how we look for them, data + code (Python)

Since ancient times, man has sought to find answers to questions about his existence by turning his gaze to the sky. The study of the stars, as the key to deciphering the mysteries of the cosmos, is central to our quest to understand the origin and evolution of the universe.

Nowadays, thanks to advanced observatories, space telescopes and missions including (but not limited to) Hubble, Kepler, Gaia, the ability to study stars and their clusters has reached a new level. Technologies make it possible not only to penetrate into the farthest corners of space, but also to observe reality with unprecedented detail. Thanks to them, “relative stars” (that is, those formed from the same cloud) are discovered. These objects have similar characteristics, including chemical composition, age and speed of movement.

Identifying stars of common origin has important implications for our attempts to understand the structure of the world at a global level. In particular, this makes it possible to:

  1. better understand the evolution of galaxies. For example, finding stars with similar characteristics and chemical compositions allows us to create a mosaic portrait of galactic processes, revealing how stars interact with their environment and what factors influence the formation of various elements;

  2. explore gravitational interactions on a galactic scale. Stars born from the same source provide a unique opportunity to analyze their relationships, revealing the laws of interaction and the dynamics of their systems;

  3. study internal processes and track chemical elements emitted by stars into outer space. This becomes possible thanks to, for example, spectroscopy – a method for studying the interaction of matter with radiation of various wavelengths, which allows one to analyze the properties and composition of matter, determine its structure and composition;

  4. discover planets that have the potential to become, or are, the cradle of life. Similar parameters of stars born from the same mother cloud contribute to the identification of systems where the existence of life is possible, opening new horizons in the search for exoplanets and the conditions for the emergence of living organisms.

So each star becomes a kind of artifact, revealing part of the mysteries of our cosmic origin.

If you have a desire to try yourself as an astrophysicist and go on an astronomical journey, I offer my program as a vehicle. The code includes a two-step analysis of Gaia data, implemented using the HDBSCAN clustering method and photometric validation based on isochrone similarity, in order to identify stars of common origin. All materials necessary for the dive, including a link to real data from the Gaia device (European Space Agency), are in my repository Astronomy Data Analysis Tool (ADAT_co) on GitHub.

Gaia is a satellite launched by the European Space Agency in 2013 with the goal of creating the most detailed map of our galaxy yet. Provides information about billions of stars, their distribution, motion, properties, etc. Gaia measures the positions of stars and changes in those positions with high precision, creating the most complete map yet.

Technical Side: Methods and Implementation

Loading and preparing data

The program assumes that the data will be loaded in the FITS (Flexible Image Transport System) format, which is often used in astrophysics. The astropy.io.fits library is used to open the file, then the data is structured into an easy-to-work form using numpy and pandas.

fits_file_path="Ваш путь к файлу"
hdulist = fits.open(fits_file_path)
data = hdulist[1].data
hdulist.close()

The next step is to select coordinates that may be key to identifying stars with a common origin:

selected_cols = ['ra', 'dec', 'pmra', 'pmdec', 'parallax', 'radial_velocity']
X = np.array([data.field(col) for col in selected_cols]).T

The choice of these specific coordinates was driven by the desire to highlight stars born from the same parent cloud of gas and dust.

About coordinates:
Right Ascension (ra) And Declination (dec): Indicate the position of an object. Stars born in the same region may have similar coordinates on the celestial sphere.
Proper Motion in Right Ascension (pmra) And Proper Motion in Declination (pmdec): The proper motion of stars represented by these coordinates can indicate their common origin, since stars formed in the same region may have similar motions.
Parallax (parallax): parallax is the angular displacement of a star against the background of nearby stars due to the movement of the Earth around the Sun. It can provide information about a star’s distance and help identify stars with a common origin.
Radial Velocity (radial_velocity): represents the speed at which a star is moving away or approaching relative to the observer. Stars born from a common cloud may have similar radial velocities.

Two-step analysis

1. Clustering with HDBSCAN

The first method used is HDBSCAN (Hierarchical Density-Based Spatial Clustering of Applications with Noise), a clustering algorithm that is great for detecting clusters of different shapes and sizes in data. It is used using the library of the same name.

clusterer = hdbscan.HDBSCAN(min_cluster_size=10, min_samples=5)
labels = clusterer.fit_predict(X)

min_cluster_size specifies the minimum number of points that must form a cluster. This allows you to filter out small clusters of data that are not significant.

min_samples specifies the minimum number of neighborhood points that must be present within a minimum cluster. This controls the algorithm’s sensitivity to outliers and cluster compactness.

2. Photometric validation based on similarity to isochrone

After clustering, photometric validation is applied based on isochrone similarity. An isochrone is a curve in color-magnitude space that shows the expected positions of stars.

Isochrones are added for various coordinates to improve data interpretation, which includes checking the correctness of the data in the photometric columns:

valid_mask = np.isfinite(data['bp_rp']) & np.isfinite(data['phot_g_mean_mag'])
valid_cluster_mask = np.isin(np.arange(len(data)), np.nonzero(cluster_mask)[0]) & valid_mask

Valid Mask: creates a mask to check the validity of the data in the bp_rp and phot_g_mean_mag columns. Only those points where both values ​​are correct will be included in the mask.

Valid Cluster Mask is combined with the cluster mask. Thus, a general mask is obtained that filters only those points that belong to the selected cluster and have passed photometric validation.

Visualization

Matplotlib is traditionally used to visualize the results. Graphs are created for three coordinate pairs, in each of which the points corresponding to the identified star clusters are highlighted, where the objects originated from the same cloud.

Below is a function that visualizes a scatterplot. Stars on the chart are displayed using scatter (gray color is chosen for background points, red for those related to the cluster). Using cluster_mask, stars are divided into background stars and those belonging to the cluster.

def plot_cluster_with_isochrone(ax, cluster_mask, data, iso_data, coord_pair, cluster_color, label):
ax.scatter(data[coord_pair[0]][~cluster_mask], data[coord_pair[1]][~cluster_mask], color="gray", label="Other Stars", s=5, alpha=0.7)
ax.scatter(data[coord_pair[0]][cluster_mask], data[coord_pair[1]][cluster_mask], color="red", label="Cluster Stars", s=5, alpha=0.7)
ax.plot(iso_data['G_BPmag'], iso_data['Gmag'],
color="darkblue", linestyle="dashed", label="Isochrone", linewidth=2)
ax.set_xlabel(coord_pair[0])
ax.set_ylabel(coord_pair[1])
ax.legend()

Saving results

The resulting images are saved in TIFF format:

output_filename = f'{output_directory}cluster_{cluster_label}_isochrone.tif'
plt.savefig(output_filename, bbox_inches="tight", dpi=300)

Thus, the entire code is organized around a clustering method followed by the use of photometric isochrone similarity validation to detect and study stars with a common origin.

Examples of results presentation and interpretation

Below are 2 examples – the results of the program for your reference.

The red dots are identified young stars near M45 (the Pleiades), formed at approximately the same time from the same region of interstellar matter. Gray dots are “background” stars.

Example 1

Example 1

Example 2

Example 2

This result indicates a connection between stars, which can be characterized by a common origin and being in similar physical conditions. At the same time, in this example, an interesting phenomenon is observed when the isochrone model corresponds to the Gaia data in the coordinates of the motion of stars along the celestial sphere (pmra and pmdec), their parallax (parallax) and radial velocity (radial velocity), but does not coincide in the coordinates of the direct line ascension (ra) and declination (dec). This suggests that the selected isochrone model has limitations that may affect its fit to the data.

You can experiment with a set of isochrones yourself in the file of the same name in the GitHub repository or, for example, use the isochrones library to get advanced features.

What can the results of the program tell you?

Using parallax-radial velocity and pmra-pmrdec as an example:

The distance between stars in the parallax-radial velocity and pmra-pmrdec graphs tells us how they move across the celestial sphere and how they interact with each other. If objects belong to the same cluster, then their velocities and angular positions will be related by the general movement of the cluster, which is usually reflected in the relative positions of the stars in these graphs. Moreover, if stars form a complex structure that may appear chaotic or randomly distributed, this indicates processes such as the influence of gravitational interactions in a star cluster.

General remarks

While working with the program, you will be able to observe interesting scenarios that will be reflected in the graphs. This is, in particular, due to the following factors:

  1. Within a cluster, stars gravitationally interact with each other. Some of them may be ejected from the cluster as a result of interactions, causing the distance between the remaining stars to increase.

  2. Objects in the cluster may lose energy due to interactions with surrounding gas or other stars. This can slow their movement, causing the distance between objects in the cluster to increase over time.

  3. Clusters can undergo complex evolutionary processes, such as supernova explosions or the influence of gravitational disturbances from outside, which can ultimately lead to the stars in the cluster being located at great distances from each other.

Thus, you can to some extent see the evolutionary and dynamic history of the world by applying the program and observing the results.

PS: Please note: Gaia data contains a large number of star characteristics, but to solve the problem we are using only a small part of them.

Addition:

The isochrone concept is a simplified model of stellar evolution. Isochrones are constructed on the basis of theoretical models of stellar evolution that take into account physical processes, such as nuclear reactions, energy transfer and other basic physical phenomena that affect the luminosity and other characteristics of stars. However, there are a number of aspects that may not be sufficiently taken into account in these models, which can lead to discrepancies between isochrones and actual observations of star clusters. In particular, star clusters may have more complex dynamic processes within themselves (for example, gravitational influence), which are not always taken into account in isochrones.

I will be glad to receive feedback in the form of comments, questions, suggestions. If the topic seems interesting, I will continue it, revealing the possibilities of determining the age of stars, their rotation speed, analyzing the content of various elements and obtaining other information, including estimating sizes and shapes. If, after trying, you want to make changes to the code, for example, decide to improve the visualization, I will be glad and grateful!

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *