
Gaia Data primarily contains of -
Images source and more at: Science at your doorstep

%%html
<div style="text-align:center;">
<iframe src="https://gea.esac.esa.int/archive//" width="960" height="540"></iframe>
</div>
Using Basic Search in Gaia Archive to fetch the first 2000 stars in 3 arcminutes radius circle around the globular cluster, Messier 5. We will then read this data in Python and plot the stars in a RA-Dec space
# numpy, for math (numerical calculations)
import numpy as np
# pandas, for data handling
import pandas as pd
pd.set_option('display.max_columns', None) # Display all of the columns of a DataFrame
# matplotlib, for plotting
import matplotlib.pyplot as plt
# "Magic command" to make the plots appear *inline* in the notebook
%matplotlib inline
#Now we can read the csv file into a pandas dataframe
m5 = pd.read_csv('data/m5.csv') # I renamed my csv file to 'm5.csv' and put it in the the subfolder 'data'
#Checking the top few rows of the data and the number of rows and columns
print("(Rows, Columns) =", m5.shape)
m5.head()
(Rows, Columns) = (2000, 14)
| source_id | ra | ra_error | dec | dec_error | parallax | parallax_error | phot_g_mean_mag | bp_rp | radial_velocity | radial_velocity_error | phot_variable_flag | teff_val | a_g_val | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 4421572868783602304 | 229.608260 | 0.451151 | 2.080235 | 0.755487 | NaN | NaN | 18.320505 | NaN | NaN | NaN | NOT_AVAILABLE | NaN | NaN |
| 1 | 4421573315458434816 | 229.661337 | 20.658777 | 2.095936 | 35.101277 | NaN | NaN | 18.712660 | NaN | NaN | NaN | NOT_AVAILABLE | NaN | NaN |
| 2 | 4421573212376895104 | 229.626614 | 2.232215 | 2.102885 | 3.039926 | NaN | NaN | 18.354128 | NaN | NaN | NaN | NOT_AVAILABLE | NaN | NaN |
| 3 | 4421572971862343296 | 229.623267 | 2.472374 | 2.089429 | 1.216912 | NaN | NaN | 18.066124 | NaN | NaN | NaN | NOT_AVAILABLE | NaN | NaN |
| 4 | 4421572044148629760 | 229.641719 | 10.074508 | 2.056986 | 2.576733 | NaN | NaN | 18.675959 | NaN | NaN | NaN | NOT_AVAILABLE | NaN | NaN |
fig = plt.figure(figsize = [6,6]) # Defining and sizing figure
plt.scatter(m5['ra'], m5['dec'], alpha=0.7, s=10) # Creating a scatter-plot
plt.xlabel('RA (°)')
plt.ylabel('Dec (°)')
plt.title('500 top stars from Gaia DR2 (ordered randomly) around Messier 5')
plt.show()
You can always click on "Show query in ADQL form" below, to see what your basic query would look like in ADQL syntax!
column_name or table_name.column_name. We can also use ADQL/SQL functions or arithmetics in the SELECT part to manipulate the data before fetching it. If we want to fetch all columns from the table, we can use SELECT *.FROM gaiadr2.gaia_source.WHERE gaia_source.parallax>=5 AND gaia_source.parallax_over_error>=20, where AND is a restricted keyword in ADQL/SQL used to signify that both these conditions must be met for the queried rowsDESC or ASC for descending or ascending order and ORDER BY random_item for ordering randomlyQuery structure:
SELECT <columns> FROM <tables> WHERE <conditions> ORDER BY <columns>
Select the 100 stars closest to Earth (so, with the stars with the largest parallaxes)
SELECT TOP 100
source_id, ra, ra_error, dec, dec_error, parallax, parallax_error
FROM gaiadr2.gaia_source
WHERE gaia_source.parallax >= 0
ORDER BY gaia_source.parallax DESC;
Gaia Archive: https://gea.esac.esa.int/archive/
#Let's take a look at what this data looks like!
closest100 = pd.read_csv('data/closest100_result.csv')
print(closest100.shape)
closest100.head()
(100, 7)
| source_id | ra | ra_error | dec | dec_error | parallax | parallax_error | |
|---|---|---|---|---|---|---|---|
| 0 | 4062964299525805952 | 272.237829 | 1.276152 | -27.645916 | 0.830618 | 1851.882140 | 1.285094 |
| 1 | 4065202424204492928 | 274.906872 | 1.251748 | -25.255882 | 1.571499 | 1847.433349 | 1.874937 |
| 2 | 4051942623265668864 | 276.223193 | 0.682959 | -27.140479 | 0.500750 | 1686.265958 | 1.473535 |
| 3 | 4048978992784308992 | 273.112421 | 1.092637 | -31.184670 | 1.362824 | 1634.283354 | 1.971231 |
| 4 | 4059168373166457472 | 259.297177 | 1.640748 | -30.486547 | 2.069445 | 1513.989051 | 2.868580 |
Writing an ADQL query to get the following parameters of the 10,000 closest stars in csv format
SELECT TOP 10000
phot_g_mean_mag + 5 * log10(parallax/1000) + 5 AS g_abs, bp_rp, 1/(parallax/1000) AS dist,
ra, dec, radius_val, teff_val
FROM gaiadr2.gaia_source
WHERE parallax > 0
ORDER BY parallax DESC
data = pd.read_csv('data/closest10k_stars.csv')
plt.figure()
plt.scatter(data.bp_rp, data.g_abs, s=.1, color='red')
# Reverse the direction of the y axis. Max and Min of g_abs are used for the limits in y-axis
plt.ylim(max(data.g_abs),min(data.g_abs))
plt.xlabel('G$_{BP}$ - G$_{RP}$')
plt.ylabel('M$_G$')
plt.show()
plt.figure(figsize = (10, 5))
# Using size as radius, color as effective temprature, and colormap RdYlBu (for mapping with star colors)
plt.scatter(data.bp_rp, data.g_abs,
s=data.radius_val, c=data.teff_val,
cmap='RdYlBu')
plt.colorbar(label='Effective Temprature') # For the colorbar to appear
plt.ylim(10 ,min(data.g_abs)) #Reversing the y-axis
plt.xlabel('G$_{BP}$ - G$_{RP}$')
plt.ylabel('M$_G$')
plt.show()
Plot the scatter graph using RA, Dec and distance. A colormap of Red-Yellow-Blue scale is used with sizes s given by stellar radiix10, color c given by stellar effective temperature
# Magic command for interactive 3D plots: %matplotlib notebook
%matplotlib inline
fig=plt.figure(figsize = (10, 6))
ax = plt.axes(projection ="3d")
scatter_plot=ax.scatter3D(data.ra, data.dec, data.dist, s=data.radius_val*10,
c=data.teff_val, cmap='RdYlBu')
ax.set_xlabel('RA [$\degree$]')
ax.set_ylabel('Dec [$\degree$]')
ax.set_zlabel('Distance [pc]')
plt.title('Closest Stars with known radius')
fig.colorbar(scatter_plot, label="Effective Stellar Temprature [K]")
<matplotlib.colorbar.Colorbar at 0x7f7b2251a7f0>
pip install GaiaCurvesfrom GaiaCurves import gaia_lightcurve as gc
star='NQ Dra'
curves=gc.fetch_curves([star], output_dir = './data/')
print(curves)
{'NQ Dra': {'ID': '2154100169676165120', 'pathname': './data/2154100169676165120_data_dr2.csv', 'source': 'DR2'}}
%matplotlib inline
gc.plot_lightcurve(curves[star]['pathname'], star, curves[star]['ID'])