Exoplanets#

Exoplanet Discovery#

For the past 25+ years, NASA has used ground- and space-based methods to identify exoplanets (planets outside of our solar system). In the past ten years in particular, campaigns like Kepler, K2, and TESS have produced an explosion of results. To date, approximately 4,400 exoplanets have been identified, and over 3,000 potential exoplanet candidates have been discovered.

In this notebook, we will use Holoviews and Panel to make a dashboard visualizing the discovery of confirmed and candidate exoplanets over the years.

We’ll also include a scatterplot in our dashboard that reveals details about the relationship between mass and radius of exoplanets, as well as controls to filter the data based on whether the planets could support life, and if so, whether chemical rockets could be used to escape the planet.

import pandas as pd
import holoviews as hv
import panel as pn
import numpy as np
import colorcet as cc
import hvplot.pandas # noqa

hv.extension('bokeh')

Loading data#

For this notebook, we will be loading our exoplanet data from three different CSV files:

  • stars, a dataset of 257,000 stars identified by the European Gaia space mission,

  • exoplanets, a collection of 480 exoplanets obtained from the NASA Exoplanet Archive; and

  • candidates, a collection of approximately 3,000 candidate exoplanets collated from the Kepler and TESS campaigns.

We could load the data using pure Pandas calls like stars = pd.read_csv("data/stars.csv"), but here let’s cache the results using Panel so that new visitors to the dashboard won’t have to reload or recalculate the data:

%%time
stars      = pn.state.as_cached('stars',      lambda: pd.read_csv("data/stars.csv"))
exoplanets = pn.state.as_cached('exoplanets', lambda: pd.read_csv("data/exoplanets.csv"))
candidates = pn.state.as_cached('candidates', lambda: pd.read_csv("data/candidates.csv"))
CPU times: user 179 ms, sys: 32.3 ms, total: 211 ms
Wall time: 209 ms

The stars data includes the coordinates and brightness of the star:

stars.head()
source_id phot_g_mean_mag ra dec
0 5.781830e+18 10.113231 236.645693 -75.399888
1 6.680680e+18 12.095904 307.362343 -40.700107
2 2.065110e+18 11.209872 313.225036 40.417739
3 5.408450e+18 11.566668 151.755425 -46.530993
4 5.781850e+18 12.487245 235.555713 -75.457647

The exoplanets data includes the coordinates along with a variety of attributes about the planet:

exoplanets.head()
pl_name hostname disc_year radius temperature mass ra dec habitable
0 11 Com b 11 Com 2007 NaN NaN 6165.6000 185.178779 17.793252 False
1 11 UMi b 11 UMi 2009 NaN NaN 4684.8142 229.274595 71.823943 False
2 14 And b 14 And 2008 NaN NaN 1525.5000 352.824150 39.235837 False
3 14 Her b 14 Her 2002 NaN NaN 1481.0878 242.602101 43.816362 False
4 16 Cyg B b 16 Cyg B 1996 NaN NaN 565.7374 295.465642 50.516824 False

Candidate data is sparse and includes only a few attributes besides the coordinates:

candidates.head()
name ra dec radius year temperature
0 2170.01 164.313856 89.086923 2.667700 2020 950.000000
1 2173.01 145.799111 87.868530 3.207040 2020 1713.000000
2 1691.01 272.404118 86.860343 3.495124 2020 707.833472
3 1671.01 130.969393 86.228398 11.750000 2020 649.000000
4 2287.01 222.750790 85.895391 1.949042 2020 898.936882

Note that because of imprecise detection methods, characteristics such as temperature, mass, and radius are all estimates. For more information on the uncertainty in these measurements, see the NASA Exoplanet Archive.

Because our goal is to generate a map of the exoplanets and stars, we need a standardized coordinate system for all three of our dataframes. Here, we’ll use the equatorial coordinate system provided in the original datasets. Equatorial coordinates are given by two angles: ra (right ascension) and dec (declination). ra and dec represent the position of an object on an imaginary sphere called the celestial sphere, with the Earth at its center and an equator that’s a projection of the Earth’s equator. Right ascension measures the horizontal position of the object on that sphere, and is usually given in either hours or degrees (our datasets use degrees). Declination measures the vertical position of the object. Because these coordinates are defined based on the tilt and shape of the Earth independent of its rotation, the equatorial positions of celestial objects do not change over the course of a day or a year.

In the dataframes, dec ranges from -90º to 90º and ra ranges from 0º to 360º. To make our map more intuitive, we will place the Earth at the origin, so we’ll need to rewrite the ra coordinates in the range -180º to 180º. To do this, we’ll write a simple function recenter:

def recenter(r):
    "Convert ra coordinates so that Earth is at origin"
    return np.mod(r+180, 360)-180
# Limit to the brightest stars, if desired
#stars = stars[stars["phot_g_mean_mag"]>11]
%%time
stars['ra']      = pn.state.as_cached('stars_ra',      lambda: recenter(stars['ra']))
exoplanets['ra'] = pn.state.as_cached('exoplanets_ra', lambda: recenter(exoplanets['ra']))
candidates['ra'] = pn.state.as_cached('candidates_ra', lambda: recenter(candidates['ra']))
CPU times: user 26 µs, sys: 16 ms, total: 16 ms
Wall time: 14.8 ms

If desired, Astropy provides methods of converting to other common celestial coordinate systems that can easily be applied to the dataframes; see examples in an older version of this notebook.

The Goldilocks zone and the Tsiolkovsky rocket equation#

One of the methods used to determine whether an exoplanet could potentially support life is to check whether liquid water could exist there. For water to be present on the planet as liquid, the planet’s temperature must be within a fairly narrow range, and therefore the planet must be within a certain distance of the nearest star. Exoplanets within this range are said to be in the “Goldilocks zone.”

If intelligent life were to exist on one of these planets, would it be capable of space travel? If the hypothetical life forms used similar methods to humans — for example, hydrogen- and oxygen-powered chemical rockets — would they even be able to leave their planet? A heavier rocket requires exponentially more fuel, but more fuel means more mass. The Tsiolkovsky rocket equation makes this question more precise:

\[\Delta v = v_e\ln\left(\frac{m_0}{m_f}\right),\]

where \(\Delta v\) is the impulse per mass unit required for the rocket to travel its course, \(v_e\) is effective exhaust velocity, \(m_0\) is the initial mass of the rocket, and \(m_f\) is the final mass of the rocket (here, equal to \(m_0\) minus the mass of the fuel spent on the flight). To see the rocket equation in action, consider a planet of the same density as Earth with radius \(R\) double Earth’s and thus mass \(M\) eight times Earth’s.

(Click to expand/contract computation details)

For the purposes of this example, we’ll assume that $\(\Delta v = \sqrt{\frac{GM}{R}},\)\( where \)G\approx 6.67\cdot 10^{-11}\( (in reality, some complicating factors exist, but our formula works as an approximation at relatively low altitudes\)^*$). Then

\[\Delta v = \sqrt{\frac{6.67\cdot 10^{-11}\cdot 4.78\cdot10^{25}}{1.27\cdot10^7}}\approx 22407 \frac{\text{m}}{\text{s}}.\]

Using the highest recorded exhaust velocity of a chemical rocket, \(5320\frac{\text{m}}{\text{s}},\) and we’ll calculate the approximate percent of the rocket’s mass that would have to be fuel in order to propel the rocket to \(250\) m\(^*\):

\[22407= 5320 \ln\left(\frac{m_0}{m_f}\right),\]

so

\[\frac{m_0}{m_f}\approx 67.5.\]

\(^*\)We won’t go into detail here, but the \(\Delta v\) calulation for \(250\) m is derived from the vis-viva equation.

To make it to \(250\) m above this planet’s surface, about \(98.5\%\) of the rocket’s initial mass would need to be fuel. For comparison, the rocket with the highest initial-to-final mass ratio ever built was the Soyuz-FG rocket, which was \(91\%\) fuel by mass. Moreover, we were very generous with the conditions used to compute the mass ratio to escape our imaginary planet. The exhaust velocity we used was only ever recorded for a highly corrosive, dangerous, expensive propellant that, with the current state of technology, is not feasible for use in space travel.

Filtering by feasibility of space travel#

We can use the rocket equation to get an idea which exoplanets might be the right size to allow for space travel. Let’s assume that the hypothetical life forms on an exoplanet can make a chemical rocket with exhaust velocity at most \(5320\frac{\text{m}}{\text{s}}.\) Let’s also say that they’ve figured out how to make rockets that are up to \(95\%\) fuel by mass (so \(\frac{m_0}{m_f}=20\)). These two (generous) assumptions will allow us to make an educated guess of whether the mass and radius of the exoplanet would ever conceivably allow for space travel:

\[\sqrt{\frac{GM}{R}}\approx \Delta v \leq 5320\ln{20}.\]

We can now define a function deltav that approximates \(\Delta v\) for each exoplanet and returns True or False depending on whether that value is small enough. We’ll then add a corresponding column escapable in our dataframe and cache it.

def deltav(p):
    """
    Given a planet record or planet dataframe, determine whether delta-v is 
    sufficiently small for feasible space travel with chemical rockets
    """
    G = 6.67*(10**(-11))
    return np.logical_and(p.habitable, np.sqrt(G*p.mass/p.radius)<=5320*np.log(20))

exoplanets['escapable'] = pn.state.as_cached('escapable', lambda: deltav(exoplanets))

Filtering and plotting data#

To help a user understand the data, we’ll allow them to filter it and then plot the results.

To orient users, we’ll first create a point representing the Earth at the origin \((0,0)\):

origin = pd.DataFrame(data = {'ra':[0],'dec':[0]})

opts  = dict(x="ra", y="dec", xlabel="right ascension (deg)", ylabel="declination (deg)", responsive=True, aspect=5/2)
earth = hv.Text(0, 0, '🌎', fontsize=20)
# earth

We could plot that single point by itself, but let’s go ahead and overlay it on a background of stars as a frame of reference, setting rasterize=True to use Datashader to pre-rasterize the data so that it is practical to display in a web browser. We’ll also add an arrow with the text “You are here” pointing to the Earth for this display.

star_bg = stars.hvplot.points(
    rasterize=True, color="phot_g_mean_mag", cmap=cc.fire, colorbar=True, cnorm='eq_hist', **opts).opts(bgcolor="black")

arrow = hv.Arrow(-5, 0, 'You are here', '>').opts(arrow_color='white',text_color="white",text_baseline="top",arrow_line_width=2)

star_bg * earth * arrow

Note that since equatorial coordinates represent the view from the earth, and we are projecting onto a two-dimensional plot, we should imagine the mapped stars wrapping around the Earth spherically, with those with ra coordinates close to 180º and -180º close to each other. Note also that distances in this map are not reflective of actual distance, but instead of relative position on the celestial sphere (i.e., their relative position as seen from Earth).

One of the first things that jumps out about this plot is the dark curve separating the two clusters of stars. This dark curve represents the ecliptic, the circular tilted path the sun appears to travel in the Earth’s sky, projected onto a plane. Few stars lie on this path due to Gaia’s imaging methods, which cause the spacecraft to observe the areas near this plane less frequently than other areas.

Now let’s plot the planets at their equatorial coordinates, using the point size and color to show attributes about them. When “mass” or “temperature” is selected as the size, we’ll scale the points down to 0.5% of the corresponding numerical mass or temperature value, so that planets with large masses or high temperatures do not overwhelm the plot. The sizes are thus mainly for relative scale, though a numerical legend could be provided with a bit of work. When plots are colored by a variable, planets for which that variable is not known will be colored grey.

Note that although most of the exoplanets have radii larger than Earth’s, in our plot, the point earth is scaled to 20 times its actual size relative to the exoplanets; without this magnification, Earth wouldn’t be easily discernible among the other planets.

def planets(planets=exoplanets, size="radius", color="radius"):
    size_scale = 1 if size == "radius" else 0.005    
    return planets.hvplot.points(color=color, cmap='blues', size=size_scale*hv.dim(size), clabel=color, **opts, alpha=0.7)

planets()

Two notable features of this graphic are the large, dark blue planet and the dense cluster of smaller planets on the left. The largest planet is the gas giant HD 100546 b, with a radius of approximately 77 times that of Earth, and by far the largest exoplanet in our dataset. The clustering behavior on the left is due to the method of detection used in the Kepler mission; the spacecraft pointed at only a small section of the sky, so the detected exoplanets are concentrated in that area.

If we like, we can filter this plot by year and/or by whether it is potentially habitable:

def habitable_exoplanets_by_year(year_range=(-np.inf, np.inf), hab=False):
    "Exoplanets filtered by the given year range and (if hab=True) whether they are habitable"
    e = exoplanets
    filtered = e[(e.disc_year>=year_range[0]) & (e.disc_year<=year_range[1])]
    if hab:
        filtered = filtered[filtered.habitable == True].drop_duplicates()
    filtered = filtered.drop_duplicates()
    return filtered

planets(habitable_exoplanets_by_year((1996, 2021), True))

There aren’t a lot of potentially habitable exoplanets, and most of them are small and lie in the cluster detected by Kepler. We can also filter by whether the planet is potentially escapable with a chemical rocket, which leaves even fewer! We’ll circle those points with a large purple-colored ring so that you can spot them.

def escapable(planets=exoplanets, size="radius", color="radius"):
    escapable = planets[planets['escapable']==True]
    return escapable.hvplot.points(
        cmap='blues', size=200, alpha=0.5, color="purple", line_width=3, fill_color=None, clabel=color, **opts)
    
escapable()

Separately, we can look at the candidate exoplanets to see how their distribution compares with the confirmed exoplanets:

def unconfirmed(year_range=(-np.inf,np.inf)):
    "Filter candidate exoplanets by year range and plot them in b,l space"
    c = candidates
    filtered_candidates = c[(c.year>=year_range[0]) & (c.year<=year_range[1])]
    return filtered_candidates.hvplot.points(size=30, color="green", alpha=0.5, **opts)

unconfirmed()

Again, we see a dense cluster of candidate exoplanets on the right due to the Kepler spacecraft’s area of detection, as well as a curve indicating the ecliptic. Few candidates lie on this curve because the TESS mission has yet to scan this portion of the sky.

Given that all the above plots share the same x and y axes, we can combine them into a single plot so that you can see everything in context:

star_bg * planets() * unconfirmed() * escapable() * earth

Whew, that’s a bit of a mess! For a user to make much sense of that, we’ll need to let them decide what they want to see dynamically; otherwise it’s too difficult to see what’s what. Later on we’ll see how to define some widgets that let them do that.

Radius vs. mass#

First, though, let’s define a completely different type of visualization, a scatterplot that lets us compare various quantities against each other so that we can see how habitable and uninhabitable exoplanets relate to each other:

def scatter(x_axis="radius", y_axis="mass"):
    habitable     = exoplanets[exoplanets['habitable']==True ].dropna(subset=[x_axis, y_axis])
    uninhabitable = exoplanets[exoplanets['habitable']==False].dropna(subset=[x_axis, y_axis])
    
    habitable_points = habitable.hvplot.scatter(
        x=x_axis,y=y_axis, color="red", responsive=True, aspect=5/2,
        label="Potentially habitable", size=30, legend='top_right')
    
    uninhabitable_points = uninhabitable.hvplot.scatter(
        x=x_axis, y=y_axis, responsive=True, aspect=5/2,
        color="blue", alpha=0.5, legend='top_right',
        label="Uninhabitable", size=10)
    
    return uninhabitable_points * habitable_points.opts(
        min_height=200, max_height=400, title=f'Scatterplot of {x_axis} and {y_axis}'
    )

scatter()

In the plot above, we can see that most planets have a mass under 2000 times Earth’s and radii under 20 times Earth’s, but there are a few outliers. The exoplanet with the largest mass in our dataset is the gas giant HR 2562-b, whose mass is over 9000 times Earth’s, but whose radius is only about 12 times Earth’s.

HD 100546 b has the largest radius compared to Earth’s, dwarfing our planet by a factor of 77 (that’s the big blue planet in the planets() plot). On the other side of the scale, the smallest planet in radius is Kepler-62 c, whose radius is about half of Earth’s, but whose mass is quadruple Earth’s.

The hottest planet in our dataset is the gas giant KELT-9b, whose surface temperature is 4050 Kelvin, and the coldest is OGLE-2005-BLG-390L b, at only 50 Kelvin. Temperature and radius appear to have a positive correlation.

In terms of habitability, all the potentially habitable exoplanets have radius less than five times Earth’s, mass less than forty times Earth’s, and surface temperature between about 200 and 400 Kelvin. For comparison, Earth’s surface temperature is 288 Kelvin.

Dashboard#

Now that we have our data and plots defined, we can use Panel widgets and layouts to build a shareable dashboard where we can explore all the combinations.

First, we’ll define two dropdown menus to choose the axis variables for the scatter plot:

axis_choices={"Earth radius": "radius", "Earth mass": "mass", "Temperature": "temperature"}

x_axis = pn.widgets.Select(name='x-axis:', options=axis_choices)
y_axis = pn.widgets.Select(name='y-axis:', options=axis_choices, value='mass')

Widgets need to be bound to a function or other dynamic object for them to have any effect. Here, we’ll use pn.bind to bind the user’s axis selections to the arguments needed by our plotting function, then display the result along with the widgets:

bound_scatter = pn.bind(scatter, x_axis=x_axis, y_axis=y_axis)

pn.Column(pn.Row(x_axis, y_axis), bound_scatter)

We can add other widgets controlling the map, including a slider representing discovery year, a checkbox determining whether to show unconfirmed exoplanets, a second checkbox determining whether to display only planets in the potentially habitable zone, and two dropdown menus to determine what the size and color of the points on the plot will represent:

year_range       = pn.widgets.RangeSlider( name='Discovery year range', start=1996, end=2021)
show_unconfirmed = pn.widgets.Checkbox(    name='Show unconfirmed exoplanets')
show_habitable   = pn.widgets.Checkbox(    name='Limit to potentially habitable planets')
show_escapable   = pn.widgets.Checkbox(    name='Show which planets could be escaped with a chemical rocket')
select_size      = pn.widgets.Select(      name='Size points by:',  options=axis_choices)
select_color     = pn.widgets.Select(      name='Color points by:', options=axis_choices)

widgets = pn.Column(
    pn.Row(
        pn.Column(select_size, select_color),
        pn.Column(year_range, show_unconfirmed, show_escapable, show_habitable)))

#widgets

We’ll first bind the filtering widgets to the arguments of our filtering function to generate a dynamically filtered dataframe that will be used in multiple plots:

filtered = pn.bind(habitable_exoplanets_by_year, year_range, show_habitable)

Now we can bind this dataframe and the other widgets to the various plotting function arguments. We’ll wrap the bound functions as HoloViews DynamicMaps so that we can use the HoloViews * overlay operator on them:

overlay = (star_bg * 
           hv.DynamicMap(pn.bind(planets,     filtered, select_size, select_color)) * 
           hv.DynamicMap(pn.bind(unconfirmed, year_range)).apply.opts(visible=show_unconfirmed) *
           hv.DynamicMap(pn.bind(escapable,   filtered, select_size, select_color)).apply.opts(visible=show_escapable) * 
           earth).opts(title='Map of exoplanets', min_height=200, max_height=400)
#overlay

Putting it all together#

Finally, we create a panel from our widgets and plots to display the final dashboard using a Panel template.

explorer = pn.Row(pn.Column(widgets, overlay, pn.Row(x_axis, y_axis), bound_scatter))

dashboard = pn.template.BootstrapTemplate(title='Exoplanets Explorer')
dashboard.main.append('Filter, color, and size the displayed exoplanets using the available widgets, and explore trends in the scatterplot below.')

dashboard.main.append(explorer)
dashboard.servable()

explorer

You can use this app here in the notebook, or serve it as a standalone dashboard using a command like panel serve --show exoplanets.ipynb.