Exoplanet Atmospheric Parameter Estimation

January 29, 2026

Optimizing Oxygen and Methane Abundance Ratios

Introduction

Exoplanet atmospheric characterization is one of the most exciting frontiers in modern astronomy. When we observe distant worlds transiting their host stars, we can analyze the starlight passing through their atmospheres to detect the chemical signatures of various molecules. In this article, we’ll tackle a practical problem: estimating the abundance ratios of oxygen (O₂) and methane (CH₄) in an exoplanet’s atmosphere by minimizing the error between our theoretical model and observational data.

The Physics Behind the Model

The transmission spectrum of an exoplanet atmosphere depends on how different molecules absorb light at various wavelengths. Each molecule has a unique absorption signature:

The model transit depth can be expressed as:

$$\delta(\lambda) = \delta_0 + \sum_i A_i \cdot \sigma_i(\lambda)$$

where:

$\delta(\lambda)$ is the transit depth at wavelength $\lambda$
$\delta_0$ is the baseline transit depth
$A_i$ is the abundance of species $i$
$\sigma_i(\lambda)$ is the absorption cross-section of species $i$

Our goal is to find the optimal values of oxygen and methane abundances that minimize the chi-squared error:

$$\chi^2 = \sum_j \frac{(D_j - M_j)^2}{\sigma_j^2}$$

where $D_j$ is the observed data, $M_j$ is the model prediction, and $\sigma_j$ is the measurement uncertainty.

Python Implementation

import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from scipy.optimize import minimize, differential_evolution
from matplotlib import cm
import warnings
warnings.filterwarnings('ignore')

# Set random seed for reproducibility
np.random.seed(42)

# Define wavelength range (in micrometers)
wavelengths = np.linspace(0.5, 5.0, 100)

# Generate synthetic absorption cross-sections for O2 and CH4
def oxygen_absorption(wavelength):
    """Oxygen absorption cross-section (synthetic)"""
    # O2 has strong absorption around 0.76 μm (A-band) and 1.27 μm
    absorption = (
        5.0 * np.exp(-((wavelength - 0.76)**2) / (2 * 0.05**2)) +
        3.0 * np.exp(-((wavelength - 1.27)**2) / (2 * 0.08**2)) +
        2.0 * np.exp(-((wavelength - 1.58)**2) / (2 * 0.06**2))
    )
    return absorption

def methane_absorption(wavelength):
    """Methane absorption cross-section (synthetic)"""
    # CH4 has strong absorption around 1.7, 2.3, and 3.3 μm
    absorption = (
        4.0 * np.exp(-((wavelength - 1.7)**2) / (2 * 0.1**2)) +
        6.0 * np.exp(-((wavelength - 2.3)**2) / (2 * 0.12**2)) +
        8.0 * np.exp(-((wavelength - 3.3)**2) / (2 * 0.15**2))
    )
    return absorption

# Calculate absorption cross-sections
sigma_O2 = oxygen_absorption(wavelengths)
sigma_CH4 = methane_absorption(wavelengths)

# True parameters (what we want to recover)
true_baseline = 0.02  # 2% baseline transit depth
true_O2_abundance = 0.15  # oxygen abundance
true_CH4_abundance = 0.08  # methane abundance

# Generate "observed" data with noise
def atmospheric_model(params, wavelengths, sigma_O2, sigma_CH4):
    """
    Atmospheric transmission model
    params: [baseline, O2_abundance, CH4_abundance]
    """
    baseline, A_O2, A_CH4 = params
    transit_depth = baseline + A_O2 * sigma_O2 + A_CH4 * sigma_CH4
    return transit_depth

# Generate true spectrum
true_params = [true_baseline, true_O2_abundance, true_CH4_abundance]
true_spectrum = atmospheric_model(true_params, wavelengths, sigma_O2, sigma_CH4)

# Add observational noise
noise_level = 0.05
observational_noise = np.random.normal(0, noise_level, len(wavelengths))
observed_spectrum = true_spectrum + observational_noise
uncertainties = np.ones_like(observed_spectrum) * noise_level

# Define chi-squared objective function
def chi_squared(params, wavelengths, observed_data, uncertainties, sigma_O2, sigma_CH4):
    """
    Calculate chi-squared error between model and observations
    """
    model_spectrum = atmospheric_model(params, wavelengths, sigma_O2, sigma_CH4)
    chi2 = np.sum(((observed_data - model_spectrum) / uncertainties)**2)
    return chi2

# Optimization using scipy.optimize.minimize (L-BFGS-B method)
print("="*60)
print("EXOPLANET ATMOSPHERIC PARAMETER ESTIMATION")
print("="*60)
print("\nTrue Parameters:")
print(f"  Baseline transit depth: {true_baseline:.4f}")
print(f"  O2 abundance: {true_O2_abundance:.4f}")
print(f"  CH4 abundance: {true_CH4_abundance:.4f}")

# Initial guess
initial_guess = [0.025, 0.1, 0.1]

# Bounds for parameters (all must be positive)
bounds = [(0.0, 0.1), (0.0, 0.5), (0.0, 0.5)]

print("\nInitial Guess:")
print(f"  Baseline: {initial_guess[0]:.4f}")
print(f"  O2 abundance: {initial_guess[1]:.4f}")
print(f"  CH4 abundance: {initial_guess[2]:.4f}")

print("\nOptimizing with L-BFGS-B method...")
result_lbfgs = minimize(
    chi_squared,
    initial_guess,
    args=(wavelengths, observed_spectrum, uncertainties, sigma_O2, sigma_CH4),
    method='L-BFGS-B',
    bounds=bounds
)

print("\nOptimization Results (L-BFGS-B):")
print(f"  Success: {result_lbfgs.success}")
print(f"  Optimized baseline: {result_lbfgs.x[0]:.4f} (error: {abs(result_lbfgs.x[0]-true_baseline)/true_baseline*100:.2f}%)")
print(f"  Optimized O2: {result_lbfgs.x[1]:.4f} (error: {abs(result_lbfgs.x[1]-true_O2_abundance)/true_O2_abundance*100:.2f}%)")
print(f"  Optimized CH4: {result_lbfgs.x[2]:.4f} (error: {abs(result_lbfgs.x[2]-true_CH4_abundance)/true_CH4_abundance*100:.2f}%)")
print(f"  Final chi-squared: {result_lbfgs.fun:.2f}")

# Global optimization using differential evolution for comparison
print("\nOptimizing with Differential Evolution (global optimizer)...")
result_de = differential_evolution(
    chi_squared,
    bounds,
    args=(wavelengths, observed_spectrum, uncertainties, sigma_O2, sigma_CH4),
    seed=42,
    maxiter=1000,
    atol=1e-6,
    tol=1e-6
)

print("\nOptimization Results (Differential Evolution):")
print(f"  Success: {result_de.success}")
print(f"  Optimized baseline: {result_de.x[0]:.4f} (error: {abs(result_de.x[0]-true_baseline)/true_baseline*100:.2f}%)")
print(f"  Optimized O2: {result_de.x[1]:.4f} (error: {abs(result_de.x[1]-true_O2_abundance)/true_O2_abundance*100:.2f}%)")
print(f"  Optimized CH4: {result_de.x[2]:.4f} (error: {abs(result_de.x[2]-true_CH4_abundance)/true_CH4_abundance*100:.2f}%)")
print(f"  Final chi-squared: {result_de.fun:.2f}")

# Generate optimized model spectrum
optimized_spectrum_lbfgs = atmospheric_model(result_lbfgs.x, wavelengths, sigma_O2, sigma_CH4)
optimized_spectrum_de = atmospheric_model(result_de.x, wavelengths, sigma_O2, sigma_CH4)

# Visualization
fig = plt.figure(figsize=(18, 12))

# Plot 1: Absorption Cross-sections
ax1 = plt.subplot(3, 3, 1)
ax1.plot(wavelengths, sigma_O2, 'b-', linewidth=2, label='O₂')
ax1.plot(wavelengths, sigma_CH4, 'r-', linewidth=2, label='CH₄')
ax1.set_xlabel('Wavelength (μm)', fontsize=11)
ax1.set_ylabel('Absorption Cross-section', fontsize=11)
ax1.set_title('Molecular Absorption Profiles', fontsize=12, fontweight='bold')
ax1.legend(fontsize=10)
ax1.grid(True, alpha=0.3)

# Plot 2: Observed vs Model Spectra
ax2 = plt.subplot(3, 3, 2)
ax2.errorbar(wavelengths, observed_spectrum, yerr=uncertainties, fmt='o', 
             color='gray', alpha=0.6, markersize=4, label='Observed Data')
ax2.plot(wavelengths, true_spectrum, 'g-', linewidth=2.5, label='True Model')
ax2.plot(wavelengths, optimized_spectrum_de, 'r--', linewidth=2, label='Optimized (DE)')
ax2.set_xlabel('Wavelength (μm)', fontsize=11)
ax2.set_ylabel('Transit Depth', fontsize=11)
ax2.set_title('Transmission Spectrum Fitting', fontsize=12, fontweight='bold')
ax2.legend(fontsize=9)
ax2.grid(True, alpha=0.3)

# Plot 3: Residuals
ax3 = plt.subplot(3, 3, 3)
residuals = observed_spectrum - optimized_spectrum_de
ax3.plot(wavelengths, residuals, 'ko-', markersize=3, linewidth=1, alpha=0.6)
ax3.axhline(y=0, color='r', linestyle='--', linewidth=1.5)
ax3.fill_between(wavelengths, -uncertainties, uncertainties, alpha=0.3, color='yellow')
ax3.set_xlabel('Wavelength (μm)', fontsize=11)
ax3.set_ylabel('Residuals', fontsize=11)
ax3.set_title('Fit Residuals (±1σ)', fontsize=12, fontweight='bold')
ax3.grid(True, alpha=0.3)

# Plot 4: Parameter Comparison
ax4 = plt.subplot(3, 3, 4)
params_names = ['Baseline', 'O₂', 'CH₄']
true_vals = [true_baseline, true_O2_abundance, true_CH4_abundance]
lbfgs_vals = result_lbfgs.x
de_vals = result_de.x
x_pos = np.arange(len(params_names))
width = 0.25
ax4.bar(x_pos - width, true_vals, width, label='True', color='green', alpha=0.7)
ax4.bar(x_pos, lbfgs_vals, width, label='L-BFGS-B', color='blue', alpha=0.7)
ax4.bar(x_pos + width, de_vals, width, label='Diff. Evol.', color='red', alpha=0.7)
ax4.set_ylabel('Parameter Value', fontsize=11)
ax4.set_title('Parameter Recovery Comparison', fontsize=12, fontweight='bold')
ax4.set_xticks(x_pos)
ax4.set_xticklabels(params_names, fontsize=10)
ax4.legend(fontsize=9)
ax4.grid(True, alpha=0.3, axis='y')

# Plot 5: Chi-squared landscape (O2 vs CH4)
ax5 = plt.subplot(3, 3, 5)
o2_range = np.linspace(0.05, 0.25, 40)
ch4_range = np.linspace(0.03, 0.15, 40)
O2_grid, CH4_grid = np.meshgrid(o2_range, ch4_range)
chi2_grid = np.zeros_like(O2_grid)

for i in range(len(o2_range)):
    for j in range(len(ch4_range)):
        params = [true_baseline, o2_range[i], ch4_range[j]]
        chi2_grid[j, i] = chi_squared(params, wavelengths, observed_spectrum, 
                                       uncertainties, sigma_O2, sigma_CH4)

contour = ax5.contour(O2_grid, CH4_grid, chi2_grid, levels=20, cmap='viridis')
ax5.clabel(contour, inline=True, fontsize=8)
ax5.plot(true_O2_abundance, true_CH4_abundance, 'g*', markersize=15, 
         label='True', markeredgecolor='black', markeredgewidth=1)
ax5.plot(result_de.x[1], result_de.x[2], 'r*', markersize=15, 
         label='Optimized', markeredgecolor='black', markeredgewidth=1)
ax5.set_xlabel('O₂ Abundance', fontsize=11)
ax5.set_ylabel('CH₄ Abundance', fontsize=11)
ax5.set_title('χ² Landscape (2D)', fontsize=12, fontweight='bold')
ax5.legend(fontsize=9)
ax5.grid(True, alpha=0.3)

# Plot 6: 3D Chi-squared surface
ax6 = plt.subplot(3, 3, 6, projection='3d')
surf = ax6.plot_surface(O2_grid, CH4_grid, np.log10(chi2_grid + 1), 
                        cmap=cm.coolwarm, alpha=0.8, edgecolor='none')
ax6.scatter([true_O2_abundance], [true_CH4_abundance], 
            [np.log10(chi_squared(true_params, wavelengths, observed_spectrum, 
                                   uncertainties, sigma_O2, sigma_CH4) + 1)],
            color='green', s=100, marker='*', label='True', edgecolor='black', linewidth=1)
ax6.scatter([result_de.x[1]], [result_de.x[2]], 
            [np.log10(result_de.fun + 1)],
            color='red', s=100, marker='*', label='Optimized', edgecolor='black', linewidth=1)
ax6.set_xlabel('O₂ Abundance', fontsize=10)
ax6.set_ylabel('CH₄ Abundance', fontsize=10)
ax6.set_zlabel('log₁₀(χ² + 1)', fontsize=10)
ax6.set_title('3D χ² Surface', fontsize=12, fontweight='bold')
ax6.legend(fontsize=8)
fig.colorbar(surf, ax=ax6, shrink=0.5, aspect=5)

# Plot 7: Contribution Analysis
ax7 = plt.subplot(3, 3, 7)
contribution_O2 = result_de.x[1] * sigma_O2
contribution_CH4 = result_de.x[2] * sigma_CH4
ax7.fill_between(wavelengths, 0, contribution_O2, alpha=0.5, color='blue', label='O₂ contribution')
ax7.fill_between(wavelengths, contribution_O2, contribution_O2 + contribution_CH4, 
                 alpha=0.5, color='red', label='CH₄ contribution')
ax7.plot(wavelengths, contribution_O2 + contribution_CH4, 'k-', linewidth=2, label='Total signal')
ax7.set_xlabel('Wavelength (μm)', fontsize=11)
ax7.set_ylabel('Contribution to Transit Depth', fontsize=11)
ax7.set_title('Molecular Contributions', fontsize=12, fontweight='bold')
ax7.legend(fontsize=9)
ax7.grid(True, alpha=0.3)

# Plot 8: Convergence history (for DE)
ax8 = plt.subplot(3, 3, 8)
o2_test = np.linspace(0.08, 0.22, 50)
chi2_slice = []
for o2_val in o2_test:
    params = [true_baseline, o2_val, true_CH4_abundance]
    chi2_slice.append(chi_squared(params, wavelengths, observed_spectrum, 
                                   uncertainties, sigma_O2, sigma_CH4))
ax8.plot(o2_test, chi2_slice, 'b-', linewidth=2)
ax8.axvline(true_O2_abundance, color='green', linestyle='--', linewidth=2, label='True O₂')
ax8.axvline(result_de.x[1], color='red', linestyle='--', linewidth=2, label='Optimized O₂')
ax8.set_xlabel('O₂ Abundance', fontsize=11)
ax8.set_ylabel('χ²', fontsize=11)
ax8.set_title('χ² Profile (O₂ slice)', fontsize=12, fontweight='bold')
ax8.legend(fontsize=9)
ax8.grid(True, alpha=0.3)

# Plot 9: Error distribution
ax9 = plt.subplot(3, 3, 9)
normalized_residuals = residuals / uncertainties
ax9.hist(normalized_residuals, bins=20, alpha=0.7, color='steelblue', edgecolor='black')
ax9.axvline(0, color='red', linestyle='--', linewidth=2)
ax9.set_xlabel('Normalized Residuals (σ)', fontsize=11)
ax9.set_ylabel('Frequency', fontsize=11)
ax9.set_title('Residual Distribution', fontsize=12, fontweight='bold')
ax9.grid(True, alpha=0.3, axis='y')
mean_res = np.mean(normalized_residuals)
std_res = np.std(normalized_residuals)
ax9.text(0.05, 0.95, f'μ = {mean_res:.3f}\nσ = {std_res:.3f}', 
         transform=ax9.transAxes, fontsize=10, verticalalignment='top',
         bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.5))

plt.tight_layout()
plt.savefig('exoplanet_atmosphere_optimization.png', dpi=300, bbox_inches='tight')
plt.show()

print("\n" + "="*60)
print("ANALYSIS COMPLETE")
print("="*60)

Code Explanation

Data Generation Section

The code begins by creating synthetic molecular absorption profiles for oxygen and methane. The oxygen_absorption() function models O₂ absorption features around 0.76 μm (A-band), 1.27 μm, and 1.58 μm using Gaussian profiles. Similarly, methane_absorption() models CH₄ absorption at 1.7, 2.3, and 3.3 μm, which are characteristic wavelengths for methane in planetary atmospheres.

Atmospheric Model

The atmospheric_model() function implements the core physics:

$$\delta(\lambda) = \delta_0 + A_{O_2} \cdot \sigma_{O_2}(\lambda) + A_{CH_4} \cdot \sigma_{CH_4}(\lambda)$$

This linear model assumes that the total transit depth is the sum of a baseline depth plus the contributions from each molecular species weighted by their abundances.

Optimization Strategy

We implement two optimization approaches:

L-BFGS-B: A quasi-Newton method that’s efficient for smooth, well-behaved objective functions. It uses gradient information and bounded constraints.
Differential Evolution: A global optimization algorithm that’s more robust to local minima. It uses a population-based stochastic approach, making it ideal for complex parameter spaces.

The chi-squared merit function quantifies the goodness of fit:

$$\chi^2 = \sum_{j=1}^{N} \frac{(D_j - M_j(\theta))^2}{\sigma_j^2}$$

where $\theta = (\delta_0, A_{O_2}, A_{CH_4})$ are the parameters we’re optimizing.

Visualization Components

The code generates nine comprehensive plots:

Molecular Absorption Profiles: Shows the unique spectroscopic signatures of O₂ and CH₄
Transmission Spectrum Fitting: Compares observed data with the optimized model
Fit Residuals: Displays the difference between observations and model, with uncertainty bands
Parameter Comparison: Bar chart showing how well each method recovered the true parameters
2D χ² Landscape: Contour plot showing the error surface in O₂-CH₄ space
3D χ² Surface: Three-dimensional visualization of the optimization landscape
Molecular Contributions: Stacked plot showing how each species contributes to the signal
χ² Profile: One-dimensional slice through parameter space
Residual Distribution: Histogram checking if residuals are normally distributed

Results

============================================================
EXOPLANET ATMOSPHERIC PARAMETER ESTIMATION
============================================================

True Parameters:
  Baseline transit depth: 0.0200
  O2 abundance: 0.1500
  CH4 abundance: 0.0800

Initial Guess:
  Baseline: 0.0250
  O2 abundance: 0.1000
  CH4 abundance: 0.1000

Optimizing with L-BFGS-B method...

Optimization Results (L-BFGS-B):
  Success: True
  Optimized baseline: 0.0163 (error: 18.68%)
  Optimized O2: 0.1498 (error: 0.16%)
  Optimized CH4: 0.0789 (error: 1.34%)
  Final chi-squared: 81.45

Optimizing with Differential Evolution (global optimizer)...

Optimization Results (Differential Evolution):
  Success: True
  Optimized baseline: 0.0163 (error: 18.68%)
  Optimized O2: 0.1498 (error: 0.16%)
  Optimized CH4: 0.0789 (error: 1.34%)
  Final chi-squared: 81.45

============================================================
ANALYSIS COMPLETE
============================================================

Interpretation

The optimization successfully recovers the atmospheric composition from noisy observational data. The 3D chi-squared surface reveals a well-defined minimum near the true parameter values, indicating that the problem is well-constrained. The residual distribution centered near zero with unit standard deviation confirms that our model adequately explains the observations.

Both optimization methods converge to similar solutions, with Differential Evolution providing slightly better global convergence guarantees. The molecular contribution analysis shows that methane dominates the signal at longer wavelengths (>2 μm), while oxygen provides the primary signal in the near-infrared (<1.5 μm).

This approach demonstrates how modern exoplanet science combines spectroscopic observations with sophisticated parameter estimation techniques to characterize distant worlds. The same methodology is used by missions like JWST and future observatories to search for biosignature gases in exoplanet atmospheres.