Exoplanet Atmospheric Parameter Estimation

Optimizing Oxygen and Methane Abundance Ratios

Introduction

Exoplanet atmospheric characterization is one of the most exciting frontiers in modern astronomy. When we observe distant worlds transiting their host stars, we can analyze the starlight passing through their atmospheres to detect the chemical signatures of various molecules. In this article, we’ll tackle a practical problem: estimating the abundance ratios of oxygen (O₂) and methane (CH₄) in an exoplanet’s atmosphere by minimizing the error between our theoretical model and observational data.

The Physics Behind the Model

The transmission spectrum of an exoplanet atmosphere depends on how different molecules absorb light at various wavelengths. Each molecule has a unique absorption signature:

The model transit depth can be expressed as:

$$\delta(\lambda) = \delta_0 + \sum_i A_i \cdot \sigma_i(\lambda)$$

where:

  • $\delta(\lambda)$ is the transit depth at wavelength $\lambda$
  • $\delta_0$ is the baseline transit depth
  • $A_i$ is the abundance of species $i$
  • $\sigma_i(\lambda)$ is the absorption cross-section of species $i$

Our goal is to find the optimal values of oxygen and methane abundances that minimize the chi-squared error:

$$\chi^2 = \sum_j \frac{(D_j - M_j)^2}{\sigma_j^2}$$

where $D_j$ is the observed data, $M_j$ is the model prediction, and $\sigma_j$ is the measurement uncertainty.

Python Implementation

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from scipy.optimize import minimize, differential_evolution
from matplotlib import cm
import warnings
warnings.filterwarnings('ignore')

# Set random seed for reproducibility
np.random.seed(42)

# Define wavelength range (in micrometers)
wavelengths = np.linspace(0.5, 5.0, 100)

# Generate synthetic absorption cross-sections for O2 and CH4
def oxygen_absorption(wavelength):
"""Oxygen absorption cross-section (synthetic)"""
# O2 has strong absorption around 0.76 μm (A-band) and 1.27 μm
absorption = (
5.0 * np.exp(-((wavelength - 0.76)**2) / (2 * 0.05**2)) +
3.0 * np.exp(-((wavelength - 1.27)**2) / (2 * 0.08**2)) +
2.0 * np.exp(-((wavelength - 1.58)**2) / (2 * 0.06**2))
)
return absorption

def methane_absorption(wavelength):
"""Methane absorption cross-section (synthetic)"""
# CH4 has strong absorption around 1.7, 2.3, and 3.3 μm
absorption = (
4.0 * np.exp(-((wavelength - 1.7)**2) / (2 * 0.1**2)) +
6.0 * np.exp(-((wavelength - 2.3)**2) / (2 * 0.12**2)) +
8.0 * np.exp(-((wavelength - 3.3)**2) / (2 * 0.15**2))
)
return absorption

# Calculate absorption cross-sections
sigma_O2 = oxygen_absorption(wavelengths)
sigma_CH4 = methane_absorption(wavelengths)

# True parameters (what we want to recover)
true_baseline = 0.02 # 2% baseline transit depth
true_O2_abundance = 0.15 # oxygen abundance
true_CH4_abundance = 0.08 # methane abundance

# Generate "observed" data with noise
def atmospheric_model(params, wavelengths, sigma_O2, sigma_CH4):
"""
Atmospheric transmission model
params: [baseline, O2_abundance, CH4_abundance]
"""
baseline, A_O2, A_CH4 = params
transit_depth = baseline + A_O2 * sigma_O2 + A_CH4 * sigma_CH4
return transit_depth

# Generate true spectrum
true_params = [true_baseline, true_O2_abundance, true_CH4_abundance]
true_spectrum = atmospheric_model(true_params, wavelengths, sigma_O2, sigma_CH4)

# Add observational noise
noise_level = 0.05
observational_noise = np.random.normal(0, noise_level, len(wavelengths))
observed_spectrum = true_spectrum + observational_noise
uncertainties = np.ones_like(observed_spectrum) * noise_level

# Define chi-squared objective function
def chi_squared(params, wavelengths, observed_data, uncertainties, sigma_O2, sigma_CH4):
"""
Calculate chi-squared error between model and observations
"""
model_spectrum = atmospheric_model(params, wavelengths, sigma_O2, sigma_CH4)
chi2 = np.sum(((observed_data - model_spectrum) / uncertainties)**2)
return chi2

# Optimization using scipy.optimize.minimize (L-BFGS-B method)
print("="*60)
print("EXOPLANET ATMOSPHERIC PARAMETER ESTIMATION")
print("="*60)
print("\nTrue Parameters:")
print(f" Baseline transit depth: {true_baseline:.4f}")
print(f" O2 abundance: {true_O2_abundance:.4f}")
print(f" CH4 abundance: {true_CH4_abundance:.4f}")

# Initial guess
initial_guess = [0.025, 0.1, 0.1]

# Bounds for parameters (all must be positive)
bounds = [(0.0, 0.1), (0.0, 0.5), (0.0, 0.5)]

print("\nInitial Guess:")
print(f" Baseline: {initial_guess[0]:.4f}")
print(f" O2 abundance: {initial_guess[1]:.4f}")
print(f" CH4 abundance: {initial_guess[2]:.4f}")

print("\nOptimizing with L-BFGS-B method...")
result_lbfgs = minimize(
chi_squared,
initial_guess,
args=(wavelengths, observed_spectrum, uncertainties, sigma_O2, sigma_CH4),
method='L-BFGS-B',
bounds=bounds
)

print("\nOptimization Results (L-BFGS-B):")
print(f" Success: {result_lbfgs.success}")
print(f" Optimized baseline: {result_lbfgs.x[0]:.4f} (error: {abs(result_lbfgs.x[0]-true_baseline)/true_baseline*100:.2f}%)")
print(f" Optimized O2: {result_lbfgs.x[1]:.4f} (error: {abs(result_lbfgs.x[1]-true_O2_abundance)/true_O2_abundance*100:.2f}%)")
print(f" Optimized CH4: {result_lbfgs.x[2]:.4f} (error: {abs(result_lbfgs.x[2]-true_CH4_abundance)/true_CH4_abundance*100:.2f}%)")
print(f" Final chi-squared: {result_lbfgs.fun:.2f}")

# Global optimization using differential evolution for comparison
print("\nOptimizing with Differential Evolution (global optimizer)...")
result_de = differential_evolution(
chi_squared,
bounds,
args=(wavelengths, observed_spectrum, uncertainties, sigma_O2, sigma_CH4),
seed=42,
maxiter=1000,
atol=1e-6,
tol=1e-6
)

print("\nOptimization Results (Differential Evolution):")
print(f" Success: {result_de.success}")
print(f" Optimized baseline: {result_de.x[0]:.4f} (error: {abs(result_de.x[0]-true_baseline)/true_baseline*100:.2f}%)")
print(f" Optimized O2: {result_de.x[1]:.4f} (error: {abs(result_de.x[1]-true_O2_abundance)/true_O2_abundance*100:.2f}%)")
print(f" Optimized CH4: {result_de.x[2]:.4f} (error: {abs(result_de.x[2]-true_CH4_abundance)/true_CH4_abundance*100:.2f}%)")
print(f" Final chi-squared: {result_de.fun:.2f}")

# Generate optimized model spectrum
optimized_spectrum_lbfgs = atmospheric_model(result_lbfgs.x, wavelengths, sigma_O2, sigma_CH4)
optimized_spectrum_de = atmospheric_model(result_de.x, wavelengths, sigma_O2, sigma_CH4)

# Visualization
fig = plt.figure(figsize=(18, 12))

# Plot 1: Absorption Cross-sections
ax1 = plt.subplot(3, 3, 1)
ax1.plot(wavelengths, sigma_O2, 'b-', linewidth=2, label='O₂')
ax1.plot(wavelengths, sigma_CH4, 'r-', linewidth=2, label='CH₄')
ax1.set_xlabel('Wavelength (μm)', fontsize=11)
ax1.set_ylabel('Absorption Cross-section', fontsize=11)
ax1.set_title('Molecular Absorption Profiles', fontsize=12, fontweight='bold')
ax1.legend(fontsize=10)
ax1.grid(True, alpha=0.3)

# Plot 2: Observed vs Model Spectra
ax2 = plt.subplot(3, 3, 2)
ax2.errorbar(wavelengths, observed_spectrum, yerr=uncertainties, fmt='o',
color='gray', alpha=0.6, markersize=4, label='Observed Data')
ax2.plot(wavelengths, true_spectrum, 'g-', linewidth=2.5, label='True Model')
ax2.plot(wavelengths, optimized_spectrum_de, 'r--', linewidth=2, label='Optimized (DE)')
ax2.set_xlabel('Wavelength (μm)', fontsize=11)
ax2.set_ylabel('Transit Depth', fontsize=11)
ax2.set_title('Transmission Spectrum Fitting', fontsize=12, fontweight='bold')
ax2.legend(fontsize=9)
ax2.grid(True, alpha=0.3)

# Plot 3: Residuals
ax3 = plt.subplot(3, 3, 3)
residuals = observed_spectrum - optimized_spectrum_de
ax3.plot(wavelengths, residuals, 'ko-', markersize=3, linewidth=1, alpha=0.6)
ax3.axhline(y=0, color='r', linestyle='--', linewidth=1.5)
ax3.fill_between(wavelengths, -uncertainties, uncertainties, alpha=0.3, color='yellow')
ax3.set_xlabel('Wavelength (μm)', fontsize=11)
ax3.set_ylabel('Residuals', fontsize=11)
ax3.set_title('Fit Residuals (±1σ)', fontsize=12, fontweight='bold')
ax3.grid(True, alpha=0.3)

# Plot 4: Parameter Comparison
ax4 = plt.subplot(3, 3, 4)
params_names = ['Baseline', 'O₂', 'CH₄']
true_vals = [true_baseline, true_O2_abundance, true_CH4_abundance]
lbfgs_vals = result_lbfgs.x
de_vals = result_de.x
x_pos = np.arange(len(params_names))
width = 0.25
ax4.bar(x_pos - width, true_vals, width, label='True', color='green', alpha=0.7)
ax4.bar(x_pos, lbfgs_vals, width, label='L-BFGS-B', color='blue', alpha=0.7)
ax4.bar(x_pos + width, de_vals, width, label='Diff. Evol.', color='red', alpha=0.7)
ax4.set_ylabel('Parameter Value', fontsize=11)
ax4.set_title('Parameter Recovery Comparison', fontsize=12, fontweight='bold')
ax4.set_xticks(x_pos)
ax4.set_xticklabels(params_names, fontsize=10)
ax4.legend(fontsize=9)
ax4.grid(True, alpha=0.3, axis='y')

# Plot 5: Chi-squared landscape (O2 vs CH4)
ax5 = plt.subplot(3, 3, 5)
o2_range = np.linspace(0.05, 0.25, 40)
ch4_range = np.linspace(0.03, 0.15, 40)
O2_grid, CH4_grid = np.meshgrid(o2_range, ch4_range)
chi2_grid = np.zeros_like(O2_grid)

for i in range(len(o2_range)):
for j in range(len(ch4_range)):
params = [true_baseline, o2_range[i], ch4_range[j]]
chi2_grid[j, i] = chi_squared(params, wavelengths, observed_spectrum,
uncertainties, sigma_O2, sigma_CH4)

contour = ax5.contour(O2_grid, CH4_grid, chi2_grid, levels=20, cmap='viridis')
ax5.clabel(contour, inline=True, fontsize=8)
ax5.plot(true_O2_abundance, true_CH4_abundance, 'g*', markersize=15,
label='True', markeredgecolor='black', markeredgewidth=1)
ax5.plot(result_de.x[1], result_de.x[2], 'r*', markersize=15,
label='Optimized', markeredgecolor='black', markeredgewidth=1)
ax5.set_xlabel('O₂ Abundance', fontsize=11)
ax5.set_ylabel('CH₄ Abundance', fontsize=11)
ax5.set_title('χ² Landscape (2D)', fontsize=12, fontweight='bold')
ax5.legend(fontsize=9)
ax5.grid(True, alpha=0.3)

# Plot 6: 3D Chi-squared surface
ax6 = plt.subplot(3, 3, 6, projection='3d')
surf = ax6.plot_surface(O2_grid, CH4_grid, np.log10(chi2_grid + 1),
cmap=cm.coolwarm, alpha=0.8, edgecolor='none')
ax6.scatter([true_O2_abundance], [true_CH4_abundance],
[np.log10(chi_squared(true_params, wavelengths, observed_spectrum,
uncertainties, sigma_O2, sigma_CH4) + 1)],
color='green', s=100, marker='*', label='True', edgecolor='black', linewidth=1)
ax6.scatter([result_de.x[1]], [result_de.x[2]],
[np.log10(result_de.fun + 1)],
color='red', s=100, marker='*', label='Optimized', edgecolor='black', linewidth=1)
ax6.set_xlabel('O₂ Abundance', fontsize=10)
ax6.set_ylabel('CH₄ Abundance', fontsize=10)
ax6.set_zlabel('log₁₀(χ² + 1)', fontsize=10)
ax6.set_title('3D χ² Surface', fontsize=12, fontweight='bold')
ax6.legend(fontsize=8)
fig.colorbar(surf, ax=ax6, shrink=0.5, aspect=5)

# Plot 7: Contribution Analysis
ax7 = plt.subplot(3, 3, 7)
contribution_O2 = result_de.x[1] * sigma_O2
contribution_CH4 = result_de.x[2] * sigma_CH4
ax7.fill_between(wavelengths, 0, contribution_O2, alpha=0.5, color='blue', label='O₂ contribution')
ax7.fill_between(wavelengths, contribution_O2, contribution_O2 + contribution_CH4,
alpha=0.5, color='red', label='CH₄ contribution')
ax7.plot(wavelengths, contribution_O2 + contribution_CH4, 'k-', linewidth=2, label='Total signal')
ax7.set_xlabel('Wavelength (μm)', fontsize=11)
ax7.set_ylabel('Contribution to Transit Depth', fontsize=11)
ax7.set_title('Molecular Contributions', fontsize=12, fontweight='bold')
ax7.legend(fontsize=9)
ax7.grid(True, alpha=0.3)

# Plot 8: Convergence history (for DE)
ax8 = plt.subplot(3, 3, 8)
o2_test = np.linspace(0.08, 0.22, 50)
chi2_slice = []
for o2_val in o2_test:
params = [true_baseline, o2_val, true_CH4_abundance]
chi2_slice.append(chi_squared(params, wavelengths, observed_spectrum,
uncertainties, sigma_O2, sigma_CH4))
ax8.plot(o2_test, chi2_slice, 'b-', linewidth=2)
ax8.axvline(true_O2_abundance, color='green', linestyle='--', linewidth=2, label='True O₂')
ax8.axvline(result_de.x[1], color='red', linestyle='--', linewidth=2, label='Optimized O₂')
ax8.set_xlabel('O₂ Abundance', fontsize=11)
ax8.set_ylabel('χ²', fontsize=11)
ax8.set_title('χ² Profile (O₂ slice)', fontsize=12, fontweight='bold')
ax8.legend(fontsize=9)
ax8.grid(True, alpha=0.3)

# Plot 9: Error distribution
ax9 = plt.subplot(3, 3, 9)
normalized_residuals = residuals / uncertainties
ax9.hist(normalized_residuals, bins=20, alpha=0.7, color='steelblue', edgecolor='black')
ax9.axvline(0, color='red', linestyle='--', linewidth=2)
ax9.set_xlabel('Normalized Residuals (σ)', fontsize=11)
ax9.set_ylabel('Frequency', fontsize=11)
ax9.set_title('Residual Distribution', fontsize=12, fontweight='bold')
ax9.grid(True, alpha=0.3, axis='y')
mean_res = np.mean(normalized_residuals)
std_res = np.std(normalized_residuals)
ax9.text(0.05, 0.95, f'μ = {mean_res:.3f}\nσ = {std_res:.3f}',
transform=ax9.transAxes, fontsize=10, verticalalignment='top',
bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.5))

plt.tight_layout()
plt.savefig('exoplanet_atmosphere_optimization.png', dpi=300, bbox_inches='tight')
plt.show()

print("\n" + "="*60)
print("ANALYSIS COMPLETE")
print("="*60)

Code Explanation

Data Generation Section

The code begins by creating synthetic molecular absorption profiles for oxygen and methane. The oxygen_absorption() function models O₂ absorption features around 0.76 μm (A-band), 1.27 μm, and 1.58 μm using Gaussian profiles. Similarly, methane_absorption() models CH₄ absorption at 1.7, 2.3, and 3.3 μm, which are characteristic wavelengths for methane in planetary atmospheres.

Atmospheric Model

The atmospheric_model() function implements the core physics:

$$\delta(\lambda) = \delta_0 + A_{O_2} \cdot \sigma_{O_2}(\lambda) + A_{CH_4} \cdot \sigma_{CH_4}(\lambda)$$

This linear model assumes that the total transit depth is the sum of a baseline depth plus the contributions from each molecular species weighted by their abundances.

Optimization Strategy

We implement two optimization approaches:

  1. L-BFGS-B: A quasi-Newton method that’s efficient for smooth, well-behaved objective functions. It uses gradient information and bounded constraints.

  2. Differential Evolution: A global optimization algorithm that’s more robust to local minima. It uses a population-based stochastic approach, making it ideal for complex parameter spaces.

The chi-squared merit function quantifies the goodness of fit:

$$\chi^2 = \sum_{j=1}^{N} \frac{(D_j - M_j(\theta))^2}{\sigma_j^2}$$

where $\theta = (\delta_0, A_{O_2}, A_{CH_4})$ are the parameters we’re optimizing.

Visualization Components

The code generates nine comprehensive plots:

  1. Molecular Absorption Profiles: Shows the unique spectroscopic signatures of O₂ and CH₄
  2. Transmission Spectrum Fitting: Compares observed data with the optimized model
  3. Fit Residuals: Displays the difference between observations and model, with uncertainty bands
  4. Parameter Comparison: Bar chart showing how well each method recovered the true parameters
  5. 2D χ² Landscape: Contour plot showing the error surface in O₂-CH₄ space
  6. 3D χ² Surface: Three-dimensional visualization of the optimization landscape
  7. Molecular Contributions: Stacked plot showing how each species contributes to the signal
  8. χ² Profile: One-dimensional slice through parameter space
  9. Residual Distribution: Histogram checking if residuals are normally distributed

Results

============================================================
EXOPLANET ATMOSPHERIC PARAMETER ESTIMATION
============================================================

True Parameters:
  Baseline transit depth: 0.0200
  O2 abundance: 0.1500
  CH4 abundance: 0.0800

Initial Guess:
  Baseline: 0.0250
  O2 abundance: 0.1000
  CH4 abundance: 0.1000

Optimizing with L-BFGS-B method...

Optimization Results (L-BFGS-B):
  Success: True
  Optimized baseline: 0.0163 (error: 18.68%)
  Optimized O2: 0.1498 (error: 0.16%)
  Optimized CH4: 0.0789 (error: 1.34%)
  Final chi-squared: 81.45

Optimizing with Differential Evolution (global optimizer)...

Optimization Results (Differential Evolution):
  Success: True
  Optimized baseline: 0.0163 (error: 18.68%)
  Optimized O2: 0.1498 (error: 0.16%)
  Optimized CH4: 0.0789 (error: 1.34%)
  Final chi-squared: 81.45

============================================================
ANALYSIS COMPLETE
============================================================

Interpretation

The optimization successfully recovers the atmospheric composition from noisy observational data. The 3D chi-squared surface reveals a well-defined minimum near the true parameter values, indicating that the problem is well-constrained. The residual distribution centered near zero with unit standard deviation confirms that our model adequately explains the observations.

Both optimization methods converge to similar solutions, with Differential Evolution providing slightly better global convergence guarantees. The molecular contribution analysis shows that methane dominates the signal at longer wavelengths (>2 μm), while oxygen provides the primary signal in the near-infrared (<1.5 μm).

This approach demonstrates how modern exoplanet science combines spectroscopic observations with sophisticated parameter estimation techniques to characterize distant worlds. The same methodology is used by missions like JWST and future observatories to search for biosignature gases in exoplanet atmospheres.