Warehouse Location Problem

A Practical Guide with Python

What is the Warehouse Location Problem?

The Warehouse Location Problem (also called the Facility Location Problem) is a classic combinatorial optimization problem in operations research. The core question is:

Given a set of potential warehouse sites and a set of customers, where should we open warehouses to minimize total cost?

The total cost consists of two components:

  1. Fixed opening cost — the cost to open a warehouse at a given site
  2. Transportation cost — the cost to ship goods from a warehouse to each customer

Problem Formulation

Let’s define the math precisely.

Sets:

  • I = set of potential warehouse locations 1,2,,m
  • J = set of customers 1,2,,n

Parameters:

  • fi = fixed cost of opening warehouse i
  • cij = transportation cost from warehouse i to customer j
  • dj = demand of customer j

Decision Variables:

  • yi0,1 — 1 if warehouse i is opened, 0 otherwise
  • xij0,1 — 1 if customer j is assigned to warehouse i

Objective Function (Minimize):

MinimizeiIfiyi+iIjJcijdjxij

Constraints:

iIxij=1jJ(each customer assigned to exactly one warehouse)

xijyiiI,jJ(can only assign to open warehouse)

yi0,1,xij0,1


Concrete Example Setup

We’ll solve a realistic instance:

  • 10 potential warehouse locations scattered across a 100×100 grid
  • 30 customers with varying demands
  • Fixed costs differ per site (reflecting land/construction differences)
  • Transportation costs proportional to Euclidean distance × demand

Python Source Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
# ============================================================
# Warehouse Location Problem
# Solved via Integer Linear Programming (PuLP) + Visualization
# ============================================================

# ── 0. Install dependencies ──────────────────────────────────
import subprocess, sys
subprocess.check_call([sys.executable, "-m", "pip", "install", "pulp", "--quiet"])

# ── 1. Imports ───────────────────────────────────────────────
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
from mpl_toolkits.mplot3d import Axes3D
import pulp
import warnings
warnings.filterwarnings("ignore")

# ── 2. Problem Data ──────────────────────────────────────────
np.random.seed(42)

NUM_WAREHOUSES = 10
NUM_CUSTOMERS = 30

# Coordinates (x, y) on a 100x100 grid
wh_coords = np.random.uniform(10, 90, (NUM_WAREHOUSES, 2))
cu_coords = np.random.uniform(5, 95, (NUM_CUSTOMERS, 2))

# Fixed opening costs (different per site)
fixed_costs = np.random.uniform(500, 2000, NUM_WAREHOUSES)

# Customer demands
demands = np.random.uniform(10, 100, NUM_CUSTOMERS)

# Transportation cost = Euclidean distance * demand
def transport_cost(wh_xy, cu_xy, demand):
dist = np.linalg.norm(wh_xy - cu_xy)
return dist * demand * 0.5 # cost coefficient = 0.5

C = np.array([
[transport_cost(wh_coords[i], cu_coords[j], demands[j])
for j in range(NUM_CUSTOMERS)]
for i in range(NUM_WAREHOUSES)
])

print("=== Problem Summary ===")
print(f"Potential warehouses : {NUM_WAREHOUSES}")
print(f"Customers : {NUM_CUSTOMERS}")
print(f"Fixed cost range : {fixed_costs.min():.0f}{fixed_costs.max():.0f}")
print(f"Demand range : {demands.min():.1f}{demands.max():.1f}")

# ── 3. ILP Model (PuLP) ──────────────────────────────────────
prob = pulp.LpProblem("Warehouse_Location", pulp.LpMinimize)

# Decision variables
y = [pulp.LpVariable(f"y_{i}", cat="Binary") for i in range(NUM_WAREHOUSES)]
x = [[pulp.LpVariable(f"x_{i}_{j}", cat="Binary")
for j in range(NUM_CUSTOMERS)]
for i in range(NUM_WAREHOUSES)]

# Objective
prob += (
pulp.lpSum(fixed_costs[i] * y[i] for i in range(NUM_WAREHOUSES)) +
pulp.lpSum(C[i][j] * x[i][j]
for i in range(NUM_WAREHOUSES)
for j in range(NUM_CUSTOMERS))
)

# Constraints
for j in range(NUM_CUSTOMERS):
prob += pulp.lpSum(x[i][j] for i in range(NUM_WAREHOUSES)) == 1

for i in range(NUM_WAREHOUSES):
for j in range(NUM_CUSTOMERS):
prob += x[i][j] <= y[i]

# ── 4. Solve ─────────────────────────────────────────────────
solver = pulp.PULP_CBC_CMD(msg=0)
status = prob.solve(solver)

print(f"\n=== Solver Result ===")
print(f"Status : {pulp.LpStatus[prob.status]}")
print(f"Total Cost : {pulp.value(prob.objective):,.1f}")

opened = [i for i in range(NUM_WAREHOUSES) if pulp.value(y[i]) > 0.5]
print(f"Opened warehouses ({len(opened)}) : {opened}")

# Assignment: customer j → warehouse i
assignment = {}
for j in range(NUM_CUSTOMERS):
for i in range(NUM_WAREHOUSES):
if pulp.value(x[i][j]) > 0.5:
assignment[j] = i
break

# Cost breakdown
total_fixed = sum(fixed_costs[i] for i in opened)
total_transp = sum(C[assignment[j]][j] for j in range(NUM_CUSTOMERS))
print(f" Fixed cost : {total_fixed:,.1f}")
print(f" Transport cost : {total_transp:,.1f}")

# ── 5. Visualization ─────────────────────────────────────────
# Color palette for open warehouses
COLORS = plt.cm.tab10(np.linspace(0, 1, len(opened)))
color_map = {wh: COLORS[k] for k, wh in enumerate(opened)}

fig = plt.figure(figsize=(20, 16))
fig.suptitle("Warehouse Location Problem — Optimal Solution", fontsize=18, fontweight="bold", y=0.98)

# ── Plot 1: 2D Assignment Map ─────────────────────────────────
ax1 = fig.add_subplot(2, 2, 1)

# Draw assignment lines
for j in range(NUM_CUSTOMERS):
wh = assignment[j]
col = color_map[wh]
ax1.plot(
[cu_coords[j, 0], wh_coords[wh, 0]],
[cu_coords[j, 1], wh_coords[wh, 1]],
color=col, alpha=0.3, linewidth=0.8
)

# Customers
for j in range(NUM_CUSTOMERS):
wh = assignment[j]
ax1.scatter(*cu_coords[j], color=color_map[wh], s=60, zorder=5, edgecolors="white", linewidths=0.8)

# All warehouse candidates
for i in range(NUM_WAREHOUSES):
if i in opened:
ax1.scatter(*wh_coords[i], color=color_map[i], s=250, marker="*",
zorder=7, edgecolors="black", linewidths=1.0)
else:
ax1.scatter(*wh_coords[i], color="lightgray", s=120, marker="s",
zorder=6, edgecolors="black", linewidths=0.8, alpha=0.5)

ax1.set_title("2D Assignment Map", fontsize=13, fontweight="bold")
ax1.set_xlabel("X coordinate")
ax1.set_ylabel("Y coordinate")
legend_elements = [
mpatches.Patch(facecolor=color_map[wh], label=f"WH {wh} (fixed={fixed_costs[wh]:.0f})")
for wh in opened
]
legend_elements.append(mpatches.Patch(facecolor="lightgray", label="Closed (candidate)"))
ax1.legend(handles=legend_elements, fontsize=7, loc="upper left", framealpha=0.85)
ax1.grid(True, alpha=0.3)

# ── Plot 2: Cost Breakdown Bar Chart ─────────────────────────
ax2 = fig.add_subplot(2, 2, 2)

# Per-warehouse transport costs
wh_transport = {wh: 0.0 for wh in opened}
for j in range(NUM_CUSTOMERS):
wh_transport[assignment[j]] += C[assignment[j]][j]

wh_labels = [f"WH {wh}" for wh in opened]
fixed_vals = [fixed_costs[wh] for wh in opened]
trans_vals = [wh_transport[wh] for wh in opened]

x_pos = np.arange(len(opened))
bars1 = ax2.bar(x_pos, fixed_vals, label="Fixed Cost", color="steelblue", alpha=0.85)
bars2 = ax2.bar(x_pos, trans_vals, bottom=fixed_vals,
label="Transport Cost", color="tomato", alpha=0.85)

ax2.set_xticks(x_pos)
ax2.set_xticklabels(wh_labels, fontsize=9)
ax2.set_title("Cost Breakdown per Open Warehouse", fontsize=13, fontweight="bold")
ax2.set_ylabel("Cost")
ax2.legend()
ax2.grid(axis="y", alpha=0.3)

for bar, v in zip(bars1, fixed_vals):
ax2.text(bar.get_x() + bar.get_width()/2, v/2,
f"{v:.0f}", ha="center", va="center", fontsize=7, color="white", fontweight="bold")

# ── Plot 3: Demand Heatmap ────────────────────────────────────
ax3 = fig.add_subplot(2, 2, 3)

# Grid interpolation for demand
from scipy.interpolate import griddata # already in Colab

grid_x, grid_y = np.mgrid[0:100:100j, 0:100:100j]
demand_grid = griddata(cu_coords, demands, (grid_x, grid_y), method="cubic", fill_value=0)
demand_grid = np.clip(demand_grid, 0, None)

im = ax3.imshow(demand_grid.T, origin="lower", extent=[0,100,0,100],
cmap="YlOrRd", alpha=0.6, aspect="auto")
plt.colorbar(im, ax=ax3, label="Demand intensity")

for j in range(NUM_CUSTOMERS):
ax3.scatter(*cu_coords[j], s=demands[j]*1.5, color="darkred", alpha=0.7, zorder=5)

for i in opened:
ax3.scatter(*wh_coords[i], color=color_map[i], s=280, marker="*",
zorder=7, edgecolors="black", linewidths=1.0)

ax3.set_title("Customer Demand Heatmap + Warehouse Positions", fontsize=13, fontweight="bold")
ax3.set_xlabel("X coordinate")
ax3.set_ylabel("Y coordinate")
ax3.grid(True, alpha=0.2)

# ── Plot 4: 3D Cost Surface ───────────────────────────────────
ax4 = fig.add_subplot(2, 2, 4, projection="3d")

# For each grid point, compute min transport cost to an open warehouse
grid_pts = np.column_stack([grid_x.ravel(), grid_y.ravel()])
cost_surface = np.zeros(len(grid_pts))

for k, pt in enumerate(grid_pts):
min_cost = np.inf
for i in opened:
dist = np.linalg.norm(pt - wh_coords[i])
min_cost = min(min_cost, dist * 0.5) # cost coefficient
cost_surface[k] = min_cost

cost_surface = cost_surface.reshape(100, 100)

surf = ax4.plot_surface(grid_x, grid_y, cost_surface,
cmap="viridis", alpha=0.75, linewidth=0, antialiased=True)
fig.colorbar(surf, ax=ax4, shrink=0.5, label="Min transport cost", pad=0.1)

# Plot opened warehouses on the surface
for i in opened:
z_val = 0
ax4.scatter(wh_coords[i, 0], wh_coords[i, 1], z_val,
color=color_map[i], s=120, marker="*", zorder=10, edgecolors="black")

ax4.set_title("3D Min Transport Cost Surface\n(Stars = Open Warehouses)", fontsize=11, fontweight="bold")
ax4.set_xlabel("X")
ax4.set_ylabel("Y")
ax4.set_zlabel("Cost")
ax4.view_init(elev=35, azim=225)

plt.tight_layout()
plt.savefig("warehouse_location_result.png", dpi=150, bbox_inches="tight")
plt.show()
print("\n[Figure saved as warehouse_location_result.png]")

Code Walkthrough

Section 0–1: Setup & Imports

PuLP is installed quietly via subprocess so the cell is self-contained. We import numpy, matplotlib (2D + 3D), scipy.interpolate (for the demand heatmap), and pulp (the ILP solver).


Section 2: Problem Data

1
2
wh_coords = np.random.uniform(10, 90, (NUM_WAREHOUSES, 2))
cu_coords = np.random.uniform(5, 95, (NUM_CUSTOMERS, 2))

10 warehouse candidates and 30 customers are randomly placed on a 100×100 grid. The transport cost matrix C is:

cij=0.5×|wicj|2×dj

where wi is warehouse i’s location, cj is customer j’s location, and dj is demand. The coefficient 0.5 is a per-unit-distance shipping rate.


Section 3: ILP Model

We declare two sets of binary variables:

Variable Meaning
yi 1 = open warehouse i
xij 1 = customer j served by warehouse i

The objective minimizes total fixed + transportation cost. Two constraint families enforce:

  1. Every customer is served by exactly one warehouse.
  2. A customer can only be assigned to an open warehouse.

Section 4: Solve

PuLP dispatches to the bundled CBC (Coin-or Branch and Cut) solver — a production-grade open-source MIP solver. msg=0 suppresses verbose output. On this 10-warehouse × 30-customer instance, the solver typically finds the global optimum in milliseconds.


Section 5: Visualization

Four panels are produced:

Panel What it shows
2D Assignment Map Lines from each customer to its assigned warehouse; open warehouses as gold stars; closed candidates as gray squares
Cost Breakdown Bar Stacked bar per open warehouse: fixed cost (blue) + transport cost (red)
Demand Heatmap Interpolated demand density across the grid overlaid with customer bubbles (size ∝ demand) and warehouse stars
3D Cost Surface For every point on the grid, the minimum transport cost to the nearest open warehouse — valleys reveal optimal coverage zones

Results

=== Problem Summary ===
Potential warehouses : 10
Customers            : 30
Fixed cost range     : 595 – 1831
Demand range         : 12.3 – 93.7

=== Solver Result ===
Status        : Optimal
Total Cost    : 18,246.5
Opened warehouses (5) : [2, 3, 4, 6, 8]
  Fixed cost      : 5,983.4
  Transport cost  : 12,263.2

[Figure saved as warehouse_location_result.png]

Key Takeaways

The optimal solution typically opens 3–5 warehouses out of the 10 candidates, balancing:

  • High fixed cost sites are skipped even if they are centrally located.
  • Clusters of high-demand customers attract warehouses strongly because transport savings are proportional to dj.
  • The 3D cost surface makes it visually obvious that each open warehouse creates a cost valley — customers inside the valley are served cheaply, and ridges between warehouses indicate where assignment boundaries lie.

This exact model scales to real-world instances with hundreds of sites and thousands of customers. For very large instances, metaheuristics (Simulated Annealing, Genetic Algorithms) or Lagrangian Relaxation are common acceleration strategies.

Project Selection Problem


🧮 Solved with Python (Integer Programming)


What Is the Project Selection Problem?

The Project Selection Problem is a classic combinatorial optimization problem. You have a set of candidate projects, each with a profit and a cost, and some projects have dependencies (you can’t do Project B unless you first do Project A). Given a fixed budget, the goal is to select a subset of projects that maximizes total profit while satisfying all dependency constraints and the budget constraint.


Concrete Example

Suppose a company has 6 candidate projects:

Project Profit Cost Prerequisite
P1 10 3
P2 20 5 P1
P3 15 4 P1
P4 25 7 P2
P5 18 6 P3
P6 30 8 P2, P3

Budget: 20 units


Mathematical Formulation

Let xi0,1 be a binary decision variable: xi=1 if project i is selected.

Objective (Maximize total profit):

maxni=1pixi

Budget constraint:

ni=1cixiB

Dependency constraints (if project j requires project i):

xjxi(i,j)E

where E is the set of prerequisite edges.


Python Source Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
# ============================================================
# Project Selection Problem - Integer Linear Programming
# ============================================================

# !pip install pulp matplotlib numpy networkx

import pulp
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
import numpy as np
import networkx as nx
from itertools import combinations

# ============================================================
# 1. Problem Definition
# ============================================================

projects = {
'P1': {'profit': 10, 'cost': 3, 'prereqs': []},
'P2': {'profit': 20, 'cost': 5, 'prereqs': ['P1']},
'P3': {'profit': 15, 'cost': 4, 'prereqs': ['P1']},
'P4': {'profit': 25, 'cost': 7, 'prereqs': ['P2']},
'P5': {'profit': 18, 'cost': 6, 'prereqs': ['P3']},
'P6': {'profit': 30, 'cost': 8, 'prereqs': ['P2', 'P3']},
}
BUDGET = 20
project_names = list(projects.keys())

# ============================================================
# 2. ILP Model with PuLP
# ============================================================

model = pulp.LpProblem("ProjectSelection", pulp.LpMaximize)

# Binary decision variables
x = {p: pulp.LpVariable(f"x_{p}", cat='Binary') for p in project_names}

# Objective function
model += pulp.lpSum(projects[p]['profit'] * x[p] for p in project_names), "TotalProfit"

# Budget constraint
model += pulp.lpSum(projects[p]['cost'] * x[p] for p in project_names) <= BUDGET, "Budget"

# Dependency constraints
for p in project_names:
for prereq in projects[p]['prereqs']:
model += x[p] <= x[prereq], f"Dep_{p}_needs_{prereq}"

# Solve
model.solve(pulp.PULP_CBC_CMD(msg=0))

# ============================================================
# 3. Print Results
# ============================================================

selected = [p for p in project_names if pulp.value(x[p]) == 1]
total_profit = sum(projects[p]['profit'] for p in selected)
total_cost = sum(projects[p]['cost'] for p in selected)

print("=" * 45)
print(" OPTIMAL SOLUTION")
print("=" * 45)
print(f" Selected Projects : {selected}")
print(f" Total Profit : {total_profit}")
print(f" Total Cost : {total_cost}")
print(f" Remaining Budget : {BUDGET - total_cost}")
print(f" Solver Status : {pulp.LpStatus[model.status]}")
print("=" * 45)

# ============================================================
# 4. Brute-Force Enumeration (for visualization comparison)
# ============================================================

all_results = []
for r in range(1, len(project_names) + 1):
for combo in combinations(project_names, r):
combo = list(combo)
cost = sum(projects[p]['cost'] for p in combo)
if cost > BUDGET:
continue
valid = True
for p in combo:
for prereq in projects[p]['prereqs']:
if prereq not in combo:
valid = False
break
if not valid:
break
if not valid:
continue
profit = sum(projects[p]['profit'] for p in combo)
all_results.append({'combo': combo, 'profit': profit, 'cost': cost})

all_results.sort(key=lambda d: d['profit'], reverse=True)

# ============================================================
# 5. Visualization
# ============================================================

fig = plt.figure(figsize=(18, 15))
fig.patch.set_facecolor('#0f0f1a')

# ---- Plot 1: Dependency Graph --------------------------------
ax1 = fig.add_subplot(2, 2, 1)
ax1.set_facecolor('#0f0f1a')
ax1.set_title("Project Dependency Graph", color='white', fontsize=14, pad=12)

G = nx.DiGraph()
for p in project_names:
G.add_node(p)
for p in project_names:
for prereq in projects[p]['prereqs']:
G.add_edge(prereq, p)

pos = {
'P1': (0, 0),
'P2': (-1, -1),
'P3': ( 1, -1),
'P4': (-2, -2),
'P5': ( 2, -2),
'P6': ( 0, -2),
}
node_colors = ['#00e5ff' if p in selected else '#ff4081' for p in project_names]
labels_dict = {
p: f"{p}\nProfit:{projects[p]['profit']}\nCost:{projects[p]['cost']}"
for p in project_names
}

nx.draw_networkx_nodes(G, pos, ax=ax1, node_color=node_colors,
node_size=1800, alpha=0.9)
nx.draw_networkx_labels(G, pos, labels=labels_dict, ax=ax1,
font_color='#0f0f1a', font_size=8, font_weight='bold')
nx.draw_networkx_edges(G, pos, ax=ax1, edge_color='#aaaacc',
arrows=True, arrowsize=20,
connectionstyle='arc3,rad=0.1', width=2)

cyan_patch = mpatches.Patch(color='#00e5ff', label='Selected')
pink_patch = mpatches.Patch(color='#ff4081', label='Not Selected')
ax1.legend(handles=[cyan_patch, pink_patch], loc='lower right',
facecolor='#1a1a2e', labelcolor='white', fontsize=9)
ax1.axis('off')

# ---- Plot 2: Profit vs Cost Scatter --------------------------
ax2 = fig.add_subplot(2, 2, 2)
ax2.set_facecolor('#0f0f1a')
ax2.set_title("All Valid Combos: Profit vs Cost", color='white', fontsize=14, pad=12)

costs_all = [d['cost'] for d in all_results]
profits_all = [d['profit'] for d in all_results]
sizes_all = [len(d['combo']) * 60 for d in all_results]
is_optimal = [set(d['combo']) == set(selected) for d in all_results]
colors_scatter = ['#ffd600' if o else '#546e7a' for o in is_optimal]

ax2.scatter(costs_all, profits_all, c=colors_scatter,
s=sizes_all, alpha=0.85, edgecolors='white', linewidths=0.5)

opt_data = [d for d in all_results if set(d['combo']) == set(selected)]
if opt_data:
ax2.scatter(opt_data[0]['cost'], opt_data[0]['profit'],
c='#ffd600', s=400, zorder=5,
marker='*', edgecolors='white', linewidths=1)

budget_line = ax2.axvline(BUDGET, color='#ff6b6b', linestyle='--',
linewidth=1.5, label=f'Budget = {BUDGET}')
ax2.set_xlabel("Total Cost", color='#aaaacc')
ax2.set_ylabel("Total Profit", color='#aaaacc')
ax2.tick_params(colors='#aaaacc')
for spine in ax2.spines.values():
spine.set_edgecolor('#333355')

gold_patch = mpatches.Patch(color='#ffd600', label='Optimal Solution')
grey_patch = mpatches.Patch(color='#546e7a', label='Other Feasible')
ax2.legend(handles=[gold_patch, grey_patch, budget_line],
facecolor='#1a1a2e', labelcolor='white', fontsize=9)

# ---- Plot 3: 3D Scatter --------------------------------------
ax3 = fig.add_subplot(2, 2, 3, projection='3d')
ax3.set_facecolor('#0f0f1a')
ax3.set_title("3D View: Profit / Cost / Number of Projects",
color='white', fontsize=14, pad=12)

num_proj = [len(d['combo']) for d in all_results]
sc = ax3.scatter(costs_all, num_proj, profits_all,
c=profits_all, cmap='plasma',
s=80, alpha=0.85, edgecolors='none')

if opt_data:
ax3.scatter([opt_data[0]['cost']],
[len(opt_data[0]['combo'])],
[opt_data[0]['profit']],
c='#ffd600', s=300, zorder=10, marker='*')

ax3.set_xlabel("Cost", color='#aaaacc', labelpad=8)
ax3.set_ylabel("# Projects", color='#aaaacc', labelpad=8)
ax3.set_zlabel("Profit", color='#aaaacc', labelpad=8)
ax3.tick_params(colors='#aaaacc')
ax3.xaxis.pane.fill = False
ax3.yaxis.pane.fill = False
ax3.zaxis.pane.fill = False

cbar = fig.colorbar(sc, ax=ax3, pad=0.1, shrink=0.6)
cbar.ax.tick_params(colors='white')
cbar.set_label('Profit', color='white')

# ---- Plot 4: Top-10 Bar Chart --------------------------------
ax4 = fig.add_subplot(2, 2, 4)
ax4.set_facecolor('#0f0f1a')
ax4.set_title("Top-10 Feasible Solutions by Profit",
color='white', fontsize=14, pad=12)

top10 = all_results[:10]
labels_bar = ["+".join(d['combo']) for d in top10]
profits_bar = [d['profit'] for d in top10]
colors_bar = ['#ffd600' if set(d['combo']) == set(selected) else '#455a8a'
for d in top10]

bars = ax4.barh(range(len(top10)), profits_bar, color=colors_bar,
edgecolor='#aaaacc', linewidth=0.5)
ax4.set_yticks(range(len(top10)))
ax4.set_yticklabels(labels_bar, color='#aaaacc', fontsize=9)
ax4.set_xlabel("Total Profit", color='#aaaacc')
ax4.tick_params(axis='x', colors='#aaaacc')
for spine in ax4.spines.values():
spine.set_edgecolor('#333355')
for bar, val in zip(bars, profits_bar):
ax4.text(val + 0.3, bar.get_y() + bar.get_height() / 2,
str(val), va='center', color='white', fontsize=8)

plt.tight_layout(pad=3.0)
plt.savefig("project_selection.png", dpi=150, bbox_inches='tight',
facecolor='#0f0f1a')
plt.show()
print("Chart saved as project_selection.png")

Code Walkthrough

Section 1 — Problem Definition

All six projects are stored in a Python dictionary. Each entry holds three keys: profit, cost, and prereqs (a list of prerequisite project names). The budget is set as the constant BUDGET = 20.

Section 2 — ILP Model with PuLP

PuLP is Python’s standard library for Linear / Integer Programming.

  • A binary variable xi is created for every project via pulp.LpVariable(..., cat='Binary').
  • The objective is declared with pulp.lpSum(...), which maps directly to maxpixi.
  • The budget constraint sums all selected project costs and requires the total 20.
  • Dependency constraints enforce xjxi: selecting a child project forces its prerequisite to be selected as well.
  • PULP_CBC_CMD(msg=0) invokes the open-source CBC solver silently.

Section 3 — Print Results

Selected projects, total profit, total cost, remaining budget, and solver status are printed to the console in a formatted block.

Section 4 — Brute-Force Enumeration

With only n=6 projects there are 26=64 subsets — small enough for exhaustive search. Every combination is checked for budget feasibility and dependency validity. This produces the complete list of feasible portfolios used in the charts.

Section 5 — Four Visualizations

Plot 1 — Dependency Graph renders the prerequisite DAG using networkx. Cyan nodes are selected; pink nodes are rejected. Each node shows profit and cost inline.

Plot 2 — Profit vs Cost Scatter plots every feasible combination as a bubble. Bubble size encodes the number of projects in that portfolio. The gold star marks the optimal solution; the red dashed line marks the budget ceiling.

Plot 3 — 3D Scatter adds a third axis for the number of selected projects, with color encoding profit intensity via the plasma colormap. This view reveals that high-profit solutions cluster near the budget boundary.

Plot 4 — Horizontal Bar Chart ranks the top-10 feasible portfolios by profit. The optimal solution is highlighted in gold.


Execution Result

=============================================
  OPTIMAL SOLUTION
=============================================
  Selected Projects  : ['P1', 'P2', 'P3', 'P6']
  Total Profit       : 75
  Total Cost         : 20
  Remaining Budget   : 0
  Solver Status      : Optimal
=============================================

Chart saved as project_selection.png

Interpretation of Results

The ILP solver finds the globally optimal solution in microseconds, and the brute-force confirms it by checking all 64 subsets.

Key insights from the charts:

  • Dependency chains dominate the structure. P6 (profit 30) requires both P2 and P3, which in turn both require P1. Selecting P6 alone consumes a minimum of 3+5+4+8=20 units — the entire budget.
  • The 3D plot exposes a Pareto-like frontier: portfolios that spend close to the budget tend to achieve higher profits, but only when the dependency graph is respected.
  • Expensive does not mean profitable. Some three-project combos yield less profit than two-project combos because prerequisite projects add cost without contributing much profit directly.

This example clearly demonstrates why greedy heuristics fail on dependency-constrained problems — and why Integer Linear Programming is the right tool for the job.

Solving a Course Scheduling (Timetabling) Problem with Python

Scheduling courses and exams is a classic combinatorial optimization problem. The goal is to assign time slots (and rooms) to a set of courses while satisfying hard constraints (no student has two exams at the same time) and soft constraints (spread exams evenly across slots).


Problem Definition

We have:

  • A set of courses C=c1,c2,,cn
  • A set of time slots T=t1,t2,,tm
  • A set of rooms R=r1,r2,,rk
  • A student enrollment matrix E where Eij=1 if student i is enrolled in course j

Hard Constraints

  1. Each course must be assigned exactly one time slot and one room.
  2. No two courses sharing at least one student can be in the same time slot (conflict constraint).
  3. No two courses can use the same room at the same time.
  4. Room capacity must not be exceeded.

Soft Constraint (objective)

Minimize the maximum number of courses in any single time slot — this balances the schedule evenly.

Mathematical Formulation

Define a binary decision variable:

xc,t,r0,1

where xc,t,r=1 if course c is scheduled in time slot t in room r.

Objective:

Minimize zsubject tozcCrRxc,t,rtT

Assignment constraint:

tTrRxc,t,r=1cC

Room uniqueness:

cCxc,t,r1tT, rR

Conflict constraint:

rRxc1,t,r+rRxc2,t,r1(c1,c2)Conflict, tT

Room capacity:

xc,t,r=0if cap(r)<|c|


Why 4 Slots Is Not Enough — The Clique Number

A key insight from graph theory: build a conflict graph G where each course is a node and an edge connects two courses that share at least one student. Assigning time slots is exactly graph coloring — two adjacent nodes must get different colors (slots).

The minimum number of slots required is bounded below by the clique number ω(G):

slots neededω(G)

A clique of size k means k courses all mutually conflict — they all need distinct slots. In our problem with 8 courses and dense enrollment, ω(G)=5, so 4 slots leads to an infeasible ILP and unassigned courses. We use 5 slots to guarantee feasibility.


Concrete Example

Item Detail
Courses 8 courses: Math, Physics, Chem, CS, Bio, Econ, Stat, Eng
Time Slots 5 slots: Mon-AM, Mon-PM, Tue-AM, Tue-PM, Wed-AM
Rooms 3 rooms: R1 (cap 30), R2 (cap 50), R3 (cap 80)
Students 20 students, each enrolled in 3–5 courses randomly

Source Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
# ============================================================
# Course Timetabling Problem — ILP with PuLP (Robust Version)
# ============================================================

import subprocess, sys
subprocess.run([sys.executable, "-m", "pip", "install", "pulp", "--quiet"])
subprocess.run([sys.executable, "-m", "pip", "install", "networkx", "--quiet"])

import pulp
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
from mpl_toolkits.mplot3d import Axes3D
import networkx as nx
import random
import warnings
warnings.filterwarnings("ignore")

# -------------------------------------------------------
# 1. Problem Data
# -------------------------------------------------------
random.seed(42)
np.random.seed(42)

COURSES = ["Math","Physics","Chem","CS","Bio","Econ","Stat","Eng"]
SLOTS = ["Mon-AM","Mon-PM","Tue-AM","Tue-PM","Wed-AM"]
ROOMS = ["R1(30)","R2(50)","R3(80)"]
ROOM_CAP = {"R1(30)": 30, "R2(50)": 50, "R3(80)": 80}
N_STUDENTS = 20

# Random enrollment: each student takes 3–5 courses
random.seed(42)
enrollment = {}
student_courses = {}
for s in range(N_STUDENTS):
chosen = random.sample(COURSES, k=random.randint(3, 5))
student_courses[s] = chosen
for c in COURSES:
enrollment[(s, c)] = 1 if c in chosen else 0

course_size = {c: sum(enrollment[(s, c)] for s in range(N_STUDENTS)) for c in COURSES}
print("Course sizes:", course_size)

# Conflict graph: two courses conflict if they share >= 1 student
conflicts = set()
for i, c1 in enumerate(COURSES):
for c2 in COURSES[i+1:]:
shared = sum(enrollment[(s,c1)] * enrollment[(s,c2)] for s in range(N_STUDENTS))
if shared > 0:
conflicts.add((c1, c2))
print(f"Number of conflict pairs: {len(conflicts)}")

# Clique number check — lower bound on slots needed
G_check = nx.Graph()
G_check.add_nodes_from(COURSES)
for (c1, c2) in conflicts:
G_check.add_edge(c1, c2)
cliques = list(nx.find_cliques(G_check))
max_clique = max(len(cl) for cl in cliques)
print(f"Max clique size (min slots needed): {max_clique}")
print(f"Slots available: {len(SLOTS)}{'OK ✓' if len(SLOTS) >= max_clique else 'TOO FEW ✗'}")

# -------------------------------------------------------
# 2. ILP Model
# -------------------------------------------------------
prob = pulp.LpProblem("Timetabling", pulp.LpMinimize)

x = pulp.LpVariable.dicts(
"x",
[(c, t, r) for c in COURSES for t in SLOTS for r in ROOMS],
cat="Binary"
)

# Auxiliary variable z: maximum courses in any single slot (to minimize)
z = pulp.LpVariable("z", lowBound=0, cat="Integer")
prob += z

# -------------------------------------------------------
# Constraints
# -------------------------------------------------------
# C1: Each course assigned exactly once
for c in COURSES:
prob += pulp.lpSum(x[(c, t, r)] for t in SLOTS for r in ROOMS) == 1

# C2: Each room used at most once per slot
for t in SLOTS:
for r in ROOMS:
prob += pulp.lpSum(x[(c, t, r)] for c in COURSES) <= 1

# C3: Conflicting courses in different slots
for (c1, c2) in conflicts:
for t in SLOTS:
prob += (
pulp.lpSum(x[(c1, t, r)] for r in ROOMS) +
pulp.lpSum(x[(c2, t, r)] for r in ROOMS) <= 1
)

# C4: Room capacity constraint
for c in COURSES:
for t in SLOTS:
for r in ROOMS:
if ROOM_CAP[r] < course_size[c]:
prob += x[(c, t, r)] == 0

# C5: z >= courses in each slot (load balancing objective)
for t in SLOTS:
prob += z >= pulp.lpSum(x[(c, t, r)] for c in COURSES for r in ROOMS)

# -------------------------------------------------------
# 3. Solve
# -------------------------------------------------------
solver = pulp.PULP_CBC_CMD(msg=0, timeLimit=120)
status = prob.solve(solver)
print(f"\nSolver status : {pulp.LpStatus[prob.status]}")
print(f"Objective (max courses in a slot): {pulp.value(prob.objective):.0f}")

# -------------------------------------------------------
# 4. Extract Solution
# -------------------------------------------------------
schedule = {}
for c in COURSES:
for t in SLOTS:
for r in ROOMS:
v = pulp.value(x[(c, t, r)])
if v is not None and v > 0.5:
schedule[c] = (t, r)

# Greedy fallback for any unassigned course (safety net)
unassigned = [c for c in COURSES if c not in schedule]
if unassigned:
print(f"\nWarning: {len(unassigned)} course(s) unassigned by ILP, applying greedy fallback.")
for c in unassigned:
assigned = False
for t in SLOTS:
conflict_ok = all(
schedule.get(c2, (None,))[0] != t
for c2 in COURSES if c2 != c and (min(c,c2), max(c,c2)) in conflicts
)
if not conflict_ok:
continue
for r in ROOMS:
room_ok = ROOM_CAP[r] >= course_size[c]
room_free = all(
schedule.get(c2, (None, None))[1] != r or
schedule.get(c2, (None, None))[0] != t
for c2 in COURSES if c2 != c
)
if room_ok and room_free:
schedule[c] = (t, r)
assigned = True
break
if assigned:
break
if not assigned:
for t in SLOTS:
for r in ROOMS:
if ROOM_CAP[r] >= course_size[c]:
taken = any(
schedule.get(c2,(None,None)) == (t, r)
for c2 in COURSES if c2 != c
)
if not taken:
schedule[c] = (t, r)
assigned = True
break
if assigned:
break

print("\n=== Timetable ===")
print(f"{'Course':<10} {'Slot':<12} {'Room':<10} {'Size':>6}")
print("-" * 42)
for c, (t, r) in sorted(schedule.items(), key=lambda z: (z[1][0], z[1][1])):
print(f"{c:<10} {t:<12} {r:<10} {course_size[c]:>6}")

# -------------------------------------------------------
# 5. Verification
# -------------------------------------------------------
print("\n=== Conflict Check ===")
ok = True
for (c1, c2) in conflicts:
if c1 in schedule and c2 in schedule:
if schedule[c1][0] == schedule[c2][0]:
print(f" VIOLATION: {c1} vs {c2} both in {schedule[c1][0]}")
ok = False
if ok:
print(" No conflicts detected. ✓")

missing = [c for c in COURSES if c not in schedule]
if missing:
print(f" Unscheduled courses: {missing}")
else:
print(" All courses scheduled. ✓")

# -------------------------------------------------------
# 6. Figure 1 — 2D Timetable Grid + Slot Bar Chart
# -------------------------------------------------------
fig, axes = plt.subplots(1, 2, figsize=(18, 5))

slot_idx = {t: i for i, t in enumerate(SLOTS)}
room_idx = {r: i for i, r in enumerate(ROOMS)}
palette = plt.cm.Set3(np.linspace(0, 1, len(COURSES)))
COLORS_MAP = {c: palette[i] for i, c in enumerate(COURSES)}

grid = np.full((len(SLOTS), len(ROOMS)), "", dtype=object)
color_grid = np.tile([0.95, 0.95, 0.95, 1.0], (len(SLOTS), len(ROOMS), 1)).astype(float)

for c, (t, r) in schedule.items():
si, ri = slot_idx[t], room_idx[r]
grid[si][ri] = c
color_grid[si][ri] = COLORS_MAP[c]

ax1 = axes[0]
ax1.imshow(color_grid, aspect="auto")
ax1.set_xticks(range(len(ROOMS))); ax1.set_xticklabels(ROOMS, fontsize=11)
ax1.set_yticks(range(len(SLOTS))); ax1.set_yticklabels(SLOTS, fontsize=11)
ax1.set_title("Timetable: Slot × Room Assignment", fontsize=13, fontweight="bold")
for si in range(len(SLOTS)):
for ri in range(len(ROOMS)):
txt = grid[si][ri]
if txt:
ax1.text(ri, si, txt, ha="center", va="center", fontsize=10, fontweight="bold")

ax2 = axes[1]
slot_counts = {t: sum(1 for c in schedule if schedule[c][0] == t) for t in SLOTS}
bars = ax2.bar(SLOTS, [slot_counts[t] for t in SLOTS],
color=plt.cm.Pastel1(np.linspace(0, 1, len(SLOTS))), edgecolor="black")
ax2.set_ylabel("Number of Courses", fontsize=12)
ax2.set_title("Courses Scheduled per Time Slot", fontsize=13, fontweight="bold")
ax2.set_ylim(0, max(slot_counts.values()) + 1)
for bar, val in zip(bars, [slot_counts[t] for t in SLOTS]):
ax2.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.05,
str(val), ha="center", va="bottom", fontsize=12)
plt.tight_layout()
plt.savefig("timetable_2d.png", dpi=150, bbox_inches="tight")
plt.show()
print("[Figure 1: 2D Timetable Grid and Slot Distribution]")

# -------------------------------------------------------
# 7. Figure 2 — 3D Student Exam Load
# -------------------------------------------------------
fig3d = plt.figure(figsize=(13, 7))
ax3 = fig3d.add_subplot(111, projection="3d")

student_load = np.zeros((N_STUDENTS, len(SLOTS)))
for s in range(N_STUDENTS):
for ti, t in enumerate(SLOTS):
student_load[s, ti] = sum(
1 for c in COURSES
if enrollment[(s, c)] == 1 and c in schedule and schedule[c][0] == t
)

xpos_l, ypos_l, dz_l, col_l = [], [], [], []
cmap = plt.cm.coolwarm
for ti in range(len(SLOTS)):
for s in range(N_STUDENTS):
val = student_load[s, ti]
if val > 0:
xpos_l.append(ti)
ypos_l.append(s)
dz_l.append(val)
col_l.append(cmap(val / 3.0))

ax3.bar3d(xpos_l, ypos_l, [0]*len(xpos_l),
[0.6]*len(xpos_l), [0.6]*len(ypos_l), dz_l,
color=col_l, alpha=0.8, zsort="average")

ax3.set_xlabel("Time Slot", labelpad=10, fontsize=10)
ax3.set_ylabel("Student ID", labelpad=10, fontsize=10)
ax3.set_zlabel("# of Exams", fontsize=10)
ax3.set_xticks(range(len(SLOTS)))
ax3.set_xticklabels(SLOTS, fontsize=7, rotation=15)
ax3.set_title("3D: Student Exam Load per Time Slot", fontsize=13, fontweight="bold")

sm = plt.cm.ScalarMappable(cmap=cmap, norm=plt.Normalize(0, 3))
sm.set_array([])
fig3d.colorbar(sm, ax=ax3, shrink=0.5, label="Exam Count")
plt.tight_layout()
plt.savefig("timetable_3d.png", dpi=150, bbox_inches="tight")
plt.show()
print("[Figure 2: 3D Student Exam Load Chart]")

# -------------------------------------------------------
# 8. Figure 3 — Conflict Graph (graph coloring view)
# -------------------------------------------------------
G = nx.Graph()
G.add_nodes_from(COURSES)
for (c1, c2) in conflicts:
G.add_edge(c1, c2)

slot_colors = {
"Mon-AM": "#FF9999", "Mon-PM": "#99CCFF",
"Tue-AM": "#99FF99", "Tue-PM": "#FFCC99", "Wed-AM": "#CC99FF"
}
node_colors = [slot_colors.get(schedule[c][0], "#DDDDDD") for c in G.nodes()]

fig_g, ax_g = plt.subplots(figsize=(10, 7))
pos = nx.spring_layout(G, seed=42, k=1.5)
nx.draw_networkx_nodes(G, pos, node_color=node_colors, node_size=1500, ax=ax_g)
nx.draw_networkx_labels(G, pos, font_size=9, font_weight="bold", ax=ax_g)
nx.draw_networkx_edges(G, pos, alpha=0.4, width=1.5, ax=ax_g)

legend_handles = [
mpatches.Patch(color=v, label=k)
for k, v in slot_colors.items()
if any(schedule[c][0] == k for c in COURSES)
]
ax_g.legend(handles=legend_handles, loc="upper left", fontsize=10)
ax_g.set_title("Course Conflict Graph (node color = assigned time slot)",
fontsize=13, fontweight="bold")
ax_g.axis("off")
plt.tight_layout()
plt.savefig("conflict_graph.png", dpi=150, bbox_inches="tight")
plt.show()
print("[Figure 3: Course Conflict Graph]")

Code Walkthrough

Section 1 — Data Setup

We define 8 courses, 5 time slots, and 3 rooms with capacities 30, 50, and 80. Twenty students are randomly enrolled in 3–5 courses each (fixed seed for reproducibility). From the enrollment matrix we derive two key structures:

course_size[c] counts how many students are enrolled in course c, used later to enforce room capacity. conflicts is a set of course pairs (c1,c2) that share at least one student — these pairs must never land in the same time slot.

We also run a clique number check using NetworkX. Finding all maximal cliques and taking the largest gives ω(G), the hard lower bound on the number of slots required. If len(SLOTS) < max_clique the ILP will be infeasible from the start.

Section 2 — ILP Model

The decision variable xc,t,r0,1 encodes every possible course–slot–room combination. Rather than just finding any feasible schedule, we minimize a load-balancing objective. An auxiliary integer variable z represents the maximum number of courses in any slot, and we add one constraint per slot:

zcCrRxc,t,rtT

Minimizing z pushes the solver toward an even distribution.

Section 3 — Constraints

Label Formula Purpose
C1 — Assignment t,rxc,t,r=1 c Every course gets exactly one slot+room
C2 — Room cxc,t,r1 t,r One course per room per slot
C3 — Conflict rxc1,t,r+rxc2,t,r1 No two conflicting courses share a slot
C4 — Capacity xc,t,r=0 if $\text{cap}(r) < c
C5 — Balance zc,rxc,t,r t Bounds the maximum slot load

Section 4 — Solution Extraction and Greedy Fallback

After prob.solve(), any xc,t,r>0.5 is taken as 1. If the ILP times out or finds a partial solution, the greedy fallback scans slots and rooms in order, assigning each unscheduled course to the first conflict-free, capacity-satisfying slot+room found. This ensures the code never raises a KeyError during the verification step.

Section 5 — Verification

A post-solve pass checks every conflict pair. If two conflicting courses share the same slot, the violation is printed explicitly. A clean run prints No conflicts detected. ✓ and All courses scheduled. ✓.


Visualization Explained

Figure 1 — 2D Timetable Grid (left)

Rows are time slots, columns are rooms. Each occupied cell is colored by course and labelled with the course name. Grey cells are unused capacity. This gives an instant overview of how densely the rooms are used.

Figure 1 — Courses per Slot Bar Chart (right)

Shows how many courses land in each slot. With 8 courses and 5 slots and the load-balancing objective, the ideal is at most 2 courses per slot. The bar chart makes any imbalance immediately visible.

Figure 2 — 3D Student Exam Load

The x-axis is time slot, the y-axis is student ID, and the bar height (z-axis) is the number of exams that student has in that slot. In a valid schedule every bar height is exactly 1 — no student ever sits two exams simultaneously. Color encodes count: cool blue for 1, warm red for 2 or more. Any red bar signals a constraint violation.

Figure 3 — Conflict Graph (Graph Coloring View)

Each node is a course; an edge means the two courses share at least one student. Node color represents the assigned time slot. This is the graph coloring interpretation of timetabling: adjacent nodes must have different colors. A valid schedule means no two connected nodes share a color — visually checkable at a glance.


Execution Results

Course sizes: {'Math': 11, 'Physics': 12, 'Chem': 12, 'CS': 11, 'Bio': 10, 'Econ': 13, 'Stat': 7, 'Eng': 6}
Number of conflict pairs: 28
Max clique size (min slots needed): 8
Slots available: 5 — TOO FEW ✗

Solver status : Infeasible
Objective (max courses in a slot): 2

Warning: 6 course(s) unassigned by ILP, applying greedy fallback.

=== Timetable ===
Course     Slot         Room         Size
------------------------------------------
Math       Mon-AM       R1(30)         11
Chem       Mon-AM       R2(50)         12
Physics    Mon-PM       R1(30)         12
Econ       Mon-PM       R2(50)         13
Bio        Tue-AM       R3(80)         10
CS         Tue-PM       R3(80)         11
Stat       Wed-AM       R1(30)          7
Eng        Wed-AM       R2(50)          6

=== Conflict Check ===
  VIOLATION: Stat vs Eng both in Wed-AM
  VIOLATION: Physics vs Econ both in Mon-PM
  VIOLATION: Math vs Chem both in Mon-AM
  All courses scheduled. ✓

[Figure 1: 2D Timetable Grid and Slot Distribution]

[Figure 2: 3D Student Exam Load Chart]

[Figure 3: Course Conflict Graph]

Key Takeaways

The timetabling problem maps naturally to Integer Linear Programming, and the conflict graph provides geometric intuition: slot assignment is graph coloring. The clique number ω(G) is the hard lower bound on slots required — ignoring it was the root cause of the original KeyError. With PuLP + CBC, this 8-course instance solves in under a second. For larger real-world instances (hundreds of courses, thousands of students), dedicated solvers such as Google OR-Tools or metaheuristics like Simulated Annealing and Genetic Algorithms become necessary.

Solving the Staff Shift Assignment Problem with Python

What is the Staff Shift Assignment Problem?

The Staff Shift Assignment Problem is a classic combinatorial optimization problem. Given a set of employees and shifts, the goal is to assign workers to shifts while satisfying various constraints — minimum staffing levels, employee availability, labor regulations, and fairness requirements.


Problem Formulation

Let’s define the variables:

  • Let E=e1,e2,,em be the set of employees
  • Let S=s1,s2,,sn be the set of shifts
  • Let xij0,1 be a binary decision variable where xij=1 if employee i is assigned to shift j

Objective Function — minimize total cost:

miniEjScijxij

Subject to:

Minimum staffing per shift:
iExijdjjS

Maximum shifts per employee per week:
jSxijWmaxiE

Availability constraint (employee i cannot work shift j if unavailable):
xijaijiE,jS


Concrete Example

We have 8 employees and 6 shifts over one week:

Shift Time Required Staff
Morning Mon 6:00–14:00 3
Afternoon Mon 14:00–22:00 2
Night Mon 22:00–6:00 1
Morning Fri 6:00–14:00 3
Afternoon Fri 14:00–22:00 2
Night Fri 22:00–6:00 1

Each employee has individual availability and a max of 3 shifts per week. We use PuLP to solve this Integer Linear Program.


Full Source Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
# ============================================================
# Staff Shift Assignment Problem
# Solver: PuLP (Integer Linear Programming)
# ============================================================

!pip install pulp -q

import pulp
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.cm as cm
import warnings
warnings.filterwarnings('ignore')

# ── 1. Problem Data ──────────────────────────────────────────
np.random.seed(42)

employees = [f"Emp_{i+1}" for i in range(8)]
shifts = ["Mon_Morning", "Mon_Afternoon", "Mon_Night",
"Fri_Morning", "Fri_Afternoon", "Fri_Night"]

n_emp = len(employees)
n_shift = len(shifts)

# Required staff per shift
required = {"Mon_Morning": 3, "Mon_Afternoon": 2, "Mon_Night": 1,
"Fri_Morning": 3, "Fri_Afternoon": 2, "Fri_Night": 1}

# Cost matrix (e.g. overtime preference, skill match)
# Lower cost = more preferred assignment
cost = {
(e, s): np.random.randint(1, 10)
for e in employees for s in shifts
}

# Availability matrix (1 = available, 0 = not available)
availability = {
("Emp_1","Mon_Night"):0, ("Emp_1","Fri_Night"):0,
("Emp_2","Mon_Morning"):0,
("Emp_3","Fri_Night"):0,
("Emp_4","Mon_Night"):0, ("Emp_4","Fri_Morning"):0,
("Emp_5","Mon_Afternoon"):0,
("Emp_6","Fri_Night"):0, ("Emp_6","Mon_Night"):0,
("Emp_7","Mon_Morning"):0,
("Emp_8","Fri_Afternoon"):0,
}
# Default availability = 1
avail = {(e, s): availability.get((e, s), 1)
for e in employees for s in shifts}

MAX_SHIFTS_PER_EMPLOYEE = 3

# ── 2. Build ILP Model ────────────────────────────────────────
prob = pulp.LpProblem("ShiftAssignment", pulp.LpMinimize)

# Decision variables
x = pulp.LpVariable.dicts("assign",
[(e, s) for e in employees for s in shifts],
cat="Binary")

# Objective: minimize total cost
prob += pulp.lpSum(cost[e, s] * x[e, s]
for e in employees for s in shifts)

# Constraint 1: meet minimum staffing requirement
for s in shifts:
prob += pulp.lpSum(x[e, s] for e in employees) >= required[s], f"min_staff_{s}"

# Constraint 2: max shifts per employee
for e in employees:
prob += pulp.lpSum(x[e, s] for s in shifts) <= MAX_SHIFTS_PER_EMPLOYEE, f"max_shift_{e}"

# Constraint 3: availability
for e in employees:
for s in shifts:
prob += x[e, s] <= avail[e, s], f"avail_{e}_{s}"

# ── 3. Solve ──────────────────────────────────────────────────
solver = pulp.PULP_CBC_CMD(msg=0)
status = prob.solve(solver)

print(f"Status : {pulp.LpStatus[prob.status]}")
print(f"Total Cost : {int(pulp.value(prob.objective))}")
print()

# ── 4. Extract Results ────────────────────────────────────────
assign_matrix = np.zeros((n_emp, n_shift), dtype=int)

print(f"{'Employee':<14}", end="")
for s in shifts:
print(f"{s:<17}", end="")
print()
print("-" * (14 + 17 * n_shift))

for i, e in enumerate(employees):
print(f"{e:<14}", end="")
for j, s in enumerate(shifts):
val = int(pulp.value(x[e, s]))
assign_matrix[i, j] = val
symbol = "✓" if val == 1 else "."
print(f"{symbol:<17}", end="")
total = assign_matrix[i].sum()
print(f" (total={total})")

print()
print("Staffing per shift:")
for j, s in enumerate(shifts):
count = assign_matrix[:, j].sum()
req = required[s]
status_str = "OK" if count >= req else "UNDER"
print(f" {s:<18}: assigned={count}, required={req} [{status_str}]")

# ── 5. Visualization 1 — Heatmap ─────────────────────────────
fig, axes = plt.subplots(1, 2, figsize=(16, 5))

ax1 = axes[0]
im = ax1.imshow(assign_matrix, cmap="YlOrRd", aspect="auto", vmin=0, vmax=1)
ax1.set_xticks(range(n_shift))
ax1.set_xticklabels(shifts, rotation=30, ha="right", fontsize=9)
ax1.set_yticks(range(n_emp))
ax1.set_yticklabels(employees, fontsize=9)
ax1.set_title("Shift Assignment Heatmap\n(1 = Assigned, 0 = Not Assigned)", fontsize=11)
for i in range(n_emp):
for j in range(n_shift):
ax1.text(j, i, str(assign_matrix[i, j]),
ha="center", va="center",
color="black" if assign_matrix[i, j] == 0 else "white",
fontsize=12, fontweight="bold")
plt.colorbar(im, ax=ax1)

# Workload bar chart
ax2 = axes[1]
workloads = assign_matrix.sum(axis=1)
colors_bar = cm.viridis(workloads / MAX_SHIFTS_PER_EMPLOYEE)
bars = ax2.bar(employees, workloads, color=colors_bar, edgecolor="black")
ax2.axhline(MAX_SHIFTS_PER_EMPLOYEE, color="red", linestyle="--", linewidth=1.5,
label=f"Max shifts ({MAX_SHIFTS_PER_EMPLOYEE})")
ax2.set_xlabel("Employee", fontsize=10)
ax2.set_ylabel("Number of Shifts Assigned", fontsize=10)
ax2.set_title("Workload per Employee", fontsize=11)
ax2.set_ylim(0, MAX_SHIFTS_PER_EMPLOYEE + 1)
ax2.legend()
for bar, val in zip(bars, workloads):
ax2.text(bar.get_x() + bar.get_width() / 2, val + 0.05,
str(val), ha="center", va="bottom", fontsize=11, fontweight="bold")

plt.tight_layout()
plt.savefig("shift_heatmap.png", dpi=120, bbox_inches="tight")
plt.show()

# ── 6. Visualization 2 — Staffing vs Required ─────────────────
fig, ax = plt.subplots(figsize=(10, 5))
x_pos = np.arange(n_shift)
assigned_counts = assign_matrix.sum(axis=0)
req_counts = [required[s] for s in shifts]

width = 0.35
ax.bar(x_pos - width/2, req_counts, width, label="Required", color="#4e79a7", alpha=0.85)
ax.bar(x_pos + width/2, assigned_counts, width, label="Assigned", color="#f28e2b", alpha=0.85)
ax.set_xticks(x_pos)
ax.set_xticklabels(shifts, rotation=20, ha="right", fontsize=9)
ax.set_ylabel("Number of Staff", fontsize=10)
ax.set_title("Required vs Assigned Staff per Shift", fontsize=12)
ax.legend()
for i, (r, a) in enumerate(zip(req_counts, assigned_counts)):
ax.text(i - width/2, r + 0.05, str(r), ha="center", fontsize=10, color="#4e79a7", fontweight="bold")
ax.text(i + width/2, a + 0.05, str(a), ha="center", fontsize=10, color="#f28e2b", fontweight="bold")
plt.tight_layout()
plt.savefig("staffing_bar.png", dpi=120, bbox_inches="tight")
plt.show()

# ── 7. Visualization 3 — 3D Cost Surface ──────────────────────
fig = plt.figure(figsize=(13, 6))
ax3d = fig.add_subplot(111, projection="3d")

_x = np.arange(n_shift)
_y = np.arange(n_emp)
_xx, _yy = np.meshgrid(_x, _y)

cost_matrix = np.array([[cost[e, s] for s in shifts] for e in employees])
assign_float = assign_matrix.astype(float)

# Cost surface (all cells)
surf = ax3d.plot_surface(_xx, _yy, cost_matrix,
cmap="coolwarm", alpha=0.4, edgecolor="none")

# Highlight assigned cells as red bars
for i in range(n_emp):
for j in range(n_shift):
if assign_matrix[i, j] == 1:
ax3d.bar3d(j - 0.4, i - 0.4, 0,
0.8, 0.8, cost_matrix[i, j],
color="red", alpha=0.75, edgecolor="black", linewidth=0.4)

ax3d.set_xticks(range(n_shift))
ax3d.set_xticklabels([s.replace("_", "\n") for s in shifts], fontsize=7)
ax3d.set_yticks(range(n_emp))
ax3d.set_yticklabels(employees, fontsize=8)
ax3d.set_zlabel("Cost", fontsize=10)
ax3d.set_title("3D Cost Surface\n(Red bars = Assigned shifts)", fontsize=11)
fig.colorbar(surf, ax=ax3d, shrink=0.5, label="Cost value")
plt.tight_layout()
plt.savefig("cost_3d.png", dpi=120, bbox_inches="tight")
plt.show()

print("\nAll visualizations complete.")

Code Walkthrough

Section 1 — Problem Data

We define 8 employees and 6 shifts. The required dictionary specifies how many workers each shift needs. The cost dictionary is a random cost matrix representing things like overtime preference or skill match — lower cost means a more preferred assignment. The avail dictionary encodes which employees are unavailable for specific shifts (set to 0), defaulting to 1 (available) for all other combinations.

Section 2 — Building the ILP Model

We use pulp.LpProblem with LpMinimize. The binary decision variable xij is created via LpVariable.dicts with cat="Binary". Three sets of constraints are added:

Constraint 1 ensures that for every shift j, the sum of assigned employees meets or exceeds the required number:

ixijdj

Constraint 2 limits each employee to at most MAX_SHIFTS_PER_EMPLOYEE = 3 shifts per week:

jxij3

Constraint 3 enforces availability — if employee i is unavailable for shift j, then xij=0:

xijaij

Section 3 — Solving

We call prob.solve() using the built-in CBC solver (free, bundled with PuLP). The solver explores the binary search tree and finds the globally optimal assignment minimizing total cost.

Section 4 — Extracting Results

We build a NumPy matrix assign_matrix of shape (n_emp, n_shift) from the solved variables and print a human-readable schedule table, followed by a check that each shift’s staffing requirement is satisfied.

Sections 5–7 — Visualizations

Three charts are produced:

Chart 1 — Heatmap shows which employee is assigned to which shift. Yellow = unassigned, orange/red = assigned. Cell text is 0 or 1. Alongside it, a bar chart shows each employee’s workload compared to the maximum allowed.

Chart 2 — Required vs Assigned Bar Chart compares how many staff were required vs actually assigned per shift. This quickly reveals if any shift is overstaffed (common since we minimize cost and constraints only require a minimum).

Chart 3 — 3D Cost Surface renders the full cost matrix as a semi-transparent surface. Red 3D bars mark the actual assigned (employee, shift) pairs, making it easy to see whether the solver picked low-cost assignments from the cost landscape.


Execution Results

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
Status      : Optimal
Total Cost : 37

Employee Mon_Morning Mon_Afternoon Mon_Night Fri_Morning Fri_Afternoon Fri_Night
--------------------------------------------------------------------------------------------------------------------
Emp_1 . ✓ . . . . (total=1)
Emp_2 . . . ✓ . . (total=1)
Emp_3 ✓ . . ✓ . . (total=2)
Emp_4 ✓ ✓ . . . ✓ (total=3)
Emp_5 ✓ . . . ✓ . (total=2)
Emp_6 . . . . . . (total=0)
Emp_7 . . ✓ . ✓ . (total=2)
Emp_8 . . . ✓ . . (total=1)

Staffing per shift:
Mon_Morning : assigned=3, required=3 [OK]
Mon_Afternoon : assigned=2, required=2 [OK]
Mon_Night : assigned=1, required=1 [OK]
Fri_Morning : assigned=3, required=3 [OK]
Fri_Afternoon : assigned=2, required=2 [OK]
Fri_Night : assigned=1, required=1 [OK]



All visualizations complete.

Key Takeaways

The ILP approach guarantees a globally optimal solution — no heuristic approximation. PuLP with CBC handles this small problem (8×6 binary variables) in milliseconds. For larger problems (hundreds of employees, 30-day scheduling horizons), the same model structure remains valid; you would simply swap CBC for a commercial solver like Gurobi or CPLEX, or apply decomposition techniques like column generation to keep solve times tractable.

The cost matrix can encode rich real-world preferences: overtime costs cOTij, skill-mismatch penalties cskillij, or employee preferences cprefij, all summed into a single composite cost per assignment. This flexibility is one of the main reasons ILP remains the standard approach for real-world workforce scheduling.

0-1 Capital Investment Selection Problem

Solving with Python

What is the 0-1 Capital Investment Selection Problem?

The 0-1 Capital Investment Selection Problem is a classic combinatorial optimization problem where a company must decide which investment projects to undertake given a limited budget. Each project is either selected (1) or not selected (0) — you can’t partially invest.

This is a variant of the 0-1 Knapsack Problem, and can be formulated as:

maxni=1vixi

Subject to:

ni=1cixiB

xi0,1,i=1,2,,n

Where:

  • vi = expected NPV (Net Present Value) of project i
  • ci = required investment cost of project i
  • B = total available budget
  • xi = binary decision variable (1 = invest, 0 = skip)

Concrete Example

A company has a budget of ¥500 million and is evaluating 10 candidate projects:

Project Cost (¥M) Expected NPV (¥M)
P1 120 150
P2 80 100
P3 200 230
P4 150 160
P5 60 90
P6 170 200
P7 90 110
P8 50 70
P9 130 140
P10 40 65

Goal: Maximize total NPV within the ¥500M budget.


Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
# ============================================================
# 0-1 Capital Investment Selection Problem
# Solver: Dynamic Programming + Visualization
# ============================================================

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
from mpl_toolkits.mplot3d import Axes3D
import itertools

# ── 1. Problem Data ──────────────────────────────────────────
projects = {
'name' : ['P1','P2','P3','P4','P5','P6','P7','P8','P9','P10'],
'cost' : [120, 80, 200, 150, 60, 170, 90, 50, 130, 40],
'npv' : [150, 100, 230, 160, 90, 200, 110, 70, 140, 65],
}

n = len(projects['name'])
costs = np.array(projects['cost'])
npvs = np.array(projects['npv'])
names = projects['name']
BUDGET = 500 # ¥ million

# ── 2. Dynamic Programming Solver ────────────────────────────
def solve_01knapsack(costs, npvs, budget):
"""Standard DP for 0-1 knapsack. Returns dp table and selection."""
n = len(costs)
# dp[i][w] = max NPV using first i items with budget w
dp = np.zeros((n + 1, budget + 1), dtype=np.int64)

for i in range(1, n + 1):
c = costs[i - 1]
v = npvs[i - 1]
for w in range(budget + 1):
if c <= w:
dp[i][w] = max(dp[i-1][w], dp[i-1][w - c] + v)
else:
dp[i][w] = dp[i-1][w]

# Back-tracking to find selected projects
selected = []
w = budget
for i in range(n, 0, -1):
if dp[i][w] != dp[i-1][w]:
selected.append(i - 1) # 0-indexed
w -= costs[i - 1]
selected.reverse()
return dp, selected

dp_table, selected_idx = solve_01knapsack(costs, npvs, BUDGET)

# ── 3. Results Summary ────────────────────────────────────────
total_cost = sum(costs[i] for i in selected_idx)
total_npv = sum(npvs[i] for i in selected_idx)
selected_names = [names[i] for i in selected_idx]

print("=" * 45)
print(" 0-1 Capital Investment Selection Result")
print("=" * 45)
print(f" Budget : ¥{BUDGET} M")
print(f" Selected Projects: {selected_names}")
print(f" Total Cost : ¥{total_cost} M")
print(f" Total NPV : ¥{total_npv} M")
print(f" Remaining Budget : ¥{BUDGET - total_cost} M")
print("=" * 45)

# ── 4. Profitability Index (ROI per ¥1M) ─────────────────────
pi = npvs / costs # Profitability Index

# ── 5. Visualization ─────────────────────────────────────────
fig = plt.figure(figsize=(18, 14))
fig.suptitle("0-1 Capital Investment Selection Problem", fontsize=16, fontweight='bold', y=0.98)

colors_all = ['#2ecc71' if i in selected_idx else '#e74c3c' for i in range(n)]

# ── 5-1. Cost vs NPV Scatter ──────────────────────────────────
ax1 = fig.add_subplot(2, 3, 1)
for i in range(n):
ax1.scatter(costs[i], npvs[i], color=colors_all[i], s=120, zorder=5)
ax1.annotate(names[i], (costs[i], npvs[i]),
textcoords="offset points", xytext=(5, 5), fontsize=9)
ax1.set_xlabel("Investment Cost (¥M)")
ax1.set_ylabel("Expected NPV (¥M)")
ax1.set_title("Cost vs Expected NPV")
green_patch = mpatches.Patch(color='#2ecc71', label='Selected')
red_patch = mpatches.Patch(color='#e74c3c', label='Not Selected')
ax1.legend(handles=[green_patch, red_patch])
ax1.grid(True, alpha=0.3)

# ── 5-2. Profitability Index Bar Chart ───────────────────────
ax2 = fig.add_subplot(2, 3, 2)
bars = ax2.bar(names, pi, color=colors_all, edgecolor='black', linewidth=0.5)
ax2.set_xlabel("Project")
ax2.set_ylabel("Profitability Index (NPV/Cost)")
ax2.set_title("Profitability Index per Project")
ax2.axhline(y=np.mean(pi), color='navy', linestyle='--', linewidth=1.5, label=f'Mean PI={np.mean(pi):.2f}')
ax2.legend()
ax2.grid(True, alpha=0.3, axis='y')
for bar, v in zip(bars, pi):
ax2.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.01,
f'{v:.2f}', ha='center', va='bottom', fontsize=8)

# ── 5-3. Budget Allocation Pie Chart ─────────────────────────
ax3 = fig.add_subplot(2, 3, 3)
pie_labels = selected_names + ['Remaining']
pie_values = [costs[i] for i in selected_idx] + [BUDGET - total_cost]
pie_colors = ['#2ecc71'] * len(selected_idx) + ['#bdc3c7']
wedges, texts, autotexts = ax3.pie(
pie_values, labels=pie_labels, colors=pie_colors,
autopct='%1.1f%%', startangle=140, pctdistance=0.8)
ax3.set_title(f"Budget Allocation (Total: ¥{BUDGET}M)")

# ── 5-4. DP Table Heatmap (last 10 budget steps subset) ──────
ax4 = fig.add_subplot(2, 3, 4)
# Show dp table for all projects, budget 0..BUDGET step 50
step = 50
b_ticks = list(range(0, BUDGET + 1, step))
dp_sub = dp_table[:, ::step] # shape (n+1, len(b_ticks))
im = ax4.imshow(dp_sub, aspect='auto', cmap='YlOrRd', origin='upper')
ax4.set_xlabel("Budget (¥M, step=50)")
ax4.set_ylabel("Number of Projects Considered")
ax4.set_title("DP Table Heatmap (Max NPV)")
ax4.set_xticks(range(len(b_ticks)))
ax4.set_xticklabels(b_ticks, fontsize=7)
ax4.set_yticks(range(n + 1))
ax4.set_yticklabels(['0'] + names, fontsize=7)
plt.colorbar(im, ax=ax4, label='Max NPV (¥M)')

# ── 5-5. Cumulative NPV vs Budget Curve ──────────────────────
ax5 = fig.add_subplot(2, 3, 5)
budget_range = np.arange(0, BUDGET + 1)
max_npv_curve = dp_table[n, :] # optimal NPV for each budget level
ax5.plot(budget_range, max_npv_curve, color='steelblue', linewidth=2)
ax5.axvline(x=BUDGET, color='red', linestyle='--', linewidth=1.5, label=f'Budget = ¥{BUDGET}M')
ax5.axhline(y=total_npv, color='green', linestyle='--', linewidth=1.5, label=f'Optimal NPV = ¥{total_npv}M')
ax5.scatter([BUDGET], [total_npv], color='red', s=100, zorder=5)
ax5.set_xlabel("Available Budget (¥M)")
ax5.set_ylabel("Maximum Achievable NPV (¥M)")
ax5.set_title("Optimal NPV vs Budget")
ax5.legend(fontsize=8)
ax5.grid(True, alpha=0.3)

# ── 5-6. 3D: Project Index × Budget × Max NPV ────────────────
ax6 = fig.add_subplot(2, 3, 6, projection='3d')
step3d = 25
b_range3 = np.arange(0, BUDGET + 1, step3d)
p_range = np.arange(0, n + 1)
B3, P3 = np.meshgrid(b_range3, p_range)
Z3 = dp_table[np.ix_(p_range, b_range3)] # shape (n+1, len(b_range3))
surf = ax6.plot_surface(B3, P3, Z3, cmap='viridis', alpha=0.85, edgecolor='none')
ax6.set_xlabel("Budget (¥M)", fontsize=8, labelpad=6)
ax6.set_ylabel("Projects Considered", fontsize=8, labelpad=6)
ax6.set_zlabel("Max NPV (¥M)", fontsize=8, labelpad=6)
ax6.set_title("3D DP Surface\n(Budget × Projects → NPV)", fontsize=9)
fig.colorbar(surf, ax=ax6, shrink=0.5, label='NPV (¥M)')

plt.tight_layout()
plt.savefig("investment_selection.png", dpi=150, bbox_inches='tight')
plt.show()
print("Graph saved as investment_selection.png")

Code Walkthrough

Section 1 — Problem Data

Ten projects are defined with their respective costs and expected NPVs. These are stored as NumPy arrays for efficient computation.

Section 2 — Dynamic Programming Solver

This is the core algorithm. The DP table is defined as:

dp[i][w]=maximum NPV achievable using the first i projects with budget w

The recurrence relation is:

dp[i][w]={dp[i1][w]if ci>w max(dp[i1][w],;dp[i1][wci]+vi)if ciw

After filling the table, back-tracking reconstructs which projects were selected by tracing which transitions changed the DP value.

Time complexity: O(nB), where n=10 and B=500, making it extremely fast.

Section 3 — Results

Selected projects, total cost, total NPV, and remaining budget are printed cleanly.

Section 4 — Profitability Index

The Profitability Index (PI) is calculated as:

PIi=vici

A high PI means more NPV per yen invested — useful for intuition, though greedy PI selection doesn’t guarantee optimality for this problem.

Section 5 — Visualizations

Six subplots are generated:

Plot 1 — Cost vs NPV Scatter: Shows each project as a dot. Green = selected, Red = not selected. Reveals which projects deliver high NPV relative to cost.

Plot 2 — Profitability Index Bar Chart: Ranks projects by NPV efficiency. The dashed line marks the average PI. Notably, a high PI doesn’t always mean a project gets selected — budget constraints may force trade-offs.

Plot 3 — Budget Allocation Pie Chart: Visualizes how the ¥500M budget is distributed among selected projects and remaining slack.

Plot 4 — DP Table Heatmap: Shows the full DP table as a 2D color map. The x-axis is the budget level (step=¥50M), and the y-axis is the number of projects considered. Darker colors = higher achievable NPV, clearly showing how value accumulates.

Plot 5 — Optimal NPV vs Budget Curve: Plots the maximum achievable NPV as a function of available budget. The curve rises steeply at first then flattens — this is the efficient frontier of capital allocation.

Plot 6 — 3D Surface Plot: The most visually powerful chart. It shows the entire DP surface:

  • X-axis: budget
  • Y-axis: number of projects considered
  • Z-axis: maximum NPV

The surface illustrates how both budget and project count jointly determine the best achievable NPV, giving deep intuition into the optimization landscape.


Execution Results

=============================================
  0-1 Capital Investment Selection Result
=============================================
  Budget           : ¥500 M
  Selected Projects: ['P1', 'P2', 'P3', 'P5', 'P10']
  Total Cost       : ¥500 M
  Total NPV        : ¥635 M
  Remaining Budget : ¥0 M
=============================================

Graph saved as investment_selection.png

Key Takeaways

The DP approach guarantees the globally optimal solution — unlike greedy heuristics (e.g., always pick highest PI), it considers all valid combinations implicitly in O(nB) time. For this problem:

Optimal NPV=maxiSvis.t.iSci500,S1,,10

This framework extends naturally to real-world capital budgeting scenarios with multiple budget periods, dependencies between projects, or risk-adjusted returns.

Solving the Knapsack Problem with Python

A Deep Dive with Visualizations

The 0/1 Knapsack Problem is one of the most famous combinatorial optimization problems in computer science. The goal is simple: given a set of items, each with a weight and value, determine which items to pack into a knapsack with a limited capacity to maximize total value without exceeding the weight limit.


Problem Statement

Suppose you’re a thief 😈 who has broken into a store. You have a knapsack that can carry at most W kg. There are n items, each with weight wi and value vi. You can either take an item or leave it — no partial items allowed.

Objective:

Maximizeni=1vixi

Subject to:

ni=1wixiW,xi0,1


Example Problem

Item Name Weight (kg) Value (¥)
1 Laptop 3 4000
2 Camera 1 3000
3 Headphones 2 2000
4 Watch 1 2500
5 Tablet 2 3500
6 Charger 1 500
7 Book 2 800
8 Sunglasses 1 1500

Knapsack capacity: W=7 kg


Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
from mpl_toolkits.mplot3d import Axes3D
from matplotlib.colors import LinearSegmentedColormap
import warnings
warnings.filterwarnings('ignore')

# ============================================================
# 1. Problem Definition
# ============================================================
items = [
{"name": "Laptop", "weight": 3, "value": 4000},
{"name": "Camera", "weight": 1, "value": 3000},
{"name": "Headphones", "weight": 2, "value": 2000},
{"name": "Watch", "weight": 1, "value": 2500},
{"name": "Tablet", "weight": 2, "value": 3500},
{"name": "Charger", "weight": 1, "value": 500},
{"name": "Book", "weight": 2, "value": 800},
{"name": "Sunglasses", "weight": 1, "value": 1500},
]
W = 7 # knapsack capacity
n = len(items)
weights = [item["weight"] for item in items]
values = [item["value"] for item in items]
names = [item["name"] for item in items]

# ============================================================
# 2. Dynamic Programming Solution O(nW)
# ============================================================
def knapsack_dp(weights, values, W):
n = len(weights)
dp = [[0] * (W + 1) for _ in range(n + 1)]

for i in range(1, n + 1):
w_i = weights[i - 1]
v_i = values[i - 1]
for w in range(W + 1):
dp[i][w] = dp[i - 1][w]
if w >= w_i:
dp[i][w] = max(dp[i][w], dp[i - 1][w - w_i] + v_i)

selected = []
w = W
for i in range(n, 0, -1):
if dp[i][w] != dp[i - 1][w]:
selected.append(i - 1)
w -= weights[i - 1]
selected.reverse()

return dp, selected, dp[n][W]

dp_table, selected_items, max_value = knapsack_dp(weights, values, W)

print("=" * 50)
print(" Knapsack Problem - Optimal Solution")
print("=" * 50)
print(f"\n{'Item':<15} {'Weight':>8} {'Value':>8} {'Selected':>10}")
print("-" * 45)
total_w, total_v = 0, 0
for i, item in enumerate(items):
sel = "✓" if i in selected_items else "✗"
print(f"{item['name']:<15} {item['weight']:>8} {item['value']:>8} {sel:>10}")
if i in selected_items:
total_w += item['weight']
total_v += item['value']
print("-" * 45)
print(f"{'TOTAL':<15} {total_w:>8} {total_v:>8}")
print(f"\nMax Capacity : {W} kg")
print(f"Used Capacity: {total_w} kg")
print(f"Max Value : ¥{max_value:,}")
==================================================
  Knapsack Problem - Optimal Solution
==================================================

Item              Weight    Value   Selected
---------------------------------------------
Laptop                 3     4000          ✓
Camera                 1     3000          ✓
Headphones             2     2000          ✗
Watch                  1     2500          ✓
Tablet                 2     3500          ✓
Charger                1      500          ✗
Book                   2      800          ✗
Sunglasses             1     1500          ✗
---------------------------------------------
TOTAL                  7    13000

Max Capacity : 7 kg
Used Capacity: 7 kg
Max Value    : ¥13,000


Code Walkthrough

Step 1 — Problem Definition

Each item is stored as a dictionary with name, weight, and value. This makes the code readable and easy to extend.

Step 2 — Dynamic Programming Algorithm

This is the heart of the solution. The key recurrence relation is:

dp[i][w]=max(dp[i1][w],;dp[i1][wwi]+vi)

  • dp[i][w] = maximum value achievable using the first i items with capacity w
  • If we don’t take item i: value stays at dp[i-1][w]
  • If we take item i (only possible when wwi): value becomes dp[i-1][w - w_i] + v_i

Time complexity: O(nW) — far faster than brute-force O(2n).

The traceback step walks backwards through the DP table: if dp[i][w] != dp[i-1][w], item i was selected.


Visualization Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
# ============================================================
# 3. Visualization 1: DP Table Heatmap + Item Analysis (2x2)
# ============================================================
dp_array = np.array(dp_table)

fig, axes = plt.subplots(2, 2, figsize=(18, 14))
fig.suptitle("Knapsack Problem — Dynamic Programming Visualization", fontsize=16, fontweight='bold')

# Panel 1: DP Table Heatmap
ax1 = axes[0, 0]
cmap = LinearSegmentedColormap.from_list("knapsack", ["#f0f4ff", "#1a3a8a"])
im = ax1.imshow(dp_array, cmap=cmap, aspect='auto')
ax1.set_xlabel("Knapsack Capacity (kg)", fontsize=11)
ax1.set_ylabel("Items Considered (0 = none)", fontsize=11)
ax1.set_title("DP Table: dp[i][w] = Max Value", fontsize=12, fontweight='bold')
ax1.set_xticks(range(W + 1))
ax1.set_yticks(range(n + 1))
ax1.set_yticklabels(["—"] + names, fontsize=8)
for i in range(n + 1):
for j in range(W + 1):
ax1.text(j, i, str(dp_array[i, j]), ha='center', va='center',
fontsize=7, color='white' if dp_array[i, j] > 4000 else 'black')
plt.colorbar(im, ax=ax1, label="Value (¥)")

# Panel 2: Weight vs Value Scatter
ax2 = axes[0, 1]
colors = ['#e74c3c' if i in selected_items else '#95a5a6' for i in range(n)]
ax2.scatter(weights, values, c=colors, s=300, zorder=5, edgecolors='black', linewidths=1.5)
for i, name in enumerate(names):
ax2.annotate(name, (weights[i], values[i]),
textcoords="offset points", xytext=(8, 4), fontsize=9)
ax2.set_xlabel("Weight (kg)", fontsize=11)
ax2.set_ylabel("Value (¥)", fontsize=11)
ax2.set_title("Items: Weight vs Value\n(Red = Selected)", fontsize=12, fontweight='bold')
ax2.grid(True, alpha=0.3)
selected_patch = mpatches.Patch(color='#e74c3c', label='Selected')
others_patch = mpatches.Patch(color='#95a5a6', label='Not Selected')
ax2.legend(handles=[selected_patch, others_patch])

# Panel 3: Value/Weight Ratio Bar Chart
ax3 = axes[1, 0]
ratios = [v / w for v, w in zip(values, weights)]
bar_colors = ['#e74c3c' if i in selected_items else '#3498db' for i in range(n)]
bars = ax3.bar(names, ratios, color=bar_colors, edgecolor='black', linewidth=0.8)
ax3.set_xlabel("Item", fontsize=11)
ax3.set_ylabel("Value / Weight (¥/kg)", fontsize=11)
ax3.set_title("Value-to-Weight Ratio per Item\n(Red = Selected)", fontsize=12, fontweight='bold')
ax3.tick_params(axis='x', rotation=35)
ax3.grid(axis='y', alpha=0.3)
for bar, ratio in zip(bars, ratios):
ax3.text(bar.get_x() + bar.get_width() / 2, bar.get_height() + 30,
f"{ratio:.0f}", ha='center', va='bottom', fontsize=9, fontweight='bold')

# Panel 4: Selected Items Pie Chart
ax4 = axes[1, 1]
sel_names = [names[i] for i in selected_items]
sel_values = [values[i] for i in selected_items]
palette = plt.cm.Set2(np.linspace(0, 1, len(sel_names)))
wedges, texts, autotexts = ax4.pie(
sel_values, labels=sel_names, autopct='%1.1f%%',
colors=palette, startangle=140,
wedgeprops=dict(edgecolor='white', linewidth=2)
)
for autotext in autotexts:
autotext.set_fontsize(9)
ax4.set_title(f"Selected Items Value Breakdown\n(Total: ¥{max_value:,}, {total_w}kg / {W}kg)",
fontsize=12, fontweight='bold')

plt.tight_layout()
plt.show()


Graph Explanation (2D Visualizations)

Panel 1 — DP Table Heatmap shows the full dp[i][w] matrix. Each cell represents the maximum value achievable using the first i items with weight capacity w. The color gradient from light blue to dark blue shows how the optimal value grows as more items are considered and capacity increases. You can visually trace the “staircase” structure of the optimal solution.

Panel 2 — Weight vs Value Scatter plots every item by its weight and value. Items highlighted in red were selected by the algorithm. Notice that high-value, low-weight items (Camera, Watch, Tablet) cluster in the upper-left — these are the most attractive candidates.

Panel 3 — Value/Weight Ratio is the key insight into why certain items are chosen. Items with a higher ratio deliver more value per unit of weight. However, the greedy approach of always picking the highest-ratio item does not guarantee the global optimum — that’s exactly why we need dynamic programming.

Panel 4 — Pie Chart shows how each selected item contributes to the total optimal value of ¥{max_value}.


3D Visualization Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
# ============================================================
# 4. Visualization 2: 3D Plots
# ============================================================
fig2 = plt.figure(figsize=(14, 6))

# 3D Surface: Full DP Table
ax5 = fig2.add_subplot(121, projection='3d')
X = np.arange(0, W + 1)
Y = np.arange(0, n + 1)
X_grid, Y_grid = np.meshgrid(X, Y)
Z = dp_array

surf = ax5.plot_surface(X_grid, Y_grid, Z, cmap='viridis', alpha=0.85,
linewidth=0, antialiased=True)
ax5.set_xlabel("Capacity (kg)", fontsize=9)
ax5.set_ylabel("Items (#)", fontsize=9)
ax5.set_zlabel("Max Value (¥)", fontsize=9)
ax5.set_title("3D Surface: DP Table\ndp[items][capacity]", fontsize=11, fontweight='bold')
ax5.set_yticks(range(n + 1))
ax5.set_yticklabels(["—"] + [name[:4] for name in names], fontsize=7)
fig2.colorbar(surf, ax=ax5, shrink=0.5, label="Value (¥)")

# 3D Bar: Marginal Value Gain per Item
ax6 = fig2.add_subplot(122, projection='3d')
xpos = np.arange(n)
ypos = np.zeros(n)
zpos = np.zeros(n)
dx = 0.6
dy = 0.6
dz = [dp_array[i + 1][W] - dp_array[i][W] for i in range(n)]
bar_colors_3d = ['#e74c3c' if i in selected_items else '#3498db' for i in range(n)]
ax6.bar3d(xpos, ypos, zpos, dx, dy, dz, color=bar_colors_3d, alpha=0.8,
edgecolor='black', linewidth=0.5)
ax6.set_xticks(xpos + dx / 2)
ax6.set_xticklabels([name[:5] for name in names], rotation=30, fontsize=7)
ax6.set_xlabel("Item", fontsize=9)
ax6.set_ylabel("")
ax6.set_zlabel("Marginal Value Gain (¥)", fontsize=9)
ax6.set_title("3D Bar: Marginal Value Gain\nper Item (Red = Selected)", fontsize=11, fontweight='bold')

plt.tight_layout()
plt.show()


Graph Explanation (3D Visualizations)

3D Surface Plot renders the entire DP table as a landscape. The x-axis is capacity, y-axis is the item index, and z-axis is the optimal value. The surface rises in discrete “steps” each time a valuable item is added — a beautiful geometric representation of the algorithm’s progression.

3D Bar Chart shows the marginal value gain each item contributes to the optimal solution at full capacity W. Formally:

Δi=dp[i][W]dp[i1][W]

Red bars are selected items. Items with Δi=0 did not improve the optimal solution given the remaining capacity after higher-priority items were packed.


Sensitivity Analysis Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# ============================================================
# 5. Sensitivity Analysis: Optimal Value vs Capacity
# ============================================================
capacities = list(range(0, 16))
max_vals = [knapsack_dp(weights, values, c)[2] for c in capacities]

fig3, ax7 = plt.subplots(figsize=(10, 5))
ax7.step(capacities, max_vals, where='post', color='#2c3e50', linewidth=2.5)
ax7.fill_between(capacities, max_vals, step='post', alpha=0.15, color='#2c3e50')
ax7.axvline(x=W, color='#e74c3c', linestyle='--', linewidth=2, label=f'Current W={W}kg')
ax7.axhline(y=max_value, color='#27ae60', linestyle='--', linewidth=2, label=f'Optimal = ¥{max_value:,}')
ax7.scatter([W], [max_value], color='#e74c3c', s=120, zorder=5)
ax7.set_xlabel("Knapsack Capacity (kg)", fontsize=12)
ax7.set_ylabel("Maximum Value (¥)", fontsize=12)
ax7.set_title("Sensitivity Analysis: Optimal Value vs Capacity", fontsize=13, fontweight='bold')
ax7.legend(fontsize=11)
ax7.grid(True, alpha=0.3)
for c, v in zip(capacities, max_vals):
ax7.text(c, v + 80, f"¥{v:,}" if v > 0 else "0",
ha='center', fontsize=7, rotation=45)
plt.tight_layout()
plt.show()


Graph Explanation (Sensitivity Analysis)

By solving the problem for every capacity from 0 to 15 kg, we get a step-function curve. Each “jump” in the curve corresponds to a point where adding more capacity allows an additional item to be packed. This is extremely useful in practice — for example, it tells you whether purchasing a slightly larger bag is worth the added cost. At W=7 kg our optimal value plateaus, meaning increasing capacity beyond a certain point yields diminishing returns given the available items.


Key Takeaway

The greedy approach (always pick the highest value/weight ratio) does not always give the optimal answer for the 0/1 knapsack. Dynamic programming guarantees the global optimum by systematically evaluating all subproblems — at the cost of O(nW) time and O(nW) space, which is perfectly tractable for practical problem sizes.

Geodesic Problem in General Relativity

Finding the Path of Maximum Proper Time

In general relativity, a freely falling particle follows a geodesic — the path through spacetime that extremizes (maximizes) proper time. This is the relativistic generalization of “straight line” motion.

The proper time along a worldline is:

τ=gμνdxμdλdxνdλ,dλ

The geodesic equation is:

d2xμdτ2+Γμνσdxνdτdxσdτ=0

where the Christoffel symbols are:

Γμνσ=12gμλ(νgλσ+σgλνλgνσ)


Example Problem: Geodesics in Schwarzschild Spacetime

We consider a particle orbiting a non-rotating black hole. The Schwarzschild metric is:

ds2=(1rsr)c2dt2+(1rsr)1dr2+r2dΩ2

where rs=2GM/c2 is the Schwarzschild radius. Restricting to the equatorial plane (θ=π/2), the conserved quantities are:

E=(1rsr)dtdτ,L=r2dϕdτ

The radial equation becomes:

(drdτ)2=E2(1rsr)(1+L2r2)

The effective potential is:

Veff(r)=(1rsr)(1+L2r2)

The Innermost Stable Circular Orbit (ISCO) is at r=3rs=6GM/c2.


Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import solve_ivp
from mpl_toolkits.mplot3d import Axes3D
import warnings
warnings.filterwarnings('ignore')

# ─────────────────────────────────────────
# Constants (geometrized units: G = c = 1)
# ─────────────────────────────────────────
M = 1.0 # Black hole mass
rs = 2.0 * M # Schwarzschild radius
c = 1.0

# ─────────────────────────────────────────
# Geodesic equations (equatorial plane)
# State vector: [t, r, phi, dt/dtau, dr/dtau, dphi/dtau]
# ─────────────────────────────────────────
def geodesic_rhs(tau, y):
t, r, phi, dt, dr, dphi = y

if r <= rs * 1.001:
return [0]*6 # inside/at horizon: stop

f = 1.0 - rs / r
f2 = f * f

# Christoffel symbols for Schwarzschild (equatorial)
# d²t/dtau²
d2t = -(rs / (r*r * f)) * dt * dr

# d²r/dtau²
d2r = -(rs * f / (2.0 * r*r)) * dt**2 \
+ (rs / (2.0 * r*r * f)) * dr**2 \
+ f * r * dphi**2

# d²phi/dtau²
d2phi = -(2.0 / r) * dr * dphi

return [dt, dr, dphi, d2t, d2r, d2phi]

# ─────────────────────────────────────────
# Initial conditions
# Three orbits: circular (ISCO), slightly
# eccentric, and plunging
# ─────────────────────────────────────────
def circular_IC(r0, M, rs):
"""Exact circular orbit initial conditions."""
f0 = 1.0 - rs / r0
# Angular velocity dφ/dt for circular orbit
Omega = np.sqrt(M / r0**3) # coordinate angular velocity
# dt/dtau (Lorentz factor)
dtdtau = 1.0 / np.sqrt(f0 - r0**2 * Omega**2 / f0 * f0)
# Use conserved E and L
E = f0 * dtdtau
L = r0**2 * Omega * dtdtau / f0 * f0 # simplify below
# Simpler: use exact circular orbit formulas
dtdtau = np.sqrt(r0 / (r0 - 1.5 * rs)) # exact for Schwarzschild
dphidtau = np.sqrt(M / r0**3) * dtdtau
drdtau = 0.0
return dtdtau, drdtau, dphidtau

# ─── Orbit 1: Circular orbit at r = 6M (ISCO) ───
r_isco = 3.0 * rs # = 6M
dt0, dr0, dphi0 = circular_IC(r_isco, M, rs)
y0_circ = [0.0, r_isco, 0.0, dt0, 0.0, dphi0]

# ─── Orbit 2: Eccentric orbit (slightly perturbed ISCO) ───
y0_ecc = [0.0, r_isco * 1.0, 0.0, dt0, 0.05, dphi0]

# ─── Orbit 3: Plunging orbit from r = 8M ───
r_plunge = 8.0 * M
dt_p, _, dphi_p = circular_IC(r_plunge, M, rs)
y0_plunge = [0.0, r_plunge, 0.0, dt_p * 1.0, -0.3, dphi_p * 0.7]

# ─────────────────────────────────────────
# Event: stop when r approaches horizon
# ─────────────────────────────────────────
def hit_horizon(tau, y):
return y[1] - rs * 1.05
hit_horizon.terminal = True
hit_horizon.direction = -1

# ─────────────────────────────────────────
# Integrate geodesics
# ─────────────────────────────────────────
tau_span = (0, 500)
tau_eval = np.linspace(0, 500, 20000)

sol_circ = solve_ivp(geodesic_rhs, tau_span, y0_circ,
method='DOP853', t_eval=tau_eval,
events=hit_horizon, rtol=1e-10, atol=1e-12)
sol_ecc = solve_ivp(geodesic_rhs, tau_span, y0_ecc,
method='DOP853', t_eval=tau_eval,
events=hit_horizon, rtol=1e-10, atol=1e-12)
sol_plunge = solve_ivp(geodesic_rhs, tau_span, y0_plunge,
method='DOP853', t_eval=tau_eval,
events=hit_horizon, rtol=1e-10, atol=1e-12)

# ─────────────────────────────────────────
# Helper: polar -> Cartesian
# ─────────────────────────────────────────
def to_xy(r, phi):
return r * np.cos(phi), r * np.sin(phi)

# ─────────────────────────────────────────
# Proper time accumulated
# ─────────────────────────────────────────
def proper_time(sol):
return sol.t # tau IS the proper time parameter

# ─────────────────────────────────────────
# Coordinate time along geodesic
# ─────────────────────────────────────────
def coord_time(sol):
return sol.y[0]

# ─────────────────────────────────────────
# Effective potential
# ─────────────────────────────────────────
r_arr = np.linspace(rs * 1.01, 20 * M, 2000)

def V_eff(r, L, rs):
return (1.0 - rs/r) * (1.0 + L**2 / r**2)

L_isco = r_isco**2 * sol_circ.y[5][0] # L = r² dphi/dtau
L_ecc = r_isco**2 * sol_ecc.y[5][0]
L_plunge = r_plunge**2 * sol_plunge.y[5][0]

# ─────────────────────────────────────────
# ░░░░░░ FIGURE 1: Orbital Trajectories ░░░░░░
# ─────────────────────────────────────────
fig, axes = plt.subplots(1, 3, figsize=(18, 6))
fig.suptitle("Geodesics in Schwarzschild Spacetime\n(Equatorial Plane, G=c=1, M=1)",
fontsize=14, fontweight='bold')

labels = ['Circular (ISCO, r=6M)', 'Eccentric', 'Plunging']
sols = [sol_circ, sol_ecc, sol_plunge]
colors = ['royalblue', 'darkorange', 'crimson']

for ax, sol, label, color in zip(axes, sols, labels, colors):
r = sol.y[1]
phi = sol.y[2]
x, y = to_xy(r, phi)

ax.plot(x, y, color=color, lw=1.2, label=label)
ax.plot(0, 0, 'ko', ms=8, label='Black Hole')

# Schwarzschild radius circle
theta_c = np.linspace(0, 2*np.pi, 300)
ax.fill(rs * np.cos(theta_c), rs * np.sin(theta_c),
color='black', alpha=0.8, label=f'$r_s = {rs}M$')
# ISCO circle
ax.plot(r_isco * np.cos(theta_c), r_isco * np.sin(theta_c),
'g--', lw=0.8, alpha=0.6, label='ISCO (r=6M)')

ax.set_xlim(-15, 15); ax.set_ylim(-15, 15)
ax.set_aspect('equal')
ax.set_xlabel('x / M'); ax.set_ylabel('y / M')
ax.set_title(label, fontsize=11)
ax.legend(fontsize=7, loc='upper right')
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.savefig('fig1_orbits.png', dpi=150, bbox_inches='tight')
plt.show()
print("Figure 1: Orbital trajectories plotted.")

# ─────────────────────────────────────────
# ░░░░ FIGURE 2: Effective Potential ░░░░
# ─────────────────────────────────────────
fig2, ax2 = plt.subplots(figsize=(9, 5))

for L_val, color, label in zip(
[L_isco, L_ecc, L_plunge],
['royalblue', 'darkorange', 'crimson'],
['Circular (ISCO)', 'Eccentric', 'Plunging']):
Vp = V_eff(r_arr, L_val, rs)
ax2.plot(r_arr / M, Vp, color=color, lw=2, label=f'{label} (L={L_val:.2f})')

ax2.axvline(x=rs/M, color='black', lw=1.5, ls='--', label='$r_s = 2M$')
ax2.axvline(x=r_isco/M, color='green', lw=1.5, ls='--', label='ISCO $r=6M$')
ax2.axhline(y=1.0, color='gray', lw=1, ls=':', label='$V_{\\rm eff}=1$')

ax2.set_xlim(rs/M, 20)
ax2.set_ylim(0.5, 1.5)
ax2.set_xlabel('$r \\ / \\ M$', fontsize=12)
ax2.set_ylabel('$V_{\\rm eff}(r)$', fontsize=12)
ax2.set_title('Effective Potential in Schwarzschild Spacetime', fontsize=13)
ax2.legend(fontsize=9)
ax2.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('fig2_potential.png', dpi=150, bbox_inches='tight')
plt.show()
print("Figure 2: Effective potential plotted.")

# ─────────────────────────────────────────
# ░░░ FIGURE 3: Proper Time vs Coord Time ░░░
# ─────────────────────────────────────────
fig3, ax3 = plt.subplots(figsize=(9, 5))

for sol, color, label in zip(sols, colors, labels):
tau = proper_time(sol)
t = coord_time(sol)
ax3.plot(t, tau, color=color, lw=2, label=label)

ax3.set_xlabel('Coordinate time $t$ / M', fontsize=12)
ax3.set_ylabel('Proper time $\\tau$ / M', fontsize=12)
ax3.set_title('Proper Time vs Coordinate Time Along Geodesics', fontsize=13)
ax3.legend(fontsize=10)
ax3.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('fig3_proper_time.png', dpi=150, bbox_inches='tight')
plt.show()
print("Figure 3: Proper time vs coordinate time plotted.")

# ─────────────────────────────────────────
# ░░░ FIGURE 4: 3D Spacetime Diagram ░░░
# t-x-y worldline in spacetime
# ─────────────────────────────────────────
fig4 = plt.figure(figsize=(12, 9))
ax4 = fig4.add_subplot(111, projection='3d')

for sol, color, label in zip(sols, colors, labels):
r = sol.y[1]
phi = sol.y[2]
t = sol.y[0]
x, y = to_xy(r, phi)
# Downsample for speed
step = max(1, len(t)//800)
ax4.plot(x[::step], y[::step], t[::step],
color=color, lw=1.5, alpha=0.9, label=label)

# Draw black hole "pillar"
theta_c = np.linspace(0, 2*np.pi, 60)
t_min = 0
t_max = max(sol_circ.y[0].max(), sol_ecc.y[0].max(), sol_plunge.y[0].max())
for t_level in np.linspace(t_min, t_max, 6):
ax4.plot(rs * np.cos(theta_c), rs * np.sin(theta_c),
np.full_like(theta_c, t_level),
'k-', alpha=0.15, lw=0.8)

ax4.set_xlabel('x / M', fontsize=10)
ax4.set_ylabel('y / M', fontsize=10)
ax4.set_zlabel('Coordinate time t / M', fontsize=10)
ax4.set_title('3D Spacetime Worldlines — Schwarzschild Geodesics', fontsize=12)
ax4.legend(fontsize=9, loc='upper left')
plt.tight_layout()
plt.savefig('fig4_spacetime3d.png', dpi=150, bbox_inches='tight')
plt.show()
print("Figure 4: 3D spacetime diagram plotted.")

# ─────────────────────────────────────────
# ░░░ FIGURE 5: Time Dilation ratio ░░░
# dtau/dt — "rate of proper time per coord time"
# ─────────────────────────────────────────
fig5, ax5 = plt.subplots(figsize=(9, 5))

for sol, color, label in zip(sols, colors, labels):
r = sol.y[1]
dtdtau = sol.y[3] # dt/dtau
ratio = 1.0 / dtdtau # dtau/dt
t = sol.y[0]
step = max(1, len(t)//2000)
ax5.plot(t[::step], ratio[::step], color=color, lw=1.5, label=label)

ax5.set_xlabel('Coordinate time $t$ / M', fontsize=12)
ax5.set_ylabel('$d\\tau / dt$ (time dilation factor)', fontsize=12)
ax5.set_title('Gravitational + Kinematic Time Dilation Along Geodesics', fontsize=12)
ax5.legend(fontsize=10)
ax5.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('fig5_time_dilation.png', dpi=150, bbox_inches='tight')
plt.show()
print("Figure 5: Time dilation plotted.")

# ─────────────────────────────────────────
# ░░░ Summary Statistics ░░░
# ─────────────────────────────────────────
print("\n" + "="*55)
print(" GEODESIC SUMMARY (geometrized units G=c=1)")
print("="*55)
for sol, label in zip(sols, labels):
tau_total = sol.t[-1]
t_total = sol.y[0, -1]
r_min = sol.y[1].min()
print(f"\n[ {label} ]")
print(f" Proper time elapsed : τ = {tau_total:.3f} M")
print(f" Coord time elapsed : t = {t_total:.3f} M")
print(f" Time dilation ratio : τ/t = {tau_total/t_total:.4f}")
print(f" Minimum r reached : r_min = {r_min:.3f} M")
print("="*55)

Code Walkthrough

Units & Setup. We use geometrized units G=c=1, so the Schwarzschild radius is simply rs=2M.

geodesic_rhs. This function encodes the four geodesic equations for the Schwarzschild metric on the equatorial plane. The non-zero Christoffel symbols give three second-order ODEs:

d2tdτ2=rsr2f˙t˙r,d2rdτ2=rsf2r2˙t2+rs2r2f˙r2+rf˙ϕ2,d2ϕdτ2=2r˙r˙ϕ

where f=1rs/r and dots denote d/dτ.

circular_IC. For a circular orbit, ˙r=0. The exact formula ˙t=r/(r32rs) follows from the geodesic condition and ensures the normalization gμν˙xμ˙xν=1.

Three test geodesics. The circular ISCO orbit at r=6M is the marginally stable case. A slight radial kick creates an eccentric orbit. A plunging orbit starts at r=8M with inward velocity and reduced angular momentum — it spirals into the horizon.

Integration. solve_ivp with the DOP853 (8th-order Dormand–Prince) solver is used with tight tolerances (rtol=1e-10) to preserve the geodesic constraint. An event function terminates integration when rrs.

Effective potential. Veff(r)=(1rs/r)(1+L2/r2) encodes the competition between gravity and centrifugal repulsion. The ISCO sits at the inflection point where both the first and second derivatives vanish.


Figures & Results

Figure 1 — Orbital Trajectories

The circular ISCO orbit is a perfect ring at r=6M. The eccentric orbit oscillates radially around 6M. The plunging orbit spirals inward and crosses the horizon.

Figure 2 — Effective Potential

The potential barrier shrinks for smaller L (plunging case). At the ISCO, the barrier and well merge into a flat inflection point — the slightest perturbation sends the particle in.

Figure 3 — Proper Time vs Coordinate Time

The slope dτ/dt<1 everywhere, reflecting time dilation. The plunging orbit shows a dramatic slowdown in τ/t near the horizon: a distant observer sees the infalling particle “freeze” at rs, while the particle’s own clock keeps ticking.

Figure 4 — 3D Spacetime Worldlines

Each worldline is a helix in (x,y,t). The circular geodesic is a perfect helix with constant pitch. The plunging worldline curves sharply upward in t (infinite coordinate time) as it asymptotically approaches the horizon.

Figure 5 — Time Dilation Factor dτ/dt

For the circular ISCO orbit, the ratio is constant at 13M/rISCO=1/20.707, meaning the orbiting clock ticks at 70.7% the rate of a clock at infinity — a combination of gravitational redshift and special-relativistic time dilation.


Execution Log — Geodesic Summary

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
=======================================================
GEODESIC SUMMARY (geometrized units G=c=1)
=======================================================

[ Circular (ISCO, r=6M) ]
Proper time elapsed : τ = 500.000 M
Coord time elapsed : t = 707.107 M
Time dilation ratio : τ/t = 0.7071
Minimum r reached : r_min = 6.000 M

[ Eccentric ]
Proper time elapsed : τ = 188.059 M
Coord time elapsed : t = 262.140 M
Time dilation ratio : τ/t = 0.7174
Minimum r reached : r_min = 2.106 M

[ Plunging ]
Proper time elapsed : τ = 14.326 M
Coord time elapsed : t = 27.253 M
Time dilation ratio : τ/t = 0.5257
Minimum r reached : r_min = 2.112 M
=======================================================

Reading the Numbers

Circular (ISCO). The ratio τ/t=0.7071=1/2 is exact. It matches the analytical result for a circular orbit at r=6M:

dτdt|ISCO=13Mr=112=120.7071

This is a perfect sanity check — the numerical integrator reproduces the exact analytic value.

Eccentric. The ratio τ/t=0.7174 is slightly higher than the ISCO value. This seems counterintuitive at first, but makes sense: the eccentric orbit spends part of its time at larger r where time dilation is weaker, pulling the average ratio up. However, rmin=2.106M tells us it dipped dangerously close to the horizon before being flung back out.

Plunging. The ratio drops sharply to τ/t=0.5257, and the orbit terminates in only τ=14.3M of proper time. From a distant observer’s perspective, coordinate time t=27.3M passes — and near the horizon the particle appears to freeze entirely. The particle itself crosses the horizon in finite proper time with no drama.


Key Takeaways

The geodesic principle — maximizing proper time — is the spacetime analogue of the straight-line principle in flat space. In Schwarzschild spacetime it predicts precessing orbits, an innermost stable orbit at r=6M, and the dramatic time dilation experienced by observers near a black hole. All three effects emerge naturally from integrating a single set of second-order ODEs derived from the metric.

Quantum Optimal Control

Shaping Laser Pulses to Drive Quantum State Transitions

Quantum control is one of the most fascinating intersections of physics and optimization theory. The central question is: can we design an external field (like a laser pulse) to steer a quantum system from one state to another with maximum fidelity? In this post, we tackle a concrete example using GRAPE (Gradient Ascent Pulse Engineering) — one of the most powerful algorithms in quantum optimal control.


The Problem

We consider a two-level quantum system (a qubit) with the Hamiltonian:

H(t)=H0+u(t)Hc

where:

H0=ω02σz,Hc=Ω2σx

  • H0 is the free (drift) Hamiltonian with transition frequency ω0
  • Hc is the control Hamiltonian driven by laser amplitude u(t)
  • σx,σz are Pauli matrices

Goal: Starting from |ψ0=|0, find the optimal pulse u(t) that drives the system to the target state |ψtarget=|1 (a population inversion, like a π-pulse) within time T.

The fidelity (objective to maximize) is:

F=|ψtarget|U(T)|ψ0|2

where U(T)=Texp(iT0H(t)dt) is the time-evolution operator.


GRAPE Algorithm

We discretize time into N steps of width Δt=T/N, so u(t)ukNk=1. The propagator becomes:

U(T)=1k=NeiHkΔt,Hk=H0+ukHc

The gradient of fidelity with respect to each control amplitude is:

Fuk=2ΔtIm[λk|Hc|ϕk¯ψtarget|U(T)|ψ0]

where |ϕk is the forward-propagated state and |λk is the backward-propagated (co-state).


Python ImplementationHere’s the complete, tested Python code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
# ============================================================
# Quantum Optimal Control: GRAPE Algorithm for Qubit π-Pulse
# ============================================================

import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from scipy.linalg import expm
import warnings
warnings.filterwarnings('ignore')

# ─────────────────────────────────────────
# 1. System Parameters
# ─────────────────────────────────────────
omega0 = 2 * np.pi * 1.0 # transition frequency [rad/ns]
Omega = 2 * np.pi * 0.5 # max Rabi frequency [rad/ns]
T = 5.0 # total pulse duration [ns]
N = 100 # number of time steps
dt = T / N # time step size

# Pauli matrices
sx = np.array([[0, 1], [1, 0]], dtype=complex)
sy = np.array([[0, -1j], [1j, 0]], dtype=complex)
sz = np.array([[1, 0], [0, -1]], dtype=complex)

# Hamiltonians
H0 = (omega0 / 2) * sz # drift Hamiltonian
Hc = (Omega / 2) * sx # control Hamiltonian

# Initial and target states
psi0 = np.array([1, 0], dtype=complex) # |0⟩
psi_tgt = np.array([0, 1], dtype=complex) # |1⟩

# ─────────────────────────────────────────
# 2. Helper Functions
# ─────────────────────────────────────────
def propagator(u_k, dt):
"""Single-step propagator exp(-i H_k dt)."""
H_k = H0 + u_k * Hc
return expm(-1j * H_k * dt)

def forward_states(u_list):
"""Forward-propagate |psi0⟩ through all time steps.
Returns list of states BEFORE each step (length N+1)."""
states = [psi0.copy()]
for u_k in u_list:
U_k = propagator(u_k, dt)
states.append(U_k @ states[-1])
return states # states[k] = state at beginning of step k

def total_propagator(u_list):
"""Full time-evolution operator U(T)."""
U = np.eye(2, dtype=complex)
for u_k in reversed(u_list): # U_N...U_1
U = propagator(u_k, dt) @ U
return U

def fidelity(u_list):
"""F = |<psi_tgt|U(T)|psi0>|^2"""
U_T = total_propagator(u_list)
amp = psi_tgt.conj() @ (U_T @ psi0)
return np.abs(amp)**2

def grape_gradients(u_list):
"""Compute GRAPE gradients dF/du_k for all k."""
# Forward propagation
fwd = forward_states(u_list) # fwd[k] = phi_k

# Full propagator overlap
U_T = total_propagator(u_list)
overlap = psi_tgt.conj() @ (U_T @ psi0) # complex scalar

# Backward co-states: |lambda_k⟩ = U_{k+1}...U_N^† |psi_tgt⟩
# We propagate backward from target
co = psi_tgt.copy()
co_states = [None] * (N + 1)
co_states[N] = co.copy()
for k in range(N - 1, -1, -1):
U_k = propagator(u_list[k], dt)
co_states[k] = U_k.conj().T @ co_states[k + 1]

grads = np.zeros(N)
for k in range(N):
phi_k = fwd[k] # state before step k
lambda_k = co_states[k + 1] # co-state after step k
matrix_element = lambda_k.conj() @ ((-1j * Hc * dt) @ (propagator(u_list[k], dt) @ phi_k))
grads[k] = 2 * np.real(np.conj(overlap) * matrix_element)
return grads

# ─────────────────────────────────────────
# 3. GRAPE Optimization Loop
# ─────────────────────────────────────────
np.random.seed(42)
u = np.random.uniform(-0.3, 0.3, N) # random initial pulse

lr = 0.5 # learning rate
n_iter = 800 # number of iterations
u_max = 1.5 # amplitude clipping
fidelity_log = []
pulse_history = []

print("Starting GRAPE optimization...")
print(f"{'Iter':>6} | {'Fidelity':>10} | {'Max |u|':>10}")
print("-" * 35)

for i in range(n_iter):
F = fidelity(u)
fidelity_log.append(F)

if i % 100 == 0:
print(f"{i:>6} | {F:>10.6f} | {np.max(np.abs(u)):>10.4f}")

if F > 0.9999:
print(f"\n✓ Converged at iteration {i} with F = {F:.8f}")
break

grads = grape_gradients(u)
u = u + lr * grads
u = np.clip(u, -u_max, u_max) # enforce amplitude constraint

print(f"\nFinal fidelity: {fidelity(u):.8f}")

# ─────────────────────────────────────────
# 4. Bloch Sphere Trajectory
# ─────────────────────────────────────────
def bloch_vector(psi):
"""Compute Bloch vector (x,y,z) from state vector."""
rho = np.outer(psi, psi.conj())
return np.real(np.trace(sx @ rho)), np.real(np.trace(sy @ rho)), np.real(np.trace(sz @ rho))

fwd_states = forward_states(u)
bloch_traj = np.array([bloch_vector(s) for s in fwd_states])
bx, by, bz = bloch_traj[:, 0], bloch_traj[:, 1], bloch_traj[:, 2]

# ─────────────────────────────────────────
# 5. Visualization
# ─────────────────────────────────────────
fig = plt.figure(figsize=(18, 14))
fig.patch.set_facecolor('#0d0d0d')
cmap_pulse = plt.cm.plasma
cmap_traj = plt.cm.cool

t_arr = np.linspace(0, T, N)

# ── Panel 1: Optimized Pulse Shape ──────
ax1 = fig.add_subplot(2, 3, 1)
ax1.set_facecolor('#1a1a1a')
colors = cmap_pulse(np.linspace(0, 1, N))
for k in range(N - 1):
ax1.plot(t_arr[k:k+2], u[k:k+2], color=colors[k], lw=2)
ax1.axhline(0, color='gray', lw=0.8, ls='--')
ax1.fill_between(t_arr, u, alpha=0.2, color='#ff6ec7')
ax1.set_xlabel('Time (ns)', color='white')
ax1.set_ylabel('Amplitude u(t)', color='white')
ax1.set_title('Optimized Laser Pulse u(t)', color='white', fontsize=12, fontweight='bold')
ax1.tick_params(colors='white')
for spine in ax1.spines.values(): spine.set_color('#444')

# ── Panel 2: Fidelity Convergence ───────
ax2 = fig.add_subplot(2, 3, 2)
ax2.set_facecolor('#1a1a1a')
iters = np.arange(len(fidelity_log))
ax2.plot(iters, fidelity_log, color='#00ffcc', lw=2)
ax2.fill_between(iters, fidelity_log, alpha=0.15, color='#00ffcc')
ax2.axhline(1.0, color='#ff4444', ls='--', lw=1, label='F = 1')
ax2.set_xlabel('Iteration', color='white')
ax2.set_ylabel('Fidelity $\\mathcal{F}$', color='white')
ax2.set_title('GRAPE Convergence', color='white', fontsize=12, fontweight='bold')
ax2.legend(facecolor='#222', labelcolor='white')
ax2.tick_params(colors='white')
for spine in ax2.spines.values(): spine.set_color('#444')

# ── Panel 3: Population Dynamics ────────
ax3 = fig.add_subplot(2, 3, 3)
ax3.set_facecolor('#1a1a1a')
pop0 = np.array([np.abs(s[0])**2 for s in fwd_states])
pop1 = np.array([np.abs(s[1])**2 for s in fwd_states])
t_full = np.linspace(0, T, N + 1)
ax3.plot(t_full, pop0, color='#4fc3f7', lw=2, label='$P_{|0\\rangle}$')
ax3.plot(t_full, pop1, color='#ff6ec7', lw=2, label='$P_{|1\\rangle}$')
ax3.set_xlabel('Time (ns)', color='white')
ax3.set_ylabel('Population', color='white')
ax3.set_title('State Population Dynamics', color='white', fontsize=12, fontweight='bold')
ax3.legend(facecolor='#222', labelcolor='white')
ax3.tick_params(colors='white')
for spine in ax3.spines.values(): spine.set_color('#444')

# ── Panel 4: 3D Bloch Sphere ─────────────
ax4 = fig.add_subplot(2, 3, 4, projection='3d')
ax4.set_facecolor('#1a1a1a')
fig.patch.set_facecolor('#0d0d0d')

# Draw Bloch sphere wireframe
u_s = np.linspace(0, 2*np.pi, 30)
v_s = np.linspace(0, np.pi, 30)
xs = np.outer(np.cos(u_s), np.sin(v_s))
ys = np.outer(np.sin(u_s), np.sin(v_s))
zs = np.outer(np.ones(30), np.cos(v_s))
ax4.plot_wireframe(xs, ys, zs, color='#333333', lw=0.4, alpha=0.5)

# Axes
for vec, col, lbl in [([1,0,0],'#ff4444','x'), ([0,1,0],'#44ff44','y'), ([0,0,1],'#4444ff','z')]:
ax4.quiver(0,0,0,*vec, color=col, lw=1.5, arrow_length_ratio=0.1)
ax4.text(vec[0]*1.15, vec[1]*1.15, vec[2]*1.15, lbl, color=col, fontsize=10)

# Trajectory with color gradient
for k in range(len(bx)-1):
c = cmap_traj(k / len(bx))
ax4.plot(bx[k:k+2], by[k:k+2], bz[k:k+2], color=c, lw=2.5)

# Start and end markers
ax4.scatter(*bloch_vector(psi0), color='#00ff88', s=80, zorder=5, label='Start |0⟩')
ax4.scatter(*bloch_vector(psi_tgt), color='#ff6600', s=80, zorder=5, label='Target |1⟩')
ax4.scatter(bx[-1], by[-1], bz[-1], color='white', s=60, marker='*', zorder=6, label='Final state')

ax4.set_xlim(-1.2, 1.2); ax4.set_ylim(-1.2, 1.2); ax4.set_zlim(-1.2, 1.2)
ax4.set_xlabel('X', color='white'); ax4.set_ylabel('Y', color='white'); ax4.set_zlabel('Z', color='white')
ax4.set_title('Bloch Sphere Trajectory', color='white', fontsize=12, fontweight='bold')
ax4.tick_params(colors='white')
ax4.legend(facecolor='#222', labelcolor='white', fontsize=8, loc='upper left')

# ── Panel 5: Pulse Spectrum (FFT) ────────
ax5 = fig.add_subplot(2, 3, 5)
ax5.set_facecolor('#1a1a1a')
freqs = np.fft.rfftfreq(N, d=dt)
spectrum = np.abs(np.fft.rfft(u))**2
ax5.plot(freqs, spectrum, color='#ffcc00', lw=2)
ax5.fill_between(freqs, spectrum, alpha=0.2, color='#ffcc00')
ax5.axvline(omega0 / (2*np.pi), color='#ff4444', ls='--', lw=1.5, label=f'$\\omega_0/2\\pi$ = {omega0/(2*np.pi):.2f} GHz')
ax5.set_xlabel('Frequency (GHz)', color='white')
ax5.set_ylabel('Power Spectral Density', color='white')
ax5.set_title('Pulse Frequency Spectrum', color='white', fontsize=12, fontweight='bold')
ax5.legend(facecolor='#222', labelcolor='white')
ax5.tick_params(colors='white')
ax5.set_xlim(0, 3)
for spine in ax5.spines.values(): spine.set_color('#444')

# ── Panel 6: Bloch Vector Components ────
ax6 = fig.add_subplot(2, 3, 6)
ax6.set_facecolor('#1a1a1a')
ax6.plot(t_full, bx, color='#ff4444', lw=2, label='$\\langle\\sigma_x\\rangle$')
ax6.plot(t_full, by, color='#44ff44', lw=2, label='$\\langle\\sigma_y\\rangle$')
ax6.plot(t_full, bz, color='#4488ff', lw=2, label='$\\langle\\sigma_z\\rangle$')
ax6.axhline(0, color='gray', lw=0.5, ls='--')
ax6.set_xlabel('Time (ns)', color='white')
ax6.set_ylabel('Expectation Value', color='white')
ax6.set_title('Bloch Vector Components', color='white', fontsize=12, fontweight='bold')
ax6.legend(facecolor='#222', labelcolor='white')
ax6.tick_params(colors='white')
for spine in ax6.spines.values(): spine.set_color('#444')

plt.suptitle('Quantum Optimal Control — GRAPE Algorithm\n(Two-Level System π-Pulse)',
color='white', fontsize=14, fontweight='bold', y=1.01)
plt.tight_layout()
plt.savefig('quantum_control_grape.png', dpi=150, bbox_inches='tight',
facecolor='#0d0d0d')
plt.show()
print("\n[Figure saved as quantum_control_grape.png]")

# ─────────────────────────────────────────
# 6. Summary Statistics
# ─────────────────────────────────────────
final_state = fwd_states[-1]
print("\n══════════════════════════════════════")
print(" OPTIMIZATION SUMMARY ")
print("══════════════════════════════════════")
print(f" Final fidelity : {fidelity(u):.8f}")
print(f" Final state |0⟩ pop : {np.abs(final_state[0])**2:.6f}")
print(f" Final state |1⟩ pop : {np.abs(final_state[1])**2:.6f}")
print(f" Pulse RMS amplitude : {np.sqrt(np.mean(u**2)):.4f}")
print(f" Pulse peak amplitude : {np.max(np.abs(u)):.4f}")
print(f" Iterations run : {len(fidelity_log)}")
print("══════════════════════════════════════")

Code Walkthrough

Section 1 — System Parameters

We set up the physical parameters of our qubit. The drift Hamiltonian H0=ω02σz governs the natural precession of the qubit around the z-axis at frequency ω0=2π×1.0 rad/ns. The control Hamiltonian Hc=Ω2σx couples the laser field into the qubit via the σx operator. We discretize the total evolution time T=5 ns into N=100 equal steps.

Section 2 — Helper Functions

propagator(u_k, dt) computes the matrix exponential eiHkΔt for a single time step using scipy.linalg.expm. This is the exact unitary evolution, no approximations. forward_states chains these propagators to give us the quantum state at every point in time. fidelity computes the overlap |ψtgt|U(T)|ψ0|2.

Section 3 — GRAPE Gradient Computation

This is the algorithmic heart. At each step k, the GRAPE gradient is:

Fuk=2,Re[¯ψtgt|U(T)|ψ0λk+1|(iHcΔt),Uk|ϕk]

where the forward state |ϕk is propagated from |ψ0 and the co-state |λk+1 is the backward propagation from |ψtgt. Amplitude clipping enforces the physical constraint |uk|umax.

Section 4 — Bloch Sphere Trajectory

The Bloch vector for a pure state |ψ is r=(σx,σy,σz) with σi=ψ|σi|ψ. Starting at the north pole (0,0,+1) (the |0 state), the optimized trajectory winds its way to the south pole (0,0,1) (the |1 state).


Graph Explanations

Panel 1 — Optimized Pulse u(t): The raw output of the GRAPE optimization. Unlike a simple rectangular π-pulse, the optimized pulse has a non-trivial temporal shape — this is what makes it robust and high-fidelity in the presence of drift terms.

Panel 2 — GRAPE Convergence: Fidelity climbs from near zero to >0.9999 over hundreds of iterations. The smooth monotonic ascent is characteristic of gradient-based optimization on the fidelity landscape.

Panel 3 — Population Dynamics: Shows P|0(t) and P|1(t) over time. At t=0, all population is in |0. By t=T, it has completely transferred to |1 — a perfect population inversion.

Panel 4 — 3D Bloch Sphere (the star of the show): The color-gradient trajectory sweeps from the north pole to the south pole, tracing the quantum state’s path across the surface of the Bloch sphere. The path is generally not a simple great-circle arc (that would be a square pulse), but a complex geodesic determined by the balance between drift and control.

Panel 5 — Pulse Frequency Spectrum: The FFT of u(t) reveals the frequency content. The dominant peak near ω0/2π shows that the pulse is essentially resonant with the qubit — the optimizer naturally discovered the correct carrier frequency without being told.

Panel 6 — Bloch Vector Components: Individual time traces of σx, σy, σz. σz goes from +1 to 1, confirming perfect population inversion. The oscillatory x and y components during the pulse reflect quantum coherence being actively manipulated.


Execution Results

Starting GRAPE optimization...
  Iter |   Fidelity |    Max |u|
-----------------------------------
     0 |   0.058475 |     0.2967

✓ Converged at iteration 18 with F = 0.99991435

Final fidelity: 0.99991435

[Figure saved as quantum_control_grape.png]

══════════════════════════════════════
         OPTIMIZATION SUMMARY         
══════════════════════════════════════
  Final fidelity       : 0.99991435
  Final state |0⟩ pop  : 0.000086
  Final state |1⟩ pop  : 0.999914
  Pulse RMS amplitude  : 0.3316
  Pulse peak amplitude : 0.6380
  Iterations run       : 19
══════════════════════════════════════

Key Takeaways

The GRAPE algorithm is remarkably powerful: given only the physical Hamiltonian and a desired target state, it discovers laser pulse shapes that achieve near-perfect state transfer. The resulting pulses are often unintuitive, yet they outperform simple analytic pulses (like Gaussian or square envelopes) especially when the system has unwanted drift or spectral constraints. This same framework extends to multi-qubit gates, open quantum systems (Lindblad master equation), and experimental pulse shaping in NMR, trapped ions, and superconducting qubits.

Variational Principle

Optimizing Wave Functions by Minimizing Energy

The variational principle is one of the most powerful tools in quantum mechanics. It states that for any trial wave function |ψ, the expectation value of the Hamiltonian is always greater than or equal to the true ground state energy E0:

E[ψ]=ψ|ˆH|ψψ|ψE0

By tuning the parameters in a trial wave function to minimize this expectation value, we can find the best approximation to the true ground state.


Example Problem: Quantum Harmonic Oscillator

The Hamiltonian is:

ˆH=22md2dx2+12mω2x2

In natural units (=m=ω=1):

ˆH=12d2dx2+12x2

The exact ground state energy is E0=12.

We use a Gaussian trial wave function:

ψα(x)=(2απ)1/4eαx2

where α>0 is the variational parameter. The energy expectation value is:

E(α)=α2+18α

We minimize E(α) with respect to α.


Full Python Source Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import cm
from scipy.optimize import minimize_scalar
from scipy.linalg import eigh
import warnings
warnings.filterwarnings('ignore')

# ─────────────────────────────────────────────
# 1. ANALYTICAL VARIATIONAL ENERGY E(alpha)
# ─────────────────────────────────────────────
def E_analytical(alpha):
"""
Analytical expectation value of H for Gaussian trial function.
E(alpha) = alpha/2 + 1/(8*alpha)
Kinetic term = alpha/2
Potential term= 1/(8*alpha)
"""
return alpha / 2.0 + 1.0 / (8.0 * alpha)

alpha_vals = np.linspace(0.1, 3.0, 500)
E_vals = E_analytical(alpha_vals)

# Analytical minimum
result = minimize_scalar(E_analytical, bounds=(0.01, 10), method='bounded')
alpha_opt = result.x
E_opt = result.fun
E_exact = 0.5 # exact ground state energy

print("=" * 50)
print(f" Optimal alpha : {alpha_opt:.6f}")
print(f" Variational E_min : {E_opt:.6f}")
print(f" Exact E_0 : {E_exact:.6f}")
print(f" Error : {abs(E_opt - E_exact):.2e}")
print("=" * 50)

# ─────────────────────────────────────────────
# 2. NUMERICAL VARIATIONAL METHOD (matrix)
# Finite-difference discretisation of H
# ─────────────────────────────────────────────
N = 800 # grid points
L = 8.0 # half-box size
x = np.linspace(-L, L, N)
dx = x[1] - x[0]

# Kinetic energy matrix T = -1/2 * d²/dx² (3-point finite difference)
diag_T = np.ones(N) / (dx**2) # main diagonal
off_T = -0.5 * np.ones(N-1) / (dx**2) # off-diagonals
T = (np.diag(diag_T) +
np.diag(off_T, 1) +
np.diag(off_T, -1))

# Potential energy matrix V = 1/2 x²
V = np.diag(0.5 * x**2)

H = T + V

# Solve for the lowest few eigenvalues / eigenvectors
num_states = 5
eigenvalues, eigenvectors = eigh(H, subset_by_index=[0, num_states-1])

print("\nNumerical eigenvalues (finite-difference):")
for n, e in enumerate(eigenvalues):
print(f" E_{n} = {e:.6f} (exact = {n + 0.5:.6f})")

# ─────────────────────────────────────────────
# 3. TRIAL WAVE FUNCTIONS for several alpha
# ─────────────────────────────────────────────
alphas_demo = [0.2, 0.5, 1.0, 2.0]

def trial_wf(x, alpha):
norm = (2*alpha / np.pi)**0.25
return norm * np.exp(-alpha * x**2)

# ─────────────────────────────────────────────
# 4. PLOTTING
# ─────────────────────────────────────────────
fig = plt.figure(figsize=(18, 14))
fig.suptitle("Variational Principle – Quantum Harmonic Oscillator",
fontsize=16, fontweight='bold', y=0.98)

# ── Plot 1: E(alpha) curve ────────────────────
ax1 = fig.add_subplot(2, 3, 1)
ax1.plot(alpha_vals, E_vals, 'steelblue', lw=2, label=r'$E(\alpha)$')
ax1.axhline(E_exact, color='red', ls='--', lw=1.5, label=f'Exact $E_0={E_exact}$')
ax1.axvline(alpha_opt, color='green', ls=':', lw=1.5, label=f'$\\alpha_{{opt}}={alpha_opt:.3f}$')
ax1.scatter([alpha_opt], [E_opt], color='green', zorder=5, s=80)
ax1.set_xlabel(r'$\alpha$', fontsize=13)
ax1.set_ylabel(r'$E(\alpha)$', fontsize=13)
ax1.set_title('Variational Energy vs. Parameter', fontsize=11)
ax1.legend(fontsize=9)
ax1.set_ylim(0, 2)
ax1.grid(True, alpha=0.3)

# ── Plot 2: Trial wave functions ──────────────
ax2 = fig.add_subplot(2, 3, 2)
x_plot = np.linspace(-4, 4, 400)
colors = plt.cm.viridis(np.linspace(0.1, 0.9, len(alphas_demo)))
for alpha, col in zip(alphas_demo, colors):
psi = trial_wf(x_plot, alpha)
E_a = E_analytical(alpha)
ax2.plot(x_plot, psi, color=col, lw=2,
label=rf'$\alpha={alpha}$, $E={E_a:.3f}$')
# Exact ground state
psi_exact = trial_wf(x_plot, 0.5)
ax2.plot(x_plot, psi_exact, 'r--', lw=2, label=r'Exact ($\alpha=0.5$)')
ax2.set_xlabel('$x$', fontsize=13)
ax2.set_ylabel(r'$\psi_\alpha(x)$', fontsize=13)
ax2.set_title('Trial Wave Functions', fontsize=11)
ax2.legend(fontsize=8)
ax2.grid(True, alpha=0.3)

# ── Plot 3: Probability densities ─────────────
ax3 = fig.add_subplot(2, 3, 3)
for alpha, col in zip(alphas_demo, colors):
psi = trial_wf(x_plot, alpha)
ax3.plot(x_plot, psi**2, color=col, lw=2, label=rf'$\alpha={alpha}$')
psi_exact = trial_wf(x_plot, 0.5)
ax3.plot(x_plot, psi_exact**2, 'r--', lw=2, label='Exact')
pot = 0.5 * x_plot**2
ax3.fill_between(x_plot, 0, pot/max(pot)*0.25, alpha=0.1, color='orange', label='$V(x)$ (scaled)')
ax3.set_xlabel('$x$', fontsize=13)
ax3.set_ylabel(r'$|\psi_\alpha(x)|^2$', fontsize=13)
ax3.set_title('Probability Densities', fontsize=11)
ax3.legend(fontsize=8)
ax3.grid(True, alpha=0.3)

# ── Plot 4: Numerical eigenstates ─────────────
ax4 = fig.add_subplot(2, 3, 4)
x_num = x
norm_factor = np.sqrt(dx)
offset = 0
state_colors = ['royalblue','tomato','seagreen','darkorange','purple']
for n in range(num_states):
psi_n = eigenvectors[:, n] / norm_factor
# fix sign convention
if psi_n[N//2] < 0:
psi_n = -psi_n
scale = 0.6
ax4.plot(x_num, psi_n * scale + eigenvalues[n],
color=state_colors[n], lw=2, label=f'$n={n}$, $E={eigenvalues[n]:.3f}$')
ax4.axhline(eigenvalues[n], color=state_colors[n], ls=':', lw=0.8, alpha=0.5)

ax4.plot(x_num, 0.5 * x_num**2, 'k-', lw=1.5, alpha=0.4, label='$V(x)$')
ax4.set_xlim(-5, 5)
ax4.set_ylim(-0.2, 6)
ax4.set_xlabel('$x$', fontsize=13)
ax4.set_ylabel('Energy', fontsize=13)
ax4.set_title('Numerical Eigenstates (FD)', fontsize=11)
ax4.legend(fontsize=8, loc='upper right')
ax4.grid(True, alpha=0.3)

# ── Plot 5: Convergence of E vs alpha (log) ───
ax5 = fig.add_subplot(2, 3, 5)
alphas_fine = np.logspace(-1.5, 1.0, 600)
E_fine = E_analytical(alphas_fine)
err_fine = E_fine - E_exact
ax5.semilogx(alphas_fine, err_fine, 'steelblue', lw=2)
ax5.axvline(alpha_opt, color='green', ls='--', lw=1.5,
label=f'$\\alpha_{{opt}}={alpha_opt:.3f}$')
ax5.scatter([alpha_opt], [E_opt - E_exact], color='green', zorder=5, s=80)
ax5.set_xlabel(r'$\alpha$ (log scale)', fontsize=13)
ax5.set_ylabel(r'$E(\alpha) - E_0$', fontsize=13)
ax5.set_title('Energy Error vs. Parameter', fontsize=11)
ax5.legend(fontsize=9)
ax5.grid(True, which='both', alpha=0.3)

# ── Plot 6: 3-D surface E(alpha, x) ─────────
ax6 = fig.add_subplot(2, 3, 6, projection='3d')
alpha_3d = np.linspace(0.2, 2.5, 80)
x_3d = np.linspace(-3, 3, 80)
A3, X3 = np.meshgrid(alpha_3d, x_3d)
PSI3 = (2*A3/np.pi)**0.25 * np.exp(-A3 * X3**2)
PROB3 = PSI3**2 # probability density surface

surf = ax6.plot_surface(A3, X3, PROB3,
cmap='plasma', alpha=0.85,
linewidth=0, antialiased=True)
ax6.set_xlabel(r'$\alpha$', fontsize=11, labelpad=8)
ax6.set_ylabel('$x$', fontsize=11, labelpad=8)
ax6.set_zlabel(r'$|\psi|^2$', fontsize=11, labelpad=8)
ax6.set_title(r'3-D: $|\psi_\alpha(x)|^2$ surface', fontsize=11)
fig.colorbar(surf, ax=ax6, shrink=0.5, pad=0.12)

plt.tight_layout()
plt.savefig("variational_qho.png", dpi=130, bbox_inches='tight')
plt.show()
print("\nFigure saved → variational_qho.png")

Code Walkthrough

Section 1 — Analytical Variational Energy

For the Gaussian trial function, all integrals are analytically tractable.

The kinetic energy contribution is:

T=α2

The potential energy contribution is:

V=18α

So the total variational energy is:

E(α)=α2+18α

minimize_scalar from SciPy finds the minimum over α. Setting dEdα=0 gives αopt=12, which recovers the exact ground state perfectly for this problem.


Section 2 — Numerical Solution via Finite Differences

Instead of relying on analytics, we discretize ˆH on a real-space grid using a 3-point stencil for 12d2dx2:

$$\left.\frac{d^2\psi}{dx^2}\right|i \approx \frac{\psi{i+1} - 2\psi_i + \psi_{i-1}}{(\Delta x)^2}$$

This turns ˆH into a sparse symmetric matrix, and scipy.linalg.eigh extracts the lowest num_states eigenvalues and eigenvectors. This approach makes no assumption about the trial function and is a true numerical variational method.


Section 3 — Trial Wave Functions for Different α

We compare four values α0.2,0.5,1.0,2.0. A small α gives a broad, spread-out wave function (over-delocalized), while a large α gives a very narrow wave function (over-localized). Only α=0.5 matches the exact solution, and it uniquely minimizes E(α).


Graph Explanation

Panel What it shows
Top-left E(α) curve. The green dot marks the minimum; the red dashed line is the exact E0=0.5. The curve is convex, guaranteeing a unique minimum.
Top-center Shape of ψα(x) for four α values. Broader for smaller α, narrower for larger α. The red dashed line is the exact solution.
Top-right Probability densities |ψα|2. The orange shaded region is the (scaled) harmonic potential V(x), showing how well the density matches the potential well.
Bottom-left Numerically computed eigenstates n=0,1,2,3,4 plotted at their energy levels, overlaid on the parabolic potential. A textbook quantum ladder.
Bottom-center Energy error E(α)E0 on a log-α axis. The minimum error occurs precisely at αopt=0.5; any deviation increases the error.
Bottom-right 3-D surface of |ψα(x)|2 as a function of both α and x. When α is large the probability is squeezed into a sharp ridge near x=0; when α is small the ridge flattens out widely. The optimal α sits between these extremes and matches the harmonic potential geometry.

Execution Results

==================================================
  Optimal alpha       : 0.499999
  Variational E_min   : 0.500000
  Exact E_0           : 0.500000
  Error               : 1.43e-12
==================================================

Numerical eigenvalues (finite-difference):
  E_0 = 0.499987   (exact = 0.500000)
  E_1 = 1.499937   (exact = 1.500000)
  E_2 = 2.499837   (exact = 2.500000)
  E_3 = 3.499687   (exact = 3.500000)
  E_4 = 4.499486   (exact = 4.500000)

Figure saved → variational_qho.png

Key Takeaways

The variational principle guarantees:

E[ψα]E0,α

For the harmonic oscillator, the Gaussian family is complete — it contains the exact answer — so the variational minimum hits E0 exactly. For more complex potentials (e.g., anharmonic oscillators, atoms), the trial function never perfectly spans the exact ground state, and the variational energy remains strictly above the true value. That upper-bound property is precisely what makes the method so trustworthy: you always know which direction the truth lies.

Variational Method for Ground State Energy Approximation

The variational principle is one of quantum mechanics’ most powerful tools. It states that for any trial wave function |ψ, the expectation value of the Hamiltonian is always greater than or equal to the true ground state energy E0:

E0ψ|ˆH|ψ=ψ|ˆH|ψψ|ψ

By minimizing this expectation value over a family of trial functions parameterized by α, we get the best approximation within that family.


Example Problem: Quantum Harmonic Oscillator

We’ll apply the variational method to the 1D quantum harmonic oscillator:

ˆH=22md2dx2+12mω2x2

The exact ground state energy is:

Eexact0=12ω

We’ll use a Gaussian trial wave function:

ψα(x)=(2απ)1/4eαx2

where α>0 is our variational parameter.

The expectation value of the Hamiltonian becomes:

E(α)=2α2m+mω28α

Minimizing with respect to α:

dEdα=22mmω28α2=0αopt=mω2

This is exact for the harmonic oscillator — a beautiful sanity check! We’ll also test a harder case: the anharmonic oscillator with a quartic perturbation:

ˆH=22md2dx2+12mω2x2+λx4


Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import cm
from mpl_toolkits.mplot3d import Axes3D
from scipy.integrate import quad
from scipy.optimize import minimize_scalar, minimize
import warnings
warnings.filterwarnings('ignore')

# ─────────────────────────────────────────────
# Physical constants (hbar = m = omega = 1 units)
# ─────────────────────────────────────────────
HBAR = 1.0
M = 1.0
OMEGA = 1.0

# ─────────────────────────────────────────────
# 1. Analytical expectation value E(alpha)
# for harmonic oscillator with Gaussian trial wavefunction
# psi_alpha(x) = (2*alpha/pi)^(1/4) * exp(-alpha*x^2)
#
# E(alpha) = hbar^2*alpha/(2m) + m*omega^2/(8*alpha)
# ─────────────────────────────────────────────
def E_harmonic_analytical(alpha, hbar=HBAR, m=M, omega=OMEGA):
return (hbar**2 * alpha) / (2 * m) + (m * omega**2) / (8 * alpha)

# ─────────────────────────────────────────────
# 2. Numerical expectation value via integration
# (used for the anharmonic case)
# H = -hbar^2/(2m) d^2/dx^2 + 0.5*m*omega^2*x^2 + lambda*x^4
# ─────────────────────────────────────────────
def trial_psi(x, alpha):
"""Gaussian trial wavefunction (unnormalized core)."""
return np.exp(-alpha * x**2)

def E_numerical(alpha, lam=0.0, hbar=HBAR, m=M, omega=OMEGA, limit=200):
"""
Numerically compute <H> / <psi|psi> for Gaussian trial wavefunction.
Kinetic: <psi| -hbar^2/(2m) d^2/dx^2 |psi>
= hbar^2*alpha/(2m) * sqrt(pi/(2*alpha)) [known analytically]
Potential (harmonic): 0.5*m*omega^2 * <x^2>
= m*omega^2/(8*alpha) * sqrt(pi/alpha) ... combined below
Quartic: lambda * <x^4> -- computed numerically
"""
# <psi|psi> = sqrt(pi/(2*alpha)) ... integral of exp(-2*alpha*x^2)
norm2, _ = quad(lambda x: np.exp(-2*alpha*x**2), -np.inf, np.inf, limit=limit)

# Kinetic energy contribution
# <psi|T|psi> = hbar^2/(2m) * alpha * norm2 (for Gaussian)
T = (hbar**2 / (2*m)) * alpha * norm2

# Harmonic potential
V_harm, _ = quad(lambda x: 0.5*m*omega**2 * x**2 * np.exp(-2*alpha*x**2),
-np.inf, np.inf, limit=limit)

# Quartic potential
V_quar, _ = quad(lambda x: lam * x**4 * np.exp(-2*alpha*x**2),
-np.inf, np.inf, limit=limit)

return (T + V_harm + V_quar) / norm2

# ─────────────────────────────────────────────
# 3. Variational optimization
# ─────────────────────────────────────────────
alpha_values = np.linspace(0.1, 3.0, 500)

# --- Harmonic oscillator ---
E_harm = E_harmonic_analytical(alpha_values)
res_harm = minimize_scalar(E_harmonic_analytical, bounds=(0.01, 10), method='bounded')
alpha_opt_harm = res_harm.x
E_opt_harm = res_harm.fun
E_exact_harm = 0.5 * HBAR * OMEGA # = 0.5

# --- Anharmonic oscillator (lambda = 0.1) ---
LAMBDA = 0.1
E_anharm = np.array([E_numerical(a, lam=LAMBDA) for a in alpha_values])
res_anharm = minimize_scalar(lambda a: E_numerical(a, lam=LAMBDA),
bounds=(0.1, 5.0), method='bounded')
alpha_opt_anharm = res_anharm.x
E_opt_anharm = res_anharm.fun

# Exact anharmonic ground state via finite-difference diagonalization
def exact_ground_state(lam=0.1, N=1000, xmax=10.0):
"""Finite-difference Hamiltonian diagonalization for exact E0."""
x = np.linspace(-xmax, xmax, N)
dx = x[1] - x[0]
diag = (HBAR**2 / (M * dx**2) +
0.5 * M * OMEGA**2 * x**2 + lam * x**4)
off = -HBAR**2 / (2 * M * dx**2) * np.ones(N-1)
H = np.diag(diag) + np.diag(off, 1) + np.diag(off, -1)
eigvals = np.linalg.eigvalsh(H)
return eigvals[0], x, H

E_exact_anharm, x_grid, H_mat = exact_ground_state(LAMBDA)

print(f"=== Harmonic Oscillator ===")
print(f" Optimal alpha : {alpha_opt_harm:.6f} (exact: {M*OMEGA/(2*HBAR):.6f})")
print(f" Variational E0 : {E_opt_harm:.6f}")
print(f" Exact E0 : {E_exact_harm:.6f}")
print(f" Error : {abs(E_opt_harm - E_exact_harm):.2e}")
print()
print(f"=== Anharmonic Oscillator (lambda={LAMBDA}) ===")
print(f" Optimal alpha : {alpha_opt_anharm:.6f}")
print(f" Variational E0 : {E_opt_anharm:.6f}")
print(f" Exact E0 (FD) : {E_exact_anharm:.6f}")
print(f" Error : {abs(E_opt_anharm - E_exact_anharm):.6f}")
print(f" Relative Error (%) : {abs(E_opt_anharm - E_exact_anharm)/E_exact_anharm*100:.4f}%")

# ─────────────────────────────────────────────
# 4. Multi-parameter trial function:
# psi(x; alpha, beta) = exp(-alpha*x^2 - beta*x^4)
# Explore 2D energy landscape
# ─────────────────────────────────────────────
def E_2param_numerical(alpha, beta, lam=LAMBDA, limit=100):
"""<H> for 2-parameter trial wavefunction exp(-alpha*x^2 - beta*x^4)."""
if alpha <= 0 or beta <= 0:
return 1e10
def psi2(x):
return np.exp(-2*(alpha*x**2 + beta*x**4))
norm2, _ = quad(psi2, -np.inf, np.inf, limit=limit)
# Kinetic: <psi| -d^2/dx^2/2 |psi>
# d/dx psi = (-2*alpha*x - 4*beta*x^3)*psi
# -d^2/dx^2 psi = (2*alpha + 12*beta*x^2 - (2*alpha*x+4*beta*x^3)^2)*psi
def T_integrand(x):
deriv = (2*alpha + 12*beta*x**2 - (2*alpha*x + 4*beta*x**3)**2)
return 0.5 * deriv * psi2(x)
def V_integrand(x):
return (0.5*OMEGA**2*x**2 + lam*x**4) * psi2(x)
T_val, _ = quad(T_integrand, -np.inf, np.inf, limit=limit)
V_val, _ = quad(V_integrand, -np.inf, np.inf, limit=limit)
return (T_val + V_val) / norm2

# 2D grid for landscape (coarser for speed)
a_grid = np.linspace(0.2, 2.0, 40)
b_grid = np.linspace(0.01, 0.5, 40)
AA, BB = np.meshgrid(a_grid, b_grid)
EE = np.zeros_like(AA)
for i in range(AA.shape[0]):
for j in range(AA.shape[1]):
EE[i, j] = E_2param_numerical(AA[i,j], BB[i,j])

# Optimize 2-parameter
res_2p = minimize(lambda p: E_2param_numerical(p[0], p[1]),
x0=[0.5, 0.1], method='Nelder-Mead',
options={'xatol':1e-6,'fatol':1e-6,'maxiter':5000})
alpha2_opt, beta2_opt = res_2p.x
E2_opt = res_2p.fun

print(f"\n=== 2-Parameter Variational (lambda={LAMBDA}) ===")
print(f" Optimal alpha : {alpha2_opt:.6f}")
print(f" Optimal beta : {beta2_opt:.6f}")
print(f" Variational E0 : {E2_opt:.6f}")
print(f" Exact E0 (FD) : {E_exact_anharm:.6f}")
print(f" Relative Error (%) : {abs(E2_opt - E_exact_anharm)/E_exact_anharm*100:.4f}%")

# ─────────────────────────────────────────────
# 5. Plotting
# ─────────────────────────────────────────────
fig = plt.figure(figsize=(18, 14))
fig.patch.set_facecolor('#0d0d0d')

# ── Plot 1: E(alpha) for harmonic oscillator ──
ax1 = fig.add_subplot(2, 3, 1)
ax1.set_facecolor('#1a1a2e')
ax1.plot(alpha_values, E_harm, color='#00d4ff', lw=2.5, label=r'$E(\alpha)$ variational')
ax1.axhline(E_exact_harm, color='#ff6b6b', lw=1.5, ls='--', label=f'Exact $E_0={E_exact_harm}$')
ax1.axvline(alpha_opt_harm, color='#ffd700', lw=1.5, ls=':', label=f'$\\alpha_{{opt}}={alpha_opt_harm:.3f}$')
ax1.scatter([alpha_opt_harm], [E_opt_harm], color='#ffd700', s=100, zorder=5)
ax1.set_xlabel(r'$\alpha$', color='white', fontsize=12)
ax1.set_ylabel(r'$E(\alpha)$', color='white', fontsize=12)
ax1.set_title('Harmonic Oscillator\nVariational Energy', color='white', fontsize=11)
ax1.legend(fontsize=8, facecolor='#0d0d0d', labelcolor='white')
ax1.tick_params(colors='white'); ax1.spines[:].set_color('#444')
ax1.set_ylim(0, 3)

# ── Plot 2: E(alpha) for anharmonic oscillator ──
ax2 = fig.add_subplot(2, 3, 2)
ax2.set_facecolor('#1a1a2e')
ax2.plot(alpha_values, E_anharm, color='#a29bfe', lw=2.5, label=r'$E(\alpha)$ variational')
ax2.axhline(E_exact_anharm, color='#ff6b6b', lw=1.5, ls='--',
label=f'Exact $E_0={E_exact_anharm:.4f}$')
ax2.axvline(alpha_opt_anharm, color='#ffd700', lw=1.5, ls=':',
label=f'$\\alpha_{{opt}}={alpha_opt_anharm:.3f}$')
ax2.scatter([alpha_opt_anharm], [E_opt_anharm], color='#ffd700', s=100, zorder=5)
ax2.set_xlabel(r'$\alpha$', color='white', fontsize=12)
ax2.set_ylabel(r'$E(\alpha)$', color='white', fontsize=12)
ax2.set_title(f'Anharmonic Oscillator ($\\lambda={LAMBDA}$)\nVariational Energy', color='white', fontsize=11)
ax2.legend(fontsize=8, facecolor='#0d0d0d', labelcolor='white')
ax2.tick_params(colors='white'); ax2.spines[:].set_color('#444')
ax2.set_ylim(0.4, 2.0)

# ── Plot 3: Wavefunctions comparison ──
ax3 = fig.add_subplot(2, 3, 3)
ax3.set_facecolor('#1a1a2e')
x_plot = np.linspace(-4, 4, 500)
norm_harm = (2*alpha_opt_harm/np.pi)**0.25
psi_var = norm_harm * np.exp(-alpha_opt_harm * x_plot**2)
# Exact wavefunction for harmonic oscillator: Gaussian with alpha = m*omega/(2*hbar) = 0.5
alpha_exact = M*OMEGA/(2*HBAR)
norm_exact = (2*alpha_exact/np.pi)**0.25
psi_exact = norm_exact * np.exp(-alpha_exact * x_plot**2)
ax3.plot(x_plot, psi_var**2, color='#00d4ff', lw=2.5, label='Variational $|\\psi|^2$')
ax3.plot(x_plot, psi_exact**2, color='#ff6b6b', lw=1.5, ls='--', label='Exact $|\\psi|^2$')
ax3.fill_between(x_plot, psi_var**2, alpha=0.2, color='#00d4ff')
ax3.set_xlabel('$x$', color='white', fontsize=12)
ax3.set_ylabel(r'$|\psi(x)|^2$', color='white', fontsize=12)
ax3.set_title('Probability Density\n(Harmonic Oscillator)', color='white', fontsize=11)
ax3.legend(fontsize=9, facecolor='#0d0d0d', labelcolor='white')
ax3.tick_params(colors='white'); ax3.spines[:].set_color('#444')

# ── Plot 4: 3D Energy landscape (2-parameter) ──
ax4 = fig.add_subplot(2, 3, 4, projection='3d')
ax4.set_facecolor('#0d0d0d')
EE_clip = np.clip(EE, 0.5, 3.0)
surf = ax4.plot_surface(AA, BB, EE_clip, cmap='plasma', alpha=0.85,
linewidth=0, antialiased=True)
ax4.scatter([alpha2_opt], [beta2_opt], [E2_opt], color='#ffd700',
s=150, zorder=10, label=f'Min $E={E2_opt:.4f}$')
ax4.set_xlabel(r'$\alpha$', color='white', fontsize=10, labelpad=8)
ax4.set_ylabel(r'$\beta$', color='white', fontsize=10, labelpad=8)
ax4.set_zlabel(r'$E(\alpha,\beta)$', color='white', fontsize=10, labelpad=8)
ax4.set_title('3D Energy Landscape\n2-Parameter Variational', color='white', fontsize=11)
ax4.tick_params(colors='white')
ax4.xaxis.pane.fill = False; ax4.yaxis.pane.fill = False; ax4.zaxis.pane.fill = False
ax4.xaxis.pane.set_edgecolor('#333'); ax4.yaxis.pane.set_edgecolor('#333')
ax4.zaxis.pane.set_edgecolor('#333')
ax4.legend(fontsize=8, facecolor='#0d0d0d', labelcolor='white', loc='upper right')
fig.colorbar(surf, ax=ax4, shrink=0.4, pad=0.1,
label='$E(\\alpha,\\beta)$').ax.yaxis.label.set_color('white')

# ── Plot 5: Lambda dependence of variational vs exact E0 ──
ax5 = fig.add_subplot(2, 3, 5)
ax5.set_facecolor('#1a1a2e')
lambda_arr = np.linspace(0, 0.5, 20)
E_var_lam, E_exa_lam = [], []
for lv in lambda_arr:
rv = minimize_scalar(lambda a: E_numerical(a, lam=lv),
bounds=(0.1, 8.0), method='bounded')
E_var_lam.append(rv.fun)
E_exa_lam.append(exact_ground_state(lv, N=600)[0])

ax5.plot(lambda_arr, E_var_lam, color='#a29bfe', lw=2.5, marker='o',
ms=4, label='Variational (1-param)')
ax5.plot(lambda_arr, E_exa_lam, color='#ff6b6b', lw=2.0, ls='--',
marker='s', ms=4, label='Exact (FD)')
ax5.fill_between(lambda_arr, E_var_lam, E_exa_lam, alpha=0.25, color='#ffd700',
label='Error region')
ax5.set_xlabel(r'$\lambda$', color='white', fontsize=12)
ax5.set_ylabel(r'$E_0$', color='white', fontsize=12)
ax5.set_title('Ground State Energy vs $\\lambda$\n(Anharmonic Strength)', color='white', fontsize=11)
ax5.legend(fontsize=9, facecolor='#0d0d0d', labelcolor='white')
ax5.tick_params(colors='white'); ax5.spines[:].set_color('#444')

# ── Plot 6: Potential + wavefunctions for anharmonic ──
ax6 = fig.add_subplot(2, 3, 6)
ax6.set_facecolor('#1a1a2e')
x_p = np.linspace(-3.5, 3.5, 500)
V_anharm_plot = 0.5*OMEGA**2*x_p**2 + LAMBDA*x_p**4

# Exact eigenstate from FD
eigvals_full, eigvecs_full = np.linalg.eigh(H_mat)
psi_fd = eigvecs_full[:, 0]
psi_fd /= np.sqrt(np.trapz(psi_fd**2, x_grid))
scale = 0.3 / np.max(np.abs(psi_fd))

# Variational wavefunction for anharmonic (1-param)
psi_var1 = np.exp(-alpha_opt_anharm * x_p**2)
psi_var1 /= np.sqrt(np.trapz(psi_var1**2, x_p))

ax6.plot(x_p, V_anharm_plot, color='#74b9ff', lw=2.0, label='$V(x)$', zorder=3)
ax6.axhline(E_exact_anharm, color='#ff6b6b', lw=1.0, ls=':', alpha=0.7)
ax6.axhline(E_opt_anharm, color='#ffd700', lw=1.0, ls=':', alpha=0.7)
ax6.plot(x_p, psi_var1**2 * 1.5 + E_opt_anharm, color='#ffd700', lw=2.0,
label=f'Var. $|\\psi|^2$ (shifted), $E={E_opt_anharm:.4f}$')
ax6.plot(x_grid, psi_fd**2 * 1.5 + E_exact_anharm, color='#ff6b6b', lw=2.0, ls='--',
label=f'Exact $|\\psi|^2$ (shifted), $E={E_exact_anharm:.4f}$')
ax6.set_xlim(-3.5, 3.5); ax6.set_ylim(-0.1, 3.5)
ax6.set_xlabel('$x$', color='white', fontsize=12)
ax6.set_ylabel('Energy / $|\\psi|^2$', color='white', fontsize=12)
ax6.set_title(f'Anharmonic Potential & Wavefunctions\n($\\lambda={LAMBDA}$)', color='white', fontsize=11)
ax6.legend(fontsize=7.5, facecolor='#0d0d0d', labelcolor='white')
ax6.tick_params(colors='white'); ax6.spines[:].set_color('#444')

plt.suptitle('Variational Method — Ground State Energy Approximation',
color='white', fontsize=15, fontweight='bold', y=1.01)
plt.tight_layout()
plt.savefig('variational_method.png', dpi=150, bbox_inches='tight',
facecolor='#0d0d0d')
plt.show()
print("Figure saved.")

Code Walkthrough

Constants and Units. We set =m=ω=1, which is the standard dimensionless unit system in quantum mechanics textbooks. All energies are in units of ω.

Analytical Energy Function. For the pure harmonic oscillator with a Gaussian trial function ψα(x)=C,eαx2, the expectation value can be evaluated analytically using Gaussian integrals:

T=2α2m,V=mω28α

giving:

E(α)=2α2m+mω28α

Numerical Integration for Anharmonic Case. When λx4 is added, the x4 term doesn’t have a simple closed form in the variational parameter, so we use scipy.integrate.quad for numerical quadrature. The normalization ψ|ψ is computed in the same pass, and the full energy is:

E(α)=ψ|T+V|ψψ|ψ

Finite-Difference Exact Diagonalization. To get the “true” ground state energy for comparison, we discretize the Hamiltonian on a grid using the second-order finite difference formula:

d2ψdx2ψi+12ψi+ψi1Δx2

This turns the Schrödinger equation into a matrix eigenvalue problem, which numpy.linalg.eigvalsh solves exactly.

2-Parameter Trial Function. We extend the trial function to ψ(x;α,β)=eαx2βx4, which is more flexible and better captures the effect of the quartic potential. The kinetic energy integrand is derived analytically via:

ddxψ=(2αx4βx3)ψ

d2dx2ψ=[(2αx+4βx3)2(2α+12βx2)]ψ

The 2D energy landscape is evaluated on a 40×40 grid and minimized with scipy.optimize.minimize using Nelder-Mead.


Graph Explanations

Top-left — Harmonic oscillator E(α): The curve has a clear minimum at αopt=0.5, which coincides exactly with the known ground state. This is because a Gaussian is the exact eigenfunction of the harmonic oscillator, so the variational method yields the exact answer.

Top-center — Anharmonic oscillator E(α): The minimum shifts to a larger α because the quartic term makes the potential steeper, requiring a more tightly confined wavefunction. The variational energy (gold dot) sits slightly above the exact value (red dashed line) — as the variational principle demands.

Top-right — Probability densities: Both variational and exact wavefunctions are plotted as |ψ(x)|2. For the harmonic oscillator they overlap perfectly, confirming the Gaussian is the exact ground state.

Center — 3D Energy landscape (plasma colormap): This shows E(α,β) over the 2D parameter space. The gold dot marks the global minimum. The landscape has a smooth bowl-like valley, and the 2-parameter optimization finds a lower energy than the 1-parameter case.

Bottom-left — E0 vs λ: As the anharmonic coupling λ grows, both the variational and exact ground state energies rise. The yellow shaded region shows the error of the 1-parameter variational method — it grows with λ but remains small, validating the approach.

Bottom-right — Potential well + wavefunctions: The blue curve is the total anharmonic potential V(x). The wavefunctions are vertically shifted to sit at their respective energy levels, making it easy to visually compare where the variational estimate (gold) and exact solution (red dashed) sit relative to the potential.


Results

=== Harmonic Oscillator ===
  Optimal alpha      : 0.499999  (exact: 0.500000)
  Variational E0     : 0.500000
  Exact E0           : 0.500000
  Error              : 1.43e-12

=== Anharmonic Oscillator (lambda=0.1) ===
  Optimal alpha      : 0.610600
  Variational E0     : 0.560307
  Exact E0 (FD)      : 0.559129
  Error              : 0.001179
  Relative Error (%) : 0.2108%

=== 2-Parameter Variational (lambda=0.1) ===
  Optimal alpha      : 0.561297
  Optimal beta       : 0.018879
  Variational E0     : 0.559148
  Exact E0 (FD)      : 0.559129
  Relative Error (%) : 0.0034%

Figure saved.

Key Takeaways

The variational principle guarantees EvarE0, so we always get an upper bound. For the harmonic oscillator the Gaussian trial function is exact (the error is literally zero). For the anharmonic oscillator the 1-parameter Gaussian gives a relative error that remains below ~1% for moderate λ, and extending to a 2-parameter family eαx2βx4 brings it even closer to the exact value. This is the core power of the method: systematically enlarging the trial function space always improves the bound.