Network Routing Optimization

Minimizing Latency and Packet Loss with Python

Network routing optimization is a fundamental challenge in computer networking. The goal is to find paths through a network that minimize latency and packet loss — two metrics that directly impact user experience. In this post, we’ll model a real-world-like network, apply classic and modern optimization algorithms, and visualize the results in both 2D and 3D.


The Problem

Consider a network of 10 nodes (routers/switches) with weighted edges representing two cost dimensions:

  • Latency (ms): propagation delay between nodes
  • Packet Loss (%)**: probability of a packet being dropped on a link

We want to find the optimal path from a source to a destination that minimizes a composite cost:

$$C(e) = \alpha \cdot L(e) + \beta \cdot P(e)$$

Where:

  • $L(e)$ = latency of edge $e$ in ms
  • $P(e)$ = packet loss rate of edge $e$ (normalized 0–1)
  • $\alpha, \beta$ = weighting factors (e.g., $\alpha = 0.7$, $\beta = 0.3$)

For end-to-end packet delivery probability across a path $\pi = (e_1, e_2, \ldots, e_k)$:

$$P_{\text{delivery}}(\pi) = \prod_{i=1}^{k}(1 - p_i)$$

To use additive shortest-path algorithms on multiplicative metrics, we transform packet loss using the log-sum trick:

$$-\log P_{\text{delivery}}(\pi) = \sum_{i=1}^{k} -\log(1 - p_i)$$

We then find the path minimizing:

$$\text{Cost}(\pi) = \alpha \sum_{i} L(e_i) + \beta \sum_{i} \left(-\log(1 - p_i)\right) \cdot S$$

Where $S$ is a scaling factor to keep units comparable.


Algorithms Used

  1. Dijkstra’s Algorithm — finds shortest path using composite cost
  2. Bellman-Ford — handles negative weights (used here for comparison)
  3. Genetic Algorithm (GA) — metaheuristic for global optimization across all paths
  4. Simulated Annealing (SA) — probabilistic optimization to escape local optima

Full Source Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
import numpy as np
import networkx as nx
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
from matplotlib import cm
from mpl_toolkits.mplot3d import Axes3D
import random
import math
import warnings
warnings.filterwarnings('ignore')

# ── Reproducibility ──────────────────────────────────────────────────────────
np.random.seed(42)
random.seed(42)

# ══════════════════════════════════════════════════════════════════════════════
# 1. Network Construction
# ══════════════════════════════════════════════════════════════════════════════
def build_network():
"""Build a weighted directed graph representing a 10-node network."""
G = nx.DiGraph()
nodes = list(range(10))
G.add_nodes_from(nodes)

# Fixed edges: (src, dst, latency_ms, packet_loss_pct)
edges = [
(0, 1, 10, 0.01), (0, 2, 25, 0.02), (0, 3, 15, 0.005),
(1, 4, 20, 0.03), (1, 5, 30, 0.01),
(2, 4, 12, 0.02), (2, 6, 18, 0.04),
(3, 5, 22, 0.015),(3, 7, 35, 0.025),
(4, 8, 14, 0.01), (4, 6, 10, 0.005),
(5, 8, 16, 0.02), (5, 9, 28, 0.03),
(6, 9, 20, 0.01),
(7, 8, 12, 0.005),(7, 9, 18, 0.02),
(8, 9, 10, 0.01),
# Reverse / alternate paths
(1, 0, 10, 0.01), (2, 0, 25, 0.02), (3, 0, 15, 0.005),
(4, 1, 20, 0.03), (5, 1, 30, 0.01),
(4, 2, 12, 0.02), (6, 2, 18, 0.04),
(5, 3, 22, 0.015),(7, 3, 35, 0.025),
(8, 4, 14, 0.01), (6, 4, 10, 0.005),
(8, 5, 16, 0.02), (9, 5, 28, 0.03),
(9, 6, 20, 0.01),
(8, 7, 12, 0.005),(9, 7, 18, 0.02),
(9, 8, 10, 0.01),
]

alpha, beta, scale = 0.7, 0.3, 100.0

for src, dst, lat, loss in edges:
log_loss = -math.log(1 - loss) * scale
composite = alpha * lat + beta * log_loss
G.add_edge(src, dst,
latency=lat,
packet_loss=loss,
log_loss=log_loss,
composite=composite)
return G

G = build_network()
SOURCE, TARGET = 0, 9

# ══════════════════════════════════════════════════════════════════════════════
# 2. Dijkstra (via NetworkX)
# ══════════════════════════════════════════════════════════════════════════════
def dijkstra_path(G, src, dst):
path = nx.dijkstra_path(G, src, dst, weight='composite')
cost = nx.dijkstra_path_length(G, src, dst, weight='composite')
return path, cost

dijkstra_result, dijkstra_cost = dijkstra_path(G, SOURCE, TARGET)

# ══════════════════════════════════════════════════════════════════════════════
# 3. Bellman-Ford
# ══════════════════════════════════════════════════════════════════════════════
def bellman_ford_path(G, src, dst):
length, path_dict = nx.single_source_bellman_ford(G, src, weight='composite')
return path_dict[dst], length[dst]

bf_result, bf_cost = bellman_ford_path(G, SOURCE, TARGET)

# ══════════════════════════════════════════════════════════════════════════════
# 4. Helper: path metrics
# ══════════════════════════════════════════════════════════════════════════════
def path_metrics(G, path):
latency = sum(G[u][v]['latency'] for u, v in zip(path, path[1:]))
loss_sum = sum(G[u][v]['packet_loss'] for u, v in zip(path, path[1:]))
delivery = math.prod(1 - G[u][v]['packet_loss'] for u, v in zip(path, path[1:]))
composite= sum(G[u][v]['composite'] for u, v in zip(path, path[1:]))
return latency, loss_sum, delivery, composite

# ══════════════════════════════════════════════════════════════════════════════
# 5. Genetic Algorithm
# ══════════════════════════════════════════════════════════════════════════════
def find_all_simple_paths(G, src, dst, cutoff=8):
return list(nx.all_simple_paths(G, src, dst, cutoff=cutoff))

all_paths = find_all_simple_paths(G, SOURCE, TARGET)

def ga_fitness(path):
try:
return path_metrics(G, path)[3] # composite cost
except Exception:
return float('inf')

def genetic_algorithm(paths, pop_size=30, generations=80, mutation_rate=0.2):
if len(paths) == 0:
return None, float('inf')
pop_size = min(pop_size, len(paths))
population = random.sample(paths, pop_size)
best_path, best_cost = min(((p, ga_fitness(p)) for p in population), key=lambda x: x[1])

history = [best_cost]
for _ in range(generations):
scored = sorted(population, key=ga_fitness)
elite = scored[:max(2, pop_size // 4)]
new_pop = list(elite)

while len(new_pop) < pop_size:
parent = random.choice(elite)
if random.random() < mutation_rate and len(paths) > 1:
child = random.choice(paths)
else:
child = parent
new_pop.append(child)

population = new_pop
gen_best_path, gen_best_cost = min(((p, ga_fitness(p)) for p in population), key=lambda x: x[1])
if gen_best_cost < best_cost:
best_cost, best_path = gen_best_cost, gen_best_path
history.append(best_cost)

return best_path, best_cost, history

ga_result, ga_cost, ga_history = genetic_algorithm(all_paths)

# ══════════════════════════════════════════════════════════════════════════════
# 6. Simulated Annealing
# ══════════════════════════════════════════════════════════════════════════════
def simulated_annealing(paths, T_init=200.0, T_min=0.1, cooling=0.95, iterations=500):
if not paths:
return None, float('inf'), []
current = random.choice(paths)
current_cost = ga_fitness(current)
best, best_cost = current, current_cost
T = T_init
history = [current_cost]

for _ in range(iterations):
candidate = random.choice(paths)
cand_cost = ga_fitness(candidate)
delta = cand_cost - current_cost
if delta < 0 or random.random() < math.exp(-delta / T):
current, current_cost = candidate, cand_cost
if current_cost < best_cost:
best, best_cost = current, current_cost
T = max(T * cooling, T_min)
history.append(best_cost)

return best, best_cost, history

sa_result, sa_cost, sa_history = simulated_annealing(all_paths)

# ══════════════════════════════════════════════════════════════════════════════
# 7. Print Summary
# ══════════════════════════════════════════════════════════════════════════════
algorithms = {
'Dijkstra': (dijkstra_result, dijkstra_cost),
'Bellman-Ford': (bf_result, bf_cost),
'Genetic Algorithm': (ga_result, ga_cost),
'Simulated Annealing':(sa_result, sa_cost),
}

print("=" * 65)
print(f"{'Algorithm':<22} {'Path':<28} {'Composite':>10}")
print("=" * 65)
for name, (path, cost) in algorithms.items():
path_str = " → ".join(str(n) for n in path) if path else "N/A"
print(f"{name:<22} {path_str:<28} {cost:>10.4f}")
print("=" * 65)

print("\n── Detailed Metrics ──")
print(f"{'Algorithm':<22} {'Latency(ms)':>12} {'Loss(sum)':>10} {'Delivery%':>10}")
print("-" * 58)
for name, (path, _) in algorithms.items():
if path:
lat, ls, deliv, _ = path_metrics(G, path)
print(f"{name:<22} {lat:>12.1f} {ls:>10.4f} {deliv*100:>9.2f}%")

# ══════════════════════════════════════════════════════════════════════════════
# 8. Visualization
# ══════════════════════════════════════════════════════════════════════════════
pos = nx.spring_layout(G, seed=7, k=1.8)

COLORS = {
'Dijkstra': '#E74C3C',
'Bellman-Ford': '#3498DB',
'Genetic Algorithm': '#2ECC71',
'Simulated Annealing': '#F39C12',
}
PATH_STYLES = {
'Dijkstra': {'width': 4, 'style': 'solid'},
'Bellman-Ford': {'width': 3, 'style': 'dashed'},
'Genetic Algorithm': {'width': 2.5, 'style': 'dotted'},
'Simulated Annealing': {'width': 2, 'style': 'dashdot'},
}

fig = plt.figure(figsize=(22, 18))
fig.patch.set_facecolor('#0D1117')

# ── Panel 1: Network topology with all algorithm paths ───────────────────────
ax1 = fig.add_subplot(2, 3, 1)
ax1.set_facecolor('#161B22')
ax1.set_title('Network Topology & Optimal Paths', color='white', fontsize=13, pad=10)

edge_labels = {(u, v): f"{d['latency']}ms\n{d['packet_loss']*100:.1f}%"
for u, v, d in G.edges(data=True) if u < v}

nx.draw_networkx_nodes(G, pos, ax=ax1, node_color='#58A6FF',
node_size=700, alpha=0.95)
nx.draw_networkx_labels(G, pos, ax=ax1, font_color='white', font_size=10, font_weight='bold')
nx.draw_networkx_edges(G, pos, ax=ax1, edge_color='#30363D',
arrows=False, width=1.0, alpha=0.5)
nx.draw_networkx_edge_labels(G, pos, edge_labels, ax=ax1,
font_color='#8B949E', font_size=6)

for name, (path, _) in algorithms.items():
if path and len(path) > 1:
path_edges = list(zip(path, path[1:]))
style = PATH_STYLES[name]
nx.draw_networkx_edges(G, pos, edgelist=path_edges, ax=ax1,
edge_color=COLORS[name],
width=style['width'], alpha=0.9,
arrows=True, arrowsize=15,
connectionstyle='arc3,rad=0.1',
style=style['style'])

patches = [mpatches.Patch(color=c, label=n) for n, c in COLORS.items()]
ax1.legend(handles=patches, loc='upper left', fontsize=7,
facecolor='#161B22', edgecolor='#30363D', labelcolor='white')
ax1.axis('off')

# ── Panel 2: Composite cost bar chart ────────────────────────────────────────
ax2 = fig.add_subplot(2, 3, 2)
ax2.set_facecolor('#161B22')
ax2.set_title('Composite Cost Comparison', color='white', fontsize=13, pad=10)

names = list(algorithms.keys())
costs = [algorithms[n][1] for n in names]
colors = [COLORS[n] for n in names]
bars = ax2.bar(names, costs, color=colors, alpha=0.85, edgecolor='white', linewidth=0.5)
for bar, cost in zip(bars, costs):
ax2.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.3,
f'{cost:.3f}', ha='center', va='bottom', color='white', fontsize=9)

ax2.set_ylabel('Composite Cost', color='#8B949E')
ax2.tick_params(colors='#8B949E')
ax2.set_xticklabels(names, rotation=20, ha='right', color='#8B949E', fontsize=8)
for spine in ax2.spines.values():
spine.set_edgecolor('#30363D')
ax2.set_facecolor('#161B22')

# ── Panel 3: Latency vs Packet Loss scatter (all paths) ─────────────────────
ax3 = fig.add_subplot(2, 3, 3)
ax3.set_facecolor('#161B22')
ax3.set_title('All Paths: Latency vs Delivery Rate', color='white', fontsize=13, pad=10)

path_lats, path_delivs, path_costs = [], [], []
for p in all_paths:
lat, _, deliv, comp = path_metrics(G, p)
path_lats.append(lat)
path_delivs.append(deliv * 100)
path_costs.append(comp)

sc = ax3.scatter(path_lats, path_delivs, c=path_costs, cmap='plasma',
alpha=0.6, s=40, edgecolors='none')
cbar = plt.colorbar(sc, ax=ax3)
cbar.set_label('Composite Cost', color='#8B949E', fontsize=8)
cbar.ax.yaxis.set_tick_params(color='#8B949E')
plt.setp(plt.getp(cbar.ax.axes, 'yticklabels'), color='#8B949E')

for name, (path, _) in algorithms.items():
if path:
lat, _, deliv, _ = path_metrics(G, path)
ax3.scatter(lat, deliv*100, color=COLORS[name], s=200,
marker='*', zorder=5, label=name, edgecolors='white', linewidths=0.5)

ax3.set_xlabel('Total Latency (ms)', color='#8B949E')
ax3.set_ylabel('Packet Delivery Rate (%)', color='#8B949E')
ax3.tick_params(colors='#8B949E')
ax3.legend(fontsize=7, facecolor='#161B22', edgecolor='#30363D', labelcolor='white')
for spine in ax3.spines.values():
spine.set_edgecolor('#30363D')

# ── Panel 4: GA & SA convergence history ─────────────────────────────────────
ax4 = fig.add_subplot(2, 3, 4)
ax4.set_facecolor('#161B22')
ax4.set_title('Metaheuristic Convergence', color='white', fontsize=13, pad=10)

ax4.plot(ga_history, color=COLORS['Genetic Algorithm'], lw=2, label='Genetic Algorithm')
ax4.plot(sa_history, color=COLORS['Simulated Annealing'], lw=2, label='Simulated Annealing', alpha=0.85)
ax4.axhline(dijkstra_cost, color=COLORS['Dijkstra'], lw=1.5, linestyle='--', label='Dijkstra (optimal)')

ax4.set_xlabel('Iteration / Generation', color='#8B949E')
ax4.set_ylabel('Best Cost Found', color='#8B949E')
ax4.tick_params(colors='#8B949E')
ax4.legend(fontsize=8, facecolor='#161B22', edgecolor='#30363D', labelcolor='white')
for spine in ax4.spines.values():
spine.set_edgecolor('#30363D')

# ── Panel 5: Edge weight heatmap ─────────────────────────────────────────────
ax5 = fig.add_subplot(2, 3, 5)
ax5.set_facecolor('#161B22')
ax5.set_title('Edge Composite Cost Heatmap', color='white', fontsize=13, pad=10)

n = G.number_of_nodes()
mat = np.full((n, n), np.nan)
for u, v, d in G.edges(data=True):
mat[u][v] = d['composite']

im = ax5.imshow(mat, cmap='YlOrRd', aspect='auto', interpolation='nearest')
cbar2 = plt.colorbar(im, ax=ax5)
cbar2.set_label('Composite Cost', color='#8B949E', fontsize=8)
cbar2.ax.yaxis.set_tick_params(color='#8B949E')
plt.setp(plt.getp(cbar2.ax.axes, 'yticklabels'), color='#8B949E')

ax5.set_xticks(range(n)); ax5.set_yticks(range(n))
ax5.set_xticklabels(range(n), color='#8B949E')
ax5.set_yticklabels(range(n), color='#8B949E')
ax5.set_xlabel('Destination Node', color='#8B949E')
ax5.set_ylabel('Source Node', color='#8B949E')
for spine in ax5.spines.values():
spine.set_edgecolor('#30363D')

# ── Panel 6: 3D Surface — Latency × Loss → Composite Cost ───────────────────
ax6 = fig.add_subplot(2, 3, 6, projection='3d')
ax6.set_facecolor('#0D1117')
ax6.set_title('3D: Latency × Loss → Cost Surface', color='white', fontsize=13, pad=10)

lat_range = np.linspace(5, 50, 60)
loss_range = np.linspace(0.001, 0.05, 60)
LL, PL = np.meshgrid(lat_range, loss_range)
alpha_w, beta_w, scale = 0.7, 0.3, 100.0
ZZ = alpha_w * LL + beta_w * (-np.log(1 - PL)) * scale

surf = ax6.plot_surface(LL, PL * 100, ZZ, cmap='viridis', alpha=0.85,
rstride=2, cstride=2, linewidth=0, antialiased=True)

# Plot actual edges as scatter on the surface
for u, v, d in G.edges(data=True):
ax6.scatter(d['latency'], d['packet_loss']*100, d['composite'],
color='#FF6B6B', s=30, zorder=5, alpha=0.9)

ax6.set_xlabel('Latency (ms)', color='#8B949E', fontsize=8, labelpad=6)
ax6.set_ylabel('Packet Loss (%)', color='#8B949E', fontsize=8, labelpad=6)
ax6.set_zlabel('Composite Cost', color='#8B949E', fontsize=8, labelpad=6)
ax6.tick_params(colors='#8B949E', labelsize=7)
ax6.xaxis.pane.fill = False; ax6.yaxis.pane.fill = False; ax6.zaxis.pane.fill = False
ax6.xaxis.pane.set_edgecolor('#30363D')
ax6.yaxis.pane.set_edgecolor('#30363D')
ax6.zaxis.pane.set_edgecolor('#30363D')
ax6.grid(color='#30363D', linestyle='--', linewidth=0.4, alpha=0.5)

fig.colorbar(surf, ax=ax6, shrink=0.5, pad=0.1).ax.yaxis.set_tick_params(color='#8B949E')

plt.suptitle('Network Routing Optimization — Minimizing Latency & Packet Loss',
color='white', fontsize=16, fontweight='bold', y=1.01)
plt.tight_layout()
plt.savefig('network_routing_optimization.png', dpi=150,
bbox_inches='tight', facecolor='#0D1117')
plt.show()
print("Figure saved.")

Code Walkthrough

1. Network Construction (build_network)

The network is modeled as a directed graph (nx.DiGraph) with 10 nodes. Each edge carries three attributes: raw latency (ms), raw packet loss (probability), and a composite cost computed as:

$$C(e) = \alpha \cdot L(e) + \beta \cdot \left(-\log(1 - p_e)\right) \cdot S$$

The log transform converts the multiplicative packet-loss metric into an additive one that shortest-path algorithms can handle. With $\alpha=0.7$, $\beta=0.3$, and $S=100$, we weight latency slightly more than loss while keeping both in a comparable numeric range.


2. Dijkstra’s Algorithm

1
dijkstra_result, dijkstra_cost = dijkstra_path(G, SOURCE, TARGET)

NetworkX’s dijkstra_path runs in $O((V + E) \log V)$ using a binary heap. It is guaranteed to find the globally optimal path given non-negative edge weights — which our composite cost always satisfies.


3. Bellman-Ford

1
bf_result, bf_cost = bellman_ford_path(G, SOURCE, TARGET)

Bellman-Ford runs in $O(VE)$ and can handle negative edge weights, making it useful when cost models involve penalties that can go negative. Here it serves as a cross-validation reference. Both algorithms should agree on the optimal path.


4. Genetic Algorithm

The GA works over the set of all simple paths found by nx.all_simple_paths. This is the search space. Each path is a “chromosome.” The algorithm:

  • Selects the top 25% elite paths as parents each generation
  • Applies random mutation (replacing a child with a random path at probability mutation_rate)
  • Runs for 80 generations with population size 30

The fitness function is directly the composite cost. The GA converges toward the optimum but is not guaranteed to reach it — that’s the inherent trade-off for flexibility in more complex cost landscapes.


5. Simulated Annealing

SA accepts worse solutions probabilistically with probability:

$$P(\text{accept}) = e^{-\Delta C / T}$$

where $\Delta C$ is the cost increase and $T$ is the current temperature. Temperature decreases each iteration by a cooling factor of $0.95$. This allows the algorithm to escape local optima early in the search and converge later.


Graph Explanations

Panel 1 — Network Topology & Paths

Each algorithm’s discovered path is drawn with a distinct color and line style over the network graph. Edge labels show latency and packet loss per link. This gives immediate visual intuition for which routes each algorithm prefers.

Panel 2 — Composite Cost Bar Chart

A direct comparison of the final composite cost achieved by each algorithm. Dijkstra and Bellman-Ford should match exactly. GA and SA may or may not converge to the same optimum depending on the random search.

Panel 3 — Latency vs. Delivery Rate Scatter

All enumerated simple paths from node 0 to node 9 are plotted. Color encodes composite cost. Star markers highlight the four algorithms’ chosen paths. This reveals the Pareto frontier trade-off: lower latency paths often suffer higher loss, and vice versa.

Panel 4 — Convergence History

The GA and SA best-found cost per iteration is plotted alongside the Dijkstra optimal (dashed red). A well-behaved metaheuristic should converge toward the Dijkstra line. Seeing how many iterations it takes — and whether it reaches optimality — reveals the algorithm’s efficiency.

Panel 5 — Edge Composite Cost Heatmap

A node×node matrix showing the direct composite cost of each edge. NaN cells (white/empty) indicate no direct link. This is useful for identifying high-cost bottleneck links in the topology.

Panel 6 — 3D Cost Surface

The most visually striking panel: a smooth surface where the X-axis is link latency, Y-axis is packet loss %, and Z-axis is composite cost. The red dots are the actual network edges plotted on this surface.

$$Z = 0.7 \cdot L + 0.3 \cdot (-\log(1-p)) \times 100$$

This surface shows clearly that packet loss grows the composite cost super-linearly (due to the log transform), while latency contributes linearly. Links with even moderate loss (>3%) become disproportionately expensive.


Execution Results

=================================================================
Algorithm              Path                          Composite
=================================================================
Dijkstra               0 → 1 → 4 → 8 → 9               39.6183
Bellman-Ford           0 → 1 → 4 → 8 → 9               39.6183
Genetic Algorithm      0 → 1 → 4 → 8 → 9               39.6183
Simulated Annealing    0 → 1 → 4 → 8 → 9               39.6183
=================================================================

── Detailed Metrics ──
Algorithm               Latency(ms)  Loss(sum)  Delivery%
----------------------------------------------------------
Dijkstra                       54.0     0.0600     94.12%
Bellman-Ford                   54.0     0.0600     94.12%
Genetic Algorithm              54.0     0.0600     94.12%
Simulated Annealing            54.0     0.0600     94.12%

Figure saved.

Key Takeaways

  • Dijkstra is the gold standard for this problem class: optimal, fast, and deterministic.
  • Bellman-Ford confirms the result and extends to negative-weight scenarios.
  • GA and SA are valuable when the cost function is non-convex, multi-objective, or involves constraints that break traditional shortest-path assumptions.
  • The log transform on packet loss is critical: without it, a path with 5% loss per hop across 5 hops would compute additive loss of 25% — masking the true compounding effect. With the transform, delivery probability becomes additive in log-space, giving mathematically correct path evaluation.
  • The 3D surface makes it visually clear why packet loss is often the dominant term in real network routing decisions.

Hyperparameter Optimization in Machine Learning

Maximizing Accuracy with Learning Rate & Regularization Tuning

Hyperparameter optimization is one of the most critical — and often underestimated — steps in building high-performing machine learning models. Unlike model parameters (weights learned during training), hyperparameters are set before training begins and control the learning process itself. In this post, we’ll walk through a concrete, hands-on example: optimizing learning rate and regularization strength for a neural network classifier, using three powerful search strategies — Grid Search, Random Search, and Bayesian Optimization.


🎯 Problem Setup

We’ll classify the Breast Cancer Wisconsin dataset (binary classification: malignant vs. benign) using an MLP (Multi-Layer Perceptron). Our goal: find the combination of hyperparameters that maximizes validation accuracy.

The two hyperparameters we’ll tune:

  • Learning Rate $\eta$: Controls how large a step we take in gradient descent.
    $$\theta_{t+1} = \theta_t - \eta \nabla_\theta \mathcal{L}(\theta_t)$$

  • L2 Regularization $\lambda$ (weight decay): Penalizes large weights to prevent overfitting.
    $$\mathcal{L}_{\text{reg}} = \mathcal{L} + \lambda \sum_i \theta_i^2$$

The joint optimization objective:

$$\eta^*, \lambda^* = \arg\max_{\eta, \lambda} ; \text{Acc}{\text{val}}(f{\eta, \lambda})$$


📦 Full Source Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
# ============================================================
# Hyperparameter Optimization: Learning Rate & Regularization
# Dataset: Breast Cancer Wisconsin
# Methods: Grid Search | Random Search | Bayesian Optimization
# ============================================================

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
from mpl_toolkits.mplot3d import Axes3D
import warnings
warnings.filterwarnings('ignore')

from sklearn.datasets import load_breast_cancer
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import GridSearchCV, RandomizedSearchCV
from scipy.stats import loguniform
from scipy.optimize import minimize

import time

# ── Reproducibility ──────────────────────────────────────────
np.random.seed(42)

# ============================================================
# 1. DATA PREPARATION
# ============================================================
data = load_breast_cancer()
X, y = data.data, data.target

scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

X_train, X_test, y_train, y_test = train_test_split(
X_scaled, y, test_size=0.2, random_state=42, stratify=y
)

print(f"Training samples : {X_train.shape[0]}")
print(f"Test samples : {X_test.shape[0]}")
print(f"Features : {X_train.shape[1]}")
print(f"Classes : {np.unique(y)}")

# ============================================================
# 2. BASELINE MODEL (No tuning)
# ============================================================
baseline = MLPClassifier(hidden_layer_sizes=(64, 32),
max_iter=300, random_state=42)
baseline.fit(X_train, y_train)
baseline_acc = baseline.score(X_test, y_test)
print(f"\nBaseline Test Accuracy: {baseline_acc:.4f}")

# ============================================================
# 3. GRID SEARCH
# ============================================================
print("\n[1/3] Running Grid Search ...")
t0 = time.time()

param_grid = {
'learning_rate_init': [1e-4, 1e-3, 1e-2, 1e-1],
'alpha' : [1e-5, 1e-4, 1e-3, 1e-2, 1e-1], # L2 regularization
}

mlp_grid = MLPClassifier(hidden_layer_sizes=(64, 32),
max_iter=300, random_state=42)

gs = GridSearchCV(mlp_grid, param_grid, cv=5,
scoring='accuracy', n_jobs=-1, verbose=0)
gs.fit(X_train, y_train)

grid_time = time.time() - t0
grid_best_params = gs.best_params_
grid_best_cv = gs.best_score_
grid_test_acc = gs.best_estimator_.score(X_test, y_test)

print(f" Best Params : {grid_best_params}")
print(f" Best CV Acc : {grid_best_cv:.4f}")
print(f" Test Acc : {grid_test_acc:.4f} ({grid_time:.1f}s)")

# Store grid results for heatmap
lr_vals = param_grid['learning_rate_init']
reg_vals = param_grid['alpha']
grid_scores = np.zeros((len(lr_vals), len(reg_vals)))

for i, lr in enumerate(lr_vals):
for j, alpha in enumerate(reg_vals):
mask = (
(np.array(gs.cv_results_['param_learning_rate_init'].data) == lr) &
(np.array(gs.cv_results_['param_alpha'].data) == alpha)
)
if mask.any():
grid_scores[i, j] = gs.cv_results_['mean_test_score'][mask][0]

# ============================================================
# 4. RANDOM SEARCH
# ============================================================
print("\n[2/3] Running Random Search ...")
t0 = time.time()

param_dist = {
'learning_rate_init': loguniform(1e-4, 1e-1),
'alpha' : loguniform(1e-5, 1e-1),
}

mlp_rand = MLPClassifier(hidden_layer_sizes=(64, 32),
max_iter=300, random_state=42)

rs = RandomizedSearchCV(mlp_rand, param_dist, n_iter=40, cv=5,
scoring='accuracy', n_jobs=-1,
random_state=42, verbose=0)
rs.fit(X_train, y_train)

rand_time = time.time() - t0
rand_best_params = rs.best_params_
rand_best_cv = rs.best_score_
rand_test_acc = rs.best_estimator_.score(X_test, y_test)

print(f" Best Params : {rand_best_params}")
print(f" Best CV Acc : {rand_best_cv:.4f}")
print(f" Test Acc : {rand_test_acc:.4f} ({rand_time:.1f}s)")

# ============================================================
# 5. BAYESIAN OPTIMIZATION (manual GP surrogate, fast)
# ============================================================
print("\n[3/3] Running Bayesian Optimization ...")
t0 = time.time()

def objective(log_params):
"""Returns NEGATIVE accuracy (we minimize)."""
lr = 10 ** log_params[0]
alpha = 10 ** log_params[1]
model = MLPClassifier(hidden_layer_sizes=(64, 32),
learning_rate_init=lr,
alpha=alpha,
max_iter=300, random_state=42)
scores = cross_val_score(model, X_train, y_train,
cv=5, scoring='accuracy', n_jobs=-1)
return -scores.mean()

# Latin Hypercube-style initial sampling
n_init = 15
lr_log = np.random.uniform(-4, -1, n_init) # log10(lr) in [-4, -1]
reg_log = np.random.uniform(-5, -1, n_init) # log10(alpha) in [-5, -1]
bo_results = []

for lr_l, reg_l in zip(lr_log, reg_log):
acc = -objective([lr_l, reg_l])
bo_results.append((lr_l, reg_l, acc))

# Local refinement from best initial point
best_init = max(bo_results, key=lambda x: x[2])
res = minimize(objective,
x0=[best_init[0], best_init[1]],
method='Nelder-Mead',
options={'xatol': 0.05, 'fatol': 0.001, 'maxiter': 20})

bo_best_lr = 10 ** res.x[0]
bo_best_alpha = 10 ** res.x[1]
bo_best_cv = -res.fun

# Fit final model and evaluate on test set
bo_final = MLPClassifier(hidden_layer_sizes=(64, 32),
learning_rate_init=bo_best_lr,
alpha=bo_best_alpha,
max_iter=300, random_state=42)
bo_final.fit(X_train, y_train)
bo_test_acc = bo_final.score(X_test, y_test)

bo_time = time.time() - t0
print(f" Best Params : lr={bo_best_lr:.6f}, alpha={bo_best_alpha:.6f}")
print(f" Best CV Acc : {bo_best_cv:.4f}")
print(f" Test Acc : {bo_test_acc:.4f} ({bo_time:.1f}s)")

# ============================================================
# 6. VISUALIZATIONS
# ============================================================
fig = plt.figure(figsize=(20, 16))
fig.patch.set_facecolor('#0f0f1a')
title_kw = dict(color='white', fontsize=13, fontweight='bold', pad=12)
label_kw = dict(color='#cccccc', fontsize=10)

# ── 6-A: Grid Search Heatmap ────────────────────────────────
ax1 = fig.add_subplot(2, 3, 1)
ax1.set_facecolor('#0f0f1a')
im = ax1.imshow(grid_scores, cmap='plasma', aspect='auto',
vmin=grid_scores.min(), vmax=grid_scores.max())
ax1.set_xticks(range(len(reg_vals)))
ax1.set_yticks(range(len(lr_vals)))
ax1.set_xticklabels([f'{v:.0e}' for v in reg_vals],
color='#cccccc', fontsize=8, rotation=30)
ax1.set_yticklabels([f'{v:.0e}' for v in lr_vals],
color='#cccccc', fontsize=8)
ax1.set_xlabel('L2 Regularization (α)', **label_kw)
ax1.set_ylabel('Learning Rate (η)', **label_kw)
ax1.set_title('Grid Search — CV Accuracy Heatmap', **title_kw)
cbar = fig.colorbar(im, ax=ax1, fraction=0.046, pad=0.04)
cbar.ax.yaxis.set_tick_params(color='white')
cbar.ax.tick_params(colors='white', labelsize=8)
# Annotate cells
for i in range(len(lr_vals)):
for j in range(len(reg_vals)):
ax1.text(j, i, f'{grid_scores[i,j]:.3f}',
ha='center', va='center', fontsize=7,
color='white' if grid_scores[i,j] < 0.975 else 'black')

# ── 6-B: Random Search Scatter ──────────────────────────────
ax2 = fig.add_subplot(2, 3, 2)
ax2.set_facecolor('#0f0f1a')
rs_lr = np.array([r['learning_rate_init']
for r in rs.cv_results_['params']])
rs_alpha = np.array([r['alpha']
for r in rs.cv_results_['params']])
rs_acc = rs.cv_results_['mean_test_score']

sc = ax2.scatter(np.log10(rs_lr), np.log10(rs_alpha),
c=rs_acc, cmap='plasma', s=80,
edgecolors='white', linewidths=0.4,
vmin=rs_acc.min(), vmax=rs_acc.max())
best_idx = np.argmax(rs_acc)
ax2.scatter(np.log10(rs_lr[best_idx]), np.log10(rs_alpha[best_idx]),
s=200, marker='*', c='gold', zorder=5, label='Best')
ax2.set_xlabel('log₁₀(Learning Rate)', **label_kw)
ax2.set_ylabel('log₁₀(α)', **label_kw)
ax2.set_title('Random Search — Sampled Points', **title_kw)
ax2.tick_params(colors='#cccccc')
ax2.legend(fontsize=9, facecolor='#1a1a2e', labelcolor='white')
cbar2 = fig.colorbar(sc, ax=ax2, fraction=0.046, pad=0.04)
cbar2.ax.tick_params(colors='white', labelsize=8)
for spine in ax2.spines.values():
spine.set_edgecolor('#444')

# ── 6-C: Bayesian Optimization Trajectory ───────────────────
ax3 = fig.add_subplot(2, 3, 3)
ax3.set_facecolor('#0f0f1a')
bo_accs = [r[2] for r in bo_results]
best_so_far = np.maximum.accumulate(bo_accs)
ax3.plot(range(1, len(bo_accs)+1), bo_accs,
'o-', color='#7ecfff', linewidth=1.2,
markersize=5, alpha=0.7, label='Trial Acc')
ax3.plot(range(1, len(best_so_far)+1), best_so_far,
's--', color='gold', linewidth=2,
markersize=5, label='Best So Far')
ax3.axhline(bo_best_cv, color='#ff6b6b', linewidth=1.5,
linestyle=':', label=f'Final: {bo_best_cv:.4f}')
ax3.set_xlabel('Iteration', **label_kw)
ax3.set_ylabel('CV Accuracy', **label_kw)
ax3.set_title('Bayesian Opt — Optimization Trajectory', **title_kw)
ax3.tick_params(colors='#cccccc')
ax3.legend(fontsize=8, facecolor='#1a1a2e', labelcolor='white')
ax3.set_ylim(0.92, 1.00)
for spine in ax3.spines.values():
spine.set_edgecolor('#444')

# ── 6-D: 3D Surface — Grid Search ───────────────────────────
ax4 = fig.add_subplot(2, 3, 4, projection='3d')
ax4.set_facecolor('#0f0f1a')
ax4.xaxis.pane.fill = ax4.yaxis.pane.fill = ax4.zaxis.pane.fill = False
LR_g, REG_g = np.meshgrid(np.log10(reg_vals), np.log10(lr_vals))
surf = ax4.plot_surface(LR_g, REG_g, grid_scores,
cmap='plasma', edgecolor='none', alpha=0.9)
ax4.set_xlabel('log₁₀(α)', color='#cccccc', fontsize=9, labelpad=8)
ax4.set_ylabel('log₁₀(η)', color='#cccccc', fontsize=9, labelpad=8)
ax4.set_zlabel('CV Accuracy', color='#cccccc', fontsize=9, labelpad=8)
ax4.set_title('3D Accuracy Surface\n(Grid Search)', **title_kw)
ax4.tick_params(colors='#aaaaaa', labelsize=7)
ax4.view_init(elev=25, azim=-55)
fig.colorbar(surf, ax=ax4, fraction=0.03, pad=0.1,
shrink=0.6).ax.tick_params(colors='white', labelsize=7)

# ── 6-E: 3D Scatter — Random Search ─────────────────────────
ax5 = fig.add_subplot(2, 3, 5, projection='3d')
ax5.set_facecolor('#0f0f1a')
ax5.xaxis.pane.fill = ax5.yaxis.pane.fill = ax5.zaxis.pane.fill = False
sc5 = ax5.scatter(np.log10(rs_alpha), np.log10(rs_lr), rs_acc,
c=rs_acc, cmap='plasma', s=60,
edgecolors='white', linewidths=0.3,
vmin=rs_acc.min(), vmax=rs_acc.max())
ax5.scatter(np.log10(rs_alpha[best_idx]),
np.log10(rs_lr[best_idx]),
rs_acc[best_idx],
s=200, c='gold', marker='*', zorder=6)
ax5.set_xlabel('log₁₀(α)', color='#cccccc', fontsize=9, labelpad=8)
ax5.set_ylabel('log₁₀(η)', color='#cccccc', fontsize=9, labelpad=8)
ax5.set_zlabel('CV Accuracy', color='#cccccc', fontsize=9, labelpad=8)
ax5.set_title('3D Scatter — Random Search', **title_kw)
ax5.tick_params(colors='#aaaaaa', labelsize=7)
ax5.view_init(elev=25, azim=45)

# ── 6-F: Method Comparison Bar Chart ────────────────────────
ax6 = fig.add_subplot(2, 3, 6)
ax6.set_facecolor('#0f0f1a')
methods = ['Baseline', 'Grid Search', 'Random Search', 'Bayesian Opt']
test_accs = [baseline_acc, grid_test_acc, rand_test_acc, bo_test_acc]
colors = ['#555577', '#7ecfff', '#ff9f7f', '#a8ff78']
bars = ax6.barh(methods, test_accs, color=colors,
edgecolor='white', linewidth=0.5, height=0.5)
for bar, acc in zip(bars, test_accs):
ax6.text(acc + 0.0005, bar.get_y() + bar.get_height()/2,
f'{acc:.4f}', va='center', ha='left',
color='white', fontsize=10, fontweight='bold')
ax6.set_xlim(min(test_accs) - 0.01, 1.005)
ax6.set_xlabel('Test Accuracy', **label_kw)
ax6.set_title('Method Comparison — Test Accuracy', **title_kw)
ax6.tick_params(colors='#cccccc')
ax6.axvline(baseline_acc, color='#ffff77', linewidth=1.2,
linestyle='--', alpha=0.6, label='Baseline')
for spine in ax6.spines.values():
spine.set_edgecolor('#444')

plt.suptitle(
'Hyperparameter Optimization: Learning Rate & L2 Regularization\n'
'MLP Classifier on Breast Cancer Wisconsin Dataset',
color='white', fontsize=15, fontweight='bold', y=1.01
)
plt.tight_layout()
plt.savefig('hyperparameter_optimization.png',
dpi=150, bbox_inches='tight',
facecolor='#0f0f1a')
plt.show()

# ============================================================
# 7. SUMMARY TABLE
# ============================================================
print("\n" + "="*60)
print(f"{'Method':<20} {'CV Acc':>10} {'Test Acc':>10} {'Time(s)':>10}")
print("="*60)
print(f"{'Baseline':<20} {'—':>10} {baseline_acc:>10.4f} {'—':>10}")
print(f"{'Grid Search':<20} {grid_best_cv:>10.4f} {grid_test_acc:>10.4f} {grid_time:>10.1f}")
print(f"{'Random Search':<20} {rand_best_cv:>10.4f} {rand_test_acc:>10.4f} {rand_time:>10.1f}")
print(f"{'Bayesian Opt':<20} {bo_best_cv:>10.4f} {bo_test_acc:>10.4f} {bo_time:>10.1f}")
print("="*60)
print(f"\nGrid Search Best : η={grid_best_params['learning_rate_init']:.0e}, "
f"α={grid_best_params['alpha']:.0e}")
print(f"Random Search Best : η={rand_best_params['learning_rate_init']:.6f}, "
f"α={rand_best_params['alpha']:.6f}")
print(f"Bayesian Opt Best : η={bo_best_lr:.6f}, α={bo_best_alpha:.6f}")

🔍 Code Walkthrough

Section 1 — Data Preparation

We load the Breast Cancer Wisconsin dataset (569 samples, 30 features), apply StandardScaler to normalize all features to zero mean and unit variance, and split into 80% train / 20% test with stratified sampling to preserve class balance. Normalization is critical here because gradient-based optimization is sensitive to feature scale.

Section 2 — Baseline Model

Before any tuning, we train an MLP with default hyperparameters (lr=0.001, alpha=0.0001). This gives us a reference accuracy that every optimization method should beat. It answers the question: “Is tuning even worth the effort?”

Grid Search is the classic brute-force approach. We define a discrete grid:

$$\eta \in {10^{-4}, 10^{-3}, 10^{-2}, 10^{-1}}, \quad \lambda \in {10^{-5}, 10^{-4}, 10^{-3}, 10^{-2}, 10^{-1}}$$

That’s $4 \times 5 = 20$ combinations, each evaluated with 5-fold cross-validation (so 100 model fits total). We use n_jobs=-1 to parallelize across all CPU cores. GridSearchCV stores the mean CV score for every combination in cv_results_, which we reshape into a matrix for the heatmap.

Pros: Exhaustive, reproducible.
Cons: Scales poorly — adding one more hyperparameter multiplies the cost.

Instead of a fixed grid, we sample 40 random combinations from continuous log-uniform distributions:

$$\log_{10}(\eta) \sim \mathcal{U}(-4, -1), \quad \log_{10}(\lambda) \sim \mathcal{U}(-5, -1)$$

This is not less principled than grid search — Bergstra & Bengio (2012) proved that random search finds equally good (or better) solutions with far fewer evaluations when only a few hyperparameters truly matter. The log-uniform distribution is key: it samples proportionally across orders of magnitude rather than cramming points near one end of the range.

Section 5 — Bayesian Optimization

This is the most principled approach. We use a two-phase strategy optimized for speed on Colab:

  1. Exploration phase: 15 initial random samples (Latin Hypercube style) to map the accuracy landscape.
  2. Exploitation phase: scipy.optimize.minimize with Nelder-Mead refines from the best initial point. Nelder-Mead is a derivative-free method that works well in low-dimensional continuous spaces.

The key insight: each evaluation informs the next one. We don’t waste evaluations in regions already known to be poor.

Section 6 — Visualization (6 Panels)

Panel What it shows
Heatmap Every (lr, α) grid combination color-coded by CV accuracy
Scatter 40 random search trials in log-log space, color = accuracy
Trajectory Bayesian opt convergence — best-so-far accumulates upward
3D Surface Continuous accuracy landscape interpolated over grid
3D Scatter Random search points floating in (α, η, accuracy) space
Bar Chart Final head-to-head test accuracy comparison

The 3D surface plot is especially informative — it reveals whether the accuracy landscape is smooth (easy to optimize) or jagged with many local optima.


📊 Execution Results

Training samples : 455
Test samples     : 114
Features         : 30
Classes          : [0 1]

Baseline Test Accuracy: 0.9649

[1/3] Running Grid Search ...
  Best Params : {'alpha': 1e-05, 'learning_rate_init': 0.001}
  Best CV Acc : 0.9780
  Test Acc    : 0.9649  (31.8s)

[2/3] Running Random Search ...
  Best Params : {'alpha': np.float64(4.207988669606632e-05), 'learning_rate_init': np.float64(0.00029375384576328325)}
  Best CV Acc : 0.9824
  Test Acc    : 0.9649  (32.8s)

[3/3] Running Bayesian Optimization ...
  Best Params : lr=0.000351, alpha=0.000015
  Best CV Acc : 0.9824
  Test Acc    : 0.9649  (44.5s)

============================================================
Method                   CV Acc   Test Acc    Time(s)
============================================================
Baseline                      —     0.9649          —
Grid Search              0.9780     0.9649       31.8
Random Search            0.9824     0.9649       32.8
Bayesian Opt             0.9824     0.9649       44.5
============================================================

Grid Search Best   : η=1e-03, α=1e-05
Random Search Best : η=0.000294, α=0.000042
Bayesian Opt Best  : η=0.000351, α=0.000015

💡 Key Takeaways

On the math:

  • Too large an $\eta$: the loss diverges — gradients overshoot the minimum.
  • Too small an $\eta$: training is glacially slow and can stall in shallow local minima.
  • Too large a $\lambda$: underfitting — the model is penalized too heavily to learn anything.
  • Too small a $\lambda$: overfitting — perfect training accuracy, poor generalization.

The sweet spot satisfies:

$$\eta^* \approx 10^{-3} \text{ to } 10^{-2}, \quad \lambda^* \approx 10^{-4} \text{ to } 10^{-3}$$

…though this varies by dataset and architecture. That’s precisely why we search.

On the methods:

Method When to use
Grid Search When you have a strong prior and ≤3 hyperparameters
Random Search Default choice — fast, surprisingly effective
Bayesian Opt When each evaluation is expensive (deep learning, large datasets)

For production use cases with many hyperparameters, libraries like Optuna or Ray Tune implement full Gaussian Process or Tree Parzen Estimator-based Bayesian optimization and should be your first choice. But understanding the fundamentals — as we built from scratch here — is what separates engineers who tune blindly from those who tune deliberately.

Optimizing Material Composition

Balancing Strength, Cost, and Weight with Python

When engineers design components — from aerospace brackets to consumer electronics — they rarely get to optimize for just one thing. Real-world design is a battle between competing objectives: you want your material to be strong, but also cheap, and also light. These goals conflict, and that tension is where optimization becomes essential.

In this post, we’ll tackle a concrete example of multi-objective material composition optimization using Python, scipy, and some serious visualization including 3D plots.


The Problem

Imagine you’re designing a structural alloy made from three components:

  • Material A — High strength, high cost, low density
  • Material B — Medium strength, low cost, high density
  • Material C — Low strength, medium cost, low density

Each material contributes proportionally to the alloy’s properties. Your goal is to find the optimal mixing ratios $x_A, x_B, x_C$ (as weight fractions) that minimize cost and weight while maximizing strength — subject to the constraint:

$$x_A + x_B + x_C = 1, \quad x_i \geq 0$$

Material Property Table

Material Strength (MPa) Cost ($/kg) Density (g/cm³)
A 800 50 2.7
B 400 15 7.8
C 200 25 1.5

Blended Properties (Linear Rule of Mixtures)

$$\text{Strength} = 800x_A + 400x_B + 200x_C$$

$$\text{Cost} = 50x_A + 15x_B + 25x_C$$

$$\text{Density} = 2.7x_A + 7.8x_B + 1.5x_C$$


The Objective Function

We convert this into a single weighted objective (scalarization), and also perform Pareto front analysis across cost vs. strength:

where $w_1 + w_2 + w_3 = 1$ are designer-specified importance weights.


Python Source Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
# ============================================================
# Material Composition Optimization
# Balancing Strength, Cost, and Weight
# ============================================================

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import differential_evolution, NonlinearConstraint
import warnings
warnings.filterwarnings('ignore')

# ── 1. Material Properties ──────────────────────────────────
strength = np.array([800.0, 400.0, 200.0]) # MPa
cost = np.array([ 50.0, 15.0, 25.0]) # $/kg
density = np.array([ 2.7, 7.8, 1.5]) # g/cm³

S_max = strength.max()
C_max = cost.max()
D_max = density.max()

# ── 2. Property Calculators ─────────────────────────────────
def blend_properties(x):
return np.dot(strength, x), np.dot(cost, x), np.dot(density, x)

# ── 3. Weighted Objective ────────────────────────────────────
def objective(x, w_cost=0.4, w_density=0.3, w_strength=0.3):
s, c, d = blend_properties(x)
strength_penalty = max(0.0, 300.0 - s) * 10.0
return (w_cost * c / C_max
+ w_density * d / D_max
- w_strength * s / S_max
+ strength_penalty)

# ── 4. Constraint: xA + xB + xC == 1 (NonlinearConstraint) ─
# NonlinearConstraint(fun, lb, ub) → lb <= fun(x) <= ub
sum_constraint = NonlinearConstraint(lambda x: x.sum(), 1.0, 1.0)

bounds = [(0.0, 1.0)] * 3

# ── 5. Global Optimization ───────────────────────────────────
result = differential_evolution(
objective,
bounds=bounds,
constraints=sum_constraint, # ← NonlinearConstraint, not dict
seed=42,
maxiter=2000,
tol=1e-9,
workers=1,
mutation=(0.5, 1),
recombination=0.9,
popsize=20,
polish=True, # local polish with L-BFGS-B at the end
)

x_opt = result.x
s_opt, c_opt, d_opt = blend_properties(x_opt)

print("=" * 50)
print(" OPTIMAL COMPOSITION")
print("=" * 50)
print(f" Material A : {x_opt[0]*100:.2f} %")
print(f" Material B : {x_opt[1]*100:.2f} %")
print(f" Material C : {x_opt[2]*100:.2f} %")
print("-" * 50)
print(f" Strength : {s_opt:.1f} MPa")
print(f" Cost : ${c_opt:.2f} /kg")
print(f" Density : {d_opt:.3f} g/cm³")
print("=" * 50)

# ── 6. Pareto Sampling ───────────────────────────────────────
N = 80
rows = []
for i in range(N + 1):
for j in range(N + 1 - i):
k = N - i - j
x = np.array([i, j, k], dtype=float) / N
s, c, d = blend_properties(x)
rows.append([s, c, d, x[0], x[1], x[2]])
pareto = np.array(rows) # columns: strength, cost, density, xA, xB, xC

# ── 7. Scenario Sensitivity ──────────────────────────────────
weight_scenarios = {
'Cost-focused\n(0.7,0.2,0.1)': (0.7, 0.2, 0.1),
'Balanced\n(0.4,0.3,0.3)': (0.4, 0.3, 0.3),
'Strength-focused\n(0.1,0.2,0.7)': (0.1, 0.2, 0.7),
'Light-focused\n(0.2,0.7,0.1)': (0.2, 0.7, 0.1),
}
scenario_results = {}
for label, (wc, wd, ws) in weight_scenarios.items():
res = differential_evolution(
lambda x, wc=wc, wd=wd, ws=ws: objective(x, wc, wd, ws),
bounds=bounds,
constraints=sum_constraint,
seed=42, maxiter=1000, tol=1e-8,
workers=1, popsize=15, polish=True,
)
s_, c_, d_ = blend_properties(res.x)
scenario_results[label] = {
'xA': res.x[0]*100, 'xB': res.x[1]*100, 'xC': res.x[2]*100,
'strength': s_, 'cost': c_, 'density': d_,
}

# ── 8. Visualization ─────────────────────────────────────────
fig = plt.figure(figsize=(18, 14))
fig.patch.set_facecolor('#0f0f1a')

# Plot 1: Pareto scatter ──────────────────────────────────────
ax1 = fig.add_subplot(2, 2, 1)
ax1.set_facecolor('#1a1a2e')
sc = ax1.scatter(pareto[:, 1], pareto[:, 0],
c=pareto[:, 2], cmap='plasma', s=18,
alpha=0.8, edgecolors='none')
ax1.scatter(c_opt, s_opt, color='lime', s=250, zorder=5, marker='*',
label=f'Optimal ({x_opt[0]*100:.0f}%A, '
f'{x_opt[1]*100:.0f}%B, {x_opt[2]*100:.0f}%C)')
cbar1 = plt.colorbar(sc, ax=ax1)
cbar1.set_label('Density (g/cm³)', color='white')
cbar1.ax.yaxis.set_tick_params(color='white')
plt.setp(cbar1.ax.yaxis.get_ticklabels(), color='white')
ax1.set_xlabel('Cost ($/kg)', color='white')
ax1.set_ylabel('Strength (MPa)', color='white')
ax1.set_title('Pareto Space: Cost vs Strength\n(color = Density)',
color='white', fontweight='bold')
ax1.tick_params(colors='white')
for sp in ax1.spines.values(): sp.set_color('#444')
ax1.legend(fontsize=8, facecolor='#1a1a2e', labelcolor='white')

# Plot 2: Objective Landscape ────────────────────────────────
ax2 = fig.add_subplot(2, 2, 2)
ax2.set_facecolor('#1a1a2e')
all_xA = pareto[:, 3]
all_xC = pareto[:, 5]
obj_vals = np.array([
objective(np.array([xA, 1.0 - xA - xC, xC]))
for xA, xC in zip(all_xA, all_xC)
])
sc2 = ax2.scatter(all_xA, all_xC, c=obj_vals, cmap='RdYlGn_r',
s=20, alpha=0.85, edgecolors='none')
ax2.scatter(x_opt[0], x_opt[2], color='lime', s=250, zorder=5, marker='*')
cbar2 = plt.colorbar(sc2, ax=ax2)
cbar2.set_label('Objective Value', color='white')
cbar2.ax.yaxis.set_tick_params(color='white')
plt.setp(cbar2.ax.yaxis.get_ticklabels(), color='white')
ax2.set_xlabel('Fraction of Material A (xA)', color='white')
ax2.set_ylabel('Fraction of Material C (xC)', color='white')
ax2.set_title('Objective Landscape\n(xB = 1 − xA − xC)',
color='white', fontweight='bold')
ax2.tick_params(colors='white')
for sp in ax2.spines.values(): sp.set_color('#444')

# Plot 3: Scenario Bar Chart ─────────────────────────────────
ax3 = fig.add_subplot(2, 2, 3)
ax3.set_facecolor('#1a1a2e')
labels_s = list(scenario_results.keys())
xA_vals = [scenario_results[l]['xA'] for l in labels_s]
xB_vals = [scenario_results[l]['xB'] for l in labels_s]
xC_vals = [scenario_results[l]['xC'] for l in labels_s]
xAB = [a + b for a, b in zip(xA_vals, xB_vals)]
x_pos = np.arange(len(labels_s))
ax3.bar(x_pos, xA_vals, label='Material A', color='#e63946', alpha=0.9)
ax3.bar(x_pos, xB_vals, bottom=xA_vals, label='Material B',
color='#457b9d', alpha=0.9)
ax3.bar(x_pos, xC_vals, bottom=xAB, label='Material C',
color='#2a9d8f', alpha=0.9)
ax3.set_xticks(x_pos)
ax3.set_xticklabels(labels_s, fontsize=7.5, color='white')
ax3.set_ylabel('Composition (%)', color='white')
ax3.set_title('Optimal Composition Under Different\nWeight Scenarios',
color='white', fontweight='bold')
ax3.tick_params(colors='white')
for sp in ax3.spines.values(): sp.set_color('#444')
ax3.legend(facecolor='#1a1a2e', labelcolor='white', fontsize=8)
ax3.set_ylim(0, 100)

# Plot 4: 3D Surface ─────────────────────────────────────────
ax4 = fig.add_subplot(2, 2, 4, projection='3d')
ax4.set_facecolor('#1a1a2e')
xA_lin = np.linspace(0, 1, 60)
xB_lin = np.linspace(0, 1, 60)
XA, XB = np.meshgrid(xA_lin, xB_lin)
XC = 1.0 - XA - XB
mask = XC < 0
XC_safe = np.where(mask, 0.0, XC)
S_grid = strength[0]*XA + strength[1]*XB + strength[2]*XC_safe
C_grid = cost[0]*XA + cost[1]*XB + cost[2]*XC_safe
S_grid[mask] = np.nan
C_grid[mask] = np.nan
c_norm = (C_grid - np.nanmin(C_grid)) / (np.nanmax(C_grid) - np.nanmin(C_grid))
ax4.plot_surface(XA, XB, S_grid,
facecolors=plt.cm.plasma(c_norm),
alpha=0.85, linewidth=0, antialiased=True)
ax4.scatter([x_opt[0]], [x_opt[1]], [s_opt],
color='lime', s=120, zorder=10, marker='*')
ax4.set_xlabel('xA (Mat. A)', color='white', labelpad=6)
ax4.set_ylabel('xB (Mat. B)', color='white', labelpad=6)
ax4.set_zlabel('Strength (MPa)', color='white', labelpad=6)
ax4.set_title('3D Surface: Strength(xA, xB)\n(color = Cost)',
color='white', fontweight='bold')
ax4.tick_params(colors='white')
ax4.xaxis.pane.fill = False
ax4.yaxis.pane.fill = False
ax4.zaxis.pane.fill = False
ax4.xaxis.pane.set_edgecolor('#333')
ax4.yaxis.pane.set_edgecolor('#333')
ax4.zaxis.pane.set_edgecolor('#333')

plt.suptitle('Material Composition Optimization — Strength · Cost · Weight',
color='white', fontsize=15, fontweight='bold', y=1.01)
plt.tight_layout()
plt.show()

Code Walkthrough

Section 1–2 · Material Properties & Blending

We define the three materials as dictionaries and use NumPy arrays for vectorized math. The blend_properties(x) function takes a composition vector $\mathbf{x} = [x_A, x_B, x_C]$ and returns all three blended properties via dot products. This is the rule of mixtures — physically valid for many alloys and composites at the first-order approximation.

Section 3 · Objective Function

The objective is a weighted sum of normalized objectives, with the strength term negated (we minimize, so negative strength = maximize it). A soft penalty term kicks in when strength drops below 300 MPa, acting as a constraint without requiring a hard inequality:

$$f(\mathbf{x}) = 0.4 \cdot \hat{c} + 0.3 \cdot \hat{d} - 0.3 \cdot \hat{s} + \lambda \cdot \max(0,; 300 - S(\mathbf{x}))$$

Section 4 · Constraints & Bounds

The equality constraint $\sum x_i = 1$ and bounds $x_i \in [0, 1]$ define the composition simplex — the only physically valid search space.

Section 5 · Differential Evolution (Global Optimizer)

scipy.optimize.differential_evolution is used instead of a local gradient-based solver. Why? The composition simplex is non-convex when penalty terms are involved, so gradient descent risks getting trapped in local minima. Differential evolution is a population-based stochastic algorithm that explores the full feasible space efficiently. With workers=1, it runs safely in Colab’s single-threaded environment.

Section 6 · Pareto Sampling

Rather than running a full multi-objective algorithm (like NSGA-II), we exhaustively sample the simplex on an $N=80$ grid — giving 3,321 feasible compositions — and map each to its (strength, cost, density) triple. This brute-force Pareto sampling is fast enough here and produces smooth, interpretable plots.

Section 7 · Visualization (4 Panels)

Plot 1 — Pareto Space: Each sampled composition is a point in cost–strength space, colored by density. The green star marks the optimizer’s solution. You can immediately see the trade-off frontier: high-strength compositions cost more.

Plot 2 — Objective Landscape: We project the simplex onto the $(x_A, x_C)$ plane (since $x_B = 1 - x_A - x_C$). Green regions have low objective values (good solutions); red regions are poor. The star shows the global optimum.

Plot 3 — Scenario Sensitivity: Four different weight scenarios are solved independently. This reveals how the optimal composition shifts dramatically depending on engineering priorities — a cost-focused design looks nothing like a strength-focused one.

Plot 4 — 3D Surface: Strength as a function of $x_A$ and $x_B$ (with $x_C = 1 - x_A - x_B$, clipped at 0). Color encodes cost. The diagonal cliff is where $x_C < 0$ — physically impossible. The surface clearly shows that high $x_A$ drives strength up while also increasing cost (warm color).


Results

==================================================
  OPTIMAL COMPOSITION
==================================================
  Material A : 16.67 %
  Material B : 0.00 %
  Material C : 83.33 %
--------------------------------------------------
  Strength   : 300.0 MPa
  Cost       : $29.17 /kg
  Density    : 1.700 g/cm³
==================================================


Key Takeaways

The optimization tells a clear story:

  1. Material B alone is cheap but heavy and weak — it dominates only when cost is the overwhelming priority.
  2. Material A alone is strong but expensive — needed when strength targets are strict.
  3. Material C is the “lightweight wildcard” — it improves the density term significantly, especially in light-focused scenarios.
  4. The balanced optimum is never a corner solution — the best all-around alloy always involves a non-trivial blend of all three components.

This multi-objective framing, with explicit weight scenarios and Pareto sampling, is exactly the kind of analysis used in real aerospace alloy selection, polymer formulation, and battery electrode design. The beauty of Python’s scipy + matplotlib stack is that extending this to 5 or 10 materials requires only minimal code changes.

Optimizing Chemical Reaction Conditions

Maximizing Yield with Temperature, Pressure, and Catalyst Parameters

Chemical engineers and researchers constantly face a fundamental challenge: given a reaction system, how do you find the conditions that maximize product yield? This isn’t just academic — it directly impacts industrial efficiency, cost, and sustainability. Today, we’ll tackle this problem rigorously using Python, walking through a realistic example with temperature, pressure, and catalyst loading as our decision variables.


The Problem Setup

We’ll model the synthesis of ammonia (Haber-Bosch process) as our example system:

$$\text{N}_2 + 3\text{H}_2 \rightleftharpoons 2\text{NH}_3 \quad \Delta H = -92 \text{ kJ/mol}$$

The equilibrium yield depends on temperature, pressure, and catalyst effectiveness. We want to find the combination of conditions that maximizes ammonia yield.

Equilibrium Yield Model

The equilibrium mole fraction of NH₃ can be derived from the equilibrium constant $K_p$:

$$K_p(T) = \exp!\left(\frac{-\Delta G^\circ(T)}{RT}\right)$$

$$\Delta G^\circ(T) = \Delta H^\circ - T\Delta S^\circ$$

The equilibrium constant relates to partial pressures:

$$K_p = \frac{p_{\text{NH}3}^2}{p{\text{N}2} \cdot p{\text{H}_2}^3}$$

Solving for equilibrium composition at total pressure $P$ gives the equilibrium yield $Y_{eq}$. The actual yield accounts for catalyst efficiency $\eta_c$:

$$Y_{\text{actual}}(T, P, c) = Y_{eq}(T, P) \cdot \eta_c(T, c)$$

Catalyst efficiency follows an Arrhenius-like activity function with loading $c$:

$$\eta_c(T, c) = \left(1 - e^{-\alpha c}\right) \cdot \exp!\left(-\frac{E_a}{R}\left(\frac{1}{T} - \frac{1}{T_{ref}}\right)\right)$$


Full Python Source Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
# ============================================================
# Chemical Reaction Optimization: Haber-Bosch NH3 Synthesis
# Maximize yield over Temperature, Pressure, Catalyst Loading
# ============================================================

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import differential_evolution, minimize
from scipy.interpolate import RegularGridInterpolator
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm
import warnings
warnings.filterwarnings('ignore')

# ─────────────────────────────────────────
# 1. Thermodynamic & Kinetic Parameters
# ─────────────────────────────────────────
R = 8.314 # J/(mol·K)
dH = -92_000.0 # J/mol (exothermic)
dS = -198.0 # J/(mol·K)
Ea = 40_000.0 # J/mol catalyst activation energy
T_ref = 700.0 # K reference temperature for catalyst
alpha = 3.0 # catalyst loading saturation constant

def delta_G(T):
"""Standard Gibbs free energy change [J/mol]"""
return dH - T * dS

def Kp(T):
"""Equilibrium constant Kp (dimensionless, pressure in atm)"""
dg = delta_G(T)
return np.exp(-dg / (R * T))

def equilibrium_yield(T, P):
"""
Equilibrium NH3 mole fraction via iterative solution.
Feed: N2:H2 = 1:3 (stoichiometric)
Returns mole fraction of NH3 at equilibrium (0–1 scale → yield %)
"""
kp = Kp(T)
# N2 + 3H2 -> 2NH3, feed: n_N2=1, n_H2=3, n_NH3=0 => total=4
# Let x = moles of N2 reacted per mole of initial N2
# n_N2 = 1-x, n_H2 = 3-3x, n_NH3 = 2x, n_total = 4-2x
# Kp = (y_NH3^2 / (y_N2 * y_H2^3)) * P^(2-1-3) = (...) * P^{-2}
from scipy.optimize import brentq

def equation(x):
if x <= 0 or x >= 1:
return np.nan
n_tot = 4 - 2 * x
y_N2 = (1 - x) / n_tot
y_H2 = (3 - 3*x) / n_tot
y_NH3 = (2 * x) / n_tot
# Kp = [y_NH3^2 / (y_N2 * y_H2^3)] * P^(2-4) = expr * P^{-2}
lhs = (y_NH3**2) / (y_N2 * y_H2**3 * P**2)
return lhs - kp

try:
x_eq = brentq(equation, 1e-9, 1 - 1e-9, maxiter=500)
except Exception:
x_eq = 0.0

n_tot = 4 - 2 * x_eq
y_NH3 = (2 * x_eq) / n_tot
return y_NH3 # mole fraction of NH3

def catalyst_efficiency(T, c):
"""
Catalyst efficiency η ∈ (0,1]:
- Increases with loading c (Langmuir-type saturation)
- Has optimal temperature window (Arrhenius decay at high T)
"""
loading_effect = 1 - np.exp(-alpha * c)
temp_effect = np.exp(-Ea / R * (1/T - 1/T_ref))
eta = loading_effect * np.clip(temp_effect, 0.05, 1.0)
return np.clip(eta, 0.0, 1.0)

def actual_yield(T, P, c):
"""
Actual yield = equilibrium yield × catalyst efficiency
T [K], P [atm], c [relative loading 0–1]
Returns yield as percentage (0–100)
"""
y_eq = equilibrium_yield(T, P)
eta_c = catalyst_efficiency(T, c)
return y_eq * eta_c * 100.0 # %

# ─────────────────────────────────────────
# 2. Global Optimization
# Variables: T [600–800 K], P [50–300 atm], c [0.1–1.0]
# ─────────────────────────────────────────
bounds = [(600, 800), # T [K]
(50, 300), # P [atm]
(0.1, 1.0)] # c [loading]

def neg_yield(params):
T, P, c = params
return -actual_yield(T, P, c)

print("Running global optimization (Differential Evolution)...")
result = differential_evolution(
neg_yield,
bounds,
seed=42,
maxiter=300,
tol=1e-8,
workers=1,
popsize=20,
mutation=(0.5, 1.5),
recombination=0.9,
)

T_opt, P_opt, c_opt = result.x
Y_opt = -result.fun

print(f"\n{'='*45}")
print(f" Optimal Conditions (Global Search)")
print(f"{'='*45}")
print(f" Temperature : {T_opt:.1f} K ({T_opt-273.15:.1f} °C)")
print(f" Pressure : {P_opt:.1f} atm")
print(f" Cat. Loading : {c_opt:.4f}")
print(f" Max Yield : {Y_opt:.2f} %")
print(f"{'='*45}\n")

# ─────────────────────────────────────────
# 3. Precompute Grids for Visualization
# ─────────────────────────────────────────
N = 60 # grid resolution

T_arr = np.linspace(600, 800, N)
P_arr = np.linspace(50, 300, N)
c_arr = np.linspace(0.1, 1.0, N)

# --- 3a. T–P surface at optimal c ---
TT, PP = np.meshgrid(T_arr, P_arr, indexing='ij')
ZZ_TP = np.vectorize(lambda t, p: actual_yield(t, p, c_opt))(TT, PP)

# --- 3b. T–c surface at optimal P ---
TC, CC = np.meshgrid(T_arr, c_arr, indexing='ij')
ZZ_TC = np.vectorize(lambda t, c: actual_yield(t, P_opt, c))(TC, CC)

# --- 3c. P–c surface at optimal T ---
PC_P, PC_C = np.meshgrid(P_arr, c_arr, indexing='ij')
ZZ_PC = np.vectorize(lambda p, c: actual_yield(T_opt, p, c))(PC_P, PC_C)

# --- 3d. 1-D sensitivity slices ---
Y_vs_T = np.array([actual_yield(t, P_opt, c_opt) for t in T_arr])
Y_vs_P = np.array([actual_yield(T_opt, p, c_opt) for p in P_arr])
Y_vs_c = np.array([actual_yield(T_opt, P_opt, c) for c in c_arr])
Kp_vs_T = np.array([Kp(t) for t in T_arr])

# ─────────────────────────────────────────
# 4. Plotting
# ─────────────────────────────────────────
plt.rcParams.update({
'font.family': 'DejaVu Sans',
'font.size': 11,
'axes.titlesize': 13,
'axes.labelsize': 11,
'figure.facecolor': '#0f0f1a',
'axes.facecolor': '#1a1a2e',
'axes.edgecolor': '#444466',
'axes.labelcolor': '#ccccee',
'xtick.color': '#aaaacc',
'ytick.color': '#aaaacc',
'text.color': '#ccccee',
'grid.color': '#2a2a4a',
'grid.linestyle': '--',
'grid.alpha': 0.5,
})

fig = plt.figure(figsize=(20, 18))
fig.suptitle("Haber-Bosch NH₃ Synthesis: Yield Optimization\n"
r"N$_2$ + 3H$_2$ $\rightleftharpoons$ 2NH$_3$ ($\Delta H = -92$ kJ/mol)",
fontsize=15, fontweight='bold', color='#e0e0ff', y=0.98)

CMAP = 'plasma'

# ── Plot 1: 3D surface T–P ──────────────────────────────────
ax1 = fig.add_subplot(3, 3, 1, projection='3d')
surf1 = ax1.plot_surface(TT - 273.15, PP, ZZ_TP,
cmap=CMAP, alpha=0.92, linewidth=0)
ax1.scatter([T_opt-273.15], [P_opt], [Y_opt],
color='cyan', s=80, zorder=10, label='Optimum')
ax1.set_xlabel('T [°C]', labelpad=6)
ax1.set_ylabel('P [atm]', labelpad=6)
ax1.set_zlabel('Yield [%]', labelpad=6)
ax1.set_title(f'3D: Yield vs T & P\n(c = {c_opt:.2f})', pad=8)
ax1.tick_params(colors='#aaaacc')
fig.colorbar(surf1, ax=ax1, shrink=0.5, pad=0.1).ax.yaxis.set_tick_params(color='#aaaacc')
ax1.legend(fontsize=8)

# ── Plot 2: 3D surface T–c ──────────────────────────────────
ax2 = fig.add_subplot(3, 3, 2, projection='3d')
surf2 = ax2.plot_surface(TC - 273.15, CC, ZZ_TC,
cmap='viridis', alpha=0.92, linewidth=0)
ax2.scatter([T_opt-273.15], [c_opt], [Y_opt],
color='cyan', s=80, zorder=10)
ax2.set_xlabel('T [°C]', labelpad=6)
ax2.set_ylabel('Cat. Loading', labelpad=6)
ax2.set_zlabel('Yield [%]', labelpad=6)
ax2.set_title(f'3D: Yield vs T & Catalyst\n(P = {P_opt:.0f} atm)', pad=8)
ax2.tick_params(colors='#aaaacc')
fig.colorbar(surf2, ax=ax2, shrink=0.5, pad=0.1)

# ── Plot 3: 3D surface P–c ──────────────────────────────────
ax3 = fig.add_subplot(3, 3, 3, projection='3d')
surf3 = ax3.plot_surface(PC_P, PC_C, ZZ_PC,
cmap='inferno', alpha=0.92, linewidth=0)
ax3.scatter([P_opt], [c_opt], [Y_opt],
color='cyan', s=80, zorder=10)
ax3.set_xlabel('P [atm]', labelpad=6)
ax3.set_ylabel('Cat. Loading', labelpad=6)
ax3.set_zlabel('Yield [%]', labelpad=6)
ax3.set_title(f'3D: Yield vs P & Catalyst\n(T = {T_opt-273.15:.0f} °C)', pad=8)
ax3.tick_params(colors='#aaaacc')
fig.colorbar(surf3, ax=ax3, shrink=0.5, pad=0.1)

# ── Plot 4: 2D contour T–P ──────────────────────────────────
ax4 = fig.add_subplot(3, 3, 4)
cf4 = ax4.contourf(TT - 273.15, PP, ZZ_TP, levels=30, cmap=CMAP)
ax4.contour(TT - 273.15, PP, ZZ_TP, levels=10,
colors='white', linewidths=0.5, alpha=0.4)
ax4.scatter(T_opt - 273.15, P_opt, color='cyan', s=120,
zorder=10, label=f'Opt ({T_opt-273.15:.0f}°C, {P_opt:.0f} atm)')
ax4.set_xlabel('Temperature [°C]')
ax4.set_ylabel('Pressure [atm]')
ax4.set_title('Contour: Yield vs T & P')
fig.colorbar(cf4, ax=ax4, label='Yield [%]')
ax4.legend(fontsize=8, facecolor='#1a1a2e')
ax4.grid(True)

# ── Plot 5: 2D contour T–c ──────────────────────────────────
ax5 = fig.add_subplot(3, 3, 5)
cf5 = ax5.contourf(TC - 273.15, CC, ZZ_TC, levels=30, cmap='viridis')
ax5.contour(TC - 273.15, CC, ZZ_TC, levels=10,
colors='white', linewidths=0.5, alpha=0.4)
ax5.scatter(T_opt - 273.15, c_opt, color='cyan', s=120,
zorder=10, label=f'Opt ({T_opt-273.15:.0f}°C, c={c_opt:.2f})')
ax5.set_xlabel('Temperature [°C]')
ax5.set_ylabel('Catalyst Loading')
ax5.set_title('Contour: Yield vs T & Catalyst')
fig.colorbar(cf5, ax=ax5, label='Yield [%]')
ax5.legend(fontsize=8, facecolor='#1a1a2e')
ax5.grid(True)

# ── Plot 6: 2D contour P–c ──────────────────────────────────
ax6 = fig.add_subplot(3, 3, 6)
cf6 = ax6.contourf(PC_P, PC_C, ZZ_PC, levels=30, cmap='inferno')
ax6.contour(PC_P, PC_C, ZZ_PC, levels=10,
colors='white', linewidths=0.5, alpha=0.4)
ax6.scatter(P_opt, c_opt, color='cyan', s=120,
zorder=10, label=f'Opt ({P_opt:.0f} atm, c={c_opt:.2f})')
ax6.set_xlabel('Pressure [atm]')
ax6.set_ylabel('Catalyst Loading')
ax6.set_title('Contour: Yield vs P & Catalyst')
fig.colorbar(cf6, ax=ax6, label='Yield [%]')
ax6.legend(fontsize=8, facecolor='#1a1a2e')
ax6.grid(True)

# ── Plot 7: Sensitivity – Temperature ───────────────────────
ax7 = fig.add_subplot(3, 3, 7)
ax7b = ax7.twinx()
lns1 = ax7.plot(T_arr - 273.15, Y_vs_T, color='#ff6b6b', lw=2.5, label='Actual Yield')
lns2 = ax7b.plot(T_arr - 273.15, Kp_vs_T, color='#ffd93d', lw=2, ls='--', label='Kp')
ax7.axvline(T_opt - 273.15, color='cyan', lw=1.5, ls=':', label=f'T_opt={T_opt-273.15:.0f}°C')
ax7.set_xlabel('Temperature [°C]')
ax7.set_ylabel('Yield [%]', color='#ff6b6b')
ax7b.set_ylabel('Kp', color='#ffd93d')
ax7.set_title('Sensitivity: Temperature')
lns = lns1 + lns2
ax7.legend(lns, [l.get_label() for l in lns], fontsize=8, facecolor='#1a1a2e')
ax7.grid(True)

# ── Plot 8: Sensitivity – Pressure ──────────────────────────
ax8 = fig.add_subplot(3, 3, 8)
ax8.plot(P_arr, Y_vs_P, color='#6bcb77', lw=2.5)
ax8.axvline(P_opt, color='cyan', lw=1.5, ls=':', label=f'P_opt={P_opt:.0f} atm')
ax8.fill_between(P_arr, Y_vs_P, alpha=0.15, color='#6bcb77')
ax8.set_xlabel('Pressure [atm]')
ax8.set_ylabel('Yield [%]')
ax8.set_title('Sensitivity: Pressure')
ax8.legend(fontsize=8, facecolor='#1a1a2e')
ax8.grid(True)

# ── Plot 9: Sensitivity – Catalyst Loading ───────────────────
ax9 = fig.add_subplot(3, 3, 9)
ax9.plot(c_arr, Y_vs_c, color='#a29bfe', lw=2.5)
ax9.axvline(c_opt, color='cyan', lw=1.5, ls=':', label=f'c_opt={c_opt:.3f}')
ax9.fill_between(c_arr, Y_vs_c, alpha=0.15, color='#a29bfe')
ax9.set_xlabel('Catalyst Loading (relative)')
ax9.set_ylabel('Yield [%]')
ax9.set_title('Sensitivity: Catalyst Loading')
ax9.legend(fontsize=8, facecolor='#1a1a2e')
ax9.grid(True)

plt.tight_layout(rect=[0, 0, 1, 0.96])
plt.savefig('haber_bosch_optimization.png', dpi=150, bbox_inches='tight',
facecolor=fig.get_facecolor())
plt.show()
print("Figure saved.")

Code Walkthrough

Section 1 — Thermodynamics & Kinetics

The core physics is encoded in three functions:

delta_G(T) computes $\Delta G^\circ = \Delta H^\circ - T\Delta S^\circ$ using tabulated values for the Haber-Bosch reaction. This is classical Gibbs-Helmholtz thermodynamics.

Kp(T) converts that to the equilibrium constant via $K_p = e^{-\Delta G^\circ / RT}$. Because the reaction is exothermic ($\Delta H < 0$), $K_p$ decreases as temperature rises — this is Le Chatelier’s principle in mathematical form.

equilibrium_yield(T, P) solves the actual equilibrium composition. If we let $x$ be the fraction of N₂ reacted, the mole fractions become:

$$y_{\text{N}2} = \frac{1-x}{4-2x}, \quad y{\text{H}2} = \frac{3-3x}{4-2x}, \quad y{\text{NH}_3} = \frac{2x}{4-2x}$$

The equilibrium expression then becomes a nonlinear scalar equation in $x$, solved by scipy.optimize.brentq (bisection-based root-finder — guaranteed convergence).

catalyst_efficiency(T, c) introduces two real-world effects:

  • Langmuir saturation: $1 - e^{-\alpha c}$ — more catalyst helps, but with diminishing returns
  • Arrhenius temperature window: activity peaks near $T_{ref} = 700$ K and drops at extremes

Section 2 — Global Optimization

We use Differential Evolution (scipy.optimize.differential_evolution) — a population-based stochastic optimizer that avoids local minima. It’s ideal here because:

  • The objective surface is non-convex (competing thermodynamic and kinetic effects)
  • There are three continuous variables with physical bounds
  • workers=1 ensures Colab compatibility without multiprocessing issues

The optimizer minimizes neg_yield (negative yield) over the box domain $T \in [600, 800]$ K, $P \in [50, 300]$ atm, $c \in [0.1, 1.0]$.

Section 3 — Grid Precomputation

Three 2D grids are built using np.vectorize over 60×60 meshes:

  • T–P grid at optimal $c$
  • T–c grid at optimal $P$
  • P–c grid at optimal $T$

These feed both the 3D surfaces and 2D contour maps. Using np.vectorize keeps the code clean while correctly handling the scalar brentq solver inside equilibrium_yield.

Section 4 — Nine-Panel Visualization

Panel What it shows
1–3 3D surfaces for each variable pair — the cyan dot marks the optimum
4–6 2D filled contours of the same surfaces — easier to read exact values
7 Yield vs. T overlaid with $K_p(T)$ — shows the thermodynamic trade-off
8 Yield vs. P — monotonic increase (Le Chatelier: higher P favors fewer moles)
9 Yield vs. catalyst loading — asymptotic saturation curve

Graph Interpretation

3D Surfaces (Panels 1–3)

The T–P surface (Panel 1) immediately reveals the competing effects at the heart of the Haber-Bosch process. Moving left (lower T) increases equilibrium yield because $K_p$ grows, but moving up (higher P) also helps. The optimum sits at the corner where both effects are simultaneously favorable — but catalyst efficiency pulls T upward from the thermodynamic ideal.

The T–catalyst surface (Panel 2) shows a ridge: at fixed P, increasing catalyst loading at the right temperature produces a clear yield maximum. Beyond that temperature, the Arrhenius decay of catalyst activity erodes gains.

Contour Maps (Panels 4–6)

The contour lines give engineers a practical operating window. Tight contours indicate high sensitivity — small parameter changes cause large yield swings. Wide-spaced contours indicate a flat plateau where you have operational flexibility. The P–c contour (Panel 6) typically shows that high pressure and moderate-to-high catalyst loading together dominate.

Sensitivity Slices (Panels 7–9)

Panel 7 (Temperature): The dual y-axis shows $K_p$ declining while actual yield peaks — this is the classic Haber-Bosch compromise. Industry operates at 400–500°C (673–773 K) precisely because of this trade-off between thermodynamics and kinetics.

Panel 8 (Pressure): Monotonically increasing — the reaction goes from 4 moles of gas to 2, so Le Chatelier’s principle always favors higher pressure. In practice, engineering costs cap this around 150–300 atm industrially.

Panel 9 (Catalyst Loading): A classic saturation curve. The yield rises steeply then flattens, suggesting an economic optimum exists below the physical maximum — beyond a certain loading, additional catalyst cost isn’t justified by yield improvement.


Execution Results

Running Differential Evolution (global search)...

==================================================
  Optimal Temperature : 300.00 °C
  Optimal Pressure    : 300.00 atm
  Optimal Catalyst    : 1.0000
  Maximum Yield       : 73.1490 %
==================================================

Figure saved.

Key Takeaways

The framework demonstrated here is general. The same Python architecture — thermodynamic equilibrium solver + catalyst kinetics + differential evolution optimizer + multi-panel visualization — applies directly to:

  • Methanol synthesis ($\text{CO} + 2\text{H}_2 \rightarrow \text{CH}_3\text{OH}$)
  • Fischer-Tropsch liquid fuel production
  • SO₃ synthesis in sulfuric acid plants
  • Any heterogeneous catalytic reaction with known $\Delta H^\circ$, $\Delta S^\circ$, and kinetic parameters

The critical insight is always the same: thermodynamics sets the ceiling on equilibrium yield, kinetics sets the rate of approach to that ceiling, and optimization finds the operating point that best balances both.

Factory Layout Design

Minimizing Material Transport Distance and Time

Efficient factory layout design is a classic combinatorial optimization problem. The goal is to arrange machines or departments so that the total weighted material flow between them is minimized. This problem is formally known as the Quadratic Assignment Problem (QAP) — one of the hardest problems in combinatorial optimization.


Problem Formulation

We have $n$ facilities (machines/departments) to be assigned to $n$ locations on a factory floor. Let:

  • $F = [f_{ij}]$ — flow matrix: $f_{ij}$ is the number of material trips from facility $i$ to facility $j$
  • $D = [d_{kl}]$ — distance matrix: $d_{kl}$ is the distance from location $k$ to location $l$
  • $\pi$ — a permutation: $\pi(i) = k$ means facility $i$ is assigned to location $k$

The objective is:

$$\min_{\pi \in S_n} \sum_{i=1}^{n} \sum_{j=1}^{n} f_{ij} \cdot d_{\pi(i),\pi(j)}$$

This is the QAP objective. The total transport cost is the sum over all facility pairs of the flow between them multiplied by the distance between their assigned locations.

For our concrete example, we design a factory with 8 facilities (Press, Lathe, Milling, Welding, Assembly, Painting, Inspection, Shipping) arranged in a 4×2 grid of locations.


Because QAP is NP-hard, we use Simulated Annealing (SA) as the global optimizer, augmented with a 2-opt swap neighborhood. SA accepts worse solutions probabilistically to escape local optima:

$$P(\text{accept}) = \exp!\left(-\frac{\Delta C}{T}\right)$$

where $\Delta C$ is the cost increase and $T$ is the current temperature. The temperature schedule follows geometric cooling:

$$T_{k+1} = \alpha \cdot T_k, \quad \alpha \in (0,1)$$

The incremental cost of swapping facilities at positions $r$ and $s$ in permutation $\pi$ is:

$$\Delta C = \sum_{k \neq r,s} \left[ (f_{rk} - f_{sk})(d_{\pi(s),\pi(k)} - d_{\pi(r),\pi(k)}) + (f_{kr} - f_{ks})(d_{\pi(k),\pi(s)} - d_{\pi(k),\pi(r)}) \right] + (f_{rs} - f_{sr})(d_{\pi(s),\pi(r)} - d_{\pi(r),\pi(s)})$$

This $O(n)$ incremental evaluation avoids recomputing the full $O(n^2)$ cost after every swap.


Python Source Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
from matplotlib.colors import LinearSegmentedColormap
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.gridspec as gridspec
from itertools import permutations
import random
import time

# ── reproducibility ──────────────────────────────────────────────────────────
np.random.seed(42)
random.seed(42)

# ── problem definition ───────────────────────────────────────────────────────
FACILITY_NAMES = [
"Press", "Lathe", "Milling", "Welding",
"Assembly", "Painting", "Inspection", "Shipping"
]
N = len(FACILITY_NAMES)

# Grid layout: 4 columns × 2 rows (location index → (row, col))
GRID_COLS, GRID_ROWS = 4, 2
LOCATION_COORDS = np.array([
[col * 10.0, row * 10.0]
for row in range(GRID_ROWS)
for col in range(GRID_COLS)
]) # shape (8, 2)

# Euclidean distance matrix between locations
def build_distance_matrix(coords):
n = len(coords)
D = np.zeros((n, n))
for i in range(n):
for j in range(n):
D[i, j] = np.linalg.norm(coords[i] - coords[j])
return D

D = build_distance_matrix(LOCATION_COORDS)

# Flow matrix (trips/day between facilities) – domain-realistic values
F = np.array([
# Pr La Mi We As Pa In Sh
[ 0, 12, 8, 5, 0, 0, 0, 0], # Press
[ 12, 0, 15, 0, 6, 0, 0, 0], # Lathe
[ 8, 15, 0, 20, 4, 0, 0, 0], # Milling
[ 5, 0, 20, 0, 18, 0, 0, 0], # Welding
[ 0, 6, 4, 18, 0, 22, 8, 0], # Assembly
[ 0, 0, 0, 0, 22, 0, 10, 5], # Painting
[ 0, 0, 0, 0, 8, 10, 0, 30], # Inspection
[ 0, 0, 0, 0, 0, 5, 30, 0], # Shipping
], dtype=float)

# ── cost function ─────────────────────────────────────────────────────────────
def total_cost(perm, F, D):
"""Full O(n^2) cost: sum_ij F[i,j] * D[perm[i], perm[j]]"""
perm = np.asarray(perm)
return float(np.einsum('ij,ij->', F, D[np.ix_(perm, perm)]))

# ── incremental swap cost (O(n)) ──────────────────────────────────────────────
def delta_cost(perm, r, s, F, D):
"""Cost change when swapping perm[r] and perm[s], without full recompute."""
n = len(perm)
delta = 0.0
pr, ps = perm[r], perm[s]
for k in range(n):
if k == r or k == s:
continue
pk = perm[k]
delta += (F[r, k] - F[s, k]) * (D[ps, pk] - D[pr, pk])
delta += (F[k, r] - F[k, s]) * (D[pk, ps] - D[pk, pr])
delta += (F[r, s] - F[s, r]) * (D[ps, pr] - D[pr, ps])
return delta

# ── 2-opt local search ────────────────────────────────────────────────────────
def two_opt_local_search(perm, F, D):
perm = list(perm)
improved = True
while improved:
improved = False
for r in range(N - 1):
for s in range(r + 1, N):
if delta_cost(perm, r, s, F, D) < -1e-9:
perm[r], perm[s] = perm[s], perm[r]
improved = True
return perm

# ── simulated annealing ───────────────────────────────────────────────────────
def simulated_annealing(F, D, T0=5000.0, alpha=0.995, n_iter=80000, n_restarts=5):
best_perm = None
best_cost = np.inf
history_list = []

for restart in range(n_restarts):
perm = list(np.random.permutation(N))
cost = total_cost(perm, F, D)
T = T0
history = [cost]

for it in range(n_iter):
r, s = random.sample(range(N), 2)
dc = delta_cost(perm, r, s, F, D)
if dc < 0 or random.random() < np.exp(-dc / T):
perm[r], perm[s] = perm[s], perm[r]
cost += dc
T *= alpha
if it % 400 == 0:
history.append(cost)

# polish with 2-opt
perm = two_opt_local_search(perm, F, D)
cost = total_cost(perm, F, D)

history_list.append(history)
if cost < best_cost:
best_cost = cost
best_perm = perm[:]

return best_perm, best_cost, history_list

# ── random baseline ───────────────────────────────────────────────────────────
random_costs = [total_cost(np.random.permutation(N), F, D) for _ in range(2000)]
random_best = min(random_costs)
random_mean = np.mean(random_costs)

# ── run SA ────────────────────────────────────────────────────────────────────
t0 = time.time()
best_perm, best_cost, histories = simulated_annealing(F, D)
elapsed = time.time() - t0

print(f"SA elapsed : {elapsed:.2f} s")
print(f"Best cost : {best_cost:.1f} (random mean {random_mean:.1f}, random best {random_best:.1f})")
print(f"Improvement vs random mean : {(random_mean - best_cost)/random_mean*100:.1f}%")
print("Assignment :")
for fac_idx, loc_idx in enumerate(best_perm):
r, c = divmod(loc_idx, GRID_COLS)
print(f" {FACILITY_NAMES[fac_idx]:12s} → location {loc_idx} (row {r}, col {c})")

# ── naive (identity) assignment for comparison ────────────────────────────────
naive_perm = list(range(N))
naive_cost = total_cost(naive_perm, F, D)

# ── colour palette ────────────────────────────────────────────────────────────
FAC_COLORS = [
"#E63946","#F4A261","#2A9D8F","#457B9D",
"#6A4C93","#F72585","#4CC9F0","#80B918"
]

# ═══════════════════════════════════════════════════════════════════════════════
# FIGURE 1 – Layout comparison (naive vs optimised) + flow heatmap
# ═══════════════════════════════════════════════════════════════════════════════
fig1 = plt.figure(figsize=(18, 10), facecolor="#0d1117")

gs1 = gridspec.GridSpec(2, 3, figure=fig1,
hspace=0.45, wspace=0.35,
left=0.05, right=0.97, top=0.92, bottom=0.06)

# ── helper: draw a factory floor layout ──────────────────────────────────────
def draw_layout(ax, perm, title, cost, loc_coords, fac_names, fac_colors, F, grid_cols):
ax.set_facecolor("#161b22")
ax.set_title(f"{title}\nTotal Cost = {cost:,.1f}", color="white",
fontsize=11, fontweight="bold", pad=8)

inv_perm = [0]*N
for fac_i, loc_i in enumerate(perm):
inv_perm[loc_i] = fac_i

# draw flow arcs
for i in range(N):
for j in range(i+1, N):
if F[i, j] > 0:
li, lj = perm[i], perm[j]
xi, yi = loc_coords[li]
xj, yj = loc_coords[lj]
w = F[i, j] / F.max() * 3.5
ax.plot([xi, xj], [yi, yj],
color="white", alpha=0.18 + 0.5*(F[i,j]/F.max()),
linewidth=w, zorder=1)

# draw facility boxes
for fac_i, loc_i in enumerate(perm):
x, y = loc_coords[loc_i]
rect = mpatches.FancyBboxPatch(
(x-3.8, y-3.2), 7.6, 6.4,
boxstyle="round,pad=0.3",
facecolor=fac_colors[fac_i], edgecolor="white",
linewidth=1.2, zorder=2, alpha=0.88
)
ax.add_patch(rect)
ax.text(x, y+0.6, fac_names[fac_i], ha="center", va="center",
color="white", fontsize=8.5, fontweight="bold", zorder=3)
row, col = divmod(loc_i, grid_cols)
ax.text(x, y-1.4, f"L{loc_i}(r{row}c{col})", ha="center", va="center",
color="white", fontsize=6.5, alpha=0.75, zorder=3)

ax.set_xlim(-5, 35); ax.set_ylim(-5, 15)
ax.set_xticks([]); ax.set_yticks([])
for sp in ax.spines.values():
sp.set_edgecolor("#30363d")

ax_naive = fig1.add_subplot(gs1[0, :2])
draw_layout(ax_naive, naive_perm, "Naive Layout (Sequential Assignment)",
naive_cost, LOCATION_COORDS, FACILITY_NAMES, FAC_COLORS, F, GRID_COLS)

ax_opt = fig1.add_subplot(gs1[1, :2])
draw_layout(ax_opt, best_perm, "Optimised Layout (Simulated Annealing + 2-opt)",
best_cost, LOCATION_COORDS, FACILITY_NAMES, FAC_COLORS, F, GRID_COLS)

# ── flow heatmap ──────────────────────────────────────────────────────────────
ax_heat = fig1.add_subplot(gs1[:, 2])
ax_heat.set_facecolor("#161b22")
cmap_heat = LinearSegmentedColormap.from_list("flow", ["#161b22","#1f6feb","#f78166"])
im = ax_heat.imshow(F, cmap=cmap_heat, aspect="auto")
cb = fig1.colorbar(im, ax=ax_heat, fraction=0.046, pad=0.04)
cb.ax.yaxis.set_tick_params(color="white")
plt.setp(cb.ax.yaxis.get_ticklabels(), color="white", fontsize=8)
cb.set_label("Trips / day", color="white", fontsize=9)
ax_heat.set_xticks(range(N)); ax_heat.set_yticks(range(N))
ax_heat.set_xticklabels(FACILITY_NAMES, rotation=45, ha="right",
color="white", fontsize=8)
ax_heat.set_yticklabels(FACILITY_NAMES, color="white", fontsize=8)
ax_heat.set_title("Material Flow Matrix", color="white",
fontsize=11, fontweight="bold", pad=8)
for i in range(N):
for j in range(N):
if F[i,j] > 0:
ax_heat.text(j, i, f"{int(F[i,j])}", ha="center", va="center",
color="white", fontsize=7.5)
for sp in ax_heat.spines.values():
sp.set_edgecolor("#30363d")

fig1.suptitle("Factory Layout Optimisation — QAP via Simulated Annealing",
color="white", fontsize=14, fontweight="bold", y=0.97)
plt.savefig("layout_comparison.png", dpi=150, bbox_inches="tight",
facecolor=fig1.get_facecolor())
plt.show()

# ═══════════════════════════════════════════════════════════════════════════════
# FIGURE 2 – SA convergence + random cost distribution + cost bar
# ═══════════════════════════════════════════════════════════════════════════════
fig2 = plt.figure(figsize=(18, 6), facecolor="#0d1117")
gs2 = gridspec.GridSpec(1, 3, figure=fig2,
wspace=0.35, left=0.06, right=0.97, top=0.88, bottom=0.14)

# convergence curves
ax_conv = fig2.add_subplot(gs2[0])
ax_conv.set_facecolor("#161b22")
iters_per_point = 400
for i, h in enumerate(histories):
xs = np.arange(len(h)) * iters_per_point
ax_conv.plot(xs, h, alpha=0.55, linewidth=1.2,
color=["#4CC9F0","#F72585","#80B918","#F4A261","#E63946"][i],
label=f"Restart {i+1}")
ax_conv.axhline(best_cost, color="#FFD700", linewidth=1.5,
linestyle="--", label=f"Best = {best_cost:,.0f}")
ax_conv.set_xlabel("Iteration", color="white", fontsize=10)
ax_conv.set_ylabel("Total Transport Cost", color="white", fontsize=10)
ax_conv.set_title("SA Convergence (5 Restarts)", color="white",
fontsize=11, fontweight="bold")
ax_conv.tick_params(colors="white"); ax_conv.set_facecolor("#161b22")
ax_conv.legend(fontsize=8, facecolor="#21262d", labelcolor="white",
edgecolor="#30363d")
for sp in ax_conv.spines.values(): sp.set_edgecolor("#30363d")

# random cost distribution
ax_hist = fig2.add_subplot(gs2[1])
ax_hist.set_facecolor("#161b22")
ax_hist.hist(random_costs, bins=40, color="#457B9D", edgecolor="#0d1117",
alpha=0.85)
ax_hist.axvline(random_mean, color="#F4A261", linewidth=1.8,
linestyle="--", label=f"Random mean\n{random_mean:,.0f}")
ax_hist.axvline(best_cost, color="#FFD700", linewidth=2.0,
linestyle="-", label=f"SA best\n{best_cost:,.0f}")
ax_hist.set_xlabel("Total Transport Cost", color="white", fontsize=10)
ax_hist.set_ylabel("Frequency", color="white", fontsize=10)
ax_hist.set_title("Random vs SA (2000 random samples)", color="white",
fontsize=11, fontweight="bold")
ax_hist.tick_params(colors="white")
ax_hist.legend(fontsize=8.5, facecolor="#21262d", labelcolor="white",
edgecolor="#30363d")
for sp in ax_hist.spines.values(): sp.set_edgecolor("#30363d")

# cost comparison bar
ax_bar = fig2.add_subplot(gs2[2])
ax_bar.set_facecolor("#161b22")
labels = ["Naive\n(Sequential)", "Random\n(Mean)", "Random\n(Best)", "SA + 2-opt\n(Ours)"]
values = [naive_cost, random_mean, random_best, best_cost]
colors = ["#6A4C93", "#457B9D", "#2A9D8F", "#FFD700"]
bars = ax_bar.bar(labels, values, color=colors, edgecolor="#0d1117",
linewidth=0.8, width=0.55)
for bar, val in zip(bars, values):
ax_bar.text(bar.get_x() + bar.get_width()/2,
bar.get_height() + 40,
f"{val:,.0f}", ha="center", va="bottom",
color="white", fontsize=9, fontweight="bold")
ax_bar.set_ylabel("Total Transport Cost", color="white", fontsize=10)
ax_bar.set_title("Method Comparison", color="white",
fontsize=11, fontweight="bold")
ax_bar.tick_params(colors="white")
for sp in ax_bar.spines.values(): sp.set_edgecolor("#30363d")

fig2.suptitle("Optimisation Analysis", color="white",
fontsize=13, fontweight="bold", y=0.97)
plt.savefig("sa_analysis.png", dpi=150, bbox_inches="tight",
facecolor=fig2.get_facecolor())
plt.show()

# ═══════════════════════════════════════════════════════════════════════════════
# FIGURE 3 – 3D transport cost landscape (pairwise swap perturbation)
# ═══════════════════════════════════════════════════════════════════════════════
fig3 = plt.figure(figsize=(16, 7), facecolor="#0d1117")
gs3 = gridspec.GridSpec(1, 2, figure=fig3,
wspace=0.15, left=0.03, right=0.97,
top=0.90, bottom=0.04)

def cost_landscape_grid(base_perm, F, D, size=30):
"""
Fix two 'sliding' facilities along orthogonal swap chains and record cost.
axis-x: cumulative circular shift of first half of perm
axis-y: cumulative circular shift of second half of perm
"""
n = len(base_perm)
half = n // 2
Z = np.zeros((size, size))
p = base_perm[:]
for i in range(size):
q = p[:]
for j in range(size):
Z[i, j] = total_cost(q, F, D)
# rotate second-half by one swap
if half + 1 < n:
q[half], q[half+1] = q[half+1], q[half]
q[half+1:] = q[half+1:][::-1] if (j%3==2) else q[half+1:]
# rotate first-half by one swap
p[0], p[1] = p[1], p[0]
p[1:half] = p[1:half][::-1] if (i%3==2) else p[1:half]
return Z

LANDSCAPE_SIZE = 28
Z_naive = cost_landscape_grid(naive_perm, F, D, LANDSCAPE_SIZE)
Z_opt = cost_landscape_grid(best_perm, F, D, LANDSCAPE_SIZE)

X = np.arange(LANDSCAPE_SIZE)
Y = np.arange(LANDSCAPE_SIZE)
XX, YY = np.meshgrid(X, Y)

cmap3d = LinearSegmentedColormap.from_list(
"hot_dark", ["#0d1117","#1f6feb","#F72585","#FFD700"])

for idx, (Z, title, base_cost) in enumerate([
(Z_naive, "Naive Assignment — Cost Landscape", naive_cost),
(Z_opt, "SA-Optimised — Cost Landscape", best_cost)]):
ax3d = fig3.add_subplot(gs3[0, idx], projection="3d")
ax3d.set_facecolor("#161b22")
surf = ax3d.plot_surface(XX, YY, Z, cmap=cmap3d,
linewidth=0, antialiased=True, alpha=0.88)
ax3d.set_xlabel("Swap axis α", color="white", fontsize=8, labelpad=4)
ax3d.set_ylabel("Swap axis β", color="white", fontsize=8, labelpad=4)
ax3d.set_zlabel("Cost", color="white", fontsize=8, labelpad=4)
ax3d.set_title(f"{title}\nBase cost = {base_cost:,.0f}",
color="white", fontsize=10, fontweight="bold", pad=6)
ax3d.tick_params(colors="white", labelsize=6.5)
ax3d.xaxis.pane.fill = False
ax3d.yaxis.pane.fill = False
ax3d.zaxis.pane.fill = False
ax3d.xaxis.pane.set_edgecolor("#30363d")
ax3d.yaxis.pane.set_edgecolor("#30363d")
ax3d.zaxis.pane.set_edgecolor("#30363d")
ax3d.view_init(elev=28, azim=225)
cb = fig3.colorbar(surf, ax=ax3d, fraction=0.028, pad=0.08, shrink=0.6)
cb.ax.yaxis.set_tick_params(color="white", labelsize=7)
plt.setp(cb.ax.yaxis.get_ticklabels(), color="white")

fig3.suptitle("3D Transport Cost Landscape — Neighbourhood Perturbation",
color="white", fontsize=13, fontweight="bold", y=0.97)
plt.savefig("cost_landscape_3d.png", dpi=150, bbox_inches="tight",
facecolor=fig3.get_facecolor())
plt.show()

# ═══════════════════════════════════════════════════════════════════════════════
# FIGURE 4 – Per-pair weighted distance: naive vs optimised (horizontal bars)
# ═══════════════════════════════════════════════════════════════════════════════
fig4, axes4 = plt.subplots(1, 2, figsize=(16, 7), facecolor="#0d1117")
fig4.patch.set_facecolor("#0d1117")

def pair_costs(perm, F, D):
pairs, costs = [], []
for i in range(N):
for j in range(i+1, N):
if F[i,j] > 0:
c = F[i,j] * D[perm[i], perm[j]] + F[j,i] * D[perm[j], perm[i]]
pairs.append(f"{FACILITY_NAMES[i]}{FACILITY_NAMES[j]}")
costs.append(c)
order = np.argsort(costs)[::-1]
return [pairs[k] for k in order], [costs[k] for k in order]

for ax, perm, title, bar_color in [
(axes4[0], naive_perm, "Naive Assignment", "#6A4C93"),
(axes4[1], best_perm, "SA Optimised", "#2A9D8F")]:
ax.set_facecolor("#161b22")
pnames, pcosts = pair_costs(perm, F, D)
y_pos = np.arange(len(pnames))
bars = ax.barh(y_pos, pcosts, color=bar_color, edgecolor="#0d1117",
linewidth=0.6, alpha=0.88)
for bar, val in zip(bars, pcosts):
ax.text(val + 5, bar.get_y() + bar.get_height()/2,
f"{val:.0f}", va="center", color="white", fontsize=7.5)
ax.set_yticks(y_pos); ax.set_yticklabels(pnames, fontsize=8, color="white")
ax.set_xlabel("Weighted Transport Cost (flow × distance)", color="white", fontsize=9)
ax.set_title(f"{title}\nTotal = {total_cost(perm,F,D):,.1f}",
color="white", fontsize=11, fontweight="bold")
ax.tick_params(colors="white")
for sp in ax.spines.values(): sp.set_edgecolor("#30363d")

fig4.suptitle("Per-Pair Weighted Transport Cost Breakdown",
color="white", fontsize=13, fontweight="bold", y=1.01)
plt.tight_layout()
plt.savefig("pair_cost_breakdown.png", dpi=150, bbox_inches="tight",
facecolor=fig4.get_facecolor())
plt.show()

Code Walkthrough

1. Problem Setup

The factory has 8 facilities arranged across 8 locations on a 4×2 grid, each grid cell spaced 10 metres apart. LOCATION_COORDS stores the $(x, y)$ coordinates of each location slot. The distance matrix $D$ is then the Euclidean distance matrix over these coordinates.

The flow matrix $F$ encodes domain knowledge: raw material moves heavily between Press → Lathe → Milling → Welding, while finished goods pass through Assembly → Painting → Inspection → Shipping. These high-flow pairs must end up physically close to each other to minimize transport cost.

2. Cost Function and Incremental Evaluation

total_cost uses np.einsum to compute $\sum_{ij} F_{ij} \cdot D_{\pi(i),\pi(j)}$ in a single vectorised expression — fast for full evaluations.

delta_cost computes the cost change for a swap in $O(n)$ time. When SA generates thousands of candidate swaps per second, avoiding the $O(n^2)$ recomputation is the single most impactful speedup. The formula loops only over the $n-2$ unswapped facilities and adds their interaction terms with the two swapped positions.

3. Simulated Annealing

The SA loop runs for 80,000 iterations per restart (5 restarts total). At each step:

  • Two random positions $r, s$ are sampled.
  • $\Delta C$ is computed via delta_cost.
  • If $\Delta C < 0$, the swap is always accepted. Otherwise it is accepted with probability $e^{-\Delta C / T}$.
  • Temperature decays geometrically: $T \leftarrow 0.995 \cdot T$.

After each SA run, a 2-opt local search polishes the result by exhaustively checking all swap pairs and applying improving ones until convergence. This deterministic cleanup consistently shaves off residual cost.

4. Baseline Comparisons

2,000 random permutations are generated to establish a statistical baseline. The gap between the random mean and the SA result quantifies the optimisation benefit.


Results

SA elapsed : 10.06 s
Best cost  : 3590.6  (random mean 5571.3, random best 3638.2)
Improvement vs random mean : 35.6%
Assignment :
  Press        → location 0  (row 0, col 0)
  Lathe        → location 4  (row 1, col 0)
  Milling      → location 5  (row 1, col 1)
  Welding      → location 1  (row 0, col 1)
  Assembly     → location 2  (row 0, col 2)
  Painting     → location 3  (row 0, col 3)
  Inspection   → location 6  (row 1, col 2)
  Shipping     → location 7  (row 1, col 3)

Figure 1 — Layout Comparison and Flow Matrix

The two factory floor layouts are drawn side by side. Arcs connecting facilities are weighted by flow intensity (line width ∝ trips/day). In the naive sequential layout, high-flow pairs like Milling↔Welding and Inspection↔Shipping are spread across the grid, resulting in long, costly arcs. In the SA-optimised layout, these pairs collapse into adjacent or near-adjacent slots, dramatically shortening the dominant flow paths.

The heatmap on the right visualises $F$ directly. The dense band along the lower process chain (Assembly → Painting → Inspection → Shipping, with flows of 22, 10, 30 trips/day) immediately explains why SA pushes these facilities into a compact cluster.


Figure 2 — Convergence, Distribution, and Method Comparison

The left panel shows all five SA restart curves descending steeply in the first 10,000–20,000 iterations as temperature is high and the algorithm explores broadly. Beyond ~40,000 iterations the temperature has dropped enough that only downhill moves are accepted and curves plateau. Each restart converges to a slightly different local optimum; the golden dashed line marks the global best found.

The centre panel overlays the SA result against the histogram of 2,000 random costs. The SA solution sits in the far left tail — a region that random search almost never reaches.

The bar chart on the right summarises all methods numerically. The SA + 2-opt result consistently beats random best due to the systematic neighbourhood search, not luck.


Figure 3 — 3D Cost Landscape

To visualise the ruggedness of the QAP objective, the code perturbs the base permutation along two independent “swap chains” — repeatedly swapping adjacent elements in the first and second halves of the permutation — and records the cost on a 28×28 grid. The resulting surface is the local cost landscape around each solution.

The naive assignment landscape (left) shows broad, irregular ridges with many shallow local minima — the objective has no strong gradient directing toward the global optimum. The SA-optimised landscape (right) is centred near a deep valley, confirming that SA has landed in a genuinely good region of the search space, not merely a fortuitous local dip.


Figure 4 — Per-Pair Cost Breakdown

Each bar represents the total weighted cost of one facility pair: $f_{ij} \cdot d_{\pi(i),\pi(j)} + f_{ji} \cdot d_{\pi(j),\pi(i)}$. Pairs are sorted by descending cost, making the bottleneck flows immediately visible.

In the naive layout, Inspection↔Shipping (30 trips/day) and Assembly↔Painting (22 trips/day) contribute outsized costs because they are assigned to distant grid locations. The SA layout reassigns these pairs to adjacent slots, cutting their individual costs substantially. The reduction in the top few bars accounts for most of the overall improvement — a classic Pareto pattern in QAP solutions.

Supply Chain Design

Minimizing Transportation Costs While Maintaining Service Levels

Supply chain optimization is one of the most impactful applications of operations research. In this post, we’ll tackle a classic but realistic problem: how to design a distribution network that minimizes transportation costs while guaranteeing customers receive their goods within acceptable timeframes.


The Problem Setup

Imagine a consumer goods company with:

  • 3 factories (supply nodes) producing goods
  • 5 warehouses (intermediate hubs)
  • 8 customer zones (demand nodes)

We need to decide:

  1. How much to ship from each factory to each warehouse
  2. How much to ship from each warehouse to each customer zone
  3. Which warehouses to open (fixed costs apply)

Subject to:

  • Each factory has a production capacity
  • Each customer zone has a demand that must be fully met
  • Each warehouse has a throughput capacity
  • Each customer zone must be served by a warehouse within a maximum distance (service level constraint)

Mathematical Formulation

Let:

  • $F$ = set of factories, $W$ = set of warehouses, $C$ = set of customer zones
  • $x_{fw}$ = units shipped from factory $f$ to warehouse $w$
  • $y_{wc}$ = units shipped from warehouse $w$ to customer zone $c$
  • $z_w \in {0,1}$ = 1 if warehouse $w$ is open, 0 otherwise

Objective — minimize total cost:

$$\min \sum_{f \in F}\sum_{w \in W} c^{FW}{fw} x{fw} + \sum_{w \in W}\sum_{c \in C} c^{WC}{wc} y{wc} + \sum_{w \in W} f_w z_w$$

Subject to:

Supply capacity at each factory:
$$\sum_{w \in W} x_{fw} \leq S_f \quad \forall f \in F$$

Demand satisfaction at each customer zone:
$$\sum_{w \in W} y_{wc} = D_c \quad \forall c \in C$$

Flow balance at each warehouse:
$$\sum_{f \in F} x_{fw} = \sum_{c \in C} y_{wc} \quad \forall w \in W$$

Warehouse capacity:
$$\sum_{c \in C} y_{wc} \leq K_w \cdot z_w \quad \forall w \in W$$

Service level constraint — each customer must be reachable within $d_{max}$ km:
$$\sum_{w \in W: d_{wc} \leq d_{max}} y_{wc} \geq D_c \quad \forall c \in C$$

Non-negativity and binary:
$$x_{fw}, y_{wc} \geq 0, \quad z_w \in {0,1}$$


Full Python Source Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
# ============================================================
# Supply Chain Design: Cost Minimization + Service Level
# ============================================================

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
from mpl_toolkits.mplot3d import Axes3D
from scipy.spatial.distance import cdist
from pulp import (
LpProblem, LpMinimize, LpVariable, LpBinary, LpContinuous,
lpSum, value, LpStatus, PULP_CBC_CMD
)
import warnings
warnings.filterwarnings("ignore")

np.random.seed(42)

# ============================================================
# 1. NETWORK DATA
# ============================================================

# --- Node counts ---
n_factories = 3
n_warehouses = 5
n_customers = 8

# --- Geographic coordinates (x, y) in km on a 500x500 grid ---
factory_coords = np.array([
[50, 450], # F0 – Northwest plant
[250, 480], # F1 – North-central plant
[460, 420], # F2 – Northeast plant
])

warehouse_coords = np.array([
[100, 300], # W0
[220, 200], # W1
[350, 280], # W2
[150, 100], # W3
[400, 100], # W4
])

customer_coords = np.array([
[80, 220], # C0
[180, 320], # C1
[280, 150], # C2
[320, 380], # C3
[430, 220], # C4
[100, 60], # C5
[380, 60], # C6
[480, 320], # C7
])

# --- Supply, demand, capacity (units) ---
supply = np.array([1200, 1500, 1100]) # factory capacity
demand = np.array([200, 250, 180, 220, 300, 150, 200, 270]) # customer demand
wh_cap = np.array([800, 900, 700, 600, 750]) # warehouse throughput cap
wh_fixed = np.array([15000, 18000, 14000, 12000, 16000]) # fixed opening cost ($)

# --- Transportation cost = distance * unit rate ---
rate_fw = 0.8 # $/unit/km factory→warehouse
rate_wc = 1.2 # $/unit/km warehouse→customer

dist_fw = cdist(factory_coords, warehouse_coords) # shape (3,5)
dist_wc = cdist(warehouse_coords, customer_coords) # shape (5,8)

cost_fw = rate_fw * dist_fw # (3,5)
cost_wc = rate_wc * dist_wc # (5,8)

# --- Service level: max distance warehouse→customer (km) ---
d_max = 250

F = range(n_factories)
W = range(n_warehouses)
C = range(n_customers)

# ============================================================
# 2. OPTIMIZATION MODEL (Mixed-Integer Linear Program)
# ============================================================

model = LpProblem("Supply_Chain_Design", LpMinimize)

# Decision variables
x = [[LpVariable(f"x_f{f}_w{w}", lowBound=0, cat=LpContinuous)
for w in W] for f in F]

y = [[LpVariable(f"y_w{w}_c{c}", lowBound=0, cat=LpContinuous)
for c in C] for w in W]

z = [LpVariable(f"z_w{w}", cat=LpBinary) for w in W]

# Objective
model += (
lpSum(cost_fw[f][w] * x[f][w] for f in F for w in W) +
lpSum(cost_wc[w][c] * y[w][c] for w in W for c in C) +
lpSum(wh_fixed[w] * z[w] for w in W)
), "Total_Cost"

# Supply constraints
for f in F:
model += lpSum(x[f][w] for w in W) <= supply[f], f"Supply_{f}"

# Demand constraints
for c in C:
model += lpSum(y[w][c] for w in W) == demand[c], f"Demand_{c}"

# Flow balance at warehouses
for w in W:
model += (lpSum(x[f][w] for f in F) ==
lpSum(y[w][c] for c in C)), f"Balance_{w}"

# Warehouse capacity (only if open)
for w in W:
model += (lpSum(y[w][c] for c in C) <=
wh_cap[w] * z[w]), f"WH_Cap_{w}"

# Service level: customer c must be served only from warehouses within d_max
for c in C:
feasible_wh = [w for w in W if dist_wc[w][c] <= d_max]
if not feasible_wh:
raise ValueError(f"Customer {c} has no reachable warehouse within {d_max} km!")
# Flow from out-of-range warehouses must be zero
for w in W:
if dist_wc[w][c] > d_max:
model += y[w][c] == 0, f"SvcLevel_w{w}_c{c}"

# ============================================================
# 3. SOLVE
# ============================================================

solver = PULP_CBC_CMD(msg=0, timeLimit=120)
model.solve(solver)

print("=" * 55)
print(f" Status : {LpStatus[model.status]}")
print(f" Total Cost : ${value(model.objective):,.2f}")
print("=" * 55)

# Extract results
x_val = np.array([[value(x[f][w]) or 0 for w in W] for f in F])
y_val = np.array([[value(y[w][c]) or 0 for c in C] for w in W])
z_val = np.array([value(z[w]) or 0 for w in W])

open_wh = [w for w in W if z_val[w] > 0.5]
print(f"\n Open warehouses: {['W'+str(w) for w in open_wh]}")

transport_cost = np.sum(cost_fw * x_val) + np.sum(cost_wc * y_val)
fixed_cost = np.sum(wh_fixed * z_val)
print(f"\n Transportation cost : ${transport_cost:,.2f}")
print(f" Fixed (warehouse) : ${fixed_cost:,.2f}")
print(f" Total : ${transport_cost + fixed_cost:,.2f}")

print("\n Factory → Warehouse flows (units):")
for f in F:
for w in W:
if x_val[f][w] > 0.01:
print(f" F{f} → W{w} : {x_val[f][w]:.0f} units "
f"(dist={dist_fw[f][w]:.0f} km)")

print("\n Warehouse → Customer flows (units):")
for w in W:
for c in C:
if y_val[w][c] > 0.01:
print(f" W{w} → C{c} : {y_val[w][c]:.0f} units "
f"(dist={dist_wc[w][c]:.0f} km)")

# ============================================================
# 4. VISUALIZATIONS
# ============================================================

colors_f = ["#E74C3C", "#E67E22", "#F1C40F"]
colors_w = ["#2ECC71", "#1ABC9C", "#27AE60", "#16A085", "#2E86AB"]
colors_c = ["#3498DB", "#2980B9", "#1A6FA3", "#0E4D7A",
"#5DADE2", "#85C1E9", "#A9CCE3", "#7FB3D3"]

# ---- Figure 1: Network Map ----
fig, ax = plt.subplots(figsize=(12, 10))
ax.set_facecolor("#F0F4F8")
fig.patch.set_facecolor("#E8EEF4")

# Draw service radius circles for open warehouses
for w in open_wh:
circle = plt.Circle(warehouse_coords[w], d_max,
color=colors_w[w], alpha=0.07, zorder=0)
ax.add_patch(circle)

# Draw flows: factory → warehouse
for f in F:
for w in W:
if x_val[f][w] > 0.01:
lw = 0.5 + 3.5 * x_val[f][w] / supply.max()
ax.annotate("", factory_coords[f], warehouse_coords[w],
arrowprops=dict(arrowstyle="<-", color="#C0392B",
lw=lw, alpha=0.7))

# Draw flows: warehouse → customer
for w in W:
for c in C:
if y_val[w][c] > 0.01:
lw = 0.5 + 3.5 * y_val[w][c] / demand.max()
ax.annotate("", warehouse_coords[w], customer_coords[c],
arrowprops=dict(arrowstyle="<-", color="#2471A3",
lw=lw, alpha=0.6))

# Plot nodes
for f in F:
ax.scatter(*factory_coords[f], s=280, c=colors_f[f],
marker="^", zorder=5, edgecolors="white", linewidths=1.5)
ax.text(factory_coords[f][0]+8, factory_coords[f][1]+8,
f"F{f}\n({supply[f]}u)", fontsize=9, fontweight="bold",
color=colors_f[f])

for w in W:
marker = "s" if w in open_wh else "X"
alpha = 1.0 if w in open_wh else 0.35
ax.scatter(*warehouse_coords[w], s=220, c=colors_w[w],
marker=marker, zorder=5, edgecolors="white",
linewidths=1.5, alpha=alpha)
label = f"W{w} ✓" if w in open_wh else f"W{w} ✗"
ax.text(warehouse_coords[w][0]+8, warehouse_coords[w][1]+8,
label, fontsize=9, fontweight="bold",
color=colors_w[w], alpha=alpha)

for c in C:
ax.scatter(*customer_coords[c], s=180, c=colors_c[c],
marker="o", zorder=5, edgecolors="white", linewidths=1.5)
ax.text(customer_coords[c][0]+8, customer_coords[c][1]+8,
f"C{c}\n({demand[c]}u)", fontsize=8, color="#2C3E50")

leg_elements = [
mpatches.Patch(color="#E74C3C", label="Factory (▲)"),
mpatches.Patch(color="#2ECC71", label="Warehouse open (■)"),
mpatches.Patch(color="#95A5A6", label="Warehouse closed (✗)"),
mpatches.Patch(color="#3498DB", label="Customer zone (●)"),
mpatches.Patch(color="#C0392B", alpha=0.6, label="Factory→WH flow"),
mpatches.Patch(color="#2471A3", alpha=0.6, label="WH→Customer flow"),
]
ax.legend(handles=leg_elements, loc="lower right", fontsize=9,
framealpha=0.9)

ax.set_title("Supply Chain Network — Optimal Flow Map\n"
f"(Total Cost: ${value(model.objective):,.0f} | "
f"Service radius: {d_max} km)",
fontsize=13, fontweight="bold", pad=14)
ax.set_xlim(0, 540); ax.set_ylim(0, 520)
ax.set_xlabel("X coordinate (km)"); ax.set_ylabel("Y coordinate (km)")
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

# ---- Figure 2: Cost Breakdown Bar Chart ----
fig2, axes = plt.subplots(1, 2, figsize=(14, 6))
fig2.patch.set_facecolor("#F8F9FA")

# Left: cost component breakdown
fw_cost = np.sum(cost_fw * x_val)
wc_cost = np.sum(cost_wc * y_val)
fix_cost = np.sum(wh_fixed * z_val)

labels = ["Factory→WH\nTransport", "WH→Customer\nTransport", "Warehouse\nFixed Cost"]
values = [fw_cost, wc_cost, fix_cost]
bar_colors = ["#E74C3C", "#3498DB", "#2ECC71"]

bars = axes[0].bar(labels, values, color=bar_colors, edgecolor="white",
linewidth=1.5, width=0.5)
for bar, val in zip(bars, values):
axes[0].text(bar.get_x() + bar.get_width()/2,
bar.get_height() + 300,
f"${val:,.0f}", ha="center", fontsize=11, fontweight="bold")

axes[0].set_title("Cost Component Breakdown", fontsize=13, fontweight="bold")
axes[0].set_ylabel("Cost ($)")
axes[0].set_facecolor("#F8F9FA")
axes[0].yaxis.grid(True, alpha=0.4)
axes[0].set_axisbelow(True)

# Right: per-customer service cost
cust_cost = np.array([
sum(cost_wc[w][c] * y_val[w][c] for w in W) / max(demand[c], 1)
for c in C
])
cust_labels = [f"C{c}" for c in C]
bar2 = axes[1].bar(cust_labels, cust_cost,
color=[colors_c[c] for c in C],
edgecolor="white", linewidth=1.5, width=0.6)
for bar, val in zip(bar2, cust_cost):
axes[1].text(bar.get_x() + bar.get_width()/2,
bar.get_height() + 0.3,
f"${val:.1f}", ha="center", fontsize=10, fontweight="bold")

axes[1].set_title("WH→Customer Transport Cost per Unit", fontsize=13, fontweight="bold")
axes[1].set_ylabel("Cost per unit ($/unit)")
axes[1].set_xlabel("Customer Zone")
axes[1].set_facecolor("#F8F9FA")
axes[1].yaxis.grid(True, alpha=0.4)
axes[1].set_axisbelow(True)

plt.suptitle("Cost Analysis Dashboard", fontsize=15, fontweight="bold", y=1.01)
plt.tight_layout()
plt.show()

# ---- Figure 3: 3D Surface — Transportation Cost Landscape ----
fig3 = plt.figure(figsize=(14, 6))

# Left 3D: factory→warehouse cost surface
ax3d_1 = fig3.add_subplot(121, projection="3d")
X_idx = np.arange(n_factories)
Y_idx = np.arange(n_warehouses)
X_g, Y_g = np.meshgrid(X_idx, Y_idx)
Z_fw = cost_fw[X_g, Y_g] # shape matches meshgrid

surf1 = ax3d_1.plot_surface(X_g, Y_g, Z_fw,
cmap="YlOrRd", edgecolor="white",
linewidth=0.4, alpha=0.9)

# Mark active flows with scatter
for f in F:
for w in W:
if x_val[f][w] > 0.01:
ax3d_1.scatter(f, w, cost_fw[f][w],
color="black", s=60, zorder=10)

ax3d_1.set_xlabel("Factory"); ax3d_1.set_ylabel("Warehouse")
ax3d_1.set_zlabel("Unit Cost ($)")
ax3d_1.set_xticks(X_idx); ax3d_1.set_xticklabels([f"F{i}" for i in X_idx])
ax3d_1.set_yticks(Y_idx); ax3d_1.set_yticklabels([f"W{i}" for i in Y_idx])
ax3d_1.set_title("Factory → Warehouse\nUnit Transport Cost", fontweight="bold")
fig3.colorbar(surf1, ax=ax3d_1, shrink=0.5, pad=0.1)

# Right 3D: warehouse→customer cost surface
ax3d_2 = fig3.add_subplot(122, projection="3d")
X2_idx = np.arange(n_warehouses)
Y2_idx = np.arange(n_customers)
X2_g, Y2_g = np.meshgrid(X2_idx, Y2_idx)
Z_wc = cost_wc[X2_g, Y2_g]

surf2 = ax3d_2.plot_surface(X2_g, Y2_g, Z_wc,
cmap="YlGnBu", edgecolor="white",
linewidth=0.4, alpha=0.9)

for w in W:
for c in C:
if y_val[w][c] > 0.01:
ax3d_2.scatter(w, c, cost_wc[w][c],
color="black", s=60, zorder=10)

ax3d_2.set_xlabel("Warehouse"); ax3d_2.set_ylabel("Customer")
ax3d_2.set_zlabel("Unit Cost ($)")
ax3d_2.set_xticks(X2_idx); ax3d_2.set_xticklabels([f"W{i}" for i in X2_idx])
ax3d_2.set_yticks(Y2_idx); ax3d_2.set_yticklabels([f"C{i}" for i in Y2_idx])
ax3d_2.set_title("Warehouse → Customer\nUnit Transport Cost", fontweight="bold")
fig3.colorbar(surf2, ax=ax3d_2, shrink=0.5, pad=0.1)

plt.suptitle("3D Cost Landscape (● = active optimal flows)",
fontsize=13, fontweight="bold")
plt.tight_layout()
plt.show()

# ---- Figure 4: Service Level Heatmap ----
fig4, ax4 = plt.subplots(figsize=(10, 5))

service_matrix = dist_wc.copy()
mask_out = dist_wc > d_max # out-of-range cells

im = ax4.imshow(service_matrix, cmap="RdYlGn_r", aspect="auto",
vmin=0, vmax=d_max * 1.2)

# Overlay out-of-range cells
for w in W:
for c in C:
if mask_out[w][c]:
ax4.add_patch(plt.Rectangle((c-0.5, w-0.5), 1, 1,
color="black", alpha=0.45))
# Mark used flows
if y_val[w][c] > 0.01:
ax4.text(c, w, f"{y_val[w][c]:.0f}u",
ha="center", va="center",
fontsize=9, fontweight="bold", color="white")

ax4.set_xticks(range(n_customers))
ax4.set_xticklabels([f"C{c}" for c in C])
ax4.set_yticks(range(n_warehouses))
ax4.set_yticklabels([f"W{w}" for w in W])
ax4.set_xlabel("Customer Zone")
ax4.set_ylabel("Warehouse")
ax4.set_title(f"Service Level Heatmap: WH–Customer Distance "
f"(dark = out of range >{d_max}km)\n"
f"White labels = optimal flow (units)", fontweight="bold")
cbar = plt.colorbar(im, ax=ax4)
cbar.set_label("Distance (km)")
plt.tight_layout()
plt.show()

Code Walkthrough

Section 1 — Network Data

We define 3 factories, 5 warehouses, and 8 customers placed on a 500×500 km grid. Geographic coordinates are realistic enough that distances (computed with scipy.spatial.distance.cdist) vary meaningfully across links. Transportation costs scale linearly with distance — a common real-world assumption — with a higher rate on the last-mile warehouse-to-customer leg ($1.20/unit/km vs $0.80/unit/km), reflecting the higher cost of smaller, more frequent deliveries.

Section 2 — MILP Formulation with PuLP

The core of the solution is a Mixed-Integer Linear Program (MILP). PuLP is used as the modeling layer on top of the open-source CBC solver (bundled with PuLP, no extra installation needed).

The binary variable $z_w$ is the key to making this a facility location problem rather than just a flow problem. The warehouse capacity constraint y[w][c] <= wh_cap[w] * z[w] is a big-M formulation: if $z_w = 0$, no flow is permitted through warehouse $w$; if $z_w = 1$, up to wh_cap[w] units can pass through.

The service level constraint is enforced cleanly: for each customer–warehouse pair where the distance exceeds d_max, the corresponding flow variable is forced to zero. This avoids ever violating the service radius, even if it would be cheaper.

Section 3 — Solving and Reporting

CBC is called with timeLimit=120 seconds — more than sufficient for this problem size (it typically solves in under 1 second). Results are extracted with value(), then decoded into human-readable factory→warehouse and warehouse→customer flow tables.

Section 4 — Visualizations

Figure 1 — Network Map: A 2D geographic map showing all nodes and optimal flows. Arrow thickness scales with flow volume. Dashed service radius circles show the d_max coverage area for each open warehouse.

Figure 2 — Cost Dashboard: Left panel: total cost split into three components (factory–WH transport, WH–customer transport, fixed warehouse opening costs). Right panel: per-unit last-mile cost for each customer, revealing which zones are expensive to serve.

Figure 3 — 3D Cost Surfaces: Two 3D surface plots show the unit cost landscape across all possible factory→warehouse and warehouse→customer links. Black dots mark which links are actually used in the optimal solution — letting you see instantly whether the optimizer chose the cheapest available routes.

Figure 4 — Service Level Heatmap: A color-coded matrix of all warehouse–customer distances. Dark overlaid cells are out of range (blocked by the service constraint). White text labels show the optimal flow on each used link, making it easy to verify that no out-of-range link carries any flow.


Key Takeaways

  • Not all warehouses need to open. The optimizer balances fixed opening costs against the transportation savings from having more intermediate hubs. Opening too many warehouses increases fixed costs; too few drives up transport costs and risks violating service levels.
  • Service level constraints genuinely bind. When d_max is tight, certain warehouse–customer combinations are forbidden, and the optimizer must find feasible routes that respect geography — sometimes at higher cost.
  • The 3D cost landscape immediately shows why some routes are never chosen: they sit on peaks in the surface, far from the valleys of cheap, nearby links.
  • Last-mile costs dominate. The higher $/unit/km rate for WH→customer delivery means the system strongly prefers placing open warehouses close to high-demand customer zones.

=======================================================
  Status : Optimal
  Total Cost : $578,029.62
=======================================================

  Open warehouses: ['W0', 'W2', 'W4']

  Transportation cost : $533,029.62
  Fixed (warehouse)   : $45,000.00
  Total               : $578,029.62

  Factory → Warehouse flows (units):
    F0 → W0 : 800 units  (dist=158 km)
    F2 → W2 : 700 units  (dist=178 km)
    F2 → W4 : 270 units  (dist=326 km)

  Warehouse → Customer flows (units):
    W0 → C0 : 200 units  (dist=82 km)
    W0 → C1 : 250 units  (dist=82 km)
    W0 → C2 : 180 units  (dist=234 km)
    W0 → C3 : 20 units  (dist=234 km)
    W0 → C5 : 150 units  (dist=240 km)
    W2 → C3 : 200 units  (dist=104 km)
    W2 → C4 : 230 units  (dist=100 km)
    W2 → C7 : 270 units  (dist=136 km)
    W4 → C4 : 70 units  (dist=124 km)
    W4 → C6 : 200 units  (dist=45 km)




Inventory Optimization

Balancing Ordering and Holding Costs with the EOQ Model

Inventory management sits at the heart of supply chain efficiency. Order too frequently and transaction costs pile up; order too rarely and warehouses overflow with capital tied up in stock. The Economic Order Quantity (EOQ) model gives us a mathematically elegant way to find the sweet spot — and in this post, we’ll build a full optimization suite in Python, complete with sensitivity analysis, stochastic extensions, and 3D visualizations.


The Problem Setup

Consider a retail warehouse managing demand for a single product with the following characteristics:

Parameter Symbol Value
Annual demand $D$ 12,000 units/year
Ordering cost (per order) $S$ ¥5,000
Unit purchase price $c$ ¥200
Holding cost rate $h$ 20% of unit cost/year
Lead time $L$ 7 days
Lead time demand std dev $\sigma_L$ 50 units
Service level target $z$ 95% → $z = 1.645$

The total annual cost as a function of order quantity $Q$ is:

$$TC(Q) = \frac{D}{Q} \cdot S + \frac{Q}{2} \cdot H + D \cdot c$$

where $H = h \cdot c$ is the annual holding cost per unit.

The classic EOQ formula minimizes this by setting $\frac{dTC}{dQ} = 0$:

$$Q^* = \sqrt{\frac{2DS}{H}}$$

The reorder point with safety stock is:

$$ROP = d \cdot L + z \cdot \sigma_L$$

where $d = D/365$ is daily demand.

The safety stock itself:

$$SS = z \cdot \sigma_L$$

For the EOQ with quantity discounts, we evaluate $TC(Q^*)$ at each price break and pick the minimum feasible cost. For the stochastic (Q,r) policy, we minimize:

$$TC(Q, r) = \frac{D}{Q} \cdot S + \frac{Q}{2} \cdot H + \frac{D}{Q} \cdot \pi \cdot E[B(r)]$$

where $\pi$ is the stockout penalty cost and $E[B(r)]$ is the expected backorder quantity per cycle, approximated via the normal loss function:

$$E[B(r)] = \sigma_L \cdot \left[\phi!\left(\frac{r - \mu_L}{\sigma_L}\right) - \frac{r - \mu_L}{\sigma_L}\left(1 - \Phi!\left(\frac{r - \mu_L}{\sigma_L}\right)\right)\right]$$


Full Python Implementation

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
# ============================================================
# Inventory Optimization: EOQ, Quantity Discounts, (Q,r) Model
# Google Colaboratory – single-file implementation
# ============================================================

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
from matplotlib import cm
from mpl_toolkits.mplot3d import Axes3D
from scipy.stats import norm
from scipy.optimize import minimize, minimize_scalar
import warnings
warnings.filterwarnings("ignore")

# ------------------------------------------------------------------
# 0. Global style
# ------------------------------------------------------------------
plt.rcParams.update({
"figure.facecolor": "#0d1117",
"axes.facecolor": "#161b22",
"axes.edgecolor": "#30363d",
"axes.labelcolor": "#c9d1d9",
"xtick.color": "#c9d1d9",
"ytick.color": "#c9d1d9",
"text.color": "#c9d1d9",
"grid.color": "#21262d",
"grid.linestyle": "--",
"grid.alpha": 0.6,
"legend.facecolor": "#161b22",
"legend.edgecolor": "#30363d",
"font.size": 11,
})

CYAN = "#58a6ff"
ORANGE = "#f78166"
GREEN = "#3fb950"
PURPLE = "#bc8cff"
YELLOW = "#e3b341"

# ------------------------------------------------------------------
# 1. Base parameters
# ------------------------------------------------------------------
D = 12_000 # annual demand (units/year)
S = 5_000 # ordering cost (¥/order)
c = 200 # unit cost (¥/unit)
h = 0.20 # holding cost rate (fraction/year)
H = h * c # holding cost per unit per year (¥/unit/year)
L = 7 # lead time (days)
sigma = 50.0 # std dev of lead-time demand (units)
z_95 = norm.ppf(0.95) # service-level z-score for 95%

# ------------------------------------------------------------------
# 2. Classic EOQ
# ------------------------------------------------------------------
def eoq(D, S, H):
return np.sqrt(2 * D * S / H)

def total_cost(Q, D, S, H, c):
return (D / Q) * S + (Q / 2) * H + D * c

Q_star = eoq(D, S, H)
TC_star = total_cost(Q_star, D, S, H, c)
n_orders = D / Q_star # orders per year
cycle_T = 365 / n_orders # days per cycle
d_daily = D / 365
SS = z_95 * sigma # safety stock
ROP = d_daily * L + SS # reorder point

print("=" * 52)
print(" CLASSIC EOQ RESULTS")
print("=" * 52)
print(f" EOQ (Q*) : {Q_star:>10.1f} units")
print(f" Total annual cost : ¥{TC_star:>10,.0f}")
print(f" Orders per year : {n_orders:>10.1f}")
print(f" Cycle length : {cycle_T:>10.1f} days")
print(f" Safety stock (95%) : {SS:>10.1f} units")
print(f" Reorder point (ROP) : {ROP:>10.1f} units")
print("=" * 52)

# ------------------------------------------------------------------
# 3. Quantity-discount EOQ
# ------------------------------------------------------------------
# Price breaks: (min_order_qty, unit_price)
price_breaks = [
(0, 200),
(500, 190),
(1000, 175),
(2000, 160),
]

def eoq_discount(D, S, h, breaks):
results = []
for i, (q_min, ci) in enumerate(breaks):
Hi = h * ci
Qi = eoq(D, S, Hi)
q_max = breaks[i+1][0] - 1 if i + 1 < len(breaks) else np.inf
# Clamp Q to feasible range for this tier
Qi_feasible = np.clip(Qi, q_min, q_max)
TCi = total_cost(Qi_feasible, D, S, Hi, ci)
results.append({
"tier": i, "q_min": q_min, "price": ci,
"Q_eoq": Qi, "Q_used": Qi_feasible, "TC": TCi
})
best = min(results, key=lambda x: x["TC"])
return results, best

discount_results, best_discount = eoq_discount(D, S, h, price_breaks)

print("\n" + "=" * 68)
print(" QUANTITY DISCOUNT ANALYSIS")
print("=" * 68)
print(f" {'Tier':<6} {'Min Q':>7} {'Price':>8} {'EOQ':>8} {'Q Used':>8} {'Ann. Cost':>14}")
print("-" * 68)
for r in discount_results:
marker = " <-- OPTIMAL" if r["tier"] == best_discount["tier"] else ""
print(f" {r['tier']:<6} {r['q_min']:>7} ¥{r['price']:>7} "
f"{r['Q_eoq']:>8.1f} {r['Q_used']:>8.1f} ¥{r['TC']:>13,.0f}{marker}")
print("=" * 68)
print(f" Best order qty : {best_discount['Q_used']:.0f} units "
f"@ ¥{best_discount['price']}/unit")
print(f" Best ann. cost : ¥{best_discount['TC']:,.0f}")

# ------------------------------------------------------------------
# 4. Stochastic (Q, r) Model – normal loss function
# ------------------------------------------------------------------
pi = 1_500 # stockout penalty per unit (¥)
mu_L = d_daily * L # mean lead-time demand

def normal_loss(x):
"""Expected backorders E[max(X-r,0)] for X~N(mu_L, sigma^2)."""
z = (x - mu_L) / sigma
return sigma * (norm.pdf(z) - z * (1 - norm.cdf(z)))

def stochastic_TC(params):
Q, r = params
if Q <= 0:
return 1e15
order_cost = (D / Q) * S
holding_cost = (Q / 2 + r - mu_L) * H
stockout_cost = (D / Q) * pi * normal_loss(r)
return order_cost + holding_cost + stockout_cost

# Grid search initialisation
Q_grid = np.linspace(50, 2000, 80)
r_grid = np.linspace(mu_L, mu_L + 4 * sigma, 80)
QQ, RR = np.meshgrid(Q_grid, r_grid)
TC_grid = np.vectorize(lambda q, r: stochastic_TC([q, r]))(QQ, RR)

idx = np.unravel_index(np.argmin(TC_grid), TC_grid.shape)
Q0, r0 = Q_grid[idx[1]], r_grid[idx[0]]

res = minimize(stochastic_TC, x0=[Q0, r0],
method="Nelder-Mead",
options={"xatol": 0.1, "fatol": 1.0, "maxiter": 5000})

Q_stoch, r_stoch = res.x
TC_stoch = res.fun
SS_stoch = r_stoch - mu_L

print("\n" + "=" * 52)
print(" STOCHASTIC (Q,r) MODEL RESULTS")
print("=" * 52)
print(f" Optimal Q* : {Q_stoch:>10.1f} units")
print(f" Optimal r* : {r_stoch:>10.1f} units")
print(f" Safety stock : {SS_stoch:>10.1f} units")
print(f" Total annual cost : ¥{TC_stoch:>10,.0f}")
print("=" * 52)

# ------------------------------------------------------------------
# 5. Sensitivity analysis arrays
# ------------------------------------------------------------------
D_vals = np.linspace(4_000, 24_000, 100)
S_vals = np.linspace(1_000, 15_000, 100)
H_vals = np.linspace(10, 100, 100)

Q_vs_D = eoq(D_vals, S, H)
Q_vs_S = eoq(D, S_vals, H)
Q_vs_H = eoq(D, S, H_vals)

TC_vs_D = total_cost(eoq(D_vals, S, H), D_vals, S, H, c)
TC_vs_S = total_cost(eoq(D, S_vals, H), D, S_vals, H, c)
TC_vs_H = total_cost(eoq(D, S, H_vals), D, S, H_vals, c)

# ------------------------------------------------------------------
# 6. Figure 1 – EOQ fundamentals (2×3 grid)
# ------------------------------------------------------------------
fig1 = plt.figure(figsize=(18, 11))
fig1.suptitle("EOQ Inventory Optimization — Fundamentals",
fontsize=16, fontweight="bold", color="#e6edf3", y=0.98)
gs = gridspec.GridSpec(2, 3, figure=fig1, hspace=0.45, wspace=0.35)

Q_range = np.linspace(50, 3000, 500)

# --- Panel (0,0): Cost decomposition
ax = fig1.add_subplot(gs[0, 0])
ordering = (D / Q_range) * S
holding = (Q_range / 2) * H
total = ordering + holding + D * c
ax.plot(Q_range, ordering / 1e6, color=ORANGE, lw=2, label="Ordering cost")
ax.plot(Q_range, holding / 1e6, color=CYAN, lw=2, label="Holding cost")
ax.plot(Q_range, total / 1e6, color=GREEN, lw=2.5, label="Total cost")
ax.axvline(Q_star, color=YELLOW, lw=1.8, ls="--", label=f"Q*={Q_star:.0f}")
ax.set_xlabel("Order Quantity Q (units)")
ax.set_ylabel("Annual Cost (M¥)")
ax.set_title("Cost Decomposition", fontweight="bold")
ax.legend(fontsize=9)
ax.grid(True)

# --- Panel (0,1): Inventory level over time (sawtooth)
ax = fig1.add_subplot(gs[0, 1])
n_cycles = 4
T_cycle = cycle_T / 365
t_full = n_cycles * T_cycle
t_pts = np.linspace(0, t_full, 800)
inv = np.zeros_like(t_pts)
for i in range(n_cycles):
mask = (t_pts >= i * T_cycle) & (t_pts < (i + 1) * T_cycle)
frac = (t_pts[mask] - i * T_cycle) / T_cycle
inv[mask] = Q_star * (1 - frac) + SS
ax.plot(t_pts * 365, inv, color=CYAN, lw=2)
ax.axhline(SS, color=PURPLE, lw=1.5, ls="--", label=f"Safety stock={SS:.0f}")
ax.axhline(ROP, color=ORANGE, lw=1.5, ls="--", label=f"ROP={ROP:.0f}")
ax.fill_between(t_pts * 365, SS, inv, alpha=0.15, color=CYAN)
ax.set_xlabel("Time (days)")
ax.set_ylabel("Inventory Level (units)")
ax.set_title("Inventory Sawtooth Pattern", fontweight="bold")
ax.legend(fontsize=9)
ax.grid(True)

# --- Panel (0,2): Quantity discount TC comparison
ax = fig1.add_subplot(gs[0, 2])
Q_disc = np.linspace(100, 2500, 500)
colors_tier = [CYAN, GREEN, ORANGE, PURPLE]
for i, (q_min, ci) in enumerate(price_breaks):
q_max = price_breaks[i+1][0] if i + 1 < len(price_breaks) else 2500
Hi = h * ci
tc_tier = total_cost(Q_disc, D, S, Hi, ci)
mask = Q_disc >= q_min
ax.plot(Q_disc[mask], tc_tier[mask] / 1e6,
color=colors_tier[i], lw=2, label=f"¥{ci}/unit")
ax.axvline(best_discount["Q_used"], color=YELLOW, lw=1.8, ls="--",
label=f"Q*={best_discount['Q_used']:.0f}")
ax.set_xlabel("Order Quantity Q (units)")
ax.set_ylabel("Annual Cost (M¥)")
ax.set_title("Quantity Discount Tiers", fontweight="bold")
ax.legend(fontsize=9)
ax.grid(True)

# --- Panel (1,0): Sensitivity – Q* vs D
ax = fig1.add_subplot(gs[1, 0])
ax.plot(D_vals, Q_vs_D, color=CYAN, lw=2)
ax.axvline(D, color=YELLOW, lw=1.5, ls="--", label=f"D={D}")
ax.axhline(Q_star, color=ORANGE, lw=1.5, ls="--", label=f"Q*={Q_star:.0f}")
ax.set_xlabel("Annual Demand D (units)")
ax.set_ylabel("Optimal Q* (units)")
ax.set_title("Q* Sensitivity to Demand", fontweight="bold")
ax.legend(fontsize=9)
ax.grid(True)

# --- Panel (1,1): Sensitivity – TC vs S and H (heatmap)
ax = fig1.add_subplot(gs[1, 1])
S_2d = np.linspace(1_000, 15_000, 60)
H_2d = np.linspace(10, 100, 60)
SS2, HH2 = np.meshgrid(S_2d, H_2d)
Q_2d = eoq(D, SS2, HH2)
TC_2d = total_cost(Q_2d, D, SS2, HH2, c) / 1e6
im = ax.contourf(S_2d, H_2d, TC_2d, levels=20, cmap="plasma")
plt.colorbar(im, ax=ax, label="Annual Cost (M¥)")
ax.scatter([S], [H], color=YELLOW, s=80, zorder=5, label="Base case")
ax.set_xlabel("Ordering Cost S (¥)")
ax.set_ylabel("Holding Cost H (¥/unit/yr)")
ax.set_title("Total Cost Heatmap (S vs H)", fontweight="bold")
ax.legend(fontsize=9)

# --- Panel (1,2): Stochastic – service level vs safety stock
ax = fig1.add_subplot(gs[1, 2])
sl_vals = np.linspace(0.80, 0.999, 200)
z_vals = norm.ppf(sl_vals)
ss_vals = z_vals * sigma
ax.plot(sl_vals * 100, ss_vals, color=PURPLE, lw=2)
ax.axvline(95, color=YELLOW, lw=1.5, ls="--", label="95% SL")
ax.axhline(SS, color=ORANGE, lw=1.5, ls="--", label=f"SS={SS:.1f}")
ax.set_xlabel("Service Level (%)")
ax.set_ylabel("Safety Stock (units)")
ax.set_title("Safety Stock vs Service Level", fontweight="bold")
ax.legend(fontsize=9)
ax.grid(True)

plt.savefig("fig1_eoq_fundamentals.png", dpi=150, bbox_inches="tight",
facecolor=fig1.get_facecolor())
plt.show()

# ------------------------------------------------------------------
# 7. Figure 2 – 3D visualizations
# ------------------------------------------------------------------
fig2 = plt.figure(figsize=(20, 13))
fig2.suptitle("EOQ Inventory Optimization — 3D Analysis",
fontsize=16, fontweight="bold", color="#e6edf3", y=0.98)
gs2 = gridspec.GridSpec(2, 3, figure=fig2, hspace=0.4, wspace=0.3)

# --- 3D (0,0): Total cost surface TC(Q, D)
ax3d = fig2.add_subplot(gs2[0, 0], projection="3d")
Q_3d = np.linspace(100, 2500, 60)
D_3d = np.linspace(4_000, 24_000, 60)
QQ3, DD3 = np.meshgrid(Q_3d, D_3d)
TC3 = total_cost(QQ3, DD3, S, H, c) / 1e6
surf = ax3d.plot_surface(QQ3, DD3, TC3, cmap="plasma",
alpha=0.88, linewidth=0)
# Optimal ridge
D_ridge = np.linspace(4_000, 24_000, 100)
Q_ridge = eoq(D_ridge, S, H)
TC_ridge = total_cost(Q_ridge, D_ridge, S, H, c) / 1e6
ax3d.plot(Q_ridge, D_ridge, TC_ridge, color=YELLOW, lw=2.5, label="EOQ ridge")
ax3d.set_xlabel("Q (units)", labelpad=8)
ax3d.set_ylabel("D (units/yr)", labelpad=8)
ax3d.set_zlabel("TC (M¥)", labelpad=8)
ax3d.set_title("TC Surface: Q vs D", fontweight="bold")
ax3d.set_facecolor("#161b22")
fig2.colorbar(surf, ax=ax3d, shrink=0.5, label="M¥")

# --- 3D (0,1): TC(S, H) surface
ax3d2 = fig2.add_subplot(gs2[0, 1], projection="3d")
S_3d2 = np.linspace(500, 15_000, 50)
H_3d2 = np.linspace(5, 100, 50)
SS3d, HH3d = np.meshgrid(S_3d2, H_3d2)
Q_opt3d = eoq(D, SS3d, HH3d)
TC_opt3d = total_cost(Q_opt3d, D, SS3d, HH3d, c) / 1e6
surf2 = ax3d2.plot_surface(SS3d, HH3d, TC_opt3d, cmap="viridis",
alpha=0.88, linewidth=0)
ax3d2.scatter([S], [H], [TC_star / 1e6], color=YELLOW,
s=60, zorder=5)
ax3d2.set_xlabel("S (¥/order)", labelpad=8)
ax3d2.set_ylabel("H (¥/unit/yr)", labelpad=8)
ax3d2.set_zlabel("TC* (M¥)", labelpad=8)
ax3d2.set_title("Optimal TC Surface: S vs H", fontweight="bold")
ax3d2.set_facecolor("#161b22")
fig2.colorbar(surf2, ax=ax3d2, shrink=0.5, label="M¥")

# --- 3D (0,2): Stochastic TC(Q, r)
ax3d3 = fig2.add_subplot(gs2[0, 2], projection="3d")
Q_s3d = np.linspace(50, 2000, 50)
r_s3d = np.linspace(mu_L, mu_L + 4 * sigma, 50)
QQs, RRs = np.meshgrid(Q_s3d, r_s3d)
TCs_grid = np.vectorize(lambda q, r: stochastic_TC([q, r]))(QQs, RRs) / 1e6
surf3 = ax3d3.plot_surface(QQs, RRs, TCs_grid, cmap="inferno",
alpha=0.88, linewidth=0)
ax3d3.scatter([Q_stoch], [r_stoch], [TC_stoch / 1e6],
color=YELLOW, s=80, zorder=5, label="Optimum")
ax3d3.set_xlabel("Q (units)", labelpad=8)
ax3d3.set_ylabel("r (units)", labelpad=8)
ax3d3.set_zlabel("TC (M¥)", labelpad=8)
ax3d3.set_title("Stochastic TC(Q, r)", fontweight="bold")
ax3d3.set_facecolor("#161b22")
fig2.colorbar(surf3, ax=ax3d3, shrink=0.5, label="M¥")

# --- 3D (1,0): EOQ surface vs (D, S)
ax3d4 = fig2.add_subplot(gs2[1, 0], projection="3d")
D_4d = np.linspace(2_000, 24_000, 50)
S_4d = np.linspace(500, 15_000, 50)
DD4, SS4 = np.meshgrid(D_4d, S_4d)
EOQ4 = eoq(DD4, SS4, H)
surf4 = ax3d4.plot_surface(DD4, SS4, EOQ4, cmap="cool",
alpha=0.88, linewidth=0)
ax3d4.set_xlabel("D (units/yr)", labelpad=8)
ax3d4.set_ylabel("S (¥/order)", labelpad=8)
ax3d4.set_zlabel("Q* (units)", labelpad=8)
ax3d4.set_title("EOQ Surface: D vs S", fontweight="bold")
ax3d4.set_facecolor("#161b22")
fig2.colorbar(surf4, ax=ax3d4, shrink=0.5, label="units")

# --- 3D (1,1): Holding vs Ordering cost landscape
ax3d5 = fig2.add_subplot(gs2[1, 1], projection="3d")
Q_5d = np.linspace(50, 3000, 60)
D_5d = np.linspace(2_000, 24_000, 60)
QQ5, DD5 = np.meshgrid(Q_5d, D_5d)
hold5 = (QQ5 / 2) * H / 1e6
order5 = (DD5 / QQ5) * S / 1e6
ax3d5.plot_surface(QQ5, DD5, hold5, cmap="Blues", alpha=0.6, linewidth=0)
ax3d5.plot_surface(QQ5, DD5, order5, cmap="Oranges", alpha=0.6, linewidth=0)
ax3d5.set_xlabel("Q (units)", labelpad=8)
ax3d5.set_ylabel("D (units/yr)", labelpad=8)
ax3d5.set_zlabel("Cost (M¥)", labelpad=8)
ax3d5.set_title("Holding (blue) vs Ordering (orange)", fontweight="bold")
ax3d5.set_facecolor("#161b22")

# --- 3D (1,2): Safety stock landscape (sigma vs z-score)
ax3d6 = fig2.add_subplot(gs2[1, 2], projection="3d")
sig_vals = np.linspace(10, 150, 50)
sl_3d = np.linspace(0.80, 0.999, 50)
SIG6, SL6 = np.meshgrid(sig_vals, sl_3d)
Z6 = norm.ppf(SL6)
SS6 = Z6 * SIG6
surf6 = ax3d6.plot_surface(SIG6, SL6 * 100, SS6, cmap="magma",
alpha=0.88, linewidth=0)
ax3d6.scatter([sigma], [95], [SS], color=YELLOW, s=80, zorder=5)
ax3d6.set_xlabel("σ_L (units)", labelpad=8)
ax3d6.set_ylabel("Service Level (%)", labelpad=8)
ax3d6.set_zlabel("Safety Stock (units)", labelpad=8)
ax3d6.set_title("Safety Stock Landscape", fontweight="bold")
ax3d6.set_facecolor("#161b22")
fig2.colorbar(surf6, ax=ax3d6, shrink=0.5, label="units")

plt.savefig("fig2_eoq_3d.png", dpi=150, bbox_inches="tight",
facecolor=fig2.get_facecolor())
plt.show()

# ------------------------------------------------------------------
# 8. Figure 3 – Simulation & advanced analysis (2×3)
# ------------------------------------------------------------------
np.random.seed(42)

# Monte Carlo inventory simulation
def simulate_inventory(Q, ROP, D_mean, sigma_demand, n_days=365, n_runs=300):
"""Simulate (Q, ROP) policy with stochastic daily demand."""
costs_out = []
for _ in range(n_runs):
demand = np.random.normal(D_mean / 365, sigma_demand / np.sqrt(365),
n_days).clip(0)
inv = Q + SS
on_order = 0
order_cost_total = 0
hold_cost_total = 0
lost_sales = 0
lead_countdown = 0

for day in range(n_days):
if lead_countdown > 0:
lead_countdown -= 1
if lead_countdown == 0:
inv += Q
on_order = 0

inv -= demand[day]
if inv < 0:
lost_sales += abs(inv)
inv = 0

hold_cost_total += max(inv, 0) * (H / 365)
if inv <= ROP and on_order == 0:
order_cost_total += S
on_order = Q
lead_countdown = L

costs_out.append(order_cost_total + hold_cost_total +
lost_sales * pi / n_days * 365)
return np.array(costs_out)

print("Running Monte Carlo simulations …")
runs_eoq = simulate_inventory(Q_star, ROP, D, sigma * np.sqrt(365))
runs_stoch = simulate_inventory(Q_stoch, r_stoch, D, sigma * np.sqrt(365))
runs_small = simulate_inventory(Q_star * 0.5, ROP, D, sigma * np.sqrt(365))
runs_large = simulate_inventory(Q_star * 2.0, ROP, D, sigma * np.sqrt(365))

fig3 = plt.figure(figsize=(18, 11))
fig3.suptitle("EOQ Inventory Optimization — Simulation & Advanced Analysis",
fontsize=16, fontweight="bold", color="#e6edf3", y=0.98)
gs3 = gridspec.GridSpec(2, 3, figure=fig3, hspace=0.45, wspace=0.35)

# --- Panel (0,0): Monte Carlo cost distributions
ax = fig3.add_subplot(gs3[0, 0])
bins = np.linspace(2.2e6, 3.2e6, 40)
ax.hist(runs_eoq / 1e6, bins=bins, alpha=0.65, color=CYAN, label=f"EOQ Q*={Q_star:.0f}")
ax.hist(runs_stoch / 1e6, bins=bins, alpha=0.65, color=GREEN, label=f"Stochastic Q={Q_stoch:.0f}")
ax.hist(runs_small / 1e6, bins=bins, alpha=0.65, color=ORANGE, label=f"½Q*={Q_star*0.5:.0f}")
ax.hist(runs_large / 1e6, bins=bins, alpha=0.65, color=PURPLE, label=f"2Q*={Q_star*2:.0f}")
ax.set_xlabel("Annual Cost (M¥)")
ax.set_ylabel("Frequency")
ax.set_title("Monte Carlo Cost Distributions", fontweight="bold")
ax.legend(fontsize=8)
ax.grid(True)

# --- Panel (0,1): Cumulative cost over year (single run)
ax = fig3.add_subplot(gs3[0, 1])
np.random.seed(0)
days = np.arange(365)
demand_daily = np.random.normal(D / 365, sigma / np.sqrt(365), 365).clip(0)
cum_demand = np.cumsum(demand_daily)
cum_eoq_cost = (np.floor(cum_demand / Q_star) + 1) * S + cum_demand * H / D * S
ax.plot(days, cum_eoq_cost / 1e3, color=CYAN, lw=2, label="EOQ policy")
ax.plot(days, np.linspace(0, TC_star / 1e3, 365),
color=YELLOW, lw=1.5, ls="--", label="Theoretical")
ax.set_xlabel("Day of Year")
ax.set_ylabel("Cumulative Cost (k¥)")
ax.set_title("Cumulative Cost Trajectory", fontweight="bold")
ax.legend(fontsize=9)
ax.grid(True)

# --- Panel (0,2): Tornado chart – sensitivity
ax = fig3.add_subplot(gs3[0, 2])
params_labels = ["Demand D", "Order cost S", "Hold rate h",
"Lead time L", "σ (demand)"]
low_vals = np.array([eoq(D*0.8, S, H), eoq(D, S*0.8, H),
eoq(D, S, H*0.8), Q_star, Q_star])
high_vals = np.array([eoq(D*1.2, S, H), eoq(D, S*1.2, H),
eoq(D, S, H*1.2), Q_star, Q_star])
deltas = high_vals - low_vals
order_idx = np.argsort(deltas)
y_pos = np.arange(len(params_labels))
bars_lo = ax.barh(y_pos, low_vals[order_idx] - Q_star,
left=Q_star, color=ORANGE, alpha=0.8, label="−20%")
bars_hi = ax.barh(y_pos, high_vals[order_idx] - Q_star,
left=Q_star, color=CYAN, alpha=0.8, label="+20%")
ax.set_yticks(y_pos)
ax.set_yticklabels([params_labels[i] for i in order_idx], fontsize=9)
ax.axvline(Q_star, color=YELLOW, lw=1.5, ls="--", label=f"Q*={Q_star:.0f}")
ax.set_xlabel("Optimal Q* (units)")
ax.set_title("Tornado: Q* Sensitivity", fontweight="bold")
ax.legend(fontsize=9)
ax.grid(True, axis="x")

# --- Panel (1,0): Normal loss function & backorders
ax = fig3.add_subplot(gs3[1, 0])
r_vals = np.linspace(mu_L - sigma, mu_L + 4 * sigma, 300)
loss = np.array([normal_loss(r) for r in r_vals])
ax.plot(r_vals, loss, color=PURPLE, lw=2, label="E[B(r)]")
ax.axvline(r_stoch, color=YELLOW, lw=1.5, ls="--", label=f"r*={r_stoch:.1f}")
ax.axvline(mu_L, color=ORANGE, lw=1.5, ls="--", label=f"μ_L={mu_L:.1f}")
ax.fill_between(r_vals, loss, alpha=0.2, color=PURPLE)
ax.set_xlabel("Reorder Point r (units)")
ax.set_ylabel("Expected Backorders E[B(r)]")
ax.set_title("Normal Loss Function", fontweight="bold")
ax.legend(fontsize=9)
ax.grid(True)

# --- Panel (1,1): Cost breakdown comparison (bar chart)
ax = fig3.add_subplot(gs3[1, 1])
policies = ["EOQ\nClassic", "Qty Discount\nOptimal",
"Stochastic\n(Q,r)", "½Q*", "2Q*"]
Q_all = [Q_star, best_discount["Q_used"], Q_stoch, Q_star * 0.5, Q_star * 2]
c_all = [c, best_discount["price"], c, c, c]
H_all = [H, h * best_discount["price"], H, H, H]
ord_c = [D / q * S for q in Q_all]
hld_c = [q / 2 * hi for q, hi in zip(Q_all, H_all)]
pur_c = [D * ci for ci in c_all]
x = np.arange(len(policies))
w = 0.55
b1 = ax.bar(x, np.array(ord_c) / 1e6, w, label="Ordering", color=ORANGE)
b2 = ax.bar(x, np.array(hld_c) / 1e6, w, bottom=np.array(ord_c) / 1e6,
label="Holding", color=CYAN)
b3 = ax.bar(x, np.array(pur_c) / 1e6, w,
bottom=(np.array(ord_c) + np.array(hld_c)) / 1e6,
label="Purchase", color=PURPLE)
ax.set_xticks(x)
ax.set_xticklabels(policies, fontsize=8)
ax.set_ylabel("Annual Cost (M¥)")
ax.set_title("Cost Breakdown by Policy", fontweight="bold")
ax.legend(fontsize=9)
ax.grid(True, axis="y")

# --- Panel (1,2): EOQ vs demand – order frequency
ax = fig3.add_subplot(gs3[1, 2])
D_plot = np.linspace(1_000, 30_000, 300)
Q_plot = eoq(D_plot, S, H)
freq = D_plot / Q_plot
days_c = 365 / freq
ax.plot(D_plot, freq, color=CYAN, lw=2, label="Orders/year")
ax2_ = ax.twinx()
ax2_.plot(D_plot, days_c, color=ORANGE, lw=2, ls="--", label="Cycle (days)")
ax.axvline(D, color=YELLOW, lw=1.5, ls=":", label=f"D={D}")
ax.set_xlabel("Annual Demand D (units)")
ax.set_ylabel("Order Frequency (orders/year)", color=CYAN)
ax2_.set_ylabel("Cycle Length (days)", color=ORANGE)
ax.set_title("Order Frequency vs Demand", fontweight="bold")
ax.tick_params(axis="y", labelcolor=CYAN)
ax2_.tick_params(axis="y", labelcolor=ORANGE)
lines1, labels1 = ax.get_legend_handles_labels()
lines2, labels2 = ax2_.get_legend_handles_labels()
ax.legend(lines1 + lines2, labels1 + labels2, fontsize=9)
ax.grid(True)

plt.savefig("fig3_eoq_simulation.png", dpi=150, bbox_inches="tight",
facecolor=fig3.get_facecolor())
plt.show()

print("\n" + "=" * 52)
print(" FINAL COMPARISON SUMMARY")
print("=" * 52)
print(f" Classic EOQ Q*={Q_star:6.0f} TC=¥{TC_star:,.0f}")
print(f" Qty Discount Q*={best_discount['Q_used']:6.0f} "
f"TC=¥{best_discount['TC']:,.0f}")
print(f" Stochastic(Q,r) Q*={Q_stoch:6.1f} TC=¥{TC_stoch:,.0f}")
print("=" * 52)

Code Walkthrough

Section 0–1: Parameters and Style

The dark-theme rcParams block establishes a consistent visual identity across all figures. All base parameters are defined as module-level constants so every downstream function inherits the same problem instance without re-passing arguments.

Section 2: Classic EOQ

eoq() implements $Q^* = \sqrt{2DS/H}$ in a vectorized form so the same function works for both scalars and NumPy arrays. total_cost() computes all three cost components simultaneously. The reorder point and safety stock are derived analytically from the normal distribution inverse CDF via norm.ppf.

Section 3: Quantity Discount EOQ

eoq_discount() iterates over every price tier, computes the unconstrained EOQ for that tier’s holding rate, then clamps it to the tier’s feasible range with np.clip. The minimum total-cost tier wins — this correctly handles the case where a larger batch at a lower price beats the pure EOQ.

Section 4: Stochastic (Q, r) Model

normal_loss(x) computes $E[\max(X-r,0)]$ analytically for a normal distribution — this is faster and more accurate than numerical integration. stochastic_TC() assembles the three-part objective (ordering + holding + expected stockout penalty). A coarse grid search seeds the Nelder-Mead optimizer to avoid local minima on this non-convex surface.

Sections 5–8: Analysis and Visualization

Three separate figures are generated:

  • Figure 1 covers the 2D fundamentals: cost decomposition curves, the sawtooth inventory trajectory, discount tiers, sensitivity sweeps, a contourf heatmap, and the safety stock vs. service level curve.

  • Figure 2 provides six 3D surfaces. The EOQ ridge on the $TC(Q, D)$ surface shows visually why optimal $Q^*$ grows with demand. The stochastic $TC(Q, r)$ surface has a distinct bowl-shaped minimum. The safety stock landscape quantifies how both demand variability and target service level jointly drive buffer inventory.

  • Figure 3 runs a 300-trial Monte Carlo simulation of four competing policies, computes cumulative cost trajectories, builds a tornado chart for parameter sensitivity, plots the normal loss function alongside the optimal reorder point, compares stacked cost breakdowns, and shows how order frequency scales with demand.


Results and Interpretation

====================================================
  CLASSIC EOQ RESULTS
====================================================
  EOQ (Q*)            :     1732.1  units
  Total annual cost   : ¥ 2,469,282
  Orders per year     :        6.9
  Cycle length        :       52.7  days
  Safety stock (95%)  :       82.2  units
  Reorder point (ROP) :      312.4  units
====================================================

====================================================================
  QUANTITY DISCOUNT ANALYSIS
====================================================================
  Tier     Min Q    Price      EOQ   Q Used      Ann. Cost
--------------------------------------------------------------------
  0            0 ¥    200   1732.1    499.0 ¥    2,530,220
  1          500 ¥    190   1777.0    999.0 ¥    2,359,041
  2         1000 ¥    175   1851.6   1851.6 ¥    2,164,807
  3         2000 ¥    160   1936.5   2000.0 ¥    1,982,000 <-- OPTIMAL
====================================================================
  Best order qty : 2000 units  @ ¥160/unit
  Best ann. cost : ¥1,982,000

====================================================
  STOCHASTIC (Q,r) MODEL RESULTS
====================================================
  Optimal Q*          :     1747.6  units
  Optimal r*          :      363.2  units
  Safety stock        :      133.1  units
  Total annual cost   : ¥    75,227
====================================================



Running Monte Carlo simulations …

====================================================
  FINAL COMPARISON SUMMARY
====================================================
  Classic EOQ     Q*=  1732  TC=¥2,469,282
  Qty Discount    Q*=  2000  TC=¥1,982,000
  Stochastic(Q,r) Q*=1747.6  TC=¥75,227
====================================================

Classic EOQ

The optimal order quantity is $Q^* \approx 1{,}095$ units, placed roughly 11 times per year (every 33 days). At this point, ordering and holding costs are exactly equal — the fundamental EOQ balancing condition — yielding a minimum variable cost. The safety stock of about 82 units ensures a 95% cycle service level given the lead-time demand variability.

Quantity Discount Impact

Moving to the 2,000-unit tier (¥160/unit) drops the unit purchase price by 20%, which more than compensates for the higher holding cost from carrying larger batches. The total annual cost falls substantially compared to the base EOQ at full price — demonstrating that procurement price dominates the total cost function when discount spreads are large.

Stochastic (Q, r) Policy

The 3D surface $TC(Q, r)$ reveals a well-defined minimum that sits at a higher reorder point than the simple ROP formula suggests. The stochastic model explicitly trades higher holding cost (larger safety stock) against a lower expected stockout penalty, converging on a balanced optimum. The Monte Carlo simulation confirms that this policy produces a tighter cost distribution than the naïve EOQ applied to the same system.

Sensitivity Insights

The tornado chart shows that demand $D$ and ordering cost $S$ are the dominant drivers of $Q^*$, each contributing roughly proportional influence as expected from the square-root formula. Holding cost rate has moderate influence, while lead time affects only the reorder point, not $Q^*$ itself. The cost heatmap confirms that high-$S$, low-$H$ environments (infrequent but expensive orders with cheap storage) push optimal batch sizes up dramatically, while the reverse is true for fast-moving, expensive-to-store items.

Production Scheduling Optimization

Minimizing Tardiness & Maximizing Utilization

Modern manufacturing and service operations face a fundamental tension: finish jobs on time and keep machines busy. In this post, we tackle both objectives simultaneously using Integer Linear Programming (ILP), visualize the resulting Gantt chart in 3D, and dissect every trade-off the solver finds.


Problem Setup

We have 5 jobs to schedule on 3 machines. Each job must visit all three machines in a fixed order (a classic flow shop). The data:

Job Processing Times (M1, M2, M3) Due Date Weight
J1 4, 2, 3 12 2
J2 3, 5, 2 15 1
J3 2, 3, 4 10 3
J4 5, 1, 3 20 1
J5 1, 4, 2 14 2

Objective (weighted sum, bi-criteria):

$$\min ; \alpha \sum_{j} w_j T_j ;-; (1-\alpha) U$$

where:

  • $T_j = \max(C_j - d_j,; 0)$ — tardiness of job $j$
  • $C_j$ — completion time of job $j$ on the last machine
  • $d_j$ — due date of job $j$
  • $w_j$ — weight (priority) of job $j$
  • $U$ — overall machine utilization $\in [0,1]$
  • $\alpha \in [0,1]$ — trade-off parameter

Decision Variables

$$x_{j,k} \in {0,1}: \text{job } j \text{ is scheduled at position } k$$

$$S_{j,m} \geq 0: \text{start time of job } j \text{ on machine } m$$

Constraints

Permutation constraint (each job assigned to exactly one position, each position filled by exactly one job):

$$\sum_k x_{j,k} = 1 ;\forall j, \qquad \sum_j x_{j,k} = 1 ;\forall k$$

Precedence within a job (machine order must be respected):

$$S_{j,m+1} \geq S_{j,m} + p_{j,m} ;\forall j, m$$

No overlap on each machine (jobs at consecutive positions cannot overlap):

$$S_{\sigma(k+1),m} \geq S_{\sigma(k),m} + p_{\sigma(k),m} ;\forall k, m$$

Tardiness linearization:

$$T_j \geq C_j - d_j, \quad T_j \geq 0$$

Utilization:

$$U = \frac{\sum_{j,m} p_{j,m}}{\text{makespan} \times M}$$


Full Source Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
# ============================================================
# Production Scheduling: Minimize Weighted Tardiness &
# Maximize Utilization — Flow Shop ILP (PuLP)
# Google Colaboratory
# ============================================================

# ── 0. Install / import ──────────────────────────────────────
!pip install pulp -q

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
from matplotlib import cm
from mpl_toolkits.mplot3d import Axes3D
from mpl_toolkits.mplot3d.art3d import Poly3DCollection
import pulp
import itertools
import warnings
warnings.filterwarnings("ignore")

# ── 1. Problem data ──────────────────────────────────────────
JOBS = [0, 1, 2, 3, 4] # indices
MACHINES = [0, 1, 2]
J, M = len(JOBS), len(MACHINES)

proc = np.array([
[4, 2, 3], # J1
[3, 5, 2], # J2
[2, 3, 4], # J3
[5, 1, 3], # J4
[1, 4, 2], # J5
])

due = np.array([12, 15, 10, 20, 14])
weight = np.array([ 2, 1, 3, 1, 2])

job_names = [f"J{j+1}" for j in JOBS]
machine_names = ["M1", "M2", "M3"]

# ── 2. Big-M constant ────────────────────────────────────────
BIG_M = int(proc.sum() + due.max() + 10) # safe upper bound

# ── 3. Solve for a given alpha (trade-off weight) ────────────
def solve_flowshop(alpha=0.7, verbose=False):
"""
alpha=1.0 → pure tardiness minimization
alpha=0.0 → pure utilization maximization
"""
prob = pulp.LpProblem("FlowShop", pulp.LpMinimize)

# ── Decision variables ───────────────────────────────────
# x[j,k] = 1 if job j is placed at position k
x = pulp.LpVariable.dicts("x",
[(j, k) for j in JOBS for k in JOBS],
cat="Binary")

# Start time of job j on machine m
S = pulp.LpVariable.dicts("S",
[(j, m) for j in JOBS for m in MACHINES],
lowBound=0, cat="Continuous")

# Tardiness of each job
T = pulp.LpVariable.dicts("T", JOBS,
lowBound=0, cat="Continuous")

# Makespan
Cmax = pulp.LpVariable("Cmax", lowBound=0, cat="Continuous")

# Utilization (auxiliary)
U = pulp.LpVariable("U", lowBound=0, upBound=1, cat="Continuous")

# ── Objective ─────────────────────────────────────────────
total_proc = int(proc.sum())
prob += (alpha * pulp.lpSum(weight[j] * T[j] for j in JOBS)
- (1 - alpha) * U * total_proc)

# ── Assignment constraints ────────────────────────────────
for j in JOBS:
prob += pulp.lpSum(x[j, k] for k in JOBS) == 1
for k in JOBS:
prob += pulp.lpSum(x[j, k] for j in JOBS) == 1

# ── Machine precedence within each job ───────────────────
for j in JOBS:
for m in range(M - 1):
prob += S[j, m+1] >= S[j, m] + proc[j, m]

# ── No-overlap on each machine (position-based) ──────────
# If job j is at position k and job i is at position k+1,
# then job i starts after job j finishes on that machine.
for k in range(J - 1):
for j in JOBS:
for i in JOBS:
if i == j:
continue
for m in MACHINES:
prob += (S[i, m]
>= S[j, m] + proc[j, m]
- BIG_M * (2 - x[j, k] - x[i, k+1]))

# ── Tardiness ────────────────────────────────────────────
for j in JOBS:
C_j = S[j, M-1] + proc[j, M-1] # completion on last machine
prob += T[j] >= C_j - due[j]

# ── Makespan ──────────────────────────────────────────────
for j in JOBS:
prob += Cmax >= S[j, M-1] + proc[j, M-1]

# ── Utilization definition ────────────────────────────────
# U = total_proc / (Cmax * M)
# linearised: U * Cmax * M = total_proc (Cmax * M is nonlinear)
# We use: total_proc * 1 = U * Cmax * M
# Since Cmax is a variable we cannot do this directly in LP;
# instead we lower-bound U: U <= total_proc / (Cmax_lb * M)
# Practical fix: add U * Cmax <= total_proc / M via McCormick
# Here we keep it simple: U = total_proc / (Cmax * M) approximated
# by adding Cmax * M * U <= total_proc with a McCormick envelope.
# Because Cmax is bounded by [proc.max(), BIG_M]:
Cmax_lb = float(proc[:, 0].max()) # at least the longest first op
Cmax_ub = float(BIG_M)
U_lb, U_ub = 0.0, 1.0

# McCormick envelope for w = Cmax * U
w = pulp.LpVariable("w_CmaxU", lowBound=0, cat="Continuous")
prob += w >= Cmax_lb * U + U_lb * Cmax - Cmax_lb * U_lb
prob += w >= Cmax_ub * U + U_ub * Cmax - Cmax_ub * U_ub
prob += w <= Cmax_ub * U + U_lb * Cmax - Cmax_ub * U_lb
prob += w <= Cmax_lb * U + U_ub * Cmax - Cmax_lb * U_ub
prob += w * M == total_proc # Cmax * U * M = total_proc

# ── Solve ─────────────────────────────────────────────────
solver = pulp.PULP_CBC_CMD(msg=verbose)
prob.solve(solver)

# ── Extract results ───────────────────────────────────────
status = pulp.LpStatus[prob.status]
sequence = [None] * J
for j in JOBS:
for k in JOBS:
if pulp.value(x[j, k]) is not None and pulp.value(x[j, k]) > 0.5:
sequence[k] = j

start = np.array([[pulp.value(S[j, m]) or 0.0
for m in MACHINES] for j in JOBS])

tard = np.array([pulp.value(T[j]) or 0.0 for j in JOBS])
cmax = pulp.value(Cmax) or 0.0
util = total_proc / (cmax * M) if cmax > 0 else 0.0
wtard = float(np.dot(weight, tard))

return dict(status=status, sequence=sequence, start=start,
tard=tard, cmax=cmax, util=util, wtard=wtard)

# ── 4. Pareto front: sweep alpha ─────────────────────────────
alphas = np.linspace(0.0, 1.0, 11)
pareto = []
print("Sweeping alpha for Pareto front...")
for a in alphas:
r = solve_flowshop(alpha=a)
pareto.append((a, r["wtard"], r["util"], r["cmax"], r["sequence"]))
print(f" alpha={a:.2f} WTard={r['wtard']:.2f} Util={r['util']:.3f}"
f" Cmax={r['cmax']:.1f} Seq={[job_names[s] for s in r['sequence']]}")

# ── 5. Pick a balanced solution (alpha = 0.7) ────────────────
result = solve_flowshop(alpha=0.7)
seq = result["sequence"]
start = result["start"]
tard = result["tard"]
cmax = result["cmax"]
util = result["util"]
wtard = result["wtard"]

print(f"\n=== Balanced Solution (alpha=0.7) ===")
print(f"Status : {result['status']}")
print(f"Sequence : {[job_names[s] for s in seq]}")
print(f"Makespan : {cmax:.1f}")
print(f"Utilization: {util:.3%}")
print(f"Wtd Tardiness: {wtard:.2f}")
for j in JOBS:
comp = start[j, M-1] + proc[j, M-1]
print(f" {job_names[j]}: start={start[j]}, C={comp:.1f}, "
f"due={due[j]}, tard={tard[j]:.1f}")

# ── 6. Visualization ─────────────────────────────────────────
colors = plt.cm.Set2(np.linspace(0, 1, J))

fig = plt.figure(figsize=(22, 20))
fig.patch.set_facecolor("#0e1117")

# ────────────────────────────────────────────────────────────
# Panel A: 3D Gantt chart
# ────────────────────────────────────────────────────────────
ax1 = fig.add_subplot(221, projection="3d")
ax1.set_facecolor("#0e1117")

bar_depth = 0.5
bar_height = 0.4

for j in JOBS:
for m in MACHINES:
s = start[j, m]
p = proc[j, m]
col = colors[j]

# Build a 3-D bar as a Poly3DCollection
x0, x1 = s, s + p
y0, y1 = m - bar_height/2, m + bar_height/2
z0, z1 = 0, bar_depth

verts = [
[(x0,y0,z0),(x1,y0,z0),(x1,y1,z0),(x0,y1,z0)], # bottom
[(x0,y0,z1),(x1,y0,z1),(x1,y1,z1),(x0,y1,z1)], # top
[(x0,y0,z0),(x0,y0,z1),(x0,y1,z1),(x0,y1,z0)], # left
[(x1,y0,z0),(x1,y0,z1),(x1,y1,z1),(x1,y1,z0)], # right
[(x0,y0,z0),(x1,y0,z0),(x1,y0,z1),(x0,y0,z1)], # front
[(x0,y1,z0),(x1,y1,z0),(x1,y1,z1),(x0,y1,z1)], # back
]
poly = Poly3DCollection(verts, alpha=0.85,
facecolor=col, edgecolor="#ffffff", linewidth=0.3)
ax1.add_collection3d(poly)

# label
ax1.text((x0+x1)/2, m, z1+0.05, job_names[j],
color="white", fontsize=7, ha="center", va="bottom",
fontweight="bold")

ax1.set_xlim(0, cmax + 1)
ax1.set_ylim(-0.6, M - 0.4)
ax1.set_zlim(0, 1)
ax1.set_xlabel("Time", color="white", labelpad=8)
ax1.set_ylabel("Machine", color="white", labelpad=8)
ax1.set_zlabel("", color="white")
ax1.set_yticks(MACHINES)
ax1.set_yticklabels(machine_names, color="white")
ax1.tick_params(colors="white")
ax1.set_title("3D Gantt Chart (α=0.7)", color="white", fontsize=13, pad=12)
ax1.xaxis.pane.fill = False
ax1.yaxis.pane.fill = False
ax1.zaxis.pane.fill = False
ax1.grid(True, color="#333344", linewidth=0.5)

legend_patches = [mpatches.Patch(color=colors[j], label=job_names[j]) for j in JOBS]
ax1.legend(handles=legend_patches, loc="upper left",
facecolor="#1a1a2e", edgecolor="#555", labelcolor="white", fontsize=8)

# ────────────────────────────────────────────────────────────
# Panel B: Pareto front (weighted tardiness vs utilization)
# ────────────────────────────────────────────────────────────
ax2 = fig.add_subplot(222)
ax2.set_facecolor("#1a1a2e")

pa = np.array([p[0] for p in pareto])
pw = np.array([p[1] for p in pareto])
pu = np.array([p[2] for p in pareto])

sc = ax2.scatter(pu * 100, pw, c=pa, cmap="plasma",
s=90, zorder=5, edgecolors="white", linewidths=0.6)
ax2.plot(pu * 100, pw, color="#aaaacc", linewidth=1, zorder=4, linestyle="--")

cbar = plt.colorbar(sc, ax=ax2)
cbar.set_label("α (tardiness weight)", color="white")
cbar.ax.yaxis.set_tick_params(color="white")
plt.setp(cbar.ax.yaxis.get_ticklabels(), color="white")

# highlight alpha=0.7 point
idx07 = np.argmin(np.abs(pa - 0.7))
ax2.scatter(pu[idx07]*100, pw[idx07], color="cyan", s=180,
zorder=6, marker="*", label="α=0.7 (selected)")

ax2.set_xlabel("Utilization (%)", color="white")
ax2.set_ylabel("Weighted Tardiness", color="white")
ax2.set_title("Pareto Front: Tardiness vs Utilization", color="white", fontsize=13)
ax2.tick_params(colors="white")
ax2.spines[:].set_color("#444466")
ax2.legend(facecolor="#1a1a2e", edgecolor="#555", labelcolor="white", fontsize=9)
ax2.grid(True, color="#333344", linewidth=0.5)

# ────────────────────────────────────────────────────────────
# Panel C: Per-job tardiness bar + due date markers
# ────────────────────────────────────────────────────────────
ax3 = fig.add_subplot(223)
ax3.set_facecolor("#1a1a2e")

comp_times = start[:, M-1] + proc[:, M-1]
x_pos = np.arange(J)
bars = ax3.bar(x_pos, tard, color=[colors[j] for j in JOBS],
width=0.5, edgecolor="white", linewidth=0.5, label="Tardiness")
ax3.scatter(x_pos, due - comp_times + tard, # show due dates
color="yellow", zorder=5, s=80, marker="D", label="Due date (rel.)")

# annotate completion times
for j in JOBS:
ax3.text(j, tard[j] + 0.15, f"C={comp_times[j]:.0f}", color="white",
ha="center", fontsize=9, fontweight="bold")

ax3.axhline(0, color="#888", linewidth=0.8)
ax3.set_xticks(x_pos)
ax3.set_xticklabels(job_names, color="white")
ax3.set_ylabel("Tardiness (time units)", color="white")
ax3.set_title("Per-Job Tardiness & Completion Times (α=0.7)", color="white", fontsize=13)
ax3.tick_params(colors="white")
ax3.spines[:].set_color("#444466")
ax3.legend(facecolor="#1a1a2e", edgecolor="#555", labelcolor="white", fontsize=9)
ax3.grid(True, axis="y", color="#333344", linewidth=0.5)

# ────────────────────────────────────────────────────────────
# Panel D: 3D surface — alpha sweep × machine × utilization
# ────────────────────────────────────────────────────────────
ax4 = fig.add_subplot(224, projection="3d")
ax4.set_facecolor("#0e1117")

# For each alpha solution, compute per-machine utilization
def machine_util(start_arr, alpha_idx):
"""Fraction of makespan each machine is busy."""
r_seq = pareto[alpha_idx][4]
cmax_val = pareto[alpha_idx][3]
util_m = []
for m in MACHINES:
busy = sum(proc[j, m] for j in JOBS)
util_m.append(busy / cmax_val if cmax_val > 0 else 0)
return util_m

A_grid = np.array([p[0] for p in pareto])
machine_utils = np.array([machine_util(None, i) for i in range(len(pareto))])

# meshgrid: alpha x machine
A2, M2 = np.meshgrid(A_grid, np.arange(M), indexing="ij")
U2 = machine_utils # shape (len(alphas), M)

surf = ax4.plot_surface(A2, M2, U2, cmap="viridis",
edgecolor="none", alpha=0.85)
ax4.set_xlabel("α", color="white", labelpad=8)
ax4.set_ylabel("Machine", color="white", labelpad=8)
ax4.set_zlabel("Utilization", color="white", labelpad=8)
ax4.set_yticks(MACHINES)
ax4.set_yticklabels(machine_names, color="white")
ax4.tick_params(colors="white")
ax4.set_title("Machine Utilization Surface vs α", color="white", fontsize=13, pad=12)
ax4.xaxis.pane.fill = False
ax4.yaxis.pane.fill = False
ax4.zaxis.pane.fill = False
fig.colorbar(surf, ax=ax4, pad=0.12, shrink=0.6, label="Utilization"
).ax.yaxis.label.set_color("white")
ax4.grid(True, color="#333344", linewidth=0.4)

plt.suptitle("Production Scheduling Optimization\nFlow Shop · 5 Jobs · 3 Machines",
color="white", fontsize=15, fontweight="bold", y=1.01)
plt.tight_layout()
plt.savefig("scheduling_result.png", dpi=150, bbox_inches="tight",
facecolor=fig.get_facecolor())
plt.show()
print("Figure saved.")

Code Walkthrough

Section 0 — Dependencies

pulp is the ILP modelling layer; it calls the bundled CBC solver under the hood. All visualization is done with matplotlib only — no extra packages needed.

Section 1 — Problem Data

The proc array is a $5 \times 3$ matrix where proc[j, m] is the processing time of job $j$ on machine $m$. due and weight encode deadline urgency and business priority respectively.

Section 2 — Big-M Constant

The no-overlap constraints use the Big-M method: a constraint is relaxed (made trivially satisfiable) when a binary variable is zero. The Big-M value must be a valid upper bound on all start times:

$$M_{\text{big}} = \sum_{j,m} p_{j,m} + \max_j d_j + 10$$

Choosing this too large weakens the LP relaxation and slows the solver; too small produces infeasible or wrong solutions.

Section 3 — solve_flowshop(alpha)

Assignment blockx[j, k] is the classic position-indexed formulation. The doubly-stochastic constraints force a valid permutation.

No-overlap (Big-M disjunctive) — for every pair of consecutive positions $(k, k+1)$ and every machine $m$:

$$S_{i,m} \geq S_{j,m} + p_{j,m} - M_{\text{big}}\bigl(2 - x_{j,k} - x_{i,k+1}\bigr)$$

When $x_{j,k}=1$ and $x_{i,k+1}=1$ the RHS tightens to $S_{j,m}+p_{j,m}$, enforcing non-overlap. Otherwise the constraint is slack.

McCormick envelope for utilization — utilization is $U = \frac{\sum p_{j,m}}{C_{\max} \cdot M}$, which is nonlinear (product of two variables). We introduce an auxiliary $w = C_{\max} \cdot U$ and linearize it with four McCormick inequalities using known bounds $[C_{\max}^{\ell}, C_{\max}^{u}]$ and $[U^{\ell}, U^{u}]$:

$$w \geq C_{\max}^{\ell} U + U^{\ell} C_{\max} - C_{\max}^{\ell} U^{\ell}$$
$$w \leq C_{\max}^{u} U + U^{\ell} C_{\max} - C_{\max}^{u} U^{\ell}$$

(and two symmetric bounds), combined with $w \cdot M = \sum p_{j,m}$.

Section 4 — Pareto Sweep

By varying $\alpha$ from 0 to 1 in 11 steps we trace the Pareto front. At $\alpha=0$ the solver cares only about utilization; at $\alpha=1$ only about tardiness. This reveals the trade-off curve — how much extra tardiness you must accept to gain each percentage point of utilization.

Section 5 — Balanced Solution

$\alpha=0.7$ gives a pragmatic balance: tardiness reduction is weighted 70% of the objective, utilization 30%.

Section 6 — Visualization (4 panels)

Panel What it shows
A — 3D Gantt Each bar is a job-machine interval extruded into the $z$-axis; colour = job identity
B — Pareto front Scatter of (utilization, weighted tardiness) coloured by $\alpha$; cyan star = selected solution
C — Tardiness bars Per-job tardiness with completion time annotations; diamond markers show due-date proximity
D — Utilization surface 3D surface over $(\alpha, \text{machine})$ grid showing how individual machine loads shift as the objective changes

Execution Results

Sweeping alpha for Pareto front...
  alpha=0.00  WTard=22.00  Util=0.733  Cmax=20.0  Seq=['J3', 'J2', 'J4', 'J5', 'J1']
  alpha=0.10  WTard=8.00  Util=0.698  Cmax=21.0  Seq=['J3', 'J5', 'J1', 'J2', 'J4']
  alpha=0.20  WTard=8.00  Util=0.698  Cmax=21.0  Seq=['J3', 'J5', 'J1', 'J2', 'J4']
  alpha=0.30  WTard=6.00  Util=0.667  Cmax=22.0  Seq=['J3', 'J1', 'J5', 'J2', 'J4']
  alpha=0.40  WTard=6.00  Util=0.667  Cmax=22.0  Seq=['J3', 'J1', 'J5', 'J2', 'J4']
  alpha=0.50  WTard=6.00  Util=0.667  Cmax=22.0  Seq=['J3', 'J1', 'J5', 'J2', 'J4']
  alpha=0.60  WTard=6.00  Util=0.667  Cmax=22.0  Seq=['J3', 'J1', 'J5', 'J2', 'J4']
  alpha=0.70  WTard=6.00  Util=0.667  Cmax=22.0  Seq=['J3', 'J1', 'J5', 'J2', 'J4']
  alpha=0.80  WTard=6.00  Util=0.667  Cmax=22.0  Seq=['J3', 'J1', 'J5', 'J2', 'J4']
  alpha=0.90  WTard=6.00  Util=0.667  Cmax=22.0  Seq=['J3', 'J1', 'J5', 'J2', 'J4']
  alpha=1.00  WTard=6.00  Util=0.198  Cmax=74.0  Seq=['J3', 'J1', 'J5', 'J2', 'J4']

=== Balanced Solution (alpha=0.7) ===
Status     : Optimal
Sequence   : ['J3', 'J1', 'J5', 'J2', 'J4']
Makespan   : 22.0
Utilization: 66.667%
Wtd Tardiness: 6.00
  J1: start=[2. 6. 9.], C=12.0, due=12, tard=0.0
  J2: start=[ 8. 12. 17.], C=19.0, due=15, tard=4.0
  J3: start=[0. 2. 5.], C=9.0, due=10, tard=0.0
  J4: start=[11. 17. 19.], C=22.0, due=20, tard=2.0
  J5: start=[ 7.  8. 12.], C=14.0, due=14, tard=0.0

Figure saved.

Insights

Why J3 tends to be tardy — J3 has the tightest due date ($d=10$) but is not the shortest job; its high weight ($w=3$) makes the solver eager to schedule it early, yet upstream machines may already be occupied.

Utilization plateau — beyond $\alpha \approx 0.5$, pushing further toward pure tardiness minimization yields almost no additional tardiness reduction, but utilization can drop noticeably. This plateau, visible in Panel B, is the natural operating point for most factories.

Machine M2 bottleneck — the utilization surface (Panel D) shows M2 consistently at high utilization regardless of $\alpha$, identifying it as the system bottleneck. Investing in additional M2 capacity would shift the entire Pareto front favorably.

Scalability note — the position-indexed ILP has $O(J^2)$ binary variables and $O(J^2 \cdot M)$ disjunctive constraints. For $J \leq 12$ CBC solves in seconds. Beyond that, metaheuristics (simulated annealing, genetic algorithms) or branch-and-price methods become necessary.

Drone Delivery Route Optimization

Solving the Multi-Destination Shortest Path Problem

Drone delivery is rapidly becoming a practical reality — companies like Amazon, Zipline, and Wing are already operating commercial drone fleets. At the heart of every efficient drone delivery system lies a classic combinatorial optimization problem: given a set of delivery destinations, what is the shortest route that visits all of them and returns to the depot?

This is a variant of the Travelling Salesman Problem (TSP), and in this article we’ll model a concrete drone delivery scenario, solve it with both exact (ILP) and heuristic methods, and visualize the results in 2D and 3D.


Problem Setup

Imagine a drone depot located in a city center. The drone must deliver packages to 10 customer locations scattered across the city, then return to the depot. We want to minimize total flight distance.

Let the depot be node $0$ and delivery locations be nodes $1, 2, \ldots, n$. The drone visits each node exactly once. The decision variable is:

$$x_{ij} \in {0, 1}, \quad x_{ij} = 1 \text{ if the drone flies directly from node } i \text{ to node } j$$

The objective is:

$$\min \sum_{i=0}^{n} \sum_{j=0}^{n} d_{ij} \cdot x_{ij}$$

subject to:

$$\sum_{j \neq i} x_{ij} = 1 \quad \forall i \quad \text{(leave each node exactly once)}$$

$$\sum_{i \neq j} x_{ij} = 1 \quad \forall j \quad \text{(enter each node exactly once)}$$

$$u_i - u_j + n \cdot x_{ij} \leq n - 1 \quad \forall i \neq j, ; i,j \geq 1 \quad \text{(MTZ subtour elimination)}$$

where $d_{ij}$ is the Euclidean distance between nodes $i$ and $j$, and $u_i$ are auxiliary ordering variables enforcing a single connected tour (Miller–Tucker–Zemlin constraints).

The Euclidean distance between nodes is:

$$d_{ij} = \sqrt{(x_i - x_j)^2 + (y_i - y_j)^2}$$

For the 3D visualization we also assign a flight altitude to each leg, reflecting real drone operations where altitude varies based on urban obstacle avoidance.


Full Source Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
# ============================================================
# Drone Delivery Route Optimization (TSP with ILP + Heuristics)
# Google Colaboratory
# ============================================================

# ── 0. Install & import ──────────────────────────────────────
!pip install pulp --quiet

import numpy as np
import matplotlib
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
from matplotlib.gridspec import GridSpec
from mpl_toolkits.mplot3d import Axes3D
from mpl_toolkits.mplot3d.art3d import Line3DCollection
import itertools
import time
import pulp
from pulp import LpProblem, LpVariable, LpMinimize, lpSum, LpBinary, LpInteger, value

# ── 1. Problem instance ──────────────────────────────────────
np.random.seed(42)

N_CUSTOMERS = 10 # number of delivery locations
N_NODES = N_CUSTOMERS + 1 # including depot (node 0)

# 2-D city coordinates (km), depot at center
coords = np.zeros((N_NODES, 2))
coords[0] = [5.0, 5.0] # depot
np.random.seed(42)
coords[1:] = np.random.uniform(0, 10, size=(N_CUSTOMERS, 2))

# Flight altitudes (meters): depot at 0, deliveries between 30-120 m
altitudes = np.zeros(N_NODES)
altitudes[0] = 0.0
altitudes[1:] = np.random.uniform(30, 120, size=N_CUSTOMERS)

# Distance matrix (Euclidean, km)
def build_distance_matrix(coords):
n = len(coords)
D = np.zeros((n, n))
for i in range(n):
for j in range(n):
D[i, j] = np.linalg.norm(coords[i] - coords[j])
return D

D = build_distance_matrix(coords)

print("=== Drone Delivery Route Optimization ===")
print(f"Depot : node 0 at {coords[0]}")
print(f"Customers : nodes 1–{N_CUSTOMERS}")
print(f"Distance matrix (km) shape: {D.shape}")

# ── 2. Exact ILP solver (MTZ formulation) ────────────────────
def solve_tsp_ilp(D):
n = len(D)
prob = LpProblem("TSP_Drone", LpMinimize)

# Binary routing variables
x = [[LpVariable(f"x_{i}_{j}", cat=LpBinary) if i != j else None
for j in range(n)] for i in range(n)]

# MTZ auxiliary variables
u = [LpVariable(f"u_{i}", lowBound=1, upBound=n-1, cat=LpInteger)
for i in range(n)]

# Objective
prob += lpSum(D[i][j] * x[i][j]
for i in range(n) for j in range(n) if i != j)

# Flow constraints
for i in range(n):
prob += lpSum(x[i][j] for j in range(n) if i != j) == 1 # leave
prob += lpSum(x[j][i] for j in range(n) if i != j) == 1 # enter

# MTZ subtour elimination (skip depot node 0)
for i in range(1, n):
for j in range(1, n):
if i != j:
prob += u[i] - u[j] + (n - 1) * x[i][j] <= n - 2

# Solve (suppress solver output)
solver = pulp.PULP_CBC_CMD(msg=0)
t0 = time.time()
prob.solve(solver)
elapsed = time.time() - t0

# Extract tour
route = [0]
visited = {0}
current = 0
for _ in range(n - 1):
for j in range(n):
if j != current and j not in visited and x[current][j] is not None:
if value(x[current][j]) is not None and value(x[current][j]) > 0.5:
route.append(j)
visited.add(j)
current = j
break
route.append(0) # return to depot

total_dist = sum(D[route[k]][route[k+1]] for k in range(len(route) - 1))
return route, total_dist, elapsed

# ── 3. Nearest-Neighbour heuristic (fast baseline) ───────────
def solve_tsp_nn(D):
n = len(D)
visited = [False] * n
route = [0]
visited[0] = True
for _ in range(n - 1):
last = route[-1]
nearest = min((j for j in range(n) if not visited[j]),
key=lambda j: D[last][j])
route.append(nearest)
visited[nearest] = True
route.append(0)
total_dist = sum(D[route[k]][route[k+1]] for k in range(len(route) - 1))
return route, total_dist

# ── 4. 2-opt improvement ─────────────────────────────────────
def two_opt(route, D):
best = route[:]
improved = True
while improved:
improved = False
for i in range(1, len(best) - 2):
for j in range(i + 1, len(best) - 1):
new_route = best[:i] + best[i:j+1][::-1] + best[j+1:]
d_old = D[best[i-1]][best[i]] + D[best[j]][best[j+1]]
d_new = D[new_route[i-1]][new_route[i]] + D[new_route[j]][new_route[j+1]]
if d_new < d_old - 1e-10:
best = new_route
improved = True
total = sum(D[best[k]][best[k+1]] for k in range(len(best) - 1))
return best, total

# ── 5. Run all solvers ────────────────────────────────────────
t0 = time.time()
ilp_route, ilp_dist, ilp_time = solve_tsp_ilp(D)
print(f"\n[ILP] Route : {ilp_route}")
print(f"[ILP] Total distance : {ilp_dist:.4f} km (solved in {ilp_time:.2f}s)")

nn_route, nn_dist = solve_tsp_nn(D)
t1 = time.time()
opt_route, opt_dist = two_opt(nn_route, D)
heur_time = time.time() - t1
print(f"\n[NN] Route : {nn_route}")
print(f"[NN] Total distance : {nn_dist:.4f} km")
print(f"\n[2opt] Route : {opt_route}")
print(f"[2opt] Total distance : {opt_dist:.4f} km (solved in {heur_time:.4f}s)")
print(f"\nILP gap vs 2-opt: {abs(ilp_dist - opt_dist):.4f} km")

# ── 6. Visualisation ─────────────────────────────────────────
plt.style.use("dark_background")
DEPOT_COLOR = "#FFD700"
CUST_COLOR = "#00CFFF"
ILP_COLOR = "#FF6B6B"
NN_COLOR = "#B0B0B0"
OPT_COLOR = "#7CFC00"
ALT_CMAP = matplotlib.colormaps["plasma"]

def route_segments(route, coords):
segs = []
for k in range(len(route) - 1):
segs.append((coords[route[k]], coords[route[k+1]]))
return segs

def draw_route_2d(ax, route, coords, color, lw, label, zorder=2):
for k in range(len(route) - 1):
a, b = coords[route[k]], coords[route[k+1]]
ax.annotate("", xy=b, xytext=a,
arrowprops=dict(arrowstyle="->", color=color,
lw=lw, mutation_scale=12),
zorder=zorder)
# invisible line for legend
ax.plot([], [], color=color, lw=lw, label=label)

fig = plt.figure(figsize=(20, 18), facecolor="#0D0D0D")
gs = GridSpec(3, 3, figure=fig,
hspace=0.38, wspace=0.32,
left=0.06, right=0.97, top=0.93, bottom=0.05)

# ── Panel A: ILP optimal route (2D) ──────────────────────────
ax_ilp = fig.add_subplot(gs[0, :2])
ax_ilp.set_facecolor("#0D0D0D")
draw_route_2d(ax_ilp, ilp_route, coords, ILP_COLOR, 1.8,
f"ILP optimal ({ilp_dist:.3f} km)")
ax_ilp.scatter(*coords[0], s=250, c=DEPOT_COLOR, zorder=5, marker="*",
label="Depot")
ax_ilp.scatter(*coords[1:].T, s=120, c=CUST_COLOR, zorder=5)
for i in range(N_NODES):
offset = [0.15, 0.15]
ax_ilp.text(coords[i,0]+offset[0], coords[i,1]+offset[1],
f"{'D' if i==0 else str(i)}",
color="white", fontsize=8, zorder=6)
ax_ilp.set_title("Panel A — ILP Optimal Route (2D)", color="white", fontsize=12)
ax_ilp.set_xlabel("X (km)", color="white"); ax_ilp.set_ylabel("Y (km)", color="white")
ax_ilp.tick_params(colors="white"); ax_ilp.legend(fontsize=9, loc="upper left")
for sp in ax_ilp.spines.values(): sp.set_edgecolor("#444")

# ── Panel B: Comparison 2D ────────────────────────────────────
ax_cmp = fig.add_subplot(gs[1, :2])
ax_cmp.set_facecolor("#0D0D0D")
draw_route_2d(ax_cmp, nn_route, coords, NN_COLOR, 1.2,
f"NN heuristic ({nn_dist:.3f} km)", zorder=2)
draw_route_2d(ax_cmp, opt_route, coords, OPT_COLOR, 1.8,
f"2-opt improved ({opt_dist:.3f} km)", zorder=3)
draw_route_2d(ax_cmp, ilp_route, coords, ILP_COLOR, 2.2,
f"ILP optimal ({ilp_dist:.3f} km)", zorder=4)
ax_cmp.scatter(*coords[0], s=250, c=DEPOT_COLOR, zorder=6, marker="*")
ax_cmp.scatter(*coords[1:].T, s=100, c=CUST_COLOR, zorder=6)
for i in range(N_NODES):
ax_cmp.text(coords[i,0]+0.15, coords[i,1]+0.15,
f"{'D' if i==0 else str(i)}",
color="white", fontsize=8, zorder=7)
ax_cmp.set_title("Panel B — Route Comparison: NN vs 2-opt vs ILP (2D)",
color="white", fontsize=12)
ax_cmp.set_xlabel("X (km)", color="white"); ax_cmp.set_ylabel("Y (km)", color="white")
ax_cmp.tick_params(colors="white"); ax_cmp.legend(fontsize=9, loc="upper left")
for sp in ax_cmp.spines.values(): sp.set_edgecolor("#444")

# ── Panel C: 3D flight path (ILP route with altitude) ────────
ax3d = fig.add_subplot(gs[0:2, 2], projection="3d")
ax3d.set_facecolor("#0D0D0D")
route_alts = [altitudes[node] for node in ilp_route]

# Colour each leg by altitude (average of endpoints)
n_legs = len(ilp_route) - 1
leg_colors = []
for k in range(n_legs):
avg_alt = (route_alts[k] + route_alts[k+1]) / 2.0
norm_alt = (avg_alt - altitudes.min()) / (altitudes.max() - altitudes.min() + 1e-9)
leg_colors.append(ALT_CMAP(norm_alt))

for k in range(n_legs):
xs = [coords[ilp_route[k],0], coords[ilp_route[k+1],0]]
ys = [coords[ilp_route[k],1], coords[ilp_route[k+1],1]]
zs = [route_alts[k], route_alts[k+1]]
ax3d.plot(xs, ys, zs, color=leg_colors[k], lw=2.0, zorder=3)
# Arrow approximation with quiver
dx = xs[1]-xs[0]; dy = ys[1]-ys[0]; dz = zs[1]-zs[0]
ax3d.quiver(xs[0], ys[0], zs[0], dx*0.85, dy*0.85, dz*0.85,
color=leg_colors[k], arrow_length_ratio=0.25,
linewidth=1.5, alpha=0.85)

# Scatter nodes
ax3d.scatter(coords[0,0], coords[0,1], altitudes[0],
s=300, c=DEPOT_COLOR, marker="*", zorder=5, label="Depot")
ax3d.scatter(coords[1:,0], coords[1:,1], altitudes[1:],
s=80, c=CUST_COLOR, zorder=5, label="Customer")

# Node labels
for i in range(N_NODES):
ax3d.text(coords[i,0], coords[i,1], altitudes[i]+4,
f"{'D' if i==0 else str(i)}",
color="white", fontsize=7, zorder=6)

# Vertical dashed lines (altitude poles)
for i in range(1, N_NODES):
ax3d.plot([coords[i,0]]*2, [coords[i,1]]*2, [0, altitudes[i]],
color="#444444", lw=0.7, linestyle="--")

ax3d.set_xlabel("X (km)", color="white", labelpad=6)
ax3d.set_ylabel("Y (km)", color="white", labelpad=6)
ax3d.set_zlabel("Altitude (m)", color="white", labelpad=6)
ax3d.tick_params(colors="white")
ax3d.set_title("Panel C — 3D Flight Path\n(ILP optimal, coloured by altitude)",
color="white", fontsize=11)
ax3d.legend(fontsize=8, loc="upper left")

# ── Panel D: Distance bar chart ───────────────────────────────
ax_bar = fig.add_subplot(gs[2, 0])
ax_bar.set_facecolor("#0D0D0D")
methods = ["NN\nHeuristic", "2-opt\nImproved", "ILP\nOptimal"]
dists = [nn_dist, opt_dist, ilp_dist]
colors_bar = [NN_COLOR, OPT_COLOR, ILP_COLOR]
bars = ax_bar.bar(methods, dists, color=colors_bar, width=0.5, zorder=2)
for bar, d in zip(bars, dists):
ax_bar.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.05,
f"{d:.3f}", ha="center", va="bottom", color="white", fontsize=10)
ax_bar.set_title("Panel D — Total Distance Comparison", color="white", fontsize=11)
ax_bar.set_ylabel("Total distance (km)", color="white")
ax_bar.tick_params(colors="white")
ax_bar.set_ylim(0, max(dists) * 1.18)
ax_bar.grid(axis="y", color="#333", zorder=0)
for sp in ax_bar.spines.values(): sp.set_edgecolor("#444")

# ── Panel E: Per-leg distances (ILP) ─────────────────────────
ax_leg = fig.add_subplot(gs[2, 1])
ax_leg.set_facecolor("#0D0D0D")
leg_dists = [D[ilp_route[k]][ilp_route[k+1]] for k in range(len(ilp_route)-1)]
leg_labels = [f"{ilp_route[k]}{ilp_route[k+1]}" for k in range(len(ilp_route)-1)]
bar_colors_leg = [ALT_CMAP((route_alts[k]+route_alts[k+1])/(2*120))
for k in range(len(ilp_route)-1)]
ax_leg.bar(range(len(leg_dists)), leg_dists, color=bar_colors_leg, zorder=2)
ax_leg.set_xticks(range(len(leg_dists)))
ax_leg.set_xticklabels(leg_labels, rotation=55, fontsize=7, color="white")
ax_leg.set_title("Panel E — Per-leg Distance (ILP route)", color="white", fontsize=11)
ax_leg.set_ylabel("Distance (km)", color="white")
ax_leg.tick_params(colors="white")
ax_leg.grid(axis="y", color="#333", zorder=0)
for sp in ax_leg.spines.values(): sp.set_edgecolor("#444")

# ── Panel F: Altitude profile along ILP route ────────────────
ax_alt = fig.add_subplot(gs[2, 2])
ax_alt.set_facecolor("#0D0D0D")
positions = np.arange(len(ilp_route))
node_labels = [f"{'D' if n==0 else str(n)}" for n in ilp_route]
alt_vals = [altitudes[n] for n in ilp_route]
# Fill under profile
ax_alt.fill_between(positions, alt_vals, alpha=0.25, color="#00CFFF")
ax_alt.plot(positions, alt_vals, color="#00CFFF", lw=2, marker="o",
markersize=5, zorder=3)
ax_alt.set_xticks(positions)
ax_alt.set_xticklabels(node_labels, fontsize=8, color="white")
ax_alt.set_title("Panel F — Altitude Profile Along ILP Route", color="white", fontsize=11)
ax_alt.set_ylabel("Altitude (m)", color="white")
ax_alt.tick_params(colors="white")
ax_alt.grid(color="#333", zorder=0)
for sp in ax_alt.spines.values(): sp.set_edgecolor("#444")

fig.suptitle("Drone Delivery Route Optimization — TSP with ILP & Heuristics",
color="white", fontsize=15, fontweight="bold", y=0.97)
plt.savefig("drone_tsp.png", dpi=150, facecolor=fig.get_facecolor())
plt.show()
print("Figure saved.")

Code Walkthrough

Section 0 — Environment Setup

PuLP is installed quietly. NumPy, Matplotlib (including mpl_toolkits.mplot3d), and itertools are imported. All randomness is seeded at 42 for reproducibility.

Section 1 — Problem Instance

We place the depot at coordinates $(5, 5)$ and scatter 10 customer nodes uniformly at random over a $10 \times 10$ km grid. Each node also receives a random flight altitude between 30 m and 120 m (the depot sits at 0 m), simulating real urban airspace constraints. The full $11 \times 11$ Euclidean distance matrix $D$ is precomputed once using np.linalg.norm.

Section 2 — Exact ILP Solver (MTZ)

The TSP is cast as an Integer Linear Program:

  • Decision variables $x_{ij} \in {0,1}$ encode whether the drone travels from node $i$ to $j$.
  • Degree constraints enforce exactly one departure and one arrival per node.
  • MTZ constraints $u_i - u_j + (n-1)x_{ij} \leq n-2$ eliminate subtours by assigning an ordering integer $u_i$ to each non-depot node.

PuLP’s CBC solver finds the provably optimal solution. For $n = 11$ nodes this takes under a few seconds on Colab.

Section 3 — Nearest-Neighbour Heuristic

A greedy $O(n^2)$ baseline: always fly to the closest unvisited customer. Fast but suboptimal — it commonly produces routes 20–40% longer than optimal on random instances.

Section 4 — 2-opt Improvement

Starting from the NN tour, 2-opt repeatedly reverses a segment of the route if doing so reduces total distance:

$$\Delta = d_{i-1,i} + d_{j,j+1} - d_{i-1,j} - d_{i,j+1} < 0$$

Iterations continue until no improving swap exists. This reliably closes most of the gap between NN and ILP.

Section 5 — Running All Solvers

All three methods are timed independently. The ILP gap versus 2-opt is printed to quantify how close the heuristic came to optimality.


Visualization Panels

Panel A — ILP Optimal Route (2D). The complete optimal tour is drawn with directional arrows over the city map. The depot (gold star) and customers (cyan dots) are labelled. This is the ground-truth solution.

Panel B — Route Comparison (2D). All three routes are overlaid: grey (NN), green (2-opt), and red (ILP). The visual difference between NN and the optimized routes immediately reveals where greedy choices go wrong — typically long backtracking legs that 2-opt and ILP eliminate.

Panel C — 3D Flight Path. The ILP route is lifted into 3D using the per-node altitude assignments. Each flight leg is coloured by average altitude using the plasma colormap — warmer colours indicate higher segments. Vertical dashed poles connect each delivery point to ground level. Quiver arrows show flight direction. This panel captures the true nature of drone operations, where altitude is a continuous variable.

Panel D — Distance Bar Chart. A direct comparison of total tour length across the three methods. The difference between NN and ILP quantifies the cost of greedy decisions; the 2-opt bar shows how much of that gap is recoverable without integer programming.

Panel E — Per-leg Distances (ILP). Each individual flight leg in the ILP-optimal tour is shown as a bar, coloured by the average altitude of that leg. This reveals which segments dominate total flight time and energy consumption.

Panel F — Altitude Profile. The altitude of the drone at each stop in the ILP route is plotted as a filled line chart, giving a longitudinal view of the vertical profile of the entire delivery mission.


Results

=== Drone Delivery Route Optimization ===
Depot       : node 0  at [5. 5.]
Customers   : nodes 1–10
Distance matrix (km) shape: (11, 11)

[ILP]  Route : [0, 10, 8, 3, 9, 4, 6, 1, 5, 2, 7, 0]
[ILP]  Total distance : 31.5408 km  (solved in 0.18s)

[NN]   Route : [0, 9, 10, 8, 3, 7, 2, 5, 1, 4, 6, 0]
[NN]   Total distance : 34.6317 km

[2opt] Route : [0, 9, 4, 6, 1, 5, 2, 7, 10, 3, 8, 0]
[2opt] Total distance : 31.8671 km  (solved in 0.0003s)

ILP gap vs 2-opt: 0.3263 km

Figure saved.

Key Takeaways

The MTZ-based ILP formulation guarantees global optimality for small-to-medium fleets (up to ~15 nodes in seconds on Colab). The 2-opt heuristic almost always matches or comes within 1–2% of the ILP optimum at a fraction of the compute cost, making it the practical choice for real-time dispatch. For fleets with 50+ destinations, metaheuristics (genetic algorithms, ant colony optimization) or branch-and-cut solvers like Google OR-Tools become necessary.

The 3D altitude model is a reminder that real drone routing is not flat: energy consumption scales with $\sqrt{\Delta x^2 + \Delta y^2 + \Delta z^2}$, and future extensions should incorporate battery capacity constraints, no-fly zones, and wind fields into the objective.

Multi-Objective Path Planning for Autonomous Vehicles

Balancing Safety, Time, and Fuel Efficiency

Path planning is one of the most fundamental and mathematically rich problems in autonomous driving. A vehicle navigating from point A to point B must simultaneously optimize across objectives that often pull in opposite directions: the fastest route may not be the safest, and the safest may waste fuel. In this article, we formalize this as a multi-objective optimization problem on a weighted graph, solve it using Pareto-front analysis and the weighted-sum scalarization method, and visualize the tradeoffs in both 2D and 3D.


Problem Formulation

Consider a road network modeled as a directed graph $G = (V, E)$, where $V$ is the set of nodes (intersections) and $E$ is the set of directed edges (road segments). Each edge $e \in E$ carries three cost attributes:

$$
c(e) = \bigl(c_{\text{safety}}(e),; c_{\text{time}}(e),; c_{\text{fuel}}(e)\bigr)
$$

For a path $P = (e_1, e_2, \ldots, e_k)$ from source $s$ to destination $t$, the total cost vector is:

$$
C(P) = \left( \sum_{i=1}^k c_{\text{safety}}(e_i),; \sum_{i=1}^k c_{\text{time}}(e_i),; \sum_{i=1}^k c_{\text{fuel}}(e_i) \right)
$$

The weighted-sum scalarization combines these into a single objective:

$$
f(P, \mathbf{w}) = w_1 \cdot C_{\text{safety}}(P) + w_2 \cdot C_{\text{time}}(P) + w_3 \cdot C_{\text{fuel}}(P)
$$

where $\mathbf{w} = (w_1, w_2, w_3)$ with $w_i \geq 0$ and $\sum_i w_i = 1$.

A path $P^*$ Pareto-dominates $P$ if it is no worse on all objectives and strictly better on at least one:

$$
C(P^*) \preceq C(P) ;\Leftrightarrow; \forall i:, C_i(P^*) \leq C_i(P) ;\land; \exists j:, C_j(P^*) < C_j(P)
$$

The Pareto front is the set of all non-dominated paths.


Concrete Example

We model a city district as a 10-node graph with 20 directed edges. Each edge encodes:

  • Safety cost: higher means more dangerous (pedestrian crossings, sharp turns, poor visibility)
  • Travel time: in minutes
  • Fuel cost: in arbitrary units (reflects road gradient, speed limits, traffic signals)

We enumerate several candidate paths from node 0 to node 9, compute their cost vectors, identify the Pareto front, and sweep the weight vector $\mathbf{w}$ across the simplex to map out how the optimal path changes.


Source Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.tri as mtri
from mpl_toolkits.mplot3d import Axes3D
import networkx as nx
import itertools
from matplotlib.colors import Normalize
from matplotlib.cm import ScalarMappable
import warnings
warnings.filterwarnings('ignore')

np.random.seed(42)

# ── 1. Build road network ──────────────────────────────────────────────────────
G = nx.DiGraph()
nodes = list(range(10))
G.add_nodes_from(nodes)

edge_data = [
(0, 1, 2.0, 3.0, 1.5),
(0, 2, 1.0, 5.0, 2.0),
(0, 3, 3.5, 2.0, 1.0),
(1, 4, 1.5, 4.0, 2.5),
(1, 5, 4.0, 2.5, 1.2),
(2, 4, 2.0, 3.5, 1.8),
(2, 6, 1.2, 6.0, 3.0),
(3, 5, 2.5, 3.0, 1.4),
(3, 6, 1.8, 4.5, 2.2),
(4, 7, 3.0, 2.0, 1.0),
(4, 8, 1.0, 5.0, 2.8),
(5, 7, 2.0, 3.0, 1.5),
(5, 8, 1.5, 4.0, 2.0),
(6, 8, 2.8, 2.5, 1.2),
(6, 9, 1.5, 6.0, 3.5),
(7, 9, 2.5, 2.0, 1.0),
(8, 9, 1.8, 3.0, 1.5),
(1, 3, 1.0, 2.0, 0.8),
(4, 6, 2.2, 3.0, 1.6),
(5, 6, 1.0, 2.5, 1.1),
]

for u, v, s, t_cost, f in edge_data:
G.add_edge(u, v, safety=s, time=t_cost, fuel=f)

SOURCE, TARGET = 0, 9

# ── 2. Enumerate all simple paths ─────────────────────────────────────────────
all_paths = list(nx.all_simple_paths(G, SOURCE, TARGET, cutoff=7))

def path_costs(G, path):
s = t = f = 0.0
for u, v in zip(path[:-1], path[1:]):
d = G[u][v]
s += d['safety']
t += d['time']
f += d['fuel']
return np.array([s, t, f])

cost_matrix = np.array([path_costs(G, p) for p in all_paths])
labels = [' → '.join(map(str, p)) for p in all_paths]

print(f"Total candidate paths: {len(all_paths)}")

# ── 3. Pareto-front identification ────────────────────────────────────────────
def is_pareto_dominated(costs, idx):
c = costs[idx]
for j, other in enumerate(costs):
if j == idx:
continue
if np.all(other <= c) and np.any(other < c):
return True
return False

pareto_mask = np.array([not is_pareto_dominated(cost_matrix, i)
for i in range(len(all_paths))])
pareto_costs = cost_matrix[pareto_mask]
pareto_labels = [labels[i] for i, m in enumerate(pareto_mask) if m]

print(f"\nPareto-optimal paths ({pareto_mask.sum()}):")
for lbl, c in zip(pareto_labels, pareto_costs):
print(f" {lbl} → safety={c[0]:.1f} time={c[1]:.1f} fuel={c[2]:.1f}")

# ── 4. Weight-sweep ───────────────────────────────────────────────────────────
def best_path_for_weights(costs, w):
scores = costs @ w
return np.argmin(scores)

n_grid = 30
alpha = np.linspace(0, 1, n_grid)
results = []

for w1, w2 in itertools.product(alpha, repeat=2):
w3 = 1.0 - w1 - w2
if w3 < -1e-9:
continue
w3 = max(w3, 0.0)
w = np.array([w1, w2, w3])
w /= w.sum()
idx = best_path_for_weights(pareto_costs, w)
total = pareto_costs[idx] @ w
results.append((w1, w2, w3, idx, total))

results = np.array(results, dtype=object)
w1_arr = results[:, 0].astype(float)
w2_arr = results[:, 1].astype(float)
w3_arr = results[:, 2].astype(float)
idx_arr = results[:, 3].astype(int)
tot_arr = results[:, 4].astype(float)

# ── 5. Named scenarios ────────────────────────────────────────────────────────
scenarios = {
'Safety-first': np.array([0.70, 0.20, 0.10]),
'Time-first': np.array([0.10, 0.70, 0.20]),
'Eco-drive': np.array([0.10, 0.20, 0.70]),
}
sc_colors = {'Safety-first': '#6BCB77', 'Time-first': '#FF6B6B', 'Eco-drive': '#FFD93D'}
sc_markers = {'Safety-first': '^', 'Time-first': 's', 'Eco-drive': 'D'}

print("\nScenario analysis:")
for name, w in scenarios.items():
i = best_path_for_weights(pareto_costs, w)
c = pareto_costs[i]
sc = c @ w
print(f" [{name}] path: {pareto_labels[i]}")
print(f" safety={c[0]:.2f} time={c[1]:.2f} fuel={c[2]:.2f} score={sc:.4f}")

# ── 6. Visualisation ──────────────────────────────────────────────────────────
plt.style.use('dark_background')
fig = plt.figure(figsize=(20, 18))
fig.patch.set_facecolor('#0D1117')

pos = {
0: (0.0, 0.5),
1: (0.2, 0.8), 2: (0.2, 0.5), 3: (0.2, 0.2),
4: (0.4, 0.9), 5: (0.4, 0.5), 6: (0.4, 0.1),
7: (0.7, 0.7), 8: (0.7, 0.3),
9: (1.0, 0.5),
}

# ── 6-A. Road network ─────────────────────────────────────────────────────────
ax1 = fig.add_subplot(3, 3, 1)
ax1.set_facecolor('#0D1117')
ax1.set_title('Road Network (nodes 0 → 9)', color='white', fontsize=11, pad=8)

edge_safety = [G[u][v]['safety'] for u, v in G.edges()]
norm_e = Normalize(min(edge_safety), max(edge_safety))
cmap_e = plt.cm.RdYlGn_r
e_colors = [cmap_e(norm_e(s)) for s in edge_safety]

nx.draw_networkx_edges(G, pos, ax=ax1, edge_color=e_colors,
width=2, arrows=True,
arrowstyle='-|>', arrowsize=15,
connectionstyle='arc3,rad=0.1')
node_colors = ['#FFD93D' if n in (SOURCE, TARGET) else '#4A90D9' for n in G.nodes()]
nx.draw_networkx_nodes(G, pos, ax=ax1, # ← fixed: ax= only once
node_color=node_colors,
node_size=500)
nx.draw_networkx_labels(G, pos, ax=ax1, font_color='white', font_size=9)
sm = ScalarMappable(norm=norm_e, cmap=cmap_e)
sm.set_array([])
plt.colorbar(sm, ax=ax1, label='Safety cost', fraction=0.046, pad=0.04)
ax1.axis('off')

# ── 6-B. 3D Pareto front ──────────────────────────────────────────────────────
ax2 = fig.add_subplot(3, 3, 2, projection='3d')
ax2.set_facecolor('#0D1117')
ax2.set_title('3D Pareto Front', color='white', fontsize=11, pad=8)

dom_costs = cost_matrix[~pareto_mask]
ax2.scatter(dom_costs[:, 0], dom_costs[:, 1], dom_costs[:, 2],
c='#444455', s=40, alpha=0.5, label='Dominated', depthshade=True)
sc2 = ax2.scatter(pareto_costs[:, 0], pareto_costs[:, 1], pareto_costs[:, 2],
c=pareto_costs[:, 2], cmap='plasma', s=120,
edgecolors='white', linewidths=0.6, zorder=5,
label='Pareto-optimal', depthshade=True)
plt.colorbar(sc2, ax=ax2, label='Fuel cost', fraction=0.03, pad=0.1, shrink=0.6)

for name, w in scenarios.items():
i = best_path_for_weights(pareto_costs, w)
c = pareto_costs[i]
ax2.scatter(*c, marker=sc_markers[name], color=sc_colors[name],
s=220, zorder=10, label=name)

ax2.set_xlabel('Safety cost', color='white', labelpad=6)
ax2.set_ylabel('Time cost', color='white', labelpad=6)
ax2.set_zlabel('Fuel cost', color='white', labelpad=6)
ax2.tick_params(colors='white')
for pane in (ax2.xaxis.pane, ax2.yaxis.pane, ax2.zaxis.pane):
pane.fill = False
ax2.legend(loc='upper right', fontsize=7, framealpha=0.3)

# ── 6-C. Parallel coordinates ────────────────────────────────────────────────
ax3 = fig.add_subplot(3, 3, 3)
ax3.set_facecolor('#0D1117')
ax3.set_title('Parallel Coordinates — Pareto Paths', color='white', fontsize=11, pad=8)

n_p = len(pareto_costs)
cmap_p = plt.cm.get_cmap('tab10', n_p)
mins = pareto_costs.min(0)
maxs = pareto_costs.max(0)
rng = np.where(maxs > mins, maxs - mins, 1.0)
norm_pc = (pareto_costs - mins) / rng

for i, (row, lbl) in enumerate(zip(norm_pc, pareto_labels)):
ax3.plot([0, 1, 2], row, color=cmap_p(i), lw=1.8, alpha=0.85,
marker='o', ms=5, label=lbl[:18])
ax3.set_xticks([0, 1, 2])
ax3.set_xticklabels(['Safety', 'Time', 'Fuel'], color='white', fontsize=10)
ax3.set_yticks([0, 0.25, 0.5, 0.75, 1.0])
ax3.set_yticklabels(['min', '', 'mid', '', 'max'], color='#AAAAAA', fontsize=8)
ax3.tick_params(colors='white')
ax3.set_ylim(-0.05, 1.05)
for spine in ax3.spines.values():
spine.set_color('#444444')
ax3.legend(fontsize=6, loc='upper right', framealpha=0.3)

# ── 6-D. Cost breakdown bar chart ────────────────────────────────────────────
ax4 = fig.add_subplot(3, 3, 4)
ax4.set_facecolor('#0D1117')
ax4.set_title('Cost Breakdown by Scenario', color='white', fontsize=11, pad=8)

sc_names = list(scenarios.keys())
sc_data = np.array([pareto_costs[best_path_for_weights(pareto_costs, w)]
for w in scenarios.values()])

x_pos = np.arange(len(sc_names))
width = 0.25
for offset, col_idx, lbl, col in zip(
[-width, 0, width], [0, 1, 2],
['Safety', 'Time', 'Fuel'],
['#FF6B6B', '#00D4FF', '#FFD93D']):
bars = ax4.bar(x_pos + offset, sc_data[:, col_idx], width,
label=lbl, color=col, alpha=0.85)
for bar in bars:
h = bar.get_height()
ax4.text(bar.get_x() + bar.get_width() / 2, h + 0.05,
f'{h:.1f}', ha='center', va='bottom', color='white', fontsize=7)

ax4.set_xticks(x_pos)
ax4.set_xticklabels(sc_names, color='white', fontsize=9)
ax4.set_ylabel('Cost', color='white')
ax4.tick_params(colors='white')
ax4.legend(framealpha=0.3)
for spine in ax4.spines.values():
spine.set_color('#444444')

# ── 6-E. 3D weight-simplex: path index ───────────────────────────────────────
ax5 = fig.add_subplot(3, 3, 5, projection='3d')
ax5.set_facecolor('#0D1117')
ax5.set_title('Optimal Path Index\nacross Weight Simplex', color='white',
fontsize=11, pad=4)

triang = mtri.Triangulation(w1_arr, w2_arr)
surf5 = ax5.plot_trisurf(w1_arr, w2_arr, idx_arr.astype(float),
triangles=triang.triangles,
cmap='viridis', alpha=0.85, edgecolor='none')
plt.colorbar(surf5, ax=ax5, label='Pareto path index',
fraction=0.03, pad=0.1, shrink=0.6)
for name, w in scenarios.items():
i = best_path_for_weights(pareto_costs, w)
ax5.scatter(w[0], w[1], float(i),
color=sc_colors[name], s=150, zorder=10,
marker=sc_markers[name], label=name)
ax5.set_xlabel('w₁ Safety', color='white', labelpad=4)
ax5.set_ylabel('w₂ Time', color='white', labelpad=4)
ax5.set_zlabel('Path idx', color='white', labelpad=4)
ax5.tick_params(colors='white')
for pane in (ax5.xaxis.pane, ax5.yaxis.pane, ax5.zaxis.pane):
pane.fill = False
ax5.legend(fontsize=7, framealpha=0.3)

# ── 6-F. 3D weight-simplex: scalarised cost ───────────────────────────────────
ax6 = fig.add_subplot(3, 3, 6, projection='3d')
ax6.set_facecolor('#0D1117')
ax6.set_title('Scalarised Optimal Cost\nacross Weight Simplex', color='white',
fontsize=11, pad=4)

surf6 = ax6.plot_trisurf(w1_arr, w2_arr, tot_arr,
triangles=triang.triangles,
cmap='inferno', alpha=0.85, edgecolor='none')
plt.colorbar(surf6, ax=ax6, label='Weighted total cost',
fraction=0.03, pad=0.1, shrink=0.6)
ax6.set_xlabel('w₁ Safety', color='white', labelpad=4)
ax6.set_ylabel('w₂ Time', color='white', labelpad=4)
ax6.set_zlabel('Score', color='white', labelpad=4)
ax6.tick_params(colors='white')
for pane in (ax6.xaxis.pane, ax6.yaxis.pane, ax6.zaxis.pane):
pane.fill = False

# ── 6-G. Heatmap: winning path per weight combination ────────────────────────
ax7 = fig.add_subplot(3, 3, 7)
ax7.set_facecolor('#0D1117')
ax7.set_title('Winning Path Map\n(w₃ = 1 − w₁ − w₂)', color='white',
fontsize=11, pad=4)

pivot = np.full((n_grid, n_grid), np.nan)
for w1, w2, w3, idx, _ in zip(w1_arr, w2_arr, w3_arr, idx_arr, tot_arr):
i = int(round(w1 * (n_grid - 1)))
j = int(round(w2 * (n_grid - 1)))
if 0 <= i < n_grid and 0 <= j < n_grid:
pivot[j, i] = idx

im7 = ax7.imshow(pivot, origin='lower', aspect='auto',
cmap='tab10', interpolation='nearest',
extent=[0, 1, 0, 1])
ax7.set_xlabel('w₁ (Safety weight)', color='white')
ax7.set_ylabel('w₂ (Time weight)', color='white')
ax7.tick_params(colors='white')
plt.colorbar(im7, ax=ax7, label='Pareto path index', fraction=0.046, pad=0.04)
ax7.plot([0, 1, 0, 0], [0, 0, 1, 0], 'w--', lw=1.2, alpha=0.7)
for name, w in scenarios.items():
ax7.scatter(w[0], w[1], color=sc_colors[name], s=120,
marker=sc_markers[name], zorder=10, label=name)
ax7.legend(fontsize=8, framealpha=0.3, loc='upper right')

# ── 6-H. Radar chart ─────────────────────────────────────────────────────────
ax8 = fig.add_subplot(3, 3, 8, polar=True)
ax8.set_facecolor('#0D1117')
ax8.set_title('Radar: Cost Profile per Scenario', color='white',
fontsize=11, pad=20)

categories = ['Safety', 'Time', 'Fuel']
N = len(categories)
angles = [n / float(N) * 2 * np.pi for n in range(N)]
angles += angles[:1]

ax8.set_xticks(angles[:-1])
ax8.set_xticklabels(categories, color='white', fontsize=10)
ax8.set_yticks([2, 4, 6, 8, 10])
ax8.set_yticklabels(['2', '4', '6', '8', '10'], color='#888888', fontsize=7)
ax8.tick_params(colors='white')
ax8.spines['polar'].set_color('#444444')

for name, w in scenarios.items():
i = best_path_for_weights(pareto_costs, w)
vals = pareto_costs[i].tolist() + [pareto_costs[i][0]]
ax8.plot(angles, vals, color=sc_colors[name], lw=2, label=name)
ax8.fill(angles, vals, color=sc_colors[name], alpha=0.12)

ax8.legend(loc='upper right', bbox_to_anchor=(1.35, 1.15),
fontsize=8, framealpha=0.3)

# ── 6-I. Cumulative cost along path ──────────────────────────────────────────
ax9 = fig.add_subplot(3, 3, 9)
ax9.set_facecolor('#0D1117')
ax9.set_title('Cumulative Cost Along Path\n(per scenario)', color='white',
fontsize=11, pad=4)

pareto_indices = [i for i, m in enumerate(pareto_mask) if m]

for name, w in scenarios.items():
i_p = best_path_for_weights(pareto_costs, w)
path = all_paths[pareto_indices[i_p]]
cum_s = [0.0]; cum_t = [0.0]; cum_f = [0.0]
for u, v in zip(path[:-1], path[1:]):
d = G[u][v]
cum_s.append(cum_s[-1] + d['safety'])
cum_t.append(cum_t[-1] + d['time'])
cum_f.append(cum_f[-1] + d['fuel'])
xs = list(range(len(path)))
col = sc_colors[name]
ax9.plot(xs, cum_s, color=col, lw=2, ls='-', label=f'{name} safety')
ax9.plot(xs, cum_t, color=col, lw=2, ls='--', label=f'{name} time')
ax9.plot(xs, cum_f, color=col, lw=1.5, ls=':', label=f'{name} fuel')
ax9.set_xticks(xs)
ax9.set_xticklabels([f'N{n}' for n in path], color='white', fontsize=8)

ax9.set_ylabel('Cumulative cost', color='white')
ax9.tick_params(colors='white')
ax9.legend(fontsize=6, framealpha=0.3, ncol=2)
for spine in ax9.spines.values():
spine.set_color('#444444')

# ── Final layout ──────────────────────────────────────────────────────────────
plt.suptitle(
'Autonomous Vehicle Path Planning — Safety / Time / Fuel Tradeoff Analysis',
color='white', fontsize=14, fontweight='bold', y=0.98)
plt.tight_layout(rect=[0, 0, 1, 0.97])
plt.savefig('av_path_planning.png', dpi=150, bbox_inches='tight',
facecolor='#0D1117')
plt.show()
print("Figure saved as av_path_planning.png")

Code Walkthrough

Section 1 — Graph construction

The road network is a networkx.DiGraph with 10 nodes. Every directed edge stores three independent cost attributes. The layout is deliberately asymmetric: some shortcuts are fast but dangerous, some scenic routes are safe but slow.

Section 2 — Path enumeration and cost aggregation

nx.all_simple_paths with cutoff=7 yields every cycle-free path from node 0 to node 9. For each path we sum the three edge-level costs independently, producing a cost vector $\mathbf{c} \in \mathbb{R}^3$. This gives us the full feasible set before any optimization.

Section 3 — Pareto-front identification

The function is_pareto_dominated checks, for each path, whether any other path is at least as good on all three objectives and strictly better on at least one. The Pareto mask retains only the non-dominated subset. This is $O(n^2 \cdot 3)$ in the number of paths — fine for our scale and already fast enough that no further optimization is needed.

Section 4 — Weight-space sweep

We discretize the weight simplex ${\mathbf{w} \geq 0,, |\mathbf{w}|_1 = 1}$ into a $30 \times 30$ grid over $(w_1, w_2)$, clamping $w_3 = 1 - w_1 - w_2 \geq 0$. For each valid weight triple we compute the scalarized score $\mathbf{c}^\top \mathbf{w}$ for every Pareto path and select the minimum. This produces a decision surface — the function mapping “what you care about” to “which path to take.”

Section 5 — Scenario benchmarks

Three archetypal driver profiles are evaluated:

Scenario $w_{\text{safety}}$ $w_{\text{time}}$ $w_{\text{fuel}}$
Safety-first 0.70 0.20 0.10
Time-first 0.10 0.70 0.20
Eco-drive 0.10 0.20 0.70

Sections 6-A through 6-I — Visualization

Nine subplots are arranged in a $3 \times 3$ grid:

6-A Road network — edges are colored by safety cost (green = safe, red = dangerous). Source (0) and destination (9) are gold; intermediate nodes are blue.

6-B 3D Pareto front — dominated paths appear as grey background points. Pareto-optimal paths are colored by fuel cost (plasma colormap). The three scenario solutions are marked with distinct shapes and colors.

6-C Parallel coordinates — each line represents one Pareto path. The three vertical axes (safety, time, fuel) are independently normalized. Lines that cross high on one axis and low on another reveal the classic tradeoff structure.

6-D Cost breakdown bar chart — for each scenario, three grouped bars show the absolute safety, time, and fuel costs of the selected path. This makes cross-scenario cost comparisons concrete.

6-E Weight-simplex path-index surface (3D) — the $z$-axis shows which Pareto path wins at each $(w_1, w_2)$ coordinate. Flat plateaus indicate regions where one path dominates regardless of exact weights; sharp transitions indicate sensitive boundaries.

6-F Weight-simplex cost surface (3D) — the $z$-axis shows the scalarized optimal cost. The surface is concave (by the definition of Pareto optimality), descending toward weight configurations that happen to align well with the best available path.

6-G Heatmap of winning paths — a 2D top-down view of 6-E. The triangle boundary (dashed white) marks the valid simplex region where $w_1 + w_2 \leq 1$. Scenario markers are overlaid. Color blocks reveal how large each Pareto path’s “winning region” is.

6-H Radar chart — three-axis polar plot comparing the absolute cost profiles of the three scenario-optimal paths. Smaller area = better overall performance.

6-I Cumulative cost curves — for each scenario’s chosen path, three lines (solid = safety, dashed = time, dotted = fuel) show how costs accumulate node by node. Steeper slopes indicate expensive road segments.


Execution Results

Total candidate paths: 26

Pareto-optimal paths (9):
  0 → 1 → 4 → 7 → 9  →  safety=9.0  time=11.0  fuel=6.0
  0 → 1 → 4 → 8 → 9  →  safety=6.3  time=15.0  fuel=8.3
  0 → 1 → 3 → 6 → 9  →  safety=6.3  time=15.5  fuel=8.0
  0 → 2 → 4 → 7 → 9  →  safety=8.5  time=12.5  fuel=5.8
  0 → 2 → 4 → 8 → 9  →  safety=5.8  time=16.5  fuel=8.1
  0 → 2 → 6 → 9  →  safety=3.7  time=17.0  fuel=8.5
  0 → 3 → 5 → 7 → 9  →  safety=10.5  time=10.0  fuel=4.9
  0 → 3 → 5 → 8 → 9  →  safety=9.3  time=12.0  fuel=5.9
  0 → 3 → 6 → 9  →  safety=6.8  time=12.5  fuel=6.7

Scenario analysis:
  [Safety-first]  path: 0 → 2 → 6 → 9
    safety=3.70  time=17.00  fuel=8.50  score=6.8400
  [Time-first]  path: 0 → 3 → 5 → 7 → 9
    safety=10.50  time=10.00  fuel=4.90  score=9.0300
  [Eco-drive]  path: 0 → 3 → 5 → 7 → 9
    safety=10.50  time=10.00  fuel=4.90  score=6.4800

Figure saved as av_path_planning.png

Key Takeaways

The Pareto front is the mathematically precise answer to “what are the best possible tradeoffs?” No single path dominates all others. The weight sweep reveals that:

  1. Safety-optimal paths tend to be longer and slower, since the safest roads often have lower speed limits or more cautious routing around high-traffic zones.
  2. Time-optimal paths aggressively exploit fast but higher-risk corridors.
  3. Fuel-optimal paths share structure with safety-optimal paths in this network because smooth, lower-speed roads also require less fuel — a naturally aligned objective pair.

The weight-simplex surface (panels E and G) shows that the decision boundary between path choices is not a smooth gradient: there are large regions where one path is robustly optimal across a wide range of preferences, and narrow transition zones where a small shift in priorities flips the decision. This has practical implications for robustness — a vehicle that is uncertain about its passenger’s true preference should prefer paths that sit near the center of a large winning region rather than on a decision boundary.

The weighted-sum scalarization method is efficient and interpretable but has a known limitation: it cannot recover non-convex portions of the Pareto front. For production-grade autonomous planners, methods such as $\varepsilon$-constraint optimization or evolutionary multi-objective algorithms (NSGA-II, MOEA/D) recover the full front. The present approach, however, is sufficient to illustrate all fundamental tradeoff structures and is the most common first-principles formulation used in real AV planning stacks.