โ˜๏ธ Cloud Resource Allocation Optimization

Minimizing Cost While Guaranteeing Performance

A Practical Deep Dive with Python, Linear Programming, and 3D Visualization


Cloud computing bills can spiral out of control fast. Allocate too few resources and your SLA burns. Over-provision and youโ€™re bleeding money. This post walks through a concrete, solvable optimization problem โ€” finding the sweet spot between cost and performance using Linear Programming (LP) and Mixed-Integer Programming (MIP) in Python.


๐Ÿ” The Problem Setup

Imagine youโ€™re a cloud architect managing three workload types across four VM instance types on a hypothetical cloud provider:

Workloads:

  • ๐ŸŒ Web API Server (latency-sensitive)
  • ๐Ÿงฎ Batch Analytics Job (throughput-sensitive)
  • ๐Ÿ—„๏ธ Database Cluster (memory-sensitive)

VM Instance Types:

Instance vCPU RAM (GB) Cost ($/hr)
t3.small 2 2 0.023
m5.large 2 8 0.096
c5.xlarge 4 8 0.170
r5.2xlarge 8 64 0.504

The Decision Variables:

Let $x_{ij}$ be the number of VM instance type $j$ allocated to workload $i$.

$$x_{ij} \in \mathbb{Z}_{\geq 0}, \quad i \in {1,2,3},\ j \in {1,2,3,4}$$

Objective โ€” Minimize Total Hourly Cost:

$$\min \sum_{i=1}^{3} \sum_{j=1}^{4} c_j \cdot x_{ij}$$

where $c_j$ is the hourly cost of instance type $j$.

Constraints:

Each workload has minimum CPU, RAM, and performance score requirements:

Additionally, a total budget cap $B$:

$$\sum_{i=1}^{3} \sum_{j=1}^{4} c_j \cdot x_{ij} \leq B$$


๐Ÿ Full Python Source Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
# ============================================================
# Cloud Resource Allocation Optimization
# Cost Minimization with Performance Constraints
# Requires: pip install pulp matplotlib numpy scipy
# ============================================================

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
from matplotlib import cm
from mpl_toolkits.mplot3d import Axes3D
from itertools import product as iproduct
import warnings
warnings.filterwarnings('ignore')

try:
import pulp
except ImportError:
import subprocess, sys
subprocess.check_call([sys.executable, "-m", "pip", "install", "pulp", "-q"])
import pulp

# โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
# 1. PROBLEM DATA
# โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

# Instance types: [name, vCPU, RAM_GB, cost_per_hour, perf_score]
instances = [
{"name": "t3.small", "cpu": 2, "ram": 2, "cost": 0.023, "perf": 10},
{"name": "m5.large", "cpu": 2, "ram": 8, "cost": 0.096, "perf": 30},
{"name": "c5.xlarge", "cpu": 4, "ram": 8, "cost": 0.170, "perf": 55},
{"name": "r5.2xlarge", "cpu": 8, "ram": 64, "cost": 0.504, "perf": 100},
]

# Workloads: [name, min_cpu, min_ram, min_perf, max_instances]
workloads = [
{"name": "Web API", "min_cpu": 8, "min_ram": 16, "min_perf": 80, "max_inst": 10},
{"name": "Batch Job", "min_cpu": 16, "min_ram": 32, "min_perf": 120, "max_inst": 10},
{"name": "Database", "min_cpu": 8, "min_ram": 128,"min_perf": 150, "max_inst": 10},
]

BUDGET_CAP = 5.0 # $/hr total

n_w = len(workloads)
n_i = len(instances)

# โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
# 2. BUILD THE MIP MODEL WITH PuLP
# โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

prob = pulp.LpProblem("CloudResourceAllocation", pulp.LpMinimize)

# Decision variables: x[i][j] = # of instance type j for workload i
x = [[pulp.LpVariable(f"x_{i}_{j}", lowBound=0, cat="Integer")
for j in range(n_i)] for i in range(n_w)]

# Objective: minimize total cost
prob += pulp.lpSum(instances[j]["cost"] * x[i][j]
for i in range(n_w) for j in range(n_i)), "TotalCost"

# Constraints per workload
for i, wl in enumerate(workloads):
# CPU requirement
prob += (pulp.lpSum(instances[j]["cpu"] * x[i][j] for j in range(n_i))
>= wl["min_cpu"], f"CPU_{i}")
# RAM requirement
prob += (pulp.lpSum(instances[j]["ram"] * x[i][j] for j in range(n_i))
>= wl["min_ram"], f"RAM_{i}")
# Performance requirement
prob += (pulp.lpSum(instances[j]["perf"] * x[i][j] for j in range(n_i))
>= wl["min_perf"], f"PERF_{i}")
# Max instances per workload
prob += (pulp.lpSum(x[i][j] for j in range(n_i))
<= wl["max_inst"], f"MAXINST_{i}")

# Global budget cap
prob += (pulp.lpSum(instances[j]["cost"] * x[i][j]
for i in range(n_w) for j in range(n_i))
<= BUDGET_CAP, "Budget")

# โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
# 3. SOLVE
# โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

solver = pulp.PULP_CBC_CMD(msg=0)
prob.solve(solver)

print("=" * 55)
print(" OPTIMIZATION STATUS:", pulp.LpStatus[prob.status])
print("=" * 55)

# Extract solution
sol = np.array([[int(pulp.value(x[i][j])) for j in range(n_i)]
for i in range(n_w)])

total_cost = sum(instances[j]["cost"] * sol[i][j]
for i in range(n_w) for j in range(n_i))

print(f"\n Optimal Total Cost: ${total_cost:.4f}/hr")
print(f" Budget Cap: ${BUDGET_CAP:.4f}/hr\n")

for i, wl in enumerate(workloads):
alloc_cpu = sum(instances[j]["cpu"] * sol[i][j] for j in range(n_i))
alloc_ram = sum(instances[j]["ram"] * sol[i][j] for j in range(n_i))
alloc_perf = sum(instances[j]["perf"] * sol[i][j] for j in range(n_i))
wl_cost = sum(instances[j]["cost"] * sol[i][j] for j in range(n_i))
print(f" [{wl['name']}]")
for j in range(n_i):
if sol[i][j] > 0:
print(f" {instances[j]['name']:12s} x{sol[i][j]} "
f"(${instances[j]['cost']*sol[i][j]:.3f}/hr)")
print(f" โ†’ CPU: {alloc_cpu} vCPU (min {wl['min_cpu']})")
print(f" โ†’ RAM: {alloc_ram} GB (min {wl['min_ram']})")
print(f" โ†’ Perf: {alloc_perf} (min {wl['min_perf']})")
print(f" โ†’ Cost: ${wl_cost:.4f}/hr\n")

# โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
# 4. PARETO FRONTIER: Cost vs Performance Trade-off
# โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

pareto_costs = []
pareto_perfs = []
perf_targets = np.linspace(50, 200, 30)

for perf_scale in perf_targets:
p2 = pulp.LpProblem("Pareto", pulp.LpMinimize)
y = [[pulp.LpVariable(f"y_{i}_{j}_{int(perf_scale)}", lowBound=0, cat="Integer")
for j in range(n_i)] for i in range(n_w)]

p2 += pulp.lpSum(instances[j]["cost"] * y[i][j]
for i in range(n_w) for j in range(n_i))

for i, wl in enumerate(workloads):
p2 += (pulp.lpSum(instances[j]["cpu"] * y[i][j] for j in range(n_i))
>= wl["min_cpu"])
p2 += (pulp.lpSum(instances[j]["ram"] * y[i][j] for j in range(n_i))
>= wl["min_ram"])
scaled_perf = wl["min_perf"] * (perf_scale / 100.0)
p2 += (pulp.lpSum(instances[j]["perf"] * y[i][j] for j in range(n_i))
>= scaled_perf)
p2 += (pulp.lpSum(y[i][j] for j in range(n_i)) <= wl["max_inst"])

p2.solve(pulp.PULP_CBC_CMD(msg=0))

if pulp.LpStatus[p2.status] == "Optimal":
c = sum(instances[j]["cost"] * pulp.value(y[i][j])
for i in range(n_w) for j in range(n_i))
pf = sum(instances[j]["perf"] * pulp.value(y[i][j])
for i in range(n_w) for j in range(n_i))
pareto_costs.append(c)
pareto_perfs.append(pf)

# โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
# 5. SENSITIVITY: Budget Cap Sweep
# โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

budget_range = np.linspace(1.0, 8.0, 25)
budget_costs = []
budget_perfs = []
budget_feasible = []

for B in budget_range:
p3 = pulp.LpProblem("BudgetSweep", pulp.LpMinimize)
z = [[pulp.LpVariable(f"z_{i}_{j}_{B:.2f}", lowBound=0, cat="Integer")
for j in range(n_i)] for i in range(n_w)]

p3 += pulp.lpSum(instances[j]["cost"] * z[i][j]
for i in range(n_w) for j in range(n_i))

for i, wl in enumerate(workloads):
p3 += (pulp.lpSum(instances[j]["cpu"] * z[i][j] for j in range(n_i))
>= wl["min_cpu"])
p3 += (pulp.lpSum(instances[j]["ram"] * z[i][j] for j in range(n_i))
>= wl["min_ram"])
p3 += (pulp.lpSum(instances[j]["perf"] * z[i][j] for j in range(n_i))
>= wl["min_perf"])
p3 += (pulp.lpSum(z[i][j] for j in range(n_i)) <= wl["max_inst"])

p3 += (pulp.lpSum(instances[j]["cost"] * z[i][j]
for i in range(n_w) for j in range(n_i)) <= B)

p3.solve(pulp.PULP_CBC_CMD(msg=0))
feasible = (pulp.LpStatus[p3.status] == "Optimal")
budget_feasible.append(feasible)

if feasible:
c = pulp.value(p3.objective)
pf = sum(instances[j]["perf"] * pulp.value(z[i][j])
for i in range(n_w) for j in range(n_i))
budget_costs.append(c)
budget_perfs.append(pf)
else:
budget_costs.append(np.nan)
budget_perfs.append(np.nan)

# โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
# 6. 3D SURFACE: CPU ร— RAM โ†’ Cost
# โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

cpu_vals = np.arange(4, 20, 2)
ram_vals = np.arange(8, 80, 8)
CPU_G, RAM_G = np.meshgrid(cpu_vals, ram_vals)
COST_G = np.full(CPU_G.shape, np.nan)

for ri, rv in enumerate(ram_vals):
for ci, cv in enumerate(cpu_vals):
p4 = pulp.LpProblem("Surface", pulp.LpMinimize)
w = [pulp.LpVariable(f"w_{j}_{ci}_{ri}", lowBound=0, cat="Integer")
for j in range(n_i)]
p4 += pulp.lpSum(instances[j]["cost"] * w[j] for j in range(n_i))
p4 += (pulp.lpSum(instances[j]["cpu"] * w[j] for j in range(n_i)) >= cv)
p4 += (pulp.lpSum(instances[j]["ram"] * w[j] for j in range(n_i)) >= rv)
p4 += (pulp.lpSum(instances[j]["perf"] * w[j] for j in range(n_i)) >= 30)
p4 += (pulp.lpSum(w[j] for j in range(n_i)) <= 8)
p4.solve(pulp.PULP_CBC_CMD(msg=0))
if pulp.LpStatus[p4.status] == "Optimal":
COST_G[ri, ci] = pulp.value(p4.objective)

# โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
# 7. PLOTTING
# โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

plt.style.use('dark_background')
fig = plt.figure(figsize=(20, 22))
fig.patch.set_facecolor('#0d1117')

colors_wl = ['#58a6ff', '#3fb950', '#f78166']
colors_inst = ['#e3b341', '#58a6ff', '#3fb950', '#f78166']
inst_names = [inst["name"] for inst in instances]
wl_names = [wl["name"] for wl in workloads]

# โ”€โ”€ Plot 1: Allocation Stacked Bar โ”€โ”€
ax1 = fig.add_subplot(3, 3, 1)
ax1.set_facecolor('#161b22')
bar_width = 0.18
x_pos = np.arange(n_w)

for j in range(n_i):
vals = [sol[i][j] for i in range(n_w)]
ax1.bar(x_pos + j * bar_width, vals, bar_width,
label=inst_names[j], color=colors_inst[j], alpha=0.9,
edgecolor='#30363d')

ax1.set_xticks(x_pos + bar_width * 1.5)
ax1.set_xticklabels(wl_names, color='white', fontsize=9)
ax1.set_ylabel("# Instances", color='white')
ax1.set_title("โ‘  Optimal VM Allocation", color='white', fontweight='bold', pad=10)
ax1.legend(loc='upper right', fontsize=7, facecolor='#21262d', labelcolor='white')
ax1.tick_params(colors='white')
ax1.spines['bottom'].set_color('#30363d')
ax1.spines['left'].set_color('#30363d')
ax1.spines['top'].set_visible(False)
ax1.spines['right'].set_visible(False)

# โ”€โ”€ Plot 2: Cost Breakdown Pie โ”€โ”€
ax2 = fig.add_subplot(3, 3, 2)
ax2.set_facecolor('#161b22')
wl_costs = [sum(instances[j]["cost"] * sol[i][j] for j in range(n_i))
for i in range(n_w)]
wedges, texts, autotexts = ax2.pie(
wl_costs, labels=wl_names, colors=colors_wl,
autopct='%1.1f%%', startangle=140,
wedgeprops=dict(edgecolor='#0d1117', linewidth=2),
textprops=dict(color='white', fontsize=9))
for at in autotexts:
at.set_fontsize(8)
at.set_color('#0d1117')
ax2.set_title(f"โ‘ก Cost Breakdown\n(Total ${total_cost:.3f}/hr)",
color='white', fontweight='bold', pad=10)

# โ”€โ”€ Plot 3: Resource Utilization vs Minimum โ”€โ”€
ax3 = fig.add_subplot(3, 3, 3)
ax3.set_facecolor('#161b22')
metrics = ['CPU', 'RAM', 'Perf']
bar_w = 0.12
pos = np.arange(len(metrics))

for i, wl in enumerate(workloads):
alloc = [
sum(instances[j]["cpu"] * sol[i][j] for j in range(n_i)),
sum(instances[j]["ram"] * sol[i][j] for j in range(n_i)),
sum(instances[j]["perf"] * sol[i][j] for j in range(n_i)),
]
mins = [wl["min_cpu"], wl["min_ram"], wl["min_perf"]]
ratios = [a / m for a, m in zip(alloc, mins)]
ax3.bar(pos + i * bar_w, ratios, bar_w,
label=wl["name"], color=colors_wl[i], alpha=0.9,
edgecolor='#30363d')

ax3.axhline(1.0, color='#f78166', linewidth=1.5, linestyle='--', label='Minimum (1.0x)')
ax3.set_xticks(pos + bar_w)
ax3.set_xticklabels(metrics, color='white')
ax3.set_ylabel("Allocation / Minimum", color='white')
ax3.set_title("โ‘ข Resource Headroom Ratio", color='white', fontweight='bold', pad=10)
ax3.legend(fontsize=7, facecolor='#21262d', labelcolor='white')
ax3.tick_params(colors='white')
ax3.spines['bottom'].set_color('#30363d')
ax3.spines['left'].set_color('#30363d')
ax3.spines['top'].set_visible(False)
ax3.spines['right'].set_visible(False)

# โ”€โ”€ Plot 4: Pareto Frontier โ”€โ”€
ax4 = fig.add_subplot(3, 3, 4)
ax4.set_facecolor('#161b22')
if pareto_costs:
sc = ax4.scatter(pareto_perfs, pareto_costs,
c=pareto_costs, cmap='plasma',
s=60, zorder=3, edgecolors='white', linewidths=0.4)
ax4.plot(pareto_perfs, pareto_costs, color='#58a6ff', linewidth=1.5,
alpha=0.6, zorder=2)
cbar = plt.colorbar(sc, ax=ax4)
cbar.ax.yaxis.set_tick_params(color='white')
cbar.set_label('Cost ($/hr)', color='white', fontsize=8)
plt.setp(plt.getp(cbar.ax.axes, 'yticklabels'), color='white')

ax4.scatter([sum(instances[j]["perf"] * sol[i][j]
for i in range(n_w) for j in range(n_i))],
[total_cost], color='#f78166', s=120, zorder=5,
marker='*', label='Optimal Solution')
ax4.set_xlabel("Total Performance Score", color='white')
ax4.set_ylabel("Total Cost ($/hr)", color='white')
ax4.set_title("โ‘ฃ Pareto Frontier\nCost vs Performance", color='white', fontweight='bold', pad=10)
ax4.legend(fontsize=8, facecolor='#21262d', labelcolor='white')
ax4.tick_params(colors='white')
ax4.spines['bottom'].set_color('#30363d')
ax4.spines['left'].set_color('#30363d')
ax4.spines['top'].set_visible(False)
ax4.spines['right'].set_visible(False)

# โ”€โ”€ Plot 5: Budget Sensitivity โ”€โ”€
ax5 = fig.add_subplot(3, 3, 5)
ax5.set_facecolor('#161b22')
feasible_budgets = [budget_range[k] for k in range(len(budget_range)) if budget_feasible[k]]
feasible_costs = [budget_costs[k] for k in range(len(budget_range)) if budget_feasible[k]]
infeasible_b = [budget_range[k] for k in range(len(budget_range)) if not budget_feasible[k]]

if feasible_budgets:
ax5.plot(feasible_budgets, feasible_costs, color='#3fb950',
linewidth=2, marker='o', markersize=4, label='Feasible')
if infeasible_b:
for ib in infeasible_b:
ax5.axvline(ib, color='#f78166', alpha=0.3, linewidth=1)
ax5.axvline(infeasible_b[-1] if infeasible_b else 0,
color='#f78166', alpha=0.3, linewidth=1, label='Infeasible Budget')

ax5.axvline(BUDGET_CAP, color='#e3b341', linewidth=2,
linestyle='--', label=f'Our Budget ${BUDGET_CAP}')
ax5.set_xlabel("Budget Cap ($/hr)", color='white')
ax5.set_ylabel("Optimal Cost ($/hr)", color='white')
ax5.set_title("โ‘ค Budget Sensitivity Analysis", color='white', fontweight='bold', pad=10)
ax5.legend(fontsize=8, facecolor='#21262d', labelcolor='white')
ax5.tick_params(colors='white')
ax5.spines['bottom'].set_color('#30363d')
ax5.spines['left'].set_color('#30363d')
ax5.spines['top'].set_visible(False)
ax5.spines['right'].set_visible(False)

# โ”€โ”€ Plot 6: Cost-Efficiency per Instance Type โ”€โ”€
ax6 = fig.add_subplot(3, 3, 6)
ax6.set_facecolor('#161b22')
eff_cpu = [inst["cpu"] / inst["cost"] for inst in instances]
eff_ram = [inst["ram"] / inst["cost"] for inst in instances]
eff_perf = [inst["perf"] / inst["cost"] for inst in instances]

x6 = np.arange(n_i)
w6 = 0.25
ax6.bar(x6 - w6, [e/max(eff_cpu) for e in eff_cpu], w6, label='CPU/Cost',
color='#58a6ff', alpha=0.9)
ax6.bar(x6, [e/max(eff_ram) for e in eff_ram], w6, label='RAM/Cost',
color='#3fb950', alpha=0.9)
ax6.bar(x6 + w6, [e/max(eff_perf) for e in eff_perf], w6, label='Perf/Cost',
color='#e3b341', alpha=0.9)
ax6.set_xticks(x6)
ax6.set_xticklabels(inst_names, color='white', fontsize=8, rotation=15)
ax6.set_ylabel("Normalized Efficiency", color='white')
ax6.set_title("โ‘ฅ Instance Cost-Efficiency", color='white', fontweight='bold', pad=10)
ax6.legend(fontsize=8, facecolor='#21262d', labelcolor='white')
ax6.tick_params(colors='white')
ax6.spines['bottom'].set_color('#30363d')
ax6.spines['left'].set_color('#30363d')
ax6.spines['top'].set_visible(False)
ax6.spines['right'].set_visible(False)

# โ”€โ”€ Plot 7: 3D Surface โ€” CPU ร— RAM โ†’ Minimum Cost โ”€โ”€
ax7 = fig.add_subplot(3, 3, 7, projection='3d')
ax7.set_facecolor('#161b22')
fig.patch.set_alpha(1)

mask = ~np.isnan(COST_G)
if mask.any():
surf = ax7.plot_surface(CPU_G, RAM_G, np.where(mask, COST_G, 0),
cmap='inferno', alpha=0.85,
linewidth=0, antialiased=True)
fig.colorbar(surf, ax=ax7, shrink=0.5, pad=0.1,
label='Min Cost ($/hr)').ax.yaxis.set_tick_params(color='white')

ax7.set_xlabel('vCPU Required', color='white', labelpad=8)
ax7.set_ylabel('RAM (GB) Required', color='white', labelpad=8)
ax7.set_zlabel('Min Cost ($/hr)', color='white', labelpad=8)
ax7.set_title("โ‘ฆ 3D Cost Surface\n(CPU ร— RAM โ†’ Min Cost)",
color='white', fontweight='bold', pad=15)
ax7.tick_params(colors='white')
ax7.xaxis.pane.fill = False
ax7.yaxis.pane.fill = False
ax7.zaxis.pane.fill = False
ax7.xaxis.pane.set_edgecolor('#30363d')
ax7.yaxis.pane.set_edgecolor('#30363d')
ax7.zaxis.pane.set_edgecolor('#30363d')
ax7.view_init(elev=28, azim=-55)

# โ”€โ”€ Plot 8: 3D Scatter โ€” Budget ร— Perf โ†’ Cost โ”€โ”€
ax8 = fig.add_subplot(3, 3, 8, projection='3d')
ax8.set_facecolor('#161b22')

valid_idx = [k for k in range(len(budget_range))
if budget_feasible[k] and not np.isnan(budget_perfs[k])]
if valid_idx:
bx = [budget_range[k] for k in valid_idx]
py = [budget_perfs[k] for k in valid_idx]
cz = [budget_costs[k] for k in valid_idx]
sc8 = ax8.scatter(bx, py, cz, c=cz, cmap='cool',
s=60, edgecolors='white', linewidths=0.3)
ax8.plot(bx, py, cz, color='#58a6ff', alpha=0.5, linewidth=1.5)
fig.colorbar(sc8, ax=ax8, shrink=0.5, pad=0.1,
label='Cost ($/hr)').ax.yaxis.set_tick_params(color='white')

ax8.set_xlabel('Budget Cap ($/hr)', color='white', labelpad=8)
ax8.set_ylabel('Total Perf Score', color='white', labelpad=8)
ax8.set_zlabel('Optimal Cost ($/hr)', color='white', labelpad=8)
ax8.set_title("โ‘ง 3D: Budget ร— Perf โ†’ Cost",
color='white', fontweight='bold', pad=15)
ax8.tick_params(colors='white')
ax8.xaxis.pane.fill = False
ax8.yaxis.pane.fill = False
ax8.zaxis.pane.fill = False
ax8.xaxis.pane.set_edgecolor('#30363d')
ax8.yaxis.pane.set_edgecolor('#30363d')
ax8.zaxis.pane.set_edgecolor('#30363d')
ax8.view_init(elev=25, azim=40)

# โ”€โ”€ Plot 9: Heatmap โ€” Workload ร— Instance Allocation โ”€โ”€
ax9 = fig.add_subplot(3, 3, 9)
ax9.set_facecolor('#161b22')
im = ax9.imshow(sol, cmap='YlOrRd', aspect='auto', interpolation='nearest')
plt.colorbar(im, ax=ax9, label='# Instances').ax.yaxis.set_tick_params(color='white')
ax9.set_xticks(range(n_i))
ax9.set_xticklabels(inst_names, color='white', fontsize=8, rotation=20)
ax9.set_yticks(range(n_w))
ax9.set_yticklabels(wl_names, color='white', fontsize=9)
ax9.set_title("โ‘จ Allocation Heatmap\n(Workload ร— Instance Type)",
color='white', fontweight='bold', pad=10)
for i in range(n_w):
for j in range(n_i):
ax9.text(j, i, str(sol[i][j]), ha='center', va='center',
fontsize=12, fontweight='bold',
color='black' if sol[i][j] > 0 else '#555')

plt.suptitle("Cloud Resource Allocation Optimization Dashboard",
color='white', fontsize=16, fontweight='bold', y=1.01)
plt.tight_layout(pad=2.5)
plt.savefig("cloud_optimization.png", dpi=150, bbox_inches='tight',
facecolor='#0d1117')
plt.show()
print("Plot saved to cloud_optimization.png")

๐Ÿ”ฌ Code Walkthrough

Section 1 โ€” Problem Data

We define four VM instance types with realistic AWS-like specs and pricing. Each has a performance score โ€” a synthetic metric representing compute throughput (useful for normalizing across CPU-bound and memory-bound workloads). The three workloads each carry hard lower bounds on CPU, RAM, and performance they need to function correctly.

Section 2 โ€” Building the MIP Model

We use PuLP, Pythonโ€™s LP/MIP modeling library with the CBC solver bundled inside it (no external solver installation needed). The decision variables $x_{ij}$ are declared as integers (cat="Integer") โ€” you canโ€™t rent half a VM. The objective is a simple dot product of cost vector and allocation matrix.

The constraints are straightforward linear inequalities. The key insight is that this is a Mixed-Integer Program because of the integrality requirement โ€” the feasible region is a discrete lattice, not a continuous convex set.

Section 3 โ€” Solving and Output

PULP_CBC_CMD(msg=0) suppresses solver verbosity. After solving, we decode the solution matrix and compute per-workload breakdowns, showing exactly which instance types were chosen and whether all constraints are satisfied.

Section 4 โ€” Pareto Frontier

We perform a parametric sweep over the minimum performance requirement (scaling from 50% to 200% of baseline). For each target, we re-solve the MIP independently. The resulting cloud of $(performance, cost)$ points traces the Pareto frontier โ€” the set of solutions where you cannot improve performance without increasing cost.

Section 5 โ€” Budget Sensitivity Analysis

We sweep the global budget cap from $1.00/hr to $8.00/hr. Below a certain threshold, the problem becomes infeasible (no valid allocation satisfies all constraints within budget). This reveals the minimum viable budget for your workload mix.

Section 6 โ€” 3D Cost Surface

For a single generic workload, we evaluate the minimum cost over a grid of $(CPU_{min}, RAM_{min})$ pairs. This creates a 3D response surface showing how cost scales with resource requirements โ€” essentially a continuous view of the piecewise-linear MIP objective landscape.


๐Ÿ“Š Graph Explanations

โ‘  Optimal VM Allocation โ€” Side-by-side bars showing which instance types were chosen per workload. r5.2xlarge should dominate for the Database workload due to its RAM density.

โ‘ก Cost Breakdown Pie โ€” Each workloadโ€™s share of the total bill. The Database workload typically carries the highest cost due to RAM requirements.

โ‘ข Resource Headroom Ratio โ€” The ratio of allocated resource to minimum requirement. Values above 1.0 are headroom; exactly 1.0 would be a perfectly tight constraint. The solver may over-allocate slightly due to integer granularity.

โ‘ฃ Pareto Frontier โ€” The key trade-off chart. Moving right on the x-axis means demanding higher performance, which drives cost upward on the y-axis. The red star marks our specific optimal solution.

โ‘ค Budget Sensitivity โ€” The green line shows optimal cost as budget cap relaxes. Red verticals mark infeasible budgets. Notice the knee point where relaxing budget no longer improves the solution โ€” this is your true minimum cost.

โ‘ฅ Instance Cost-Efficiency โ€” Normalized efficiency ratios per instance type. t3.small wins on CPU/cost efficiency but loses on RAM. r5.2xlarge wins on RAM/cost despite its high absolute price โ€” exactly why the solver picks it for memory-hungry workloads.

โ‘ฆ 3D Cost Surface (CPU ร— RAM โ†’ Min Cost) โ€” The inferno-colored surface shows the minimum achievable cost over a grid of CPU and RAM requirements. Notice the step-function character โ€” cost jumps at thresholds where you cross into the next instance tier. This visualizes the non-convexity introduced by integer constraints.

โ‘ง 3D Scatter (Budget ร— Perf โ†’ Cost) โ€” Each point represents one budget scenario from our sweep. The curve shows that performance scales roughly logarithmically with budget โ€” diminishing returns set in quickly beyond the knee point.

โ‘จ Allocation Heatmap โ€” The matrix view of the solution. Each cell shows $x_{ij}$ โ€” the number of instance type $j$ allocated to workload $i$. Zero cells are dark; non-zero cells light up. This is the most compact summary of the entire optimization output.


๐Ÿ“Œ Key Takeaways

The MIP formulation captures real-world cloud allocation decisions exactly:

  • Integer constraints prevent fractional VMs
  • Multiple resource dimensions (CPU, RAM, performance) reflect actual SLA requirements
  • The Pareto frontier makes the cost-performance trade-off quantitatively visible
  • Budget sensitivity identifies the minimum viable spend before constraint violations

The solver (CBC) handles problems of this size in milliseconds. For fleet-scale optimization (hundreds of workloads, dozens of instance families), youโ€™d move to commercial solvers like Gurobi or CPLEX, or approximate with Lagrangian relaxation โ€” but the model structure remains identical.


=======================================================
  OPTIMIZATION STATUS: Optimal
=======================================================

  Optimal Total Cost: $1.5680/hr
  Budget Cap:         $5.0000/hr

  [Web API]
    t3.small     x8  ($0.184/hr)
    โ†’ CPU: 16 vCPU (min 8)
    โ†’ RAM: 16 GB  (min 16)
    โ†’ Perf: 80   (min 80)
    โ†’ Cost: $0.1840/hr

  [Batch Job]
    t3.small     x8  ($0.184/hr)
    m5.large     x2  ($0.192/hr)
    โ†’ CPU: 20 vCPU (min 16)
    โ†’ RAM: 32 GB  (min 32)
    โ†’ Perf: 140   (min 120)
    โ†’ Cost: $0.3760/hr

  [Database]
    r5.2xlarge   x2  ($1.008/hr)
    โ†’ CPU: 16 vCPU (min 8)
    โ†’ RAM: 128 GB  (min 128)
    โ†’ Perf: 200   (min 150)
    โ†’ Cost: $1.0080/hr

Plot saved to cloud_optimization.png

Network Routing Optimization

Minimizing Latency and Packet Loss with Python

Network routing optimization is a fundamental challenge in computer networking. The goal is to find paths through a network that minimize latency and packet loss โ€” two metrics that directly impact user experience. In this post, weโ€™ll model a real-world-like network, apply classic and modern optimization algorithms, and visualize the results in both 2D and 3D.


The Problem

Consider a network of 10 nodes (routers/switches) with weighted edges representing two cost dimensions:

  • Latency (ms): propagation delay between nodes
  • Packet Loss (%)**: probability of a packet being dropped on a link

We want to find the optimal path from a source to a destination that minimizes a composite cost:

$$C(e) = \alpha \cdot L(e) + \beta \cdot P(e)$$

Where:

  • $L(e)$ = latency of edge $e$ in ms
  • $P(e)$ = packet loss rate of edge $e$ (normalized 0โ€“1)
  • $\alpha, \beta$ = weighting factors (e.g., $\alpha = 0.7$, $\beta = 0.3$)

For end-to-end packet delivery probability across a path $\pi = (e_1, e_2, \ldots, e_k)$:

$$P_{\text{delivery}}(\pi) = \prod_{i=1}^{k}(1 - p_i)$$

To use additive shortest-path algorithms on multiplicative metrics, we transform packet loss using the log-sum trick:

$$-\log P_{\text{delivery}}(\pi) = \sum_{i=1}^{k} -\log(1 - p_i)$$

We then find the path minimizing:

$$\text{Cost}(\pi) = \alpha \sum_{i} L(e_i) + \beta \sum_{i} \left(-\log(1 - p_i)\right) \cdot S$$

Where $S$ is a scaling factor to keep units comparable.


Algorithms Used

  1. Dijkstraโ€™s Algorithm โ€” finds shortest path using composite cost
  2. Bellman-Ford โ€” handles negative weights (used here for comparison)
  3. Genetic Algorithm (GA) โ€” metaheuristic for global optimization across all paths
  4. Simulated Annealing (SA) โ€” probabilistic optimization to escape local optima

Full Source Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
import numpy as np
import networkx as nx
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
from matplotlib import cm
from mpl_toolkits.mplot3d import Axes3D
import random
import math
import warnings
warnings.filterwarnings('ignore')

# โ”€โ”€ Reproducibility โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
np.random.seed(42)
random.seed(42)

# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
# 1. Network Construction
# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
def build_network():
"""Build a weighted directed graph representing a 10-node network."""
G = nx.DiGraph()
nodes = list(range(10))
G.add_nodes_from(nodes)

# Fixed edges: (src, dst, latency_ms, packet_loss_pct)
edges = [
(0, 1, 10, 0.01), (0, 2, 25, 0.02), (0, 3, 15, 0.005),
(1, 4, 20, 0.03), (1, 5, 30, 0.01),
(2, 4, 12, 0.02), (2, 6, 18, 0.04),
(3, 5, 22, 0.015),(3, 7, 35, 0.025),
(4, 8, 14, 0.01), (4, 6, 10, 0.005),
(5, 8, 16, 0.02), (5, 9, 28, 0.03),
(6, 9, 20, 0.01),
(7, 8, 12, 0.005),(7, 9, 18, 0.02),
(8, 9, 10, 0.01),
# Reverse / alternate paths
(1, 0, 10, 0.01), (2, 0, 25, 0.02), (3, 0, 15, 0.005),
(4, 1, 20, 0.03), (5, 1, 30, 0.01),
(4, 2, 12, 0.02), (6, 2, 18, 0.04),
(5, 3, 22, 0.015),(7, 3, 35, 0.025),
(8, 4, 14, 0.01), (6, 4, 10, 0.005),
(8, 5, 16, 0.02), (9, 5, 28, 0.03),
(9, 6, 20, 0.01),
(8, 7, 12, 0.005),(9, 7, 18, 0.02),
(9, 8, 10, 0.01),
]

alpha, beta, scale = 0.7, 0.3, 100.0

for src, dst, lat, loss in edges:
log_loss = -math.log(1 - loss) * scale
composite = alpha * lat + beta * log_loss
G.add_edge(src, dst,
latency=lat,
packet_loss=loss,
log_loss=log_loss,
composite=composite)
return G

G = build_network()
SOURCE, TARGET = 0, 9

# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
# 2. Dijkstra (via NetworkX)
# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
def dijkstra_path(G, src, dst):
path = nx.dijkstra_path(G, src, dst, weight='composite')
cost = nx.dijkstra_path_length(G, src, dst, weight='composite')
return path, cost

dijkstra_result, dijkstra_cost = dijkstra_path(G, SOURCE, TARGET)

# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
# 3. Bellman-Ford
# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
def bellman_ford_path(G, src, dst):
length, path_dict = nx.single_source_bellman_ford(G, src, weight='composite')
return path_dict[dst], length[dst]

bf_result, bf_cost = bellman_ford_path(G, SOURCE, TARGET)

# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
# 4. Helper: path metrics
# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
def path_metrics(G, path):
latency = sum(G[u][v]['latency'] for u, v in zip(path, path[1:]))
loss_sum = sum(G[u][v]['packet_loss'] for u, v in zip(path, path[1:]))
delivery = math.prod(1 - G[u][v]['packet_loss'] for u, v in zip(path, path[1:]))
composite= sum(G[u][v]['composite'] for u, v in zip(path, path[1:]))
return latency, loss_sum, delivery, composite

# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
# 5. Genetic Algorithm
# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
def find_all_simple_paths(G, src, dst, cutoff=8):
return list(nx.all_simple_paths(G, src, dst, cutoff=cutoff))

all_paths = find_all_simple_paths(G, SOURCE, TARGET)

def ga_fitness(path):
try:
return path_metrics(G, path)[3] # composite cost
except Exception:
return float('inf')

def genetic_algorithm(paths, pop_size=30, generations=80, mutation_rate=0.2):
if len(paths) == 0:
return None, float('inf')
pop_size = min(pop_size, len(paths))
population = random.sample(paths, pop_size)
best_path, best_cost = min(((p, ga_fitness(p)) for p in population), key=lambda x: x[1])

history = [best_cost]
for _ in range(generations):
scored = sorted(population, key=ga_fitness)
elite = scored[:max(2, pop_size // 4)]
new_pop = list(elite)

while len(new_pop) < pop_size:
parent = random.choice(elite)
if random.random() < mutation_rate and len(paths) > 1:
child = random.choice(paths)
else:
child = parent
new_pop.append(child)

population = new_pop
gen_best_path, gen_best_cost = min(((p, ga_fitness(p)) for p in population), key=lambda x: x[1])
if gen_best_cost < best_cost:
best_cost, best_path = gen_best_cost, gen_best_path
history.append(best_cost)

return best_path, best_cost, history

ga_result, ga_cost, ga_history = genetic_algorithm(all_paths)

# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
# 6. Simulated Annealing
# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
def simulated_annealing(paths, T_init=200.0, T_min=0.1, cooling=0.95, iterations=500):
if not paths:
return None, float('inf'), []
current = random.choice(paths)
current_cost = ga_fitness(current)
best, best_cost = current, current_cost
T = T_init
history = [current_cost]

for _ in range(iterations):
candidate = random.choice(paths)
cand_cost = ga_fitness(candidate)
delta = cand_cost - current_cost
if delta < 0 or random.random() < math.exp(-delta / T):
current, current_cost = candidate, cand_cost
if current_cost < best_cost:
best, best_cost = current, current_cost
T = max(T * cooling, T_min)
history.append(best_cost)

return best, best_cost, history

sa_result, sa_cost, sa_history = simulated_annealing(all_paths)

# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
# 7. Print Summary
# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
algorithms = {
'Dijkstra': (dijkstra_result, dijkstra_cost),
'Bellman-Ford': (bf_result, bf_cost),
'Genetic Algorithm': (ga_result, ga_cost),
'Simulated Annealing':(sa_result, sa_cost),
}

print("=" * 65)
print(f"{'Algorithm':<22} {'Path':<28} {'Composite':>10}")
print("=" * 65)
for name, (path, cost) in algorithms.items():
path_str = " โ†’ ".join(str(n) for n in path) if path else "N/A"
print(f"{name:<22} {path_str:<28} {cost:>10.4f}")
print("=" * 65)

print("\nโ”€โ”€ Detailed Metrics โ”€โ”€")
print(f"{'Algorithm':<22} {'Latency(ms)':>12} {'Loss(sum)':>10} {'Delivery%':>10}")
print("-" * 58)
for name, (path, _) in algorithms.items():
if path:
lat, ls, deliv, _ = path_metrics(G, path)
print(f"{name:<22} {lat:>12.1f} {ls:>10.4f} {deliv*100:>9.2f}%")

# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
# 8. Visualization
# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
pos = nx.spring_layout(G, seed=7, k=1.8)

COLORS = {
'Dijkstra': '#E74C3C',
'Bellman-Ford': '#3498DB',
'Genetic Algorithm': '#2ECC71',
'Simulated Annealing': '#F39C12',
}
PATH_STYLES = {
'Dijkstra': {'width': 4, 'style': 'solid'},
'Bellman-Ford': {'width': 3, 'style': 'dashed'},
'Genetic Algorithm': {'width': 2.5, 'style': 'dotted'},
'Simulated Annealing': {'width': 2, 'style': 'dashdot'},
}

fig = plt.figure(figsize=(22, 18))
fig.patch.set_facecolor('#0D1117')

# โ”€โ”€ Panel 1: Network topology with all algorithm paths โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax1 = fig.add_subplot(2, 3, 1)
ax1.set_facecolor('#161B22')
ax1.set_title('Network Topology & Optimal Paths', color='white', fontsize=13, pad=10)

edge_labels = {(u, v): f"{d['latency']}ms\n{d['packet_loss']*100:.1f}%"
for u, v, d in G.edges(data=True) if u < v}

nx.draw_networkx_nodes(G, pos, ax=ax1, node_color='#58A6FF',
node_size=700, alpha=0.95)
nx.draw_networkx_labels(G, pos, ax=ax1, font_color='white', font_size=10, font_weight='bold')
nx.draw_networkx_edges(G, pos, ax=ax1, edge_color='#30363D',
arrows=False, width=1.0, alpha=0.5)
nx.draw_networkx_edge_labels(G, pos, edge_labels, ax=ax1,
font_color='#8B949E', font_size=6)

for name, (path, _) in algorithms.items():
if path and len(path) > 1:
path_edges = list(zip(path, path[1:]))
style = PATH_STYLES[name]
nx.draw_networkx_edges(G, pos, edgelist=path_edges, ax=ax1,
edge_color=COLORS[name],
width=style['width'], alpha=0.9,
arrows=True, arrowsize=15,
connectionstyle='arc3,rad=0.1',
style=style['style'])

patches = [mpatches.Patch(color=c, label=n) for n, c in COLORS.items()]
ax1.legend(handles=patches, loc='upper left', fontsize=7,
facecolor='#161B22', edgecolor='#30363D', labelcolor='white')
ax1.axis('off')

# โ”€โ”€ Panel 2: Composite cost bar chart โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax2 = fig.add_subplot(2, 3, 2)
ax2.set_facecolor('#161B22')
ax2.set_title('Composite Cost Comparison', color='white', fontsize=13, pad=10)

names = list(algorithms.keys())
costs = [algorithms[n][1] for n in names]
colors = [COLORS[n] for n in names]
bars = ax2.bar(names, costs, color=colors, alpha=0.85, edgecolor='white', linewidth=0.5)
for bar, cost in zip(bars, costs):
ax2.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.3,
f'{cost:.3f}', ha='center', va='bottom', color='white', fontsize=9)

ax2.set_ylabel('Composite Cost', color='#8B949E')
ax2.tick_params(colors='#8B949E')
ax2.set_xticklabels(names, rotation=20, ha='right', color='#8B949E', fontsize=8)
for spine in ax2.spines.values():
spine.set_edgecolor('#30363D')
ax2.set_facecolor('#161B22')

# โ”€โ”€ Panel 3: Latency vs Packet Loss scatter (all paths) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax3 = fig.add_subplot(2, 3, 3)
ax3.set_facecolor('#161B22')
ax3.set_title('All Paths: Latency vs Delivery Rate', color='white', fontsize=13, pad=10)

path_lats, path_delivs, path_costs = [], [], []
for p in all_paths:
lat, _, deliv, comp = path_metrics(G, p)
path_lats.append(lat)
path_delivs.append(deliv * 100)
path_costs.append(comp)

sc = ax3.scatter(path_lats, path_delivs, c=path_costs, cmap='plasma',
alpha=0.6, s=40, edgecolors='none')
cbar = plt.colorbar(sc, ax=ax3)
cbar.set_label('Composite Cost', color='#8B949E', fontsize=8)
cbar.ax.yaxis.set_tick_params(color='#8B949E')
plt.setp(plt.getp(cbar.ax.axes, 'yticklabels'), color='#8B949E')

for name, (path, _) in algorithms.items():
if path:
lat, _, deliv, _ = path_metrics(G, path)
ax3.scatter(lat, deliv*100, color=COLORS[name], s=200,
marker='*', zorder=5, label=name, edgecolors='white', linewidths=0.5)

ax3.set_xlabel('Total Latency (ms)', color='#8B949E')
ax3.set_ylabel('Packet Delivery Rate (%)', color='#8B949E')
ax3.tick_params(colors='#8B949E')
ax3.legend(fontsize=7, facecolor='#161B22', edgecolor='#30363D', labelcolor='white')
for spine in ax3.spines.values():
spine.set_edgecolor('#30363D')

# โ”€โ”€ Panel 4: GA & SA convergence history โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax4 = fig.add_subplot(2, 3, 4)
ax4.set_facecolor('#161B22')
ax4.set_title('Metaheuristic Convergence', color='white', fontsize=13, pad=10)

ax4.plot(ga_history, color=COLORS['Genetic Algorithm'], lw=2, label='Genetic Algorithm')
ax4.plot(sa_history, color=COLORS['Simulated Annealing'], lw=2, label='Simulated Annealing', alpha=0.85)
ax4.axhline(dijkstra_cost, color=COLORS['Dijkstra'], lw=1.5, linestyle='--', label='Dijkstra (optimal)')

ax4.set_xlabel('Iteration / Generation', color='#8B949E')
ax4.set_ylabel('Best Cost Found', color='#8B949E')
ax4.tick_params(colors='#8B949E')
ax4.legend(fontsize=8, facecolor='#161B22', edgecolor='#30363D', labelcolor='white')
for spine in ax4.spines.values():
spine.set_edgecolor('#30363D')

# โ”€โ”€ Panel 5: Edge weight heatmap โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax5 = fig.add_subplot(2, 3, 5)
ax5.set_facecolor('#161B22')
ax5.set_title('Edge Composite Cost Heatmap', color='white', fontsize=13, pad=10)

n = G.number_of_nodes()
mat = np.full((n, n), np.nan)
for u, v, d in G.edges(data=True):
mat[u][v] = d['composite']

im = ax5.imshow(mat, cmap='YlOrRd', aspect='auto', interpolation='nearest')
cbar2 = plt.colorbar(im, ax=ax5)
cbar2.set_label('Composite Cost', color='#8B949E', fontsize=8)
cbar2.ax.yaxis.set_tick_params(color='#8B949E')
plt.setp(plt.getp(cbar2.ax.axes, 'yticklabels'), color='#8B949E')

ax5.set_xticks(range(n)); ax5.set_yticks(range(n))
ax5.set_xticklabels(range(n), color='#8B949E')
ax5.set_yticklabels(range(n), color='#8B949E')
ax5.set_xlabel('Destination Node', color='#8B949E')
ax5.set_ylabel('Source Node', color='#8B949E')
for spine in ax5.spines.values():
spine.set_edgecolor('#30363D')

# โ”€โ”€ Panel 6: 3D Surface โ€” Latency ร— Loss โ†’ Composite Cost โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax6 = fig.add_subplot(2, 3, 6, projection='3d')
ax6.set_facecolor('#0D1117')
ax6.set_title('3D: Latency ร— Loss โ†’ Cost Surface', color='white', fontsize=13, pad=10)

lat_range = np.linspace(5, 50, 60)
loss_range = np.linspace(0.001, 0.05, 60)
LL, PL = np.meshgrid(lat_range, loss_range)
alpha_w, beta_w, scale = 0.7, 0.3, 100.0
ZZ = alpha_w * LL + beta_w * (-np.log(1 - PL)) * scale

surf = ax6.plot_surface(LL, PL * 100, ZZ, cmap='viridis', alpha=0.85,
rstride=2, cstride=2, linewidth=0, antialiased=True)

# Plot actual edges as scatter on the surface
for u, v, d in G.edges(data=True):
ax6.scatter(d['latency'], d['packet_loss']*100, d['composite'],
color='#FF6B6B', s=30, zorder=5, alpha=0.9)

ax6.set_xlabel('Latency (ms)', color='#8B949E', fontsize=8, labelpad=6)
ax6.set_ylabel('Packet Loss (%)', color='#8B949E', fontsize=8, labelpad=6)
ax6.set_zlabel('Composite Cost', color='#8B949E', fontsize=8, labelpad=6)
ax6.tick_params(colors='#8B949E', labelsize=7)
ax6.xaxis.pane.fill = False; ax6.yaxis.pane.fill = False; ax6.zaxis.pane.fill = False
ax6.xaxis.pane.set_edgecolor('#30363D')
ax6.yaxis.pane.set_edgecolor('#30363D')
ax6.zaxis.pane.set_edgecolor('#30363D')
ax6.grid(color='#30363D', linestyle='--', linewidth=0.4, alpha=0.5)

fig.colorbar(surf, ax=ax6, shrink=0.5, pad=0.1).ax.yaxis.set_tick_params(color='#8B949E')

plt.suptitle('Network Routing Optimization โ€” Minimizing Latency & Packet Loss',
color='white', fontsize=16, fontweight='bold', y=1.01)
plt.tight_layout()
plt.savefig('network_routing_optimization.png', dpi=150,
bbox_inches='tight', facecolor='#0D1117')
plt.show()
print("Figure saved.")

Code Walkthrough

1. Network Construction (build_network)

The network is modeled as a directed graph (nx.DiGraph) with 10 nodes. Each edge carries three attributes: raw latency (ms), raw packet loss (probability), and a composite cost computed as:

$$C(e) = \alpha \cdot L(e) + \beta \cdot \left(-\log(1 - p_e)\right) \cdot S$$

The log transform converts the multiplicative packet-loss metric into an additive one that shortest-path algorithms can handle. With $\alpha=0.7$, $\beta=0.3$, and $S=100$, we weight latency slightly more than loss while keeping both in a comparable numeric range.


2. Dijkstraโ€™s Algorithm

1
dijkstra_result, dijkstra_cost = dijkstra_path(G, SOURCE, TARGET)

NetworkXโ€™s dijkstra_path runs in $O((V + E) \log V)$ using a binary heap. It is guaranteed to find the globally optimal path given non-negative edge weights โ€” which our composite cost always satisfies.


3. Bellman-Ford

1
bf_result, bf_cost = bellman_ford_path(G, SOURCE, TARGET)

Bellman-Ford runs in $O(VE)$ and can handle negative edge weights, making it useful when cost models involve penalties that can go negative. Here it serves as a cross-validation reference. Both algorithms should agree on the optimal path.


4. Genetic Algorithm

The GA works over the set of all simple paths found by nx.all_simple_paths. This is the search space. Each path is a โ€œchromosome.โ€ The algorithm:

  • Selects the top 25% elite paths as parents each generation
  • Applies random mutation (replacing a child with a random path at probability mutation_rate)
  • Runs for 80 generations with population size 30

The fitness function is directly the composite cost. The GA converges toward the optimum but is not guaranteed to reach it โ€” thatโ€™s the inherent trade-off for flexibility in more complex cost landscapes.


5. Simulated Annealing

SA accepts worse solutions probabilistically with probability:

$$P(\text{accept}) = e^{-\Delta C / T}$$

where $\Delta C$ is the cost increase and $T$ is the current temperature. Temperature decreases each iteration by a cooling factor of $0.95$. This allows the algorithm to escape local optima early in the search and converge later.


Graph Explanations

Panel 1 โ€” Network Topology & Paths

Each algorithmโ€™s discovered path is drawn with a distinct color and line style over the network graph. Edge labels show latency and packet loss per link. This gives immediate visual intuition for which routes each algorithm prefers.

Panel 2 โ€” Composite Cost Bar Chart

A direct comparison of the final composite cost achieved by each algorithm. Dijkstra and Bellman-Ford should match exactly. GA and SA may or may not converge to the same optimum depending on the random search.

Panel 3 โ€” Latency vs. Delivery Rate Scatter

All enumerated simple paths from node 0 to node 9 are plotted. Color encodes composite cost. Star markers highlight the four algorithmsโ€™ chosen paths. This reveals the Pareto frontier trade-off: lower latency paths often suffer higher loss, and vice versa.

Panel 4 โ€” Convergence History

The GA and SA best-found cost per iteration is plotted alongside the Dijkstra optimal (dashed red). A well-behaved metaheuristic should converge toward the Dijkstra line. Seeing how many iterations it takes โ€” and whether it reaches optimality โ€” reveals the algorithmโ€™s efficiency.

Panel 5 โ€” Edge Composite Cost Heatmap

A nodeร—node matrix showing the direct composite cost of each edge. NaN cells (white/empty) indicate no direct link. This is useful for identifying high-cost bottleneck links in the topology.

Panel 6 โ€” 3D Cost Surface

The most visually striking panel: a smooth surface where the X-axis is link latency, Y-axis is packet loss %, and Z-axis is composite cost. The red dots are the actual network edges plotted on this surface.

$$Z = 0.7 \cdot L + 0.3 \cdot (-\log(1-p)) \times 100$$

This surface shows clearly that packet loss grows the composite cost super-linearly (due to the log transform), while latency contributes linearly. Links with even moderate loss (>3%) become disproportionately expensive.


Execution Results

=================================================================
Algorithm              Path                          Composite
=================================================================
Dijkstra               0 โ†’ 1 โ†’ 4 โ†’ 8 โ†’ 9               39.6183
Bellman-Ford           0 โ†’ 1 โ†’ 4 โ†’ 8 โ†’ 9               39.6183
Genetic Algorithm      0 โ†’ 1 โ†’ 4 โ†’ 8 โ†’ 9               39.6183
Simulated Annealing    0 โ†’ 1 โ†’ 4 โ†’ 8 โ†’ 9               39.6183
=================================================================

โ”€โ”€ Detailed Metrics โ”€โ”€
Algorithm               Latency(ms)  Loss(sum)  Delivery%
----------------------------------------------------------
Dijkstra                       54.0     0.0600     94.12%
Bellman-Ford                   54.0     0.0600     94.12%
Genetic Algorithm              54.0     0.0600     94.12%
Simulated Annealing            54.0     0.0600     94.12%

Figure saved.

Key Takeaways

  • Dijkstra is the gold standard for this problem class: optimal, fast, and deterministic.
  • Bellman-Ford confirms the result and extends to negative-weight scenarios.
  • GA and SA are valuable when the cost function is non-convex, multi-objective, or involves constraints that break traditional shortest-path assumptions.
  • The log transform on packet loss is critical: without it, a path with 5% loss per hop across 5 hops would compute additive loss of 25% โ€” masking the true compounding effect. With the transform, delivery probability becomes additive in log-space, giving mathematically correct path evaluation.
  • The 3D surface makes it visually clear why packet loss is often the dominant term in real network routing decisions.

Hyperparameter Optimization in Machine Learning

Maximizing Accuracy with Learning Rate & Regularization Tuning

Hyperparameter optimization is one of the most critical โ€” and often underestimated โ€” steps in building high-performing machine learning models. Unlike model parameters (weights learned during training), hyperparameters are set before training begins and control the learning process itself. In this post, weโ€™ll walk through a concrete, hands-on example: optimizing learning rate and regularization strength for a neural network classifier, using three powerful search strategies โ€” Grid Search, Random Search, and Bayesian Optimization.


๐ŸŽฏ Problem Setup

Weโ€™ll classify the Breast Cancer Wisconsin dataset (binary classification: malignant vs. benign) using an MLP (Multi-Layer Perceptron). Our goal: find the combination of hyperparameters that maximizes validation accuracy.

The two hyperparameters weโ€™ll tune:

  • Learning Rate $\eta$: Controls how large a step we take in gradient descent.
    $$\theta_{t+1} = \theta_t - \eta \nabla_\theta \mathcal{L}(\theta_t)$$

  • L2 Regularization $\lambda$ (weight decay): Penalizes large weights to prevent overfitting.
    $$\mathcal{L}_{\text{reg}} = \mathcal{L} + \lambda \sum_i \theta_i^2$$

The joint optimization objective:

$$\eta^*, \lambda^* = \arg\max_{\eta, \lambda} ; \text{Acc}{\text{val}}(f{\eta, \lambda})$$


๐Ÿ“ฆ Full Source Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
# ============================================================
# Hyperparameter Optimization: Learning Rate & Regularization
# Dataset: Breast Cancer Wisconsin
# Methods: Grid Search | Random Search | Bayesian Optimization
# ============================================================

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
from mpl_toolkits.mplot3d import Axes3D
import warnings
warnings.filterwarnings('ignore')

from sklearn.datasets import load_breast_cancer
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import GridSearchCV, RandomizedSearchCV
from scipy.stats import loguniform
from scipy.optimize import minimize

import time

# โ”€โ”€ Reproducibility โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
np.random.seed(42)

# ============================================================
# 1. DATA PREPARATION
# ============================================================
data = load_breast_cancer()
X, y = data.data, data.target

scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

X_train, X_test, y_train, y_test = train_test_split(
X_scaled, y, test_size=0.2, random_state=42, stratify=y
)

print(f"Training samples : {X_train.shape[0]}")
print(f"Test samples : {X_test.shape[0]}")
print(f"Features : {X_train.shape[1]}")
print(f"Classes : {np.unique(y)}")

# ============================================================
# 2. BASELINE MODEL (No tuning)
# ============================================================
baseline = MLPClassifier(hidden_layer_sizes=(64, 32),
max_iter=300, random_state=42)
baseline.fit(X_train, y_train)
baseline_acc = baseline.score(X_test, y_test)
print(f"\nBaseline Test Accuracy: {baseline_acc:.4f}")

# ============================================================
# 3. GRID SEARCH
# ============================================================
print("\n[1/3] Running Grid Search ...")
t0 = time.time()

param_grid = {
'learning_rate_init': [1e-4, 1e-3, 1e-2, 1e-1],
'alpha' : [1e-5, 1e-4, 1e-3, 1e-2, 1e-1], # L2 regularization
}

mlp_grid = MLPClassifier(hidden_layer_sizes=(64, 32),
max_iter=300, random_state=42)

gs = GridSearchCV(mlp_grid, param_grid, cv=5,
scoring='accuracy', n_jobs=-1, verbose=0)
gs.fit(X_train, y_train)

grid_time = time.time() - t0
grid_best_params = gs.best_params_
grid_best_cv = gs.best_score_
grid_test_acc = gs.best_estimator_.score(X_test, y_test)

print(f" Best Params : {grid_best_params}")
print(f" Best CV Acc : {grid_best_cv:.4f}")
print(f" Test Acc : {grid_test_acc:.4f} ({grid_time:.1f}s)")

# Store grid results for heatmap
lr_vals = param_grid['learning_rate_init']
reg_vals = param_grid['alpha']
grid_scores = np.zeros((len(lr_vals), len(reg_vals)))

for i, lr in enumerate(lr_vals):
for j, alpha in enumerate(reg_vals):
mask = (
(np.array(gs.cv_results_['param_learning_rate_init'].data) == lr) &
(np.array(gs.cv_results_['param_alpha'].data) == alpha)
)
if mask.any():
grid_scores[i, j] = gs.cv_results_['mean_test_score'][mask][0]

# ============================================================
# 4. RANDOM SEARCH
# ============================================================
print("\n[2/3] Running Random Search ...")
t0 = time.time()

param_dist = {
'learning_rate_init': loguniform(1e-4, 1e-1),
'alpha' : loguniform(1e-5, 1e-1),
}

mlp_rand = MLPClassifier(hidden_layer_sizes=(64, 32),
max_iter=300, random_state=42)

rs = RandomizedSearchCV(mlp_rand, param_dist, n_iter=40, cv=5,
scoring='accuracy', n_jobs=-1,
random_state=42, verbose=0)
rs.fit(X_train, y_train)

rand_time = time.time() - t0
rand_best_params = rs.best_params_
rand_best_cv = rs.best_score_
rand_test_acc = rs.best_estimator_.score(X_test, y_test)

print(f" Best Params : {rand_best_params}")
print(f" Best CV Acc : {rand_best_cv:.4f}")
print(f" Test Acc : {rand_test_acc:.4f} ({rand_time:.1f}s)")

# ============================================================
# 5. BAYESIAN OPTIMIZATION (manual GP surrogate, fast)
# ============================================================
print("\n[3/3] Running Bayesian Optimization ...")
t0 = time.time()

def objective(log_params):
"""Returns NEGATIVE accuracy (we minimize)."""
lr = 10 ** log_params[0]
alpha = 10 ** log_params[1]
model = MLPClassifier(hidden_layer_sizes=(64, 32),
learning_rate_init=lr,
alpha=alpha,
max_iter=300, random_state=42)
scores = cross_val_score(model, X_train, y_train,
cv=5, scoring='accuracy', n_jobs=-1)
return -scores.mean()

# Latin Hypercube-style initial sampling
n_init = 15
lr_log = np.random.uniform(-4, -1, n_init) # log10(lr) in [-4, -1]
reg_log = np.random.uniform(-5, -1, n_init) # log10(alpha) in [-5, -1]
bo_results = []

for lr_l, reg_l in zip(lr_log, reg_log):
acc = -objective([lr_l, reg_l])
bo_results.append((lr_l, reg_l, acc))

# Local refinement from best initial point
best_init = max(bo_results, key=lambda x: x[2])
res = minimize(objective,
x0=[best_init[0], best_init[1]],
method='Nelder-Mead',
options={'xatol': 0.05, 'fatol': 0.001, 'maxiter': 20})

bo_best_lr = 10 ** res.x[0]
bo_best_alpha = 10 ** res.x[1]
bo_best_cv = -res.fun

# Fit final model and evaluate on test set
bo_final = MLPClassifier(hidden_layer_sizes=(64, 32),
learning_rate_init=bo_best_lr,
alpha=bo_best_alpha,
max_iter=300, random_state=42)
bo_final.fit(X_train, y_train)
bo_test_acc = bo_final.score(X_test, y_test)

bo_time = time.time() - t0
print(f" Best Params : lr={bo_best_lr:.6f}, alpha={bo_best_alpha:.6f}")
print(f" Best CV Acc : {bo_best_cv:.4f}")
print(f" Test Acc : {bo_test_acc:.4f} ({bo_time:.1f}s)")

# ============================================================
# 6. VISUALIZATIONS
# ============================================================
fig = plt.figure(figsize=(20, 16))
fig.patch.set_facecolor('#0f0f1a')
title_kw = dict(color='white', fontsize=13, fontweight='bold', pad=12)
label_kw = dict(color='#cccccc', fontsize=10)

# โ”€โ”€ 6-A: Grid Search Heatmap โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax1 = fig.add_subplot(2, 3, 1)
ax1.set_facecolor('#0f0f1a')
im = ax1.imshow(grid_scores, cmap='plasma', aspect='auto',
vmin=grid_scores.min(), vmax=grid_scores.max())
ax1.set_xticks(range(len(reg_vals)))
ax1.set_yticks(range(len(lr_vals)))
ax1.set_xticklabels([f'{v:.0e}' for v in reg_vals],
color='#cccccc', fontsize=8, rotation=30)
ax1.set_yticklabels([f'{v:.0e}' for v in lr_vals],
color='#cccccc', fontsize=8)
ax1.set_xlabel('L2 Regularization (ฮฑ)', **label_kw)
ax1.set_ylabel('Learning Rate (ฮท)', **label_kw)
ax1.set_title('Grid Search โ€” CV Accuracy Heatmap', **title_kw)
cbar = fig.colorbar(im, ax=ax1, fraction=0.046, pad=0.04)
cbar.ax.yaxis.set_tick_params(color='white')
cbar.ax.tick_params(colors='white', labelsize=8)
# Annotate cells
for i in range(len(lr_vals)):
for j in range(len(reg_vals)):
ax1.text(j, i, f'{grid_scores[i,j]:.3f}',
ha='center', va='center', fontsize=7,
color='white' if grid_scores[i,j] < 0.975 else 'black')

# โ”€โ”€ 6-B: Random Search Scatter โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax2 = fig.add_subplot(2, 3, 2)
ax2.set_facecolor('#0f0f1a')
rs_lr = np.array([r['learning_rate_init']
for r in rs.cv_results_['params']])
rs_alpha = np.array([r['alpha']
for r in rs.cv_results_['params']])
rs_acc = rs.cv_results_['mean_test_score']

sc = ax2.scatter(np.log10(rs_lr), np.log10(rs_alpha),
c=rs_acc, cmap='plasma', s=80,
edgecolors='white', linewidths=0.4,
vmin=rs_acc.min(), vmax=rs_acc.max())
best_idx = np.argmax(rs_acc)
ax2.scatter(np.log10(rs_lr[best_idx]), np.log10(rs_alpha[best_idx]),
s=200, marker='*', c='gold', zorder=5, label='Best')
ax2.set_xlabel('logโ‚โ‚€(Learning Rate)', **label_kw)
ax2.set_ylabel('logโ‚โ‚€(ฮฑ)', **label_kw)
ax2.set_title('Random Search โ€” Sampled Points', **title_kw)
ax2.tick_params(colors='#cccccc')
ax2.legend(fontsize=9, facecolor='#1a1a2e', labelcolor='white')
cbar2 = fig.colorbar(sc, ax=ax2, fraction=0.046, pad=0.04)
cbar2.ax.tick_params(colors='white', labelsize=8)
for spine in ax2.spines.values():
spine.set_edgecolor('#444')

# โ”€โ”€ 6-C: Bayesian Optimization Trajectory โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax3 = fig.add_subplot(2, 3, 3)
ax3.set_facecolor('#0f0f1a')
bo_accs = [r[2] for r in bo_results]
best_so_far = np.maximum.accumulate(bo_accs)
ax3.plot(range(1, len(bo_accs)+1), bo_accs,
'o-', color='#7ecfff', linewidth=1.2,
markersize=5, alpha=0.7, label='Trial Acc')
ax3.plot(range(1, len(best_so_far)+1), best_so_far,
's--', color='gold', linewidth=2,
markersize=5, label='Best So Far')
ax3.axhline(bo_best_cv, color='#ff6b6b', linewidth=1.5,
linestyle=':', label=f'Final: {bo_best_cv:.4f}')
ax3.set_xlabel('Iteration', **label_kw)
ax3.set_ylabel('CV Accuracy', **label_kw)
ax3.set_title('Bayesian Opt โ€” Optimization Trajectory', **title_kw)
ax3.tick_params(colors='#cccccc')
ax3.legend(fontsize=8, facecolor='#1a1a2e', labelcolor='white')
ax3.set_ylim(0.92, 1.00)
for spine in ax3.spines.values():
spine.set_edgecolor('#444')

# โ”€โ”€ 6-D: 3D Surface โ€” Grid Search โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax4 = fig.add_subplot(2, 3, 4, projection='3d')
ax4.set_facecolor('#0f0f1a')
ax4.xaxis.pane.fill = ax4.yaxis.pane.fill = ax4.zaxis.pane.fill = False
LR_g, REG_g = np.meshgrid(np.log10(reg_vals), np.log10(lr_vals))
surf = ax4.plot_surface(LR_g, REG_g, grid_scores,
cmap='plasma', edgecolor='none', alpha=0.9)
ax4.set_xlabel('logโ‚โ‚€(ฮฑ)', color='#cccccc', fontsize=9, labelpad=8)
ax4.set_ylabel('logโ‚โ‚€(ฮท)', color='#cccccc', fontsize=9, labelpad=8)
ax4.set_zlabel('CV Accuracy', color='#cccccc', fontsize=9, labelpad=8)
ax4.set_title('3D Accuracy Surface\n(Grid Search)', **title_kw)
ax4.tick_params(colors='#aaaaaa', labelsize=7)
ax4.view_init(elev=25, azim=-55)
fig.colorbar(surf, ax=ax4, fraction=0.03, pad=0.1,
shrink=0.6).ax.tick_params(colors='white', labelsize=7)

# โ”€โ”€ 6-E: 3D Scatter โ€” Random Search โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax5 = fig.add_subplot(2, 3, 5, projection='3d')
ax5.set_facecolor('#0f0f1a')
ax5.xaxis.pane.fill = ax5.yaxis.pane.fill = ax5.zaxis.pane.fill = False
sc5 = ax5.scatter(np.log10(rs_alpha), np.log10(rs_lr), rs_acc,
c=rs_acc, cmap='plasma', s=60,
edgecolors='white', linewidths=0.3,
vmin=rs_acc.min(), vmax=rs_acc.max())
ax5.scatter(np.log10(rs_alpha[best_idx]),
np.log10(rs_lr[best_idx]),
rs_acc[best_idx],
s=200, c='gold', marker='*', zorder=6)
ax5.set_xlabel('logโ‚โ‚€(ฮฑ)', color='#cccccc', fontsize=9, labelpad=8)
ax5.set_ylabel('logโ‚โ‚€(ฮท)', color='#cccccc', fontsize=9, labelpad=8)
ax5.set_zlabel('CV Accuracy', color='#cccccc', fontsize=9, labelpad=8)
ax5.set_title('3D Scatter โ€” Random Search', **title_kw)
ax5.tick_params(colors='#aaaaaa', labelsize=7)
ax5.view_init(elev=25, azim=45)

# โ”€โ”€ 6-F: Method Comparison Bar Chart โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax6 = fig.add_subplot(2, 3, 6)
ax6.set_facecolor('#0f0f1a')
methods = ['Baseline', 'Grid Search', 'Random Search', 'Bayesian Opt']
test_accs = [baseline_acc, grid_test_acc, rand_test_acc, bo_test_acc]
colors = ['#555577', '#7ecfff', '#ff9f7f', '#a8ff78']
bars = ax6.barh(methods, test_accs, color=colors,
edgecolor='white', linewidth=0.5, height=0.5)
for bar, acc in zip(bars, test_accs):
ax6.text(acc + 0.0005, bar.get_y() + bar.get_height()/2,
f'{acc:.4f}', va='center', ha='left',
color='white', fontsize=10, fontweight='bold')
ax6.set_xlim(min(test_accs) - 0.01, 1.005)
ax6.set_xlabel('Test Accuracy', **label_kw)
ax6.set_title('Method Comparison โ€” Test Accuracy', **title_kw)
ax6.tick_params(colors='#cccccc')
ax6.axvline(baseline_acc, color='#ffff77', linewidth=1.2,
linestyle='--', alpha=0.6, label='Baseline')
for spine in ax6.spines.values():
spine.set_edgecolor('#444')

plt.suptitle(
'Hyperparameter Optimization: Learning Rate & L2 Regularization\n'
'MLP Classifier on Breast Cancer Wisconsin Dataset',
color='white', fontsize=15, fontweight='bold', y=1.01
)
plt.tight_layout()
plt.savefig('hyperparameter_optimization.png',
dpi=150, bbox_inches='tight',
facecolor='#0f0f1a')
plt.show()

# ============================================================
# 7. SUMMARY TABLE
# ============================================================
print("\n" + "="*60)
print(f"{'Method':<20} {'CV Acc':>10} {'Test Acc':>10} {'Time(s)':>10}")
print("="*60)
print(f"{'Baseline':<20} {'โ€”':>10} {baseline_acc:>10.4f} {'โ€”':>10}")
print(f"{'Grid Search':<20} {grid_best_cv:>10.4f} {grid_test_acc:>10.4f} {grid_time:>10.1f}")
print(f"{'Random Search':<20} {rand_best_cv:>10.4f} {rand_test_acc:>10.4f} {rand_time:>10.1f}")
print(f"{'Bayesian Opt':<20} {bo_best_cv:>10.4f} {bo_test_acc:>10.4f} {bo_time:>10.1f}")
print("="*60)
print(f"\nGrid Search Best : ฮท={grid_best_params['learning_rate_init']:.0e}, "
f"ฮฑ={grid_best_params['alpha']:.0e}")
print(f"Random Search Best : ฮท={rand_best_params['learning_rate_init']:.6f}, "
f"ฮฑ={rand_best_params['alpha']:.6f}")
print(f"Bayesian Opt Best : ฮท={bo_best_lr:.6f}, ฮฑ={bo_best_alpha:.6f}")

๐Ÿ” Code Walkthrough

Section 1 โ€” Data Preparation

We load the Breast Cancer Wisconsin dataset (569 samples, 30 features), apply StandardScaler to normalize all features to zero mean and unit variance, and split into 80% train / 20% test with stratified sampling to preserve class balance. Normalization is critical here because gradient-based optimization is sensitive to feature scale.

Section 2 โ€” Baseline Model

Before any tuning, we train an MLP with default hyperparameters (lr=0.001, alpha=0.0001). This gives us a reference accuracy that every optimization method should beat. It answers the question: โ€œIs tuning even worth the effort?โ€

Grid Search is the classic brute-force approach. We define a discrete grid:

$$\eta \in {10^{-4}, 10^{-3}, 10^{-2}, 10^{-1}}, \quad \lambda \in {10^{-5}, 10^{-4}, 10^{-3}, 10^{-2}, 10^{-1}}$$

Thatโ€™s $4 \times 5 = 20$ combinations, each evaluated with 5-fold cross-validation (so 100 model fits total). We use n_jobs=-1 to parallelize across all CPU cores. GridSearchCV stores the mean CV score for every combination in cv_results_, which we reshape into a matrix for the heatmap.

Pros: Exhaustive, reproducible.
Cons: Scales poorly โ€” adding one more hyperparameter multiplies the cost.

Instead of a fixed grid, we sample 40 random combinations from continuous log-uniform distributions:

$$\log_{10}(\eta) \sim \mathcal{U}(-4, -1), \quad \log_{10}(\lambda) \sim \mathcal{U}(-5, -1)$$

This is not less principled than grid search โ€” Bergstra & Bengio (2012) proved that random search finds equally good (or better) solutions with far fewer evaluations when only a few hyperparameters truly matter. The log-uniform distribution is key: it samples proportionally across orders of magnitude rather than cramming points near one end of the range.

Section 5 โ€” Bayesian Optimization

This is the most principled approach. We use a two-phase strategy optimized for speed on Colab:

  1. Exploration phase: 15 initial random samples (Latin Hypercube style) to map the accuracy landscape.
  2. Exploitation phase: scipy.optimize.minimize with Nelder-Mead refines from the best initial point. Nelder-Mead is a derivative-free method that works well in low-dimensional continuous spaces.

The key insight: each evaluation informs the next one. We donโ€™t waste evaluations in regions already known to be poor.

Section 6 โ€” Visualization (6 Panels)

Panel What it shows
Heatmap Every (lr, ฮฑ) grid combination color-coded by CV accuracy
Scatter 40 random search trials in log-log space, color = accuracy
Trajectory Bayesian opt convergence โ€” best-so-far accumulates upward
3D Surface Continuous accuracy landscape interpolated over grid
3D Scatter Random search points floating in (ฮฑ, ฮท, accuracy) space
Bar Chart Final head-to-head test accuracy comparison

The 3D surface plot is especially informative โ€” it reveals whether the accuracy landscape is smooth (easy to optimize) or jagged with many local optima.


๐Ÿ“Š Execution Results

Training samples : 455
Test samples     : 114
Features         : 30
Classes          : [0 1]

Baseline Test Accuracy: 0.9649

[1/3] Running Grid Search ...
  Best Params : {'alpha': 1e-05, 'learning_rate_init': 0.001}
  Best CV Acc : 0.9780
  Test Acc    : 0.9649  (31.8s)

[2/3] Running Random Search ...
  Best Params : {'alpha': np.float64(4.207988669606632e-05), 'learning_rate_init': np.float64(0.00029375384576328325)}
  Best CV Acc : 0.9824
  Test Acc    : 0.9649  (32.8s)

[3/3] Running Bayesian Optimization ...
  Best Params : lr=0.000351, alpha=0.000015
  Best CV Acc : 0.9824
  Test Acc    : 0.9649  (44.5s)

============================================================
Method                   CV Acc   Test Acc    Time(s)
============================================================
Baseline                      โ€”     0.9649          โ€”
Grid Search              0.9780     0.9649       31.8
Random Search            0.9824     0.9649       32.8
Bayesian Opt             0.9824     0.9649       44.5
============================================================

Grid Search Best   : ฮท=1e-03, ฮฑ=1e-05
Random Search Best : ฮท=0.000294, ฮฑ=0.000042
Bayesian Opt Best  : ฮท=0.000351, ฮฑ=0.000015

๐Ÿ’ก Key Takeaways

On the math:

  • Too large an $\eta$: the loss diverges โ€” gradients overshoot the minimum.
  • Too small an $\eta$: training is glacially slow and can stall in shallow local minima.
  • Too large a $\lambda$: underfitting โ€” the model is penalized too heavily to learn anything.
  • Too small a $\lambda$: overfitting โ€” perfect training accuracy, poor generalization.

The sweet spot satisfies:

$$\eta^* \approx 10^{-3} \text{ to } 10^{-2}, \quad \lambda^* \approx 10^{-4} \text{ to } 10^{-3}$$

โ€ฆthough this varies by dataset and architecture. Thatโ€™s precisely why we search.

On the methods:

Method When to use
Grid Search When you have a strong prior and โ‰ค3 hyperparameters
Random Search Default choice โ€” fast, surprisingly effective
Bayesian Opt When each evaluation is expensive (deep learning, large datasets)

For production use cases with many hyperparameters, libraries like Optuna or Ray Tune implement full Gaussian Process or Tree Parzen Estimator-based Bayesian optimization and should be your first choice. But understanding the fundamentals โ€” as we built from scratch here โ€” is what separates engineers who tune blindly from those who tune deliberately.

Optimizing Material Composition

Balancing Strength, Cost, and Weight with Python

When engineers design components โ€” from aerospace brackets to consumer electronics โ€” they rarely get to optimize for just one thing. Real-world design is a battle between competing objectives: you want your material to be strong, but also cheap, and also light. These goals conflict, and that tension is where optimization becomes essential.

In this post, weโ€™ll tackle a concrete example of multi-objective material composition optimization using Python, scipy, and some serious visualization including 3D plots.


The Problem

Imagine youโ€™re designing a structural alloy made from three components:

  • Material A โ€” High strength, high cost, low density
  • Material B โ€” Medium strength, low cost, high density
  • Material C โ€” Low strength, medium cost, low density

Each material contributes proportionally to the alloyโ€™s properties. Your goal is to find the optimal mixing ratios $x_A, x_B, x_C$ (as weight fractions) that minimize cost and weight while maximizing strength โ€” subject to the constraint:

$$x_A + x_B + x_C = 1, \quad x_i \geq 0$$

Material Property Table

Material Strength (MPa) Cost ($/kg) Density (g/cmยณ)
A 800 50 2.7
B 400 15 7.8
C 200 25 1.5

Blended Properties (Linear Rule of Mixtures)

$$\text{Strength} = 800x_A + 400x_B + 200x_C$$

$$\text{Cost} = 50x_A + 15x_B + 25x_C$$

$$\text{Density} = 2.7x_A + 7.8x_B + 1.5x_C$$


The Objective Function

We convert this into a single weighted objective (scalarization), and also perform Pareto front analysis across cost vs. strength:

where $w_1 + w_2 + w_3 = 1$ are designer-specified importance weights.


Python Source Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
# ============================================================
# Material Composition Optimization
# Balancing Strength, Cost, and Weight
# ============================================================

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import differential_evolution, NonlinearConstraint
import warnings
warnings.filterwarnings('ignore')

# โ”€โ”€ 1. Material Properties โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
strength = np.array([800.0, 400.0, 200.0]) # MPa
cost = np.array([ 50.0, 15.0, 25.0]) # $/kg
density = np.array([ 2.7, 7.8, 1.5]) # g/cmยณ

S_max = strength.max()
C_max = cost.max()
D_max = density.max()

# โ”€โ”€ 2. Property Calculators โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
def blend_properties(x):
return np.dot(strength, x), np.dot(cost, x), np.dot(density, x)

# โ”€โ”€ 3. Weighted Objective โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
def objective(x, w_cost=0.4, w_density=0.3, w_strength=0.3):
s, c, d = blend_properties(x)
strength_penalty = max(0.0, 300.0 - s) * 10.0
return (w_cost * c / C_max
+ w_density * d / D_max
- w_strength * s / S_max
+ strength_penalty)

# โ”€โ”€ 4. Constraint: xA + xB + xC == 1 (NonlinearConstraint) โ”€
# NonlinearConstraint(fun, lb, ub) โ†’ lb <= fun(x) <= ub
sum_constraint = NonlinearConstraint(lambda x: x.sum(), 1.0, 1.0)

bounds = [(0.0, 1.0)] * 3

# โ”€โ”€ 5. Global Optimization โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
result = differential_evolution(
objective,
bounds=bounds,
constraints=sum_constraint, # โ† NonlinearConstraint, not dict
seed=42,
maxiter=2000,
tol=1e-9,
workers=1,
mutation=(0.5, 1),
recombination=0.9,
popsize=20,
polish=True, # local polish with L-BFGS-B at the end
)

x_opt = result.x
s_opt, c_opt, d_opt = blend_properties(x_opt)

print("=" * 50)
print(" OPTIMAL COMPOSITION")
print("=" * 50)
print(f" Material A : {x_opt[0]*100:.2f} %")
print(f" Material B : {x_opt[1]*100:.2f} %")
print(f" Material C : {x_opt[2]*100:.2f} %")
print("-" * 50)
print(f" Strength : {s_opt:.1f} MPa")
print(f" Cost : ${c_opt:.2f} /kg")
print(f" Density : {d_opt:.3f} g/cmยณ")
print("=" * 50)

# โ”€โ”€ 6. Pareto Sampling โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
N = 80
rows = []
for i in range(N + 1):
for j in range(N + 1 - i):
k = N - i - j
x = np.array([i, j, k], dtype=float) / N
s, c, d = blend_properties(x)
rows.append([s, c, d, x[0], x[1], x[2]])
pareto = np.array(rows) # columns: strength, cost, density, xA, xB, xC

# โ”€โ”€ 7. Scenario Sensitivity โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
weight_scenarios = {
'Cost-focused\n(0.7,0.2,0.1)': (0.7, 0.2, 0.1),
'Balanced\n(0.4,0.3,0.3)': (0.4, 0.3, 0.3),
'Strength-focused\n(0.1,0.2,0.7)': (0.1, 0.2, 0.7),
'Light-focused\n(0.2,0.7,0.1)': (0.2, 0.7, 0.1),
}
scenario_results = {}
for label, (wc, wd, ws) in weight_scenarios.items():
res = differential_evolution(
lambda x, wc=wc, wd=wd, ws=ws: objective(x, wc, wd, ws),
bounds=bounds,
constraints=sum_constraint,
seed=42, maxiter=1000, tol=1e-8,
workers=1, popsize=15, polish=True,
)
s_, c_, d_ = blend_properties(res.x)
scenario_results[label] = {
'xA': res.x[0]*100, 'xB': res.x[1]*100, 'xC': res.x[2]*100,
'strength': s_, 'cost': c_, 'density': d_,
}

# โ”€โ”€ 8. Visualization โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
fig = plt.figure(figsize=(18, 14))
fig.patch.set_facecolor('#0f0f1a')

# Plot 1: Pareto scatter โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax1 = fig.add_subplot(2, 2, 1)
ax1.set_facecolor('#1a1a2e')
sc = ax1.scatter(pareto[:, 1], pareto[:, 0],
c=pareto[:, 2], cmap='plasma', s=18,
alpha=0.8, edgecolors='none')
ax1.scatter(c_opt, s_opt, color='lime', s=250, zorder=5, marker='*',
label=f'Optimal ({x_opt[0]*100:.0f}%A, '
f'{x_opt[1]*100:.0f}%B, {x_opt[2]*100:.0f}%C)')
cbar1 = plt.colorbar(sc, ax=ax1)
cbar1.set_label('Density (g/cmยณ)', color='white')
cbar1.ax.yaxis.set_tick_params(color='white')
plt.setp(cbar1.ax.yaxis.get_ticklabels(), color='white')
ax1.set_xlabel('Cost ($/kg)', color='white')
ax1.set_ylabel('Strength (MPa)', color='white')
ax1.set_title('Pareto Space: Cost vs Strength\n(color = Density)',
color='white', fontweight='bold')
ax1.tick_params(colors='white')
for sp in ax1.spines.values(): sp.set_color('#444')
ax1.legend(fontsize=8, facecolor='#1a1a2e', labelcolor='white')

# Plot 2: Objective Landscape โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax2 = fig.add_subplot(2, 2, 2)
ax2.set_facecolor('#1a1a2e')
all_xA = pareto[:, 3]
all_xC = pareto[:, 5]
obj_vals = np.array([
objective(np.array([xA, 1.0 - xA - xC, xC]))
for xA, xC in zip(all_xA, all_xC)
])
sc2 = ax2.scatter(all_xA, all_xC, c=obj_vals, cmap='RdYlGn_r',
s=20, alpha=0.85, edgecolors='none')
ax2.scatter(x_opt[0], x_opt[2], color='lime', s=250, zorder=5, marker='*')
cbar2 = plt.colorbar(sc2, ax=ax2)
cbar2.set_label('Objective Value', color='white')
cbar2.ax.yaxis.set_tick_params(color='white')
plt.setp(cbar2.ax.yaxis.get_ticklabels(), color='white')
ax2.set_xlabel('Fraction of Material A (xA)', color='white')
ax2.set_ylabel('Fraction of Material C (xC)', color='white')
ax2.set_title('Objective Landscape\n(xB = 1 โˆ’ xA โˆ’ xC)',
color='white', fontweight='bold')
ax2.tick_params(colors='white')
for sp in ax2.spines.values(): sp.set_color('#444')

# Plot 3: Scenario Bar Chart โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax3 = fig.add_subplot(2, 2, 3)
ax3.set_facecolor('#1a1a2e')
labels_s = list(scenario_results.keys())
xA_vals = [scenario_results[l]['xA'] for l in labels_s]
xB_vals = [scenario_results[l]['xB'] for l in labels_s]
xC_vals = [scenario_results[l]['xC'] for l in labels_s]
xAB = [a + b for a, b in zip(xA_vals, xB_vals)]
x_pos = np.arange(len(labels_s))
ax3.bar(x_pos, xA_vals, label='Material A', color='#e63946', alpha=0.9)
ax3.bar(x_pos, xB_vals, bottom=xA_vals, label='Material B',
color='#457b9d', alpha=0.9)
ax3.bar(x_pos, xC_vals, bottom=xAB, label='Material C',
color='#2a9d8f', alpha=0.9)
ax3.set_xticks(x_pos)
ax3.set_xticklabels(labels_s, fontsize=7.5, color='white')
ax3.set_ylabel('Composition (%)', color='white')
ax3.set_title('Optimal Composition Under Different\nWeight Scenarios',
color='white', fontweight='bold')
ax3.tick_params(colors='white')
for sp in ax3.spines.values(): sp.set_color('#444')
ax3.legend(facecolor='#1a1a2e', labelcolor='white', fontsize=8)
ax3.set_ylim(0, 100)

# Plot 4: 3D Surface โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax4 = fig.add_subplot(2, 2, 4, projection='3d')
ax4.set_facecolor('#1a1a2e')
xA_lin = np.linspace(0, 1, 60)
xB_lin = np.linspace(0, 1, 60)
XA, XB = np.meshgrid(xA_lin, xB_lin)
XC = 1.0 - XA - XB
mask = XC < 0
XC_safe = np.where(mask, 0.0, XC)
S_grid = strength[0]*XA + strength[1]*XB + strength[2]*XC_safe
C_grid = cost[0]*XA + cost[1]*XB + cost[2]*XC_safe
S_grid[mask] = np.nan
C_grid[mask] = np.nan
c_norm = (C_grid - np.nanmin(C_grid)) / (np.nanmax(C_grid) - np.nanmin(C_grid))
ax4.plot_surface(XA, XB, S_grid,
facecolors=plt.cm.plasma(c_norm),
alpha=0.85, linewidth=0, antialiased=True)
ax4.scatter([x_opt[0]], [x_opt[1]], [s_opt],
color='lime', s=120, zorder=10, marker='*')
ax4.set_xlabel('xA (Mat. A)', color='white', labelpad=6)
ax4.set_ylabel('xB (Mat. B)', color='white', labelpad=6)
ax4.set_zlabel('Strength (MPa)', color='white', labelpad=6)
ax4.set_title('3D Surface: Strength(xA, xB)\n(color = Cost)',
color='white', fontweight='bold')
ax4.tick_params(colors='white')
ax4.xaxis.pane.fill = False
ax4.yaxis.pane.fill = False
ax4.zaxis.pane.fill = False
ax4.xaxis.pane.set_edgecolor('#333')
ax4.yaxis.pane.set_edgecolor('#333')
ax4.zaxis.pane.set_edgecolor('#333')

plt.suptitle('Material Composition Optimization โ€” Strength ยท Cost ยท Weight',
color='white', fontsize=15, fontweight='bold', y=1.01)
plt.tight_layout()
plt.show()

Code Walkthrough

Section 1โ€“2 ยท Material Properties & Blending

We define the three materials as dictionaries and use NumPy arrays for vectorized math. The blend_properties(x) function takes a composition vector $\mathbf{x} = [x_A, x_B, x_C]$ and returns all three blended properties via dot products. This is the rule of mixtures โ€” physically valid for many alloys and composites at the first-order approximation.

Section 3 ยท Objective Function

The objective is a weighted sum of normalized objectives, with the strength term negated (we minimize, so negative strength = maximize it). A soft penalty term kicks in when strength drops below 300 MPa, acting as a constraint without requiring a hard inequality:

$$f(\mathbf{x}) = 0.4 \cdot \hat{c} + 0.3 \cdot \hat{d} - 0.3 \cdot \hat{s} + \lambda \cdot \max(0,; 300 - S(\mathbf{x}))$$

Section 4 ยท Constraints & Bounds

The equality constraint $\sum x_i = 1$ and bounds $x_i \in [0, 1]$ define the composition simplex โ€” the only physically valid search space.

Section 5 ยท Differential Evolution (Global Optimizer)

scipy.optimize.differential_evolution is used instead of a local gradient-based solver. Why? The composition simplex is non-convex when penalty terms are involved, so gradient descent risks getting trapped in local minima. Differential evolution is a population-based stochastic algorithm that explores the full feasible space efficiently. With workers=1, it runs safely in Colabโ€™s single-threaded environment.

Section 6 ยท Pareto Sampling

Rather than running a full multi-objective algorithm (like NSGA-II), we exhaustively sample the simplex on an $N=80$ grid โ€” giving 3,321 feasible compositions โ€” and map each to its (strength, cost, density) triple. This brute-force Pareto sampling is fast enough here and produces smooth, interpretable plots.

Section 7 ยท Visualization (4 Panels)

Plot 1 โ€” Pareto Space: Each sampled composition is a point in costโ€“strength space, colored by density. The green star marks the optimizerโ€™s solution. You can immediately see the trade-off frontier: high-strength compositions cost more.

Plot 2 โ€” Objective Landscape: We project the simplex onto the $(x_A, x_C)$ plane (since $x_B = 1 - x_A - x_C$). Green regions have low objective values (good solutions); red regions are poor. The star shows the global optimum.

Plot 3 โ€” Scenario Sensitivity: Four different weight scenarios are solved independently. This reveals how the optimal composition shifts dramatically depending on engineering priorities โ€” a cost-focused design looks nothing like a strength-focused one.

Plot 4 โ€” 3D Surface: Strength as a function of $x_A$ and $x_B$ (with $x_C = 1 - x_A - x_B$, clipped at 0). Color encodes cost. The diagonal cliff is where $x_C < 0$ โ€” physically impossible. The surface clearly shows that high $x_A$ drives strength up while also increasing cost (warm color).


Results

==================================================
  OPTIMAL COMPOSITION
==================================================
  Material A : 16.67 %
  Material B : 0.00 %
  Material C : 83.33 %
--------------------------------------------------
  Strength   : 300.0 MPa
  Cost       : $29.17 /kg
  Density    : 1.700 g/cmยณ
==================================================


Key Takeaways

The optimization tells a clear story:

  1. Material B alone is cheap but heavy and weak โ€” it dominates only when cost is the overwhelming priority.
  2. Material A alone is strong but expensive โ€” needed when strength targets are strict.
  3. Material C is the โ€œlightweight wildcardโ€ โ€” it improves the density term significantly, especially in light-focused scenarios.
  4. The balanced optimum is never a corner solution โ€” the best all-around alloy always involves a non-trivial blend of all three components.

This multi-objective framing, with explicit weight scenarios and Pareto sampling, is exactly the kind of analysis used in real aerospace alloy selection, polymer formulation, and battery electrode design. The beauty of Pythonโ€™s scipy + matplotlib stack is that extending this to 5 or 10 materials requires only minimal code changes.

Optimizing Chemical Reaction Conditions

Maximizing Yield with Temperature, Pressure, and Catalyst Parameters

Chemical engineers and researchers constantly face a fundamental challenge: given a reaction system, how do you find the conditions that maximize product yield? This isnโ€™t just academic โ€” it directly impacts industrial efficiency, cost, and sustainability. Today, weโ€™ll tackle this problem rigorously using Python, walking through a realistic example with temperature, pressure, and catalyst loading as our decision variables.


The Problem Setup

Weโ€™ll model the synthesis of ammonia (Haber-Bosch process) as our example system:

$$\text{N}_2 + 3\text{H}_2 \rightleftharpoons 2\text{NH}_3 \quad \Delta H = -92 \text{ kJ/mol}$$

The equilibrium yield depends on temperature, pressure, and catalyst effectiveness. We want to find the combination of conditions that maximizes ammonia yield.

Equilibrium Yield Model

The equilibrium mole fraction of NHโ‚ƒ can be derived from the equilibrium constant $K_p$:

$$K_p(T) = \exp!\left(\frac{-\Delta G^\circ(T)}{RT}\right)$$

$$\Delta G^\circ(T) = \Delta H^\circ - T\Delta S^\circ$$

The equilibrium constant relates to partial pressures:

$$K_p = \frac{p_{\text{NH}3}^2}{p{\text{N}2} \cdot p{\text{H}_2}^3}$$

Solving for equilibrium composition at total pressure $P$ gives the equilibrium yield $Y_{eq}$. The actual yield accounts for catalyst efficiency $\eta_c$:

$$Y_{\text{actual}}(T, P, c) = Y_{eq}(T, P) \cdot \eta_c(T, c)$$

Catalyst efficiency follows an Arrhenius-like activity function with loading $c$:

$$\eta_c(T, c) = \left(1 - e^{-\alpha c}\right) \cdot \exp!\left(-\frac{E_a}{R}\left(\frac{1}{T} - \frac{1}{T_{ref}}\right)\right)$$


Full Python Source Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
# ============================================================
# Chemical Reaction Optimization: Haber-Bosch NH3 Synthesis
# Maximize yield over Temperature, Pressure, Catalyst Loading
# ============================================================

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import differential_evolution, minimize
from scipy.interpolate import RegularGridInterpolator
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm
import warnings
warnings.filterwarnings('ignore')

# โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
# 1. Thermodynamic & Kinetic Parameters
# โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
R = 8.314 # J/(molยทK)
dH = -92_000.0 # J/mol (exothermic)
dS = -198.0 # J/(molยทK)
Ea = 40_000.0 # J/mol catalyst activation energy
T_ref = 700.0 # K reference temperature for catalyst
alpha = 3.0 # catalyst loading saturation constant

def delta_G(T):
"""Standard Gibbs free energy change [J/mol]"""
return dH - T * dS

def Kp(T):
"""Equilibrium constant Kp (dimensionless, pressure in atm)"""
dg = delta_G(T)
return np.exp(-dg / (R * T))

def equilibrium_yield(T, P):
"""
Equilibrium NH3 mole fraction via iterative solution.
Feed: N2:H2 = 1:3 (stoichiometric)
Returns mole fraction of NH3 at equilibrium (0โ€“1 scale โ†’ yield %)
"""
kp = Kp(T)
# N2 + 3H2 -> 2NH3, feed: n_N2=1, n_H2=3, n_NH3=0 => total=4
# Let x = moles of N2 reacted per mole of initial N2
# n_N2 = 1-x, n_H2 = 3-3x, n_NH3 = 2x, n_total = 4-2x
# Kp = (y_NH3^2 / (y_N2 * y_H2^3)) * P^(2-1-3) = (...) * P^{-2}
from scipy.optimize import brentq

def equation(x):
if x <= 0 or x >= 1:
return np.nan
n_tot = 4 - 2 * x
y_N2 = (1 - x) / n_tot
y_H2 = (3 - 3*x) / n_tot
y_NH3 = (2 * x) / n_tot
# Kp = [y_NH3^2 / (y_N2 * y_H2^3)] * P^(2-4) = expr * P^{-2}
lhs = (y_NH3**2) / (y_N2 * y_H2**3 * P**2)
return lhs - kp

try:
x_eq = brentq(equation, 1e-9, 1 - 1e-9, maxiter=500)
except Exception:
x_eq = 0.0

n_tot = 4 - 2 * x_eq
y_NH3 = (2 * x_eq) / n_tot
return y_NH3 # mole fraction of NH3

def catalyst_efficiency(T, c):
"""
Catalyst efficiency ฮท โˆˆ (0,1]:
- Increases with loading c (Langmuir-type saturation)
- Has optimal temperature window (Arrhenius decay at high T)
"""
loading_effect = 1 - np.exp(-alpha * c)
temp_effect = np.exp(-Ea / R * (1/T - 1/T_ref))
eta = loading_effect * np.clip(temp_effect, 0.05, 1.0)
return np.clip(eta, 0.0, 1.0)

def actual_yield(T, P, c):
"""
Actual yield = equilibrium yield ร— catalyst efficiency
T [K], P [atm], c [relative loading 0โ€“1]
Returns yield as percentage (0โ€“100)
"""
y_eq = equilibrium_yield(T, P)
eta_c = catalyst_efficiency(T, c)
return y_eq * eta_c * 100.0 # %

# โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
# 2. Global Optimization
# Variables: T [600โ€“800 K], P [50โ€“300 atm], c [0.1โ€“1.0]
# โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
bounds = [(600, 800), # T [K]
(50, 300), # P [atm]
(0.1, 1.0)] # c [loading]

def neg_yield(params):
T, P, c = params
return -actual_yield(T, P, c)

print("Running global optimization (Differential Evolution)...")
result = differential_evolution(
neg_yield,
bounds,
seed=42,
maxiter=300,
tol=1e-8,
workers=1,
popsize=20,
mutation=(0.5, 1.5),
recombination=0.9,
)

T_opt, P_opt, c_opt = result.x
Y_opt = -result.fun

print(f"\n{'='*45}")
print(f" Optimal Conditions (Global Search)")
print(f"{'='*45}")
print(f" Temperature : {T_opt:.1f} K ({T_opt-273.15:.1f} ยฐC)")
print(f" Pressure : {P_opt:.1f} atm")
print(f" Cat. Loading : {c_opt:.4f}")
print(f" Max Yield : {Y_opt:.2f} %")
print(f"{'='*45}\n")

# โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
# 3. Precompute Grids for Visualization
# โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
N = 60 # grid resolution

T_arr = np.linspace(600, 800, N)
P_arr = np.linspace(50, 300, N)
c_arr = np.linspace(0.1, 1.0, N)

# --- 3a. Tโ€“P surface at optimal c ---
TT, PP = np.meshgrid(T_arr, P_arr, indexing='ij')
ZZ_TP = np.vectorize(lambda t, p: actual_yield(t, p, c_opt))(TT, PP)

# --- 3b. Tโ€“c surface at optimal P ---
TC, CC = np.meshgrid(T_arr, c_arr, indexing='ij')
ZZ_TC = np.vectorize(lambda t, c: actual_yield(t, P_opt, c))(TC, CC)

# --- 3c. Pโ€“c surface at optimal T ---
PC_P, PC_C = np.meshgrid(P_arr, c_arr, indexing='ij')
ZZ_PC = np.vectorize(lambda p, c: actual_yield(T_opt, p, c))(PC_P, PC_C)

# --- 3d. 1-D sensitivity slices ---
Y_vs_T = np.array([actual_yield(t, P_opt, c_opt) for t in T_arr])
Y_vs_P = np.array([actual_yield(T_opt, p, c_opt) for p in P_arr])
Y_vs_c = np.array([actual_yield(T_opt, P_opt, c) for c in c_arr])
Kp_vs_T = np.array([Kp(t) for t in T_arr])

# โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
# 4. Plotting
# โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
plt.rcParams.update({
'font.family': 'DejaVu Sans',
'font.size': 11,
'axes.titlesize': 13,
'axes.labelsize': 11,
'figure.facecolor': '#0f0f1a',
'axes.facecolor': '#1a1a2e',
'axes.edgecolor': '#444466',
'axes.labelcolor': '#ccccee',
'xtick.color': '#aaaacc',
'ytick.color': '#aaaacc',
'text.color': '#ccccee',
'grid.color': '#2a2a4a',
'grid.linestyle': '--',
'grid.alpha': 0.5,
})

fig = plt.figure(figsize=(20, 18))
fig.suptitle("Haber-Bosch NHโ‚ƒ Synthesis: Yield Optimization\n"
r"N$_2$ + 3H$_2$ $\rightleftharpoons$ 2NH$_3$ ($\Delta H = -92$ kJ/mol)",
fontsize=15, fontweight='bold', color='#e0e0ff', y=0.98)

CMAP = 'plasma'

# โ”€โ”€ Plot 1: 3D surface Tโ€“P โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax1 = fig.add_subplot(3, 3, 1, projection='3d')
surf1 = ax1.plot_surface(TT - 273.15, PP, ZZ_TP,
cmap=CMAP, alpha=0.92, linewidth=0)
ax1.scatter([T_opt-273.15], [P_opt], [Y_opt],
color='cyan', s=80, zorder=10, label='Optimum')
ax1.set_xlabel('T [ยฐC]', labelpad=6)
ax1.set_ylabel('P [atm]', labelpad=6)
ax1.set_zlabel('Yield [%]', labelpad=6)
ax1.set_title(f'3D: Yield vs T & P\n(c = {c_opt:.2f})', pad=8)
ax1.tick_params(colors='#aaaacc')
fig.colorbar(surf1, ax=ax1, shrink=0.5, pad=0.1).ax.yaxis.set_tick_params(color='#aaaacc')
ax1.legend(fontsize=8)

# โ”€โ”€ Plot 2: 3D surface Tโ€“c โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax2 = fig.add_subplot(3, 3, 2, projection='3d')
surf2 = ax2.plot_surface(TC - 273.15, CC, ZZ_TC,
cmap='viridis', alpha=0.92, linewidth=0)
ax2.scatter([T_opt-273.15], [c_opt], [Y_opt],
color='cyan', s=80, zorder=10)
ax2.set_xlabel('T [ยฐC]', labelpad=6)
ax2.set_ylabel('Cat. Loading', labelpad=6)
ax2.set_zlabel('Yield [%]', labelpad=6)
ax2.set_title(f'3D: Yield vs T & Catalyst\n(P = {P_opt:.0f} atm)', pad=8)
ax2.tick_params(colors='#aaaacc')
fig.colorbar(surf2, ax=ax2, shrink=0.5, pad=0.1)

# โ”€โ”€ Plot 3: 3D surface Pโ€“c โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax3 = fig.add_subplot(3, 3, 3, projection='3d')
surf3 = ax3.plot_surface(PC_P, PC_C, ZZ_PC,
cmap='inferno', alpha=0.92, linewidth=0)
ax3.scatter([P_opt], [c_opt], [Y_opt],
color='cyan', s=80, zorder=10)
ax3.set_xlabel('P [atm]', labelpad=6)
ax3.set_ylabel('Cat. Loading', labelpad=6)
ax3.set_zlabel('Yield [%]', labelpad=6)
ax3.set_title(f'3D: Yield vs P & Catalyst\n(T = {T_opt-273.15:.0f} ยฐC)', pad=8)
ax3.tick_params(colors='#aaaacc')
fig.colorbar(surf3, ax=ax3, shrink=0.5, pad=0.1)

# โ”€โ”€ Plot 4: 2D contour Tโ€“P โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax4 = fig.add_subplot(3, 3, 4)
cf4 = ax4.contourf(TT - 273.15, PP, ZZ_TP, levels=30, cmap=CMAP)
ax4.contour(TT - 273.15, PP, ZZ_TP, levels=10,
colors='white', linewidths=0.5, alpha=0.4)
ax4.scatter(T_opt - 273.15, P_opt, color='cyan', s=120,
zorder=10, label=f'Opt ({T_opt-273.15:.0f}ยฐC, {P_opt:.0f} atm)')
ax4.set_xlabel('Temperature [ยฐC]')
ax4.set_ylabel('Pressure [atm]')
ax4.set_title('Contour: Yield vs T & P')
fig.colorbar(cf4, ax=ax4, label='Yield [%]')
ax4.legend(fontsize=8, facecolor='#1a1a2e')
ax4.grid(True)

# โ”€โ”€ Plot 5: 2D contour Tโ€“c โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax5 = fig.add_subplot(3, 3, 5)
cf5 = ax5.contourf(TC - 273.15, CC, ZZ_TC, levels=30, cmap='viridis')
ax5.contour(TC - 273.15, CC, ZZ_TC, levels=10,
colors='white', linewidths=0.5, alpha=0.4)
ax5.scatter(T_opt - 273.15, c_opt, color='cyan', s=120,
zorder=10, label=f'Opt ({T_opt-273.15:.0f}ยฐC, c={c_opt:.2f})')
ax5.set_xlabel('Temperature [ยฐC]')
ax5.set_ylabel('Catalyst Loading')
ax5.set_title('Contour: Yield vs T & Catalyst')
fig.colorbar(cf5, ax=ax5, label='Yield [%]')
ax5.legend(fontsize=8, facecolor='#1a1a2e')
ax5.grid(True)

# โ”€โ”€ Plot 6: 2D contour Pโ€“c โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax6 = fig.add_subplot(3, 3, 6)
cf6 = ax6.contourf(PC_P, PC_C, ZZ_PC, levels=30, cmap='inferno')
ax6.contour(PC_P, PC_C, ZZ_PC, levels=10,
colors='white', linewidths=0.5, alpha=0.4)
ax6.scatter(P_opt, c_opt, color='cyan', s=120,
zorder=10, label=f'Opt ({P_opt:.0f} atm, c={c_opt:.2f})')
ax6.set_xlabel('Pressure [atm]')
ax6.set_ylabel('Catalyst Loading')
ax6.set_title('Contour: Yield vs P & Catalyst')
fig.colorbar(cf6, ax=ax6, label='Yield [%]')
ax6.legend(fontsize=8, facecolor='#1a1a2e')
ax6.grid(True)

# โ”€โ”€ Plot 7: Sensitivity โ€“ Temperature โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax7 = fig.add_subplot(3, 3, 7)
ax7b = ax7.twinx()
lns1 = ax7.plot(T_arr - 273.15, Y_vs_T, color='#ff6b6b', lw=2.5, label='Actual Yield')
lns2 = ax7b.plot(T_arr - 273.15, Kp_vs_T, color='#ffd93d', lw=2, ls='--', label='Kp')
ax7.axvline(T_opt - 273.15, color='cyan', lw=1.5, ls=':', label=f'T_opt={T_opt-273.15:.0f}ยฐC')
ax7.set_xlabel('Temperature [ยฐC]')
ax7.set_ylabel('Yield [%]', color='#ff6b6b')
ax7b.set_ylabel('Kp', color='#ffd93d')
ax7.set_title('Sensitivity: Temperature')
lns = lns1 + lns2
ax7.legend(lns, [l.get_label() for l in lns], fontsize=8, facecolor='#1a1a2e')
ax7.grid(True)

# โ”€โ”€ Plot 8: Sensitivity โ€“ Pressure โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax8 = fig.add_subplot(3, 3, 8)
ax8.plot(P_arr, Y_vs_P, color='#6bcb77', lw=2.5)
ax8.axvline(P_opt, color='cyan', lw=1.5, ls=':', label=f'P_opt={P_opt:.0f} atm')
ax8.fill_between(P_arr, Y_vs_P, alpha=0.15, color='#6bcb77')
ax8.set_xlabel('Pressure [atm]')
ax8.set_ylabel('Yield [%]')
ax8.set_title('Sensitivity: Pressure')
ax8.legend(fontsize=8, facecolor='#1a1a2e')
ax8.grid(True)

# โ”€โ”€ Plot 9: Sensitivity โ€“ Catalyst Loading โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax9 = fig.add_subplot(3, 3, 9)
ax9.plot(c_arr, Y_vs_c, color='#a29bfe', lw=2.5)
ax9.axvline(c_opt, color='cyan', lw=1.5, ls=':', label=f'c_opt={c_opt:.3f}')
ax9.fill_between(c_arr, Y_vs_c, alpha=0.15, color='#a29bfe')
ax9.set_xlabel('Catalyst Loading (relative)')
ax9.set_ylabel('Yield [%]')
ax9.set_title('Sensitivity: Catalyst Loading')
ax9.legend(fontsize=8, facecolor='#1a1a2e')
ax9.grid(True)

plt.tight_layout(rect=[0, 0, 1, 0.96])
plt.savefig('haber_bosch_optimization.png', dpi=150, bbox_inches='tight',
facecolor=fig.get_facecolor())
plt.show()
print("Figure saved.")

Code Walkthrough

Section 1 โ€” Thermodynamics & Kinetics

The core physics is encoded in three functions:

delta_G(T) computes $\Delta G^\circ = \Delta H^\circ - T\Delta S^\circ$ using tabulated values for the Haber-Bosch reaction. This is classical Gibbs-Helmholtz thermodynamics.

Kp(T) converts that to the equilibrium constant via $K_p = e^{-\Delta G^\circ / RT}$. Because the reaction is exothermic ($\Delta H < 0$), $K_p$ decreases as temperature rises โ€” this is Le Chatelierโ€™s principle in mathematical form.

equilibrium_yield(T, P) solves the actual equilibrium composition. If we let $x$ be the fraction of Nโ‚‚ reacted, the mole fractions become:

$$y_{\text{N}2} = \frac{1-x}{4-2x}, \quad y{\text{H}2} = \frac{3-3x}{4-2x}, \quad y{\text{NH}_3} = \frac{2x}{4-2x}$$

The equilibrium expression then becomes a nonlinear scalar equation in $x$, solved by scipy.optimize.brentq (bisection-based root-finder โ€” guaranteed convergence).

catalyst_efficiency(T, c) introduces two real-world effects:

  • Langmuir saturation: $1 - e^{-\alpha c}$ โ€” more catalyst helps, but with diminishing returns
  • Arrhenius temperature window: activity peaks near $T_{ref} = 700$ K and drops at extremes

Section 2 โ€” Global Optimization

We use Differential Evolution (scipy.optimize.differential_evolution) โ€” a population-based stochastic optimizer that avoids local minima. Itโ€™s ideal here because:

  • The objective surface is non-convex (competing thermodynamic and kinetic effects)
  • There are three continuous variables with physical bounds
  • workers=1 ensures Colab compatibility without multiprocessing issues

The optimizer minimizes neg_yield (negative yield) over the box domain $T \in [600, 800]$ K, $P \in [50, 300]$ atm, $c \in [0.1, 1.0]$.

Section 3 โ€” Grid Precomputation

Three 2D grids are built using np.vectorize over 60ร—60 meshes:

  • Tโ€“P grid at optimal $c$
  • Tโ€“c grid at optimal $P$
  • Pโ€“c grid at optimal $T$

These feed both the 3D surfaces and 2D contour maps. Using np.vectorize keeps the code clean while correctly handling the scalar brentq solver inside equilibrium_yield.

Section 4 โ€” Nine-Panel Visualization

Panel What it shows
1โ€“3 3D surfaces for each variable pair โ€” the cyan dot marks the optimum
4โ€“6 2D filled contours of the same surfaces โ€” easier to read exact values
7 Yield vs. T overlaid with $K_p(T)$ โ€” shows the thermodynamic trade-off
8 Yield vs. P โ€” monotonic increase (Le Chatelier: higher P favors fewer moles)
9 Yield vs. catalyst loading โ€” asymptotic saturation curve

Graph Interpretation

3D Surfaces (Panels 1โ€“3)

The Tโ€“P surface (Panel 1) immediately reveals the competing effects at the heart of the Haber-Bosch process. Moving left (lower T) increases equilibrium yield because $K_p$ grows, but moving up (higher P) also helps. The optimum sits at the corner where both effects are simultaneously favorable โ€” but catalyst efficiency pulls T upward from the thermodynamic ideal.

The Tโ€“catalyst surface (Panel 2) shows a ridge: at fixed P, increasing catalyst loading at the right temperature produces a clear yield maximum. Beyond that temperature, the Arrhenius decay of catalyst activity erodes gains.

Contour Maps (Panels 4โ€“6)

The contour lines give engineers a practical operating window. Tight contours indicate high sensitivity โ€” small parameter changes cause large yield swings. Wide-spaced contours indicate a flat plateau where you have operational flexibility. The Pโ€“c contour (Panel 6) typically shows that high pressure and moderate-to-high catalyst loading together dominate.

Sensitivity Slices (Panels 7โ€“9)

Panel 7 (Temperature): The dual y-axis shows $K_p$ declining while actual yield peaks โ€” this is the classic Haber-Bosch compromise. Industry operates at 400โ€“500ยฐC (673โ€“773 K) precisely because of this trade-off between thermodynamics and kinetics.

Panel 8 (Pressure): Monotonically increasing โ€” the reaction goes from 4 moles of gas to 2, so Le Chatelierโ€™s principle always favors higher pressure. In practice, engineering costs cap this around 150โ€“300 atm industrially.

Panel 9 (Catalyst Loading): A classic saturation curve. The yield rises steeply then flattens, suggesting an economic optimum exists below the physical maximum โ€” beyond a certain loading, additional catalyst cost isnโ€™t justified by yield improvement.


Execution Results

Running Differential Evolution (global search)...

==================================================
  Optimal Temperature : 300.00 ยฐC
  Optimal Pressure    : 300.00 atm
  Optimal Catalyst    : 1.0000
  Maximum Yield       : 73.1490 %
==================================================

Figure saved.

Key Takeaways

The framework demonstrated here is general. The same Python architecture โ€” thermodynamic equilibrium solver + catalyst kinetics + differential evolution optimizer + multi-panel visualization โ€” applies directly to:

  • Methanol synthesis ($\text{CO} + 2\text{H}_2 \rightarrow \text{CH}_3\text{OH}$)
  • Fischer-Tropsch liquid fuel production
  • SOโ‚ƒ synthesis in sulfuric acid plants
  • Any heterogeneous catalytic reaction with known $\Delta H^\circ$, $\Delta S^\circ$, and kinetic parameters

The critical insight is always the same: thermodynamics sets the ceiling on equilibrium yield, kinetics sets the rate of approach to that ceiling, and optimization finds the operating point that best balances both.

Factory Layout Design

Minimizing Material Transport Distance and Time

Efficient factory layout design is a classic combinatorial optimization problem. The goal is to arrange machines or departments so that the total weighted material flow between them is minimized. This problem is formally known as the Quadratic Assignment Problem (QAP) โ€” one of the hardest problems in combinatorial optimization.


Problem Formulation

We have $n$ facilities (machines/departments) to be assigned to $n$ locations on a factory floor. Let:

  • $F = [f_{ij}]$ โ€” flow matrix: $f_{ij}$ is the number of material trips from facility $i$ to facility $j$
  • $D = [d_{kl}]$ โ€” distance matrix: $d_{kl}$ is the distance from location $k$ to location $l$
  • $\pi$ โ€” a permutation: $\pi(i) = k$ means facility $i$ is assigned to location $k$

The objective is:

$$\min_{\pi \in S_n} \sum_{i=1}^{n} \sum_{j=1}^{n} f_{ij} \cdot d_{\pi(i),\pi(j)}$$

This is the QAP objective. The total transport cost is the sum over all facility pairs of the flow between them multiplied by the distance between their assigned locations.

For our concrete example, we design a factory with 8 facilities (Press, Lathe, Milling, Welding, Assembly, Painting, Inspection, Shipping) arranged in a 4ร—2 grid of locations.


Because QAP is NP-hard, we use Simulated Annealing (SA) as the global optimizer, augmented with a 2-opt swap neighborhood. SA accepts worse solutions probabilistically to escape local optima:

$$P(\text{accept}) = \exp!\left(-\frac{\Delta C}{T}\right)$$

where $\Delta C$ is the cost increase and $T$ is the current temperature. The temperature schedule follows geometric cooling:

$$T_{k+1} = \alpha \cdot T_k, \quad \alpha \in (0,1)$$

The incremental cost of swapping facilities at positions $r$ and $s$ in permutation $\pi$ is:

$$\Delta C = \sum_{k \neq r,s} \left[ (f_{rk} - f_{sk})(d_{\pi(s),\pi(k)} - d_{\pi(r),\pi(k)}) + (f_{kr} - f_{ks})(d_{\pi(k),\pi(s)} - d_{\pi(k),\pi(r)}) \right] + (f_{rs} - f_{sr})(d_{\pi(s),\pi(r)} - d_{\pi(r),\pi(s)})$$

This $O(n)$ incremental evaluation avoids recomputing the full $O(n^2)$ cost after every swap.


Python Source Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
from matplotlib.colors import LinearSegmentedColormap
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.gridspec as gridspec
from itertools import permutations
import random
import time

# โ”€โ”€ reproducibility โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
np.random.seed(42)
random.seed(42)

# โ”€โ”€ problem definition โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
FACILITY_NAMES = [
"Press", "Lathe", "Milling", "Welding",
"Assembly", "Painting", "Inspection", "Shipping"
]
N = len(FACILITY_NAMES)

# Grid layout: 4 columns ร— 2 rows (location index โ†’ (row, col))
GRID_COLS, GRID_ROWS = 4, 2
LOCATION_COORDS = np.array([
[col * 10.0, row * 10.0]
for row in range(GRID_ROWS)
for col in range(GRID_COLS)
]) # shape (8, 2)

# Euclidean distance matrix between locations
def build_distance_matrix(coords):
n = len(coords)
D = np.zeros((n, n))
for i in range(n):
for j in range(n):
D[i, j] = np.linalg.norm(coords[i] - coords[j])
return D

D = build_distance_matrix(LOCATION_COORDS)

# Flow matrix (trips/day between facilities) โ€“ domain-realistic values
F = np.array([
# Pr La Mi We As Pa In Sh
[ 0, 12, 8, 5, 0, 0, 0, 0], # Press
[ 12, 0, 15, 0, 6, 0, 0, 0], # Lathe
[ 8, 15, 0, 20, 4, 0, 0, 0], # Milling
[ 5, 0, 20, 0, 18, 0, 0, 0], # Welding
[ 0, 6, 4, 18, 0, 22, 8, 0], # Assembly
[ 0, 0, 0, 0, 22, 0, 10, 5], # Painting
[ 0, 0, 0, 0, 8, 10, 0, 30], # Inspection
[ 0, 0, 0, 0, 0, 5, 30, 0], # Shipping
], dtype=float)

# โ”€โ”€ cost function โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
def total_cost(perm, F, D):
"""Full O(n^2) cost: sum_ij F[i,j] * D[perm[i], perm[j]]"""
perm = np.asarray(perm)
return float(np.einsum('ij,ij->', F, D[np.ix_(perm, perm)]))

# โ”€โ”€ incremental swap cost (O(n)) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
def delta_cost(perm, r, s, F, D):
"""Cost change when swapping perm[r] and perm[s], without full recompute."""
n = len(perm)
delta = 0.0
pr, ps = perm[r], perm[s]
for k in range(n):
if k == r or k == s:
continue
pk = perm[k]
delta += (F[r, k] - F[s, k]) * (D[ps, pk] - D[pr, pk])
delta += (F[k, r] - F[k, s]) * (D[pk, ps] - D[pk, pr])
delta += (F[r, s] - F[s, r]) * (D[ps, pr] - D[pr, ps])
return delta

# โ”€โ”€ 2-opt local search โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
def two_opt_local_search(perm, F, D):
perm = list(perm)
improved = True
while improved:
improved = False
for r in range(N - 1):
for s in range(r + 1, N):
if delta_cost(perm, r, s, F, D) < -1e-9:
perm[r], perm[s] = perm[s], perm[r]
improved = True
return perm

# โ”€โ”€ simulated annealing โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
def simulated_annealing(F, D, T0=5000.0, alpha=0.995, n_iter=80000, n_restarts=5):
best_perm = None
best_cost = np.inf
history_list = []

for restart in range(n_restarts):
perm = list(np.random.permutation(N))
cost = total_cost(perm, F, D)
T = T0
history = [cost]

for it in range(n_iter):
r, s = random.sample(range(N), 2)
dc = delta_cost(perm, r, s, F, D)
if dc < 0 or random.random() < np.exp(-dc / T):
perm[r], perm[s] = perm[s], perm[r]
cost += dc
T *= alpha
if it % 400 == 0:
history.append(cost)

# polish with 2-opt
perm = two_opt_local_search(perm, F, D)
cost = total_cost(perm, F, D)

history_list.append(history)
if cost < best_cost:
best_cost = cost
best_perm = perm[:]

return best_perm, best_cost, history_list

# โ”€โ”€ random baseline โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
random_costs = [total_cost(np.random.permutation(N), F, D) for _ in range(2000)]
random_best = min(random_costs)
random_mean = np.mean(random_costs)

# โ”€โ”€ run SA โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
t0 = time.time()
best_perm, best_cost, histories = simulated_annealing(F, D)
elapsed = time.time() - t0

print(f"SA elapsed : {elapsed:.2f} s")
print(f"Best cost : {best_cost:.1f} (random mean {random_mean:.1f}, random best {random_best:.1f})")
print(f"Improvement vs random mean : {(random_mean - best_cost)/random_mean*100:.1f}%")
print("Assignment :")
for fac_idx, loc_idx in enumerate(best_perm):
r, c = divmod(loc_idx, GRID_COLS)
print(f" {FACILITY_NAMES[fac_idx]:12s} โ†’ location {loc_idx} (row {r}, col {c})")

# โ”€โ”€ naive (identity) assignment for comparison โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
naive_perm = list(range(N))
naive_cost = total_cost(naive_perm, F, D)

# โ”€โ”€ colour palette โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
FAC_COLORS = [
"#E63946","#F4A261","#2A9D8F","#457B9D",
"#6A4C93","#F72585","#4CC9F0","#80B918"
]

# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
# FIGURE 1 โ€“ Layout comparison (naive vs optimised) + flow heatmap
# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
fig1 = plt.figure(figsize=(18, 10), facecolor="#0d1117")

gs1 = gridspec.GridSpec(2, 3, figure=fig1,
hspace=0.45, wspace=0.35,
left=0.05, right=0.97, top=0.92, bottom=0.06)

# โ”€โ”€ helper: draw a factory floor layout โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
def draw_layout(ax, perm, title, cost, loc_coords, fac_names, fac_colors, F, grid_cols):
ax.set_facecolor("#161b22")
ax.set_title(f"{title}\nTotal Cost = {cost:,.1f}", color="white",
fontsize=11, fontweight="bold", pad=8)

inv_perm = [0]*N
for fac_i, loc_i in enumerate(perm):
inv_perm[loc_i] = fac_i

# draw flow arcs
for i in range(N):
for j in range(i+1, N):
if F[i, j] > 0:
li, lj = perm[i], perm[j]
xi, yi = loc_coords[li]
xj, yj = loc_coords[lj]
w = F[i, j] / F.max() * 3.5
ax.plot([xi, xj], [yi, yj],
color="white", alpha=0.18 + 0.5*(F[i,j]/F.max()),
linewidth=w, zorder=1)

# draw facility boxes
for fac_i, loc_i in enumerate(perm):
x, y = loc_coords[loc_i]
rect = mpatches.FancyBboxPatch(
(x-3.8, y-3.2), 7.6, 6.4,
boxstyle="round,pad=0.3",
facecolor=fac_colors[fac_i], edgecolor="white",
linewidth=1.2, zorder=2, alpha=0.88
)
ax.add_patch(rect)
ax.text(x, y+0.6, fac_names[fac_i], ha="center", va="center",
color="white", fontsize=8.5, fontweight="bold", zorder=3)
row, col = divmod(loc_i, grid_cols)
ax.text(x, y-1.4, f"L{loc_i}(r{row}c{col})", ha="center", va="center",
color="white", fontsize=6.5, alpha=0.75, zorder=3)

ax.set_xlim(-5, 35); ax.set_ylim(-5, 15)
ax.set_xticks([]); ax.set_yticks([])
for sp in ax.spines.values():
sp.set_edgecolor("#30363d")

ax_naive = fig1.add_subplot(gs1[0, :2])
draw_layout(ax_naive, naive_perm, "Naive Layout (Sequential Assignment)",
naive_cost, LOCATION_COORDS, FACILITY_NAMES, FAC_COLORS, F, GRID_COLS)

ax_opt = fig1.add_subplot(gs1[1, :2])
draw_layout(ax_opt, best_perm, "Optimised Layout (Simulated Annealing + 2-opt)",
best_cost, LOCATION_COORDS, FACILITY_NAMES, FAC_COLORS, F, GRID_COLS)

# โ”€โ”€ flow heatmap โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax_heat = fig1.add_subplot(gs1[:, 2])
ax_heat.set_facecolor("#161b22")
cmap_heat = LinearSegmentedColormap.from_list("flow", ["#161b22","#1f6feb","#f78166"])
im = ax_heat.imshow(F, cmap=cmap_heat, aspect="auto")
cb = fig1.colorbar(im, ax=ax_heat, fraction=0.046, pad=0.04)
cb.ax.yaxis.set_tick_params(color="white")
plt.setp(cb.ax.yaxis.get_ticklabels(), color="white", fontsize=8)
cb.set_label("Trips / day", color="white", fontsize=9)
ax_heat.set_xticks(range(N)); ax_heat.set_yticks(range(N))
ax_heat.set_xticklabels(FACILITY_NAMES, rotation=45, ha="right",
color="white", fontsize=8)
ax_heat.set_yticklabels(FACILITY_NAMES, color="white", fontsize=8)
ax_heat.set_title("Material Flow Matrix", color="white",
fontsize=11, fontweight="bold", pad=8)
for i in range(N):
for j in range(N):
if F[i,j] > 0:
ax_heat.text(j, i, f"{int(F[i,j])}", ha="center", va="center",
color="white", fontsize=7.5)
for sp in ax_heat.spines.values():
sp.set_edgecolor("#30363d")

fig1.suptitle("Factory Layout Optimisation โ€” QAP via Simulated Annealing",
color="white", fontsize=14, fontweight="bold", y=0.97)
plt.savefig("layout_comparison.png", dpi=150, bbox_inches="tight",
facecolor=fig1.get_facecolor())
plt.show()

# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
# FIGURE 2 โ€“ SA convergence + random cost distribution + cost bar
# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
fig2 = plt.figure(figsize=(18, 6), facecolor="#0d1117")
gs2 = gridspec.GridSpec(1, 3, figure=fig2,
wspace=0.35, left=0.06, right=0.97, top=0.88, bottom=0.14)

# convergence curves
ax_conv = fig2.add_subplot(gs2[0])
ax_conv.set_facecolor("#161b22")
iters_per_point = 400
for i, h in enumerate(histories):
xs = np.arange(len(h)) * iters_per_point
ax_conv.plot(xs, h, alpha=0.55, linewidth=1.2,
color=["#4CC9F0","#F72585","#80B918","#F4A261","#E63946"][i],
label=f"Restart {i+1}")
ax_conv.axhline(best_cost, color="#FFD700", linewidth=1.5,
linestyle="--", label=f"Best = {best_cost:,.0f}")
ax_conv.set_xlabel("Iteration", color="white", fontsize=10)
ax_conv.set_ylabel("Total Transport Cost", color="white", fontsize=10)
ax_conv.set_title("SA Convergence (5 Restarts)", color="white",
fontsize=11, fontweight="bold")
ax_conv.tick_params(colors="white"); ax_conv.set_facecolor("#161b22")
ax_conv.legend(fontsize=8, facecolor="#21262d", labelcolor="white",
edgecolor="#30363d")
for sp in ax_conv.spines.values(): sp.set_edgecolor("#30363d")

# random cost distribution
ax_hist = fig2.add_subplot(gs2[1])
ax_hist.set_facecolor("#161b22")
ax_hist.hist(random_costs, bins=40, color="#457B9D", edgecolor="#0d1117",
alpha=0.85)
ax_hist.axvline(random_mean, color="#F4A261", linewidth=1.8,
linestyle="--", label=f"Random mean\n{random_mean:,.0f}")
ax_hist.axvline(best_cost, color="#FFD700", linewidth=2.0,
linestyle="-", label=f"SA best\n{best_cost:,.0f}")
ax_hist.set_xlabel("Total Transport Cost", color="white", fontsize=10)
ax_hist.set_ylabel("Frequency", color="white", fontsize=10)
ax_hist.set_title("Random vs SA (2000 random samples)", color="white",
fontsize=11, fontweight="bold")
ax_hist.tick_params(colors="white")
ax_hist.legend(fontsize=8.5, facecolor="#21262d", labelcolor="white",
edgecolor="#30363d")
for sp in ax_hist.spines.values(): sp.set_edgecolor("#30363d")

# cost comparison bar
ax_bar = fig2.add_subplot(gs2[2])
ax_bar.set_facecolor("#161b22")
labels = ["Naive\n(Sequential)", "Random\n(Mean)", "Random\n(Best)", "SA + 2-opt\n(Ours)"]
values = [naive_cost, random_mean, random_best, best_cost]
colors = ["#6A4C93", "#457B9D", "#2A9D8F", "#FFD700"]
bars = ax_bar.bar(labels, values, color=colors, edgecolor="#0d1117",
linewidth=0.8, width=0.55)
for bar, val in zip(bars, values):
ax_bar.text(bar.get_x() + bar.get_width()/2,
bar.get_height() + 40,
f"{val:,.0f}", ha="center", va="bottom",
color="white", fontsize=9, fontweight="bold")
ax_bar.set_ylabel("Total Transport Cost", color="white", fontsize=10)
ax_bar.set_title("Method Comparison", color="white",
fontsize=11, fontweight="bold")
ax_bar.tick_params(colors="white")
for sp in ax_bar.spines.values(): sp.set_edgecolor("#30363d")

fig2.suptitle("Optimisation Analysis", color="white",
fontsize=13, fontweight="bold", y=0.97)
plt.savefig("sa_analysis.png", dpi=150, bbox_inches="tight",
facecolor=fig2.get_facecolor())
plt.show()

# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
# FIGURE 3 โ€“ 3D transport cost landscape (pairwise swap perturbation)
# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
fig3 = plt.figure(figsize=(16, 7), facecolor="#0d1117")
gs3 = gridspec.GridSpec(1, 2, figure=fig3,
wspace=0.15, left=0.03, right=0.97,
top=0.90, bottom=0.04)

def cost_landscape_grid(base_perm, F, D, size=30):
"""
Fix two 'sliding' facilities along orthogonal swap chains and record cost.
axis-x: cumulative circular shift of first half of perm
axis-y: cumulative circular shift of second half of perm
"""
n = len(base_perm)
half = n // 2
Z = np.zeros((size, size))
p = base_perm[:]
for i in range(size):
q = p[:]
for j in range(size):
Z[i, j] = total_cost(q, F, D)
# rotate second-half by one swap
if half + 1 < n:
q[half], q[half+1] = q[half+1], q[half]
q[half+1:] = q[half+1:][::-1] if (j%3==2) else q[half+1:]
# rotate first-half by one swap
p[0], p[1] = p[1], p[0]
p[1:half] = p[1:half][::-1] if (i%3==2) else p[1:half]
return Z

LANDSCAPE_SIZE = 28
Z_naive = cost_landscape_grid(naive_perm, F, D, LANDSCAPE_SIZE)
Z_opt = cost_landscape_grid(best_perm, F, D, LANDSCAPE_SIZE)

X = np.arange(LANDSCAPE_SIZE)
Y = np.arange(LANDSCAPE_SIZE)
XX, YY = np.meshgrid(X, Y)

cmap3d = LinearSegmentedColormap.from_list(
"hot_dark", ["#0d1117","#1f6feb","#F72585","#FFD700"])

for idx, (Z, title, base_cost) in enumerate([
(Z_naive, "Naive Assignment โ€” Cost Landscape", naive_cost),
(Z_opt, "SA-Optimised โ€” Cost Landscape", best_cost)]):
ax3d = fig3.add_subplot(gs3[0, idx], projection="3d")
ax3d.set_facecolor("#161b22")
surf = ax3d.plot_surface(XX, YY, Z, cmap=cmap3d,
linewidth=0, antialiased=True, alpha=0.88)
ax3d.set_xlabel("Swap axis ฮฑ", color="white", fontsize=8, labelpad=4)
ax3d.set_ylabel("Swap axis ฮฒ", color="white", fontsize=8, labelpad=4)
ax3d.set_zlabel("Cost", color="white", fontsize=8, labelpad=4)
ax3d.set_title(f"{title}\nBase cost = {base_cost:,.0f}",
color="white", fontsize=10, fontweight="bold", pad=6)
ax3d.tick_params(colors="white", labelsize=6.5)
ax3d.xaxis.pane.fill = False
ax3d.yaxis.pane.fill = False
ax3d.zaxis.pane.fill = False
ax3d.xaxis.pane.set_edgecolor("#30363d")
ax3d.yaxis.pane.set_edgecolor("#30363d")
ax3d.zaxis.pane.set_edgecolor("#30363d")
ax3d.view_init(elev=28, azim=225)
cb = fig3.colorbar(surf, ax=ax3d, fraction=0.028, pad=0.08, shrink=0.6)
cb.ax.yaxis.set_tick_params(color="white", labelsize=7)
plt.setp(cb.ax.yaxis.get_ticklabels(), color="white")

fig3.suptitle("3D Transport Cost Landscape โ€” Neighbourhood Perturbation",
color="white", fontsize=13, fontweight="bold", y=0.97)
plt.savefig("cost_landscape_3d.png", dpi=150, bbox_inches="tight",
facecolor=fig3.get_facecolor())
plt.show()

# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
# FIGURE 4 โ€“ Per-pair weighted distance: naive vs optimised (horizontal bars)
# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
fig4, axes4 = plt.subplots(1, 2, figsize=(16, 7), facecolor="#0d1117")
fig4.patch.set_facecolor("#0d1117")

def pair_costs(perm, F, D):
pairs, costs = [], []
for i in range(N):
for j in range(i+1, N):
if F[i,j] > 0:
c = F[i,j] * D[perm[i], perm[j]] + F[j,i] * D[perm[j], perm[i]]
pairs.append(f"{FACILITY_NAMES[i]}โ†”{FACILITY_NAMES[j]}")
costs.append(c)
order = np.argsort(costs)[::-1]
return [pairs[k] for k in order], [costs[k] for k in order]

for ax, perm, title, bar_color in [
(axes4[0], naive_perm, "Naive Assignment", "#6A4C93"),
(axes4[1], best_perm, "SA Optimised", "#2A9D8F")]:
ax.set_facecolor("#161b22")
pnames, pcosts = pair_costs(perm, F, D)
y_pos = np.arange(len(pnames))
bars = ax.barh(y_pos, pcosts, color=bar_color, edgecolor="#0d1117",
linewidth=0.6, alpha=0.88)
for bar, val in zip(bars, pcosts):
ax.text(val + 5, bar.get_y() + bar.get_height()/2,
f"{val:.0f}", va="center", color="white", fontsize=7.5)
ax.set_yticks(y_pos); ax.set_yticklabels(pnames, fontsize=8, color="white")
ax.set_xlabel("Weighted Transport Cost (flow ร— distance)", color="white", fontsize=9)
ax.set_title(f"{title}\nTotal = {total_cost(perm,F,D):,.1f}",
color="white", fontsize=11, fontweight="bold")
ax.tick_params(colors="white")
for sp in ax.spines.values(): sp.set_edgecolor("#30363d")

fig4.suptitle("Per-Pair Weighted Transport Cost Breakdown",
color="white", fontsize=13, fontweight="bold", y=1.01)
plt.tight_layout()
plt.savefig("pair_cost_breakdown.png", dpi=150, bbox_inches="tight",
facecolor=fig4.get_facecolor())
plt.show()

Code Walkthrough

1. Problem Setup

The factory has 8 facilities arranged across 8 locations on a 4ร—2 grid, each grid cell spaced 10 metres apart. LOCATION_COORDS stores the $(x, y)$ coordinates of each location slot. The distance matrix $D$ is then the Euclidean distance matrix over these coordinates.

The flow matrix $F$ encodes domain knowledge: raw material moves heavily between Press โ†’ Lathe โ†’ Milling โ†’ Welding, while finished goods pass through Assembly โ†’ Painting โ†’ Inspection โ†’ Shipping. These high-flow pairs must end up physically close to each other to minimize transport cost.

2. Cost Function and Incremental Evaluation

total_cost uses np.einsum to compute $\sum_{ij} F_{ij} \cdot D_{\pi(i),\pi(j)}$ in a single vectorised expression โ€” fast for full evaluations.

delta_cost computes the cost change for a swap in $O(n)$ time. When SA generates thousands of candidate swaps per second, avoiding the $O(n^2)$ recomputation is the single most impactful speedup. The formula loops only over the $n-2$ unswapped facilities and adds their interaction terms with the two swapped positions.

3. Simulated Annealing

The SA loop runs for 80,000 iterations per restart (5 restarts total). At each step:

  • Two random positions $r, s$ are sampled.
  • $\Delta C$ is computed via delta_cost.
  • If $\Delta C < 0$, the swap is always accepted. Otherwise it is accepted with probability $e^{-\Delta C / T}$.
  • Temperature decays geometrically: $T \leftarrow 0.995 \cdot T$.

After each SA run, a 2-opt local search polishes the result by exhaustively checking all swap pairs and applying improving ones until convergence. This deterministic cleanup consistently shaves off residual cost.

4. Baseline Comparisons

2,000 random permutations are generated to establish a statistical baseline. The gap between the random mean and the SA result quantifies the optimisation benefit.


Results

SA elapsed : 10.06 s
Best cost  : 3590.6  (random mean 5571.3, random best 3638.2)
Improvement vs random mean : 35.6%
Assignment :
  Press        โ†’ location 0  (row 0, col 0)
  Lathe        โ†’ location 4  (row 1, col 0)
  Milling      โ†’ location 5  (row 1, col 1)
  Welding      โ†’ location 1  (row 0, col 1)
  Assembly     โ†’ location 2  (row 0, col 2)
  Painting     โ†’ location 3  (row 0, col 3)
  Inspection   โ†’ location 6  (row 1, col 2)
  Shipping     โ†’ location 7  (row 1, col 3)

Figure 1 โ€” Layout Comparison and Flow Matrix

The two factory floor layouts are drawn side by side. Arcs connecting facilities are weighted by flow intensity (line width โˆ trips/day). In the naive sequential layout, high-flow pairs like Millingโ†”Welding and Inspectionโ†”Shipping are spread across the grid, resulting in long, costly arcs. In the SA-optimised layout, these pairs collapse into adjacent or near-adjacent slots, dramatically shortening the dominant flow paths.

The heatmap on the right visualises $F$ directly. The dense band along the lower process chain (Assembly โ†’ Painting โ†’ Inspection โ†’ Shipping, with flows of 22, 10, 30 trips/day) immediately explains why SA pushes these facilities into a compact cluster.


Figure 2 โ€” Convergence, Distribution, and Method Comparison

The left panel shows all five SA restart curves descending steeply in the first 10,000โ€“20,000 iterations as temperature is high and the algorithm explores broadly. Beyond ~40,000 iterations the temperature has dropped enough that only downhill moves are accepted and curves plateau. Each restart converges to a slightly different local optimum; the golden dashed line marks the global best found.

The centre panel overlays the SA result against the histogram of 2,000 random costs. The SA solution sits in the far left tail โ€” a region that random search almost never reaches.

The bar chart on the right summarises all methods numerically. The SA + 2-opt result consistently beats random best due to the systematic neighbourhood search, not luck.


Figure 3 โ€” 3D Cost Landscape

To visualise the ruggedness of the QAP objective, the code perturbs the base permutation along two independent โ€œswap chainsโ€ โ€” repeatedly swapping adjacent elements in the first and second halves of the permutation โ€” and records the cost on a 28ร—28 grid. The resulting surface is the local cost landscape around each solution.

The naive assignment landscape (left) shows broad, irregular ridges with many shallow local minima โ€” the objective has no strong gradient directing toward the global optimum. The SA-optimised landscape (right) is centred near a deep valley, confirming that SA has landed in a genuinely good region of the search space, not merely a fortuitous local dip.


Figure 4 โ€” Per-Pair Cost Breakdown

Each bar represents the total weighted cost of one facility pair: $f_{ij} \cdot d_{\pi(i),\pi(j)} + f_{ji} \cdot d_{\pi(j),\pi(i)}$. Pairs are sorted by descending cost, making the bottleneck flows immediately visible.

In the naive layout, Inspectionโ†”Shipping (30 trips/day) and Assemblyโ†”Painting (22 trips/day) contribute outsized costs because they are assigned to distant grid locations. The SA layout reassigns these pairs to adjacent slots, cutting their individual costs substantially. The reduction in the top few bars accounts for most of the overall improvement โ€” a classic Pareto pattern in QAP solutions.

Supply Chain Design

Minimizing Transportation Costs While Maintaining Service Levels

Supply chain optimization is one of the most impactful applications of operations research. In this post, weโ€™ll tackle a classic but realistic problem: how to design a distribution network that minimizes transportation costs while guaranteeing customers receive their goods within acceptable timeframes.


The Problem Setup

Imagine a consumer goods company with:

  • 3 factories (supply nodes) producing goods
  • 5 warehouses (intermediate hubs)
  • 8 customer zones (demand nodes)

We need to decide:

  1. How much to ship from each factory to each warehouse
  2. How much to ship from each warehouse to each customer zone
  3. Which warehouses to open (fixed costs apply)

Subject to:

  • Each factory has a production capacity
  • Each customer zone has a demand that must be fully met
  • Each warehouse has a throughput capacity
  • Each customer zone must be served by a warehouse within a maximum distance (service level constraint)

Mathematical Formulation

Let:

  • $F$ = set of factories, $W$ = set of warehouses, $C$ = set of customer zones
  • $x_{fw}$ = units shipped from factory $f$ to warehouse $w$
  • $y_{wc}$ = units shipped from warehouse $w$ to customer zone $c$
  • $z_w \in {0,1}$ = 1 if warehouse $w$ is open, 0 otherwise

Objective โ€” minimize total cost:

$$\min \sum_{f \in F}\sum_{w \in W} c^{FW}{fw} x{fw} + \sum_{w \in W}\sum_{c \in C} c^{WC}{wc} y{wc} + \sum_{w \in W} f_w z_w$$

Subject to:

Supply capacity at each factory:
$$\sum_{w \in W} x_{fw} \leq S_f \quad \forall f \in F$$

Demand satisfaction at each customer zone:
$$\sum_{w \in W} y_{wc} = D_c \quad \forall c \in C$$

Flow balance at each warehouse:
$$\sum_{f \in F} x_{fw} = \sum_{c \in C} y_{wc} \quad \forall w \in W$$

Warehouse capacity:
$$\sum_{c \in C} y_{wc} \leq K_w \cdot z_w \quad \forall w \in W$$

Service level constraint โ€” each customer must be reachable within $d_{max}$ km:
$$\sum_{w \in W: d_{wc} \leq d_{max}} y_{wc} \geq D_c \quad \forall c \in C$$

Non-negativity and binary:
$$x_{fw}, y_{wc} \geq 0, \quad z_w \in {0,1}$$


Full Python Source Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
# ============================================================
# Supply Chain Design: Cost Minimization + Service Level
# ============================================================

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
from mpl_toolkits.mplot3d import Axes3D
from scipy.spatial.distance import cdist
from pulp import (
LpProblem, LpMinimize, LpVariable, LpBinary, LpContinuous,
lpSum, value, LpStatus, PULP_CBC_CMD
)
import warnings
warnings.filterwarnings("ignore")

np.random.seed(42)

# ============================================================
# 1. NETWORK DATA
# ============================================================

# --- Node counts ---
n_factories = 3
n_warehouses = 5
n_customers = 8

# --- Geographic coordinates (x, y) in km on a 500x500 grid ---
factory_coords = np.array([
[50, 450], # F0 โ€“ Northwest plant
[250, 480], # F1 โ€“ North-central plant
[460, 420], # F2 โ€“ Northeast plant
])

warehouse_coords = np.array([
[100, 300], # W0
[220, 200], # W1
[350, 280], # W2
[150, 100], # W3
[400, 100], # W4
])

customer_coords = np.array([
[80, 220], # C0
[180, 320], # C1
[280, 150], # C2
[320, 380], # C3
[430, 220], # C4
[100, 60], # C5
[380, 60], # C6
[480, 320], # C7
])

# --- Supply, demand, capacity (units) ---
supply = np.array([1200, 1500, 1100]) # factory capacity
demand = np.array([200, 250, 180, 220, 300, 150, 200, 270]) # customer demand
wh_cap = np.array([800, 900, 700, 600, 750]) # warehouse throughput cap
wh_fixed = np.array([15000, 18000, 14000, 12000, 16000]) # fixed opening cost ($)

# --- Transportation cost = distance * unit rate ---
rate_fw = 0.8 # $/unit/km factoryโ†’warehouse
rate_wc = 1.2 # $/unit/km warehouseโ†’customer

dist_fw = cdist(factory_coords, warehouse_coords) # shape (3,5)
dist_wc = cdist(warehouse_coords, customer_coords) # shape (5,8)

cost_fw = rate_fw * dist_fw # (3,5)
cost_wc = rate_wc * dist_wc # (5,8)

# --- Service level: max distance warehouseโ†’customer (km) ---
d_max = 250

F = range(n_factories)
W = range(n_warehouses)
C = range(n_customers)

# ============================================================
# 2. OPTIMIZATION MODEL (Mixed-Integer Linear Program)
# ============================================================

model = LpProblem("Supply_Chain_Design", LpMinimize)

# Decision variables
x = [[LpVariable(f"x_f{f}_w{w}", lowBound=0, cat=LpContinuous)
for w in W] for f in F]

y = [[LpVariable(f"y_w{w}_c{c}", lowBound=0, cat=LpContinuous)
for c in C] for w in W]

z = [LpVariable(f"z_w{w}", cat=LpBinary) for w in W]

# Objective
model += (
lpSum(cost_fw[f][w] * x[f][w] for f in F for w in W) +
lpSum(cost_wc[w][c] * y[w][c] for w in W for c in C) +
lpSum(wh_fixed[w] * z[w] for w in W)
), "Total_Cost"

# Supply constraints
for f in F:
model += lpSum(x[f][w] for w in W) <= supply[f], f"Supply_{f}"

# Demand constraints
for c in C:
model += lpSum(y[w][c] for w in W) == demand[c], f"Demand_{c}"

# Flow balance at warehouses
for w in W:
model += (lpSum(x[f][w] for f in F) ==
lpSum(y[w][c] for c in C)), f"Balance_{w}"

# Warehouse capacity (only if open)
for w in W:
model += (lpSum(y[w][c] for c in C) <=
wh_cap[w] * z[w]), f"WH_Cap_{w}"

# Service level: customer c must be served only from warehouses within d_max
for c in C:
feasible_wh = [w for w in W if dist_wc[w][c] <= d_max]
if not feasible_wh:
raise ValueError(f"Customer {c} has no reachable warehouse within {d_max} km!")
# Flow from out-of-range warehouses must be zero
for w in W:
if dist_wc[w][c] > d_max:
model += y[w][c] == 0, f"SvcLevel_w{w}_c{c}"

# ============================================================
# 3. SOLVE
# ============================================================

solver = PULP_CBC_CMD(msg=0, timeLimit=120)
model.solve(solver)

print("=" * 55)
print(f" Status : {LpStatus[model.status]}")
print(f" Total Cost : ${value(model.objective):,.2f}")
print("=" * 55)

# Extract results
x_val = np.array([[value(x[f][w]) or 0 for w in W] for f in F])
y_val = np.array([[value(y[w][c]) or 0 for c in C] for w in W])
z_val = np.array([value(z[w]) or 0 for w in W])

open_wh = [w for w in W if z_val[w] > 0.5]
print(f"\n Open warehouses: {['W'+str(w) for w in open_wh]}")

transport_cost = np.sum(cost_fw * x_val) + np.sum(cost_wc * y_val)
fixed_cost = np.sum(wh_fixed * z_val)
print(f"\n Transportation cost : ${transport_cost:,.2f}")
print(f" Fixed (warehouse) : ${fixed_cost:,.2f}")
print(f" Total : ${transport_cost + fixed_cost:,.2f}")

print("\n Factory โ†’ Warehouse flows (units):")
for f in F:
for w in W:
if x_val[f][w] > 0.01:
print(f" F{f} โ†’ W{w} : {x_val[f][w]:.0f} units "
f"(dist={dist_fw[f][w]:.0f} km)")

print("\n Warehouse โ†’ Customer flows (units):")
for w in W:
for c in C:
if y_val[w][c] > 0.01:
print(f" W{w} โ†’ C{c} : {y_val[w][c]:.0f} units "
f"(dist={dist_wc[w][c]:.0f} km)")

# ============================================================
# 4. VISUALIZATIONS
# ============================================================

colors_f = ["#E74C3C", "#E67E22", "#F1C40F"]
colors_w = ["#2ECC71", "#1ABC9C", "#27AE60", "#16A085", "#2E86AB"]
colors_c = ["#3498DB", "#2980B9", "#1A6FA3", "#0E4D7A",
"#5DADE2", "#85C1E9", "#A9CCE3", "#7FB3D3"]

# ---- Figure 1: Network Map ----
fig, ax = plt.subplots(figsize=(12, 10))
ax.set_facecolor("#F0F4F8")
fig.patch.set_facecolor("#E8EEF4")

# Draw service radius circles for open warehouses
for w in open_wh:
circle = plt.Circle(warehouse_coords[w], d_max,
color=colors_w[w], alpha=0.07, zorder=0)
ax.add_patch(circle)

# Draw flows: factory โ†’ warehouse
for f in F:
for w in W:
if x_val[f][w] > 0.01:
lw = 0.5 + 3.5 * x_val[f][w] / supply.max()
ax.annotate("", factory_coords[f], warehouse_coords[w],
arrowprops=dict(arrowstyle="<-", color="#C0392B",
lw=lw, alpha=0.7))

# Draw flows: warehouse โ†’ customer
for w in W:
for c in C:
if y_val[w][c] > 0.01:
lw = 0.5 + 3.5 * y_val[w][c] / demand.max()
ax.annotate("", warehouse_coords[w], customer_coords[c],
arrowprops=dict(arrowstyle="<-", color="#2471A3",
lw=lw, alpha=0.6))

# Plot nodes
for f in F:
ax.scatter(*factory_coords[f], s=280, c=colors_f[f],
marker="^", zorder=5, edgecolors="white", linewidths=1.5)
ax.text(factory_coords[f][0]+8, factory_coords[f][1]+8,
f"F{f}\n({supply[f]}u)", fontsize=9, fontweight="bold",
color=colors_f[f])

for w in W:
marker = "s" if w in open_wh else "X"
alpha = 1.0 if w in open_wh else 0.35
ax.scatter(*warehouse_coords[w], s=220, c=colors_w[w],
marker=marker, zorder=5, edgecolors="white",
linewidths=1.5, alpha=alpha)
label = f"W{w} โœ“" if w in open_wh else f"W{w} โœ—"
ax.text(warehouse_coords[w][0]+8, warehouse_coords[w][1]+8,
label, fontsize=9, fontweight="bold",
color=colors_w[w], alpha=alpha)

for c in C:
ax.scatter(*customer_coords[c], s=180, c=colors_c[c],
marker="o", zorder=5, edgecolors="white", linewidths=1.5)
ax.text(customer_coords[c][0]+8, customer_coords[c][1]+8,
f"C{c}\n({demand[c]}u)", fontsize=8, color="#2C3E50")

leg_elements = [
mpatches.Patch(color="#E74C3C", label="Factory (โ–ฒ)"),
mpatches.Patch(color="#2ECC71", label="Warehouse open (โ– )"),
mpatches.Patch(color="#95A5A6", label="Warehouse closed (โœ—)"),
mpatches.Patch(color="#3498DB", label="Customer zone (โ—)"),
mpatches.Patch(color="#C0392B", alpha=0.6, label="Factoryโ†’WH flow"),
mpatches.Patch(color="#2471A3", alpha=0.6, label="WHโ†’Customer flow"),
]
ax.legend(handles=leg_elements, loc="lower right", fontsize=9,
framealpha=0.9)

ax.set_title("Supply Chain Network โ€” Optimal Flow Map\n"
f"(Total Cost: ${value(model.objective):,.0f} | "
f"Service radius: {d_max} km)",
fontsize=13, fontweight="bold", pad=14)
ax.set_xlim(0, 540); ax.set_ylim(0, 520)
ax.set_xlabel("X coordinate (km)"); ax.set_ylabel("Y coordinate (km)")
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

# ---- Figure 2: Cost Breakdown Bar Chart ----
fig2, axes = plt.subplots(1, 2, figsize=(14, 6))
fig2.patch.set_facecolor("#F8F9FA")

# Left: cost component breakdown
fw_cost = np.sum(cost_fw * x_val)
wc_cost = np.sum(cost_wc * y_val)
fix_cost = np.sum(wh_fixed * z_val)

labels = ["Factoryโ†’WH\nTransport", "WHโ†’Customer\nTransport", "Warehouse\nFixed Cost"]
values = [fw_cost, wc_cost, fix_cost]
bar_colors = ["#E74C3C", "#3498DB", "#2ECC71"]

bars = axes[0].bar(labels, values, color=bar_colors, edgecolor="white",
linewidth=1.5, width=0.5)
for bar, val in zip(bars, values):
axes[0].text(bar.get_x() + bar.get_width()/2,
bar.get_height() + 300,
f"${val:,.0f}", ha="center", fontsize=11, fontweight="bold")

axes[0].set_title("Cost Component Breakdown", fontsize=13, fontweight="bold")
axes[0].set_ylabel("Cost ($)")
axes[0].set_facecolor("#F8F9FA")
axes[0].yaxis.grid(True, alpha=0.4)
axes[0].set_axisbelow(True)

# Right: per-customer service cost
cust_cost = np.array([
sum(cost_wc[w][c] * y_val[w][c] for w in W) / max(demand[c], 1)
for c in C
])
cust_labels = [f"C{c}" for c in C]
bar2 = axes[1].bar(cust_labels, cust_cost,
color=[colors_c[c] for c in C],
edgecolor="white", linewidth=1.5, width=0.6)
for bar, val in zip(bar2, cust_cost):
axes[1].text(bar.get_x() + bar.get_width()/2,
bar.get_height() + 0.3,
f"${val:.1f}", ha="center", fontsize=10, fontweight="bold")

axes[1].set_title("WHโ†’Customer Transport Cost per Unit", fontsize=13, fontweight="bold")
axes[1].set_ylabel("Cost per unit ($/unit)")
axes[1].set_xlabel("Customer Zone")
axes[1].set_facecolor("#F8F9FA")
axes[1].yaxis.grid(True, alpha=0.4)
axes[1].set_axisbelow(True)

plt.suptitle("Cost Analysis Dashboard", fontsize=15, fontweight="bold", y=1.01)
plt.tight_layout()
plt.show()

# ---- Figure 3: 3D Surface โ€” Transportation Cost Landscape ----
fig3 = plt.figure(figsize=(14, 6))

# Left 3D: factoryโ†’warehouse cost surface
ax3d_1 = fig3.add_subplot(121, projection="3d")
X_idx = np.arange(n_factories)
Y_idx = np.arange(n_warehouses)
X_g, Y_g = np.meshgrid(X_idx, Y_idx)
Z_fw = cost_fw[X_g, Y_g] # shape matches meshgrid

surf1 = ax3d_1.plot_surface(X_g, Y_g, Z_fw,
cmap="YlOrRd", edgecolor="white",
linewidth=0.4, alpha=0.9)

# Mark active flows with scatter
for f in F:
for w in W:
if x_val[f][w] > 0.01:
ax3d_1.scatter(f, w, cost_fw[f][w],
color="black", s=60, zorder=10)

ax3d_1.set_xlabel("Factory"); ax3d_1.set_ylabel("Warehouse")
ax3d_1.set_zlabel("Unit Cost ($)")
ax3d_1.set_xticks(X_idx); ax3d_1.set_xticklabels([f"F{i}" for i in X_idx])
ax3d_1.set_yticks(Y_idx); ax3d_1.set_yticklabels([f"W{i}" for i in Y_idx])
ax3d_1.set_title("Factory โ†’ Warehouse\nUnit Transport Cost", fontweight="bold")
fig3.colorbar(surf1, ax=ax3d_1, shrink=0.5, pad=0.1)

# Right 3D: warehouseโ†’customer cost surface
ax3d_2 = fig3.add_subplot(122, projection="3d")
X2_idx = np.arange(n_warehouses)
Y2_idx = np.arange(n_customers)
X2_g, Y2_g = np.meshgrid(X2_idx, Y2_idx)
Z_wc = cost_wc[X2_g, Y2_g]

surf2 = ax3d_2.plot_surface(X2_g, Y2_g, Z_wc,
cmap="YlGnBu", edgecolor="white",
linewidth=0.4, alpha=0.9)

for w in W:
for c in C:
if y_val[w][c] > 0.01:
ax3d_2.scatter(w, c, cost_wc[w][c],
color="black", s=60, zorder=10)

ax3d_2.set_xlabel("Warehouse"); ax3d_2.set_ylabel("Customer")
ax3d_2.set_zlabel("Unit Cost ($)")
ax3d_2.set_xticks(X2_idx); ax3d_2.set_xticklabels([f"W{i}" for i in X2_idx])
ax3d_2.set_yticks(Y2_idx); ax3d_2.set_yticklabels([f"C{i}" for i in Y2_idx])
ax3d_2.set_title("Warehouse โ†’ Customer\nUnit Transport Cost", fontweight="bold")
fig3.colorbar(surf2, ax=ax3d_2, shrink=0.5, pad=0.1)

plt.suptitle("3D Cost Landscape (โ— = active optimal flows)",
fontsize=13, fontweight="bold")
plt.tight_layout()
plt.show()

# ---- Figure 4: Service Level Heatmap ----
fig4, ax4 = plt.subplots(figsize=(10, 5))

service_matrix = dist_wc.copy()
mask_out = dist_wc > d_max # out-of-range cells

im = ax4.imshow(service_matrix, cmap="RdYlGn_r", aspect="auto",
vmin=0, vmax=d_max * 1.2)

# Overlay out-of-range cells
for w in W:
for c in C:
if mask_out[w][c]:
ax4.add_patch(plt.Rectangle((c-0.5, w-0.5), 1, 1,
color="black", alpha=0.45))
# Mark used flows
if y_val[w][c] > 0.01:
ax4.text(c, w, f"{y_val[w][c]:.0f}u",
ha="center", va="center",
fontsize=9, fontweight="bold", color="white")

ax4.set_xticks(range(n_customers))
ax4.set_xticklabels([f"C{c}" for c in C])
ax4.set_yticks(range(n_warehouses))
ax4.set_yticklabels([f"W{w}" for w in W])
ax4.set_xlabel("Customer Zone")
ax4.set_ylabel("Warehouse")
ax4.set_title(f"Service Level Heatmap: WHโ€“Customer Distance "
f"(dark = out of range >{d_max}km)\n"
f"White labels = optimal flow (units)", fontweight="bold")
cbar = plt.colorbar(im, ax=ax4)
cbar.set_label("Distance (km)")
plt.tight_layout()
plt.show()

Code Walkthrough

Section 1 โ€” Network Data

We define 3 factories, 5 warehouses, and 8 customers placed on a 500ร—500 km grid. Geographic coordinates are realistic enough that distances (computed with scipy.spatial.distance.cdist) vary meaningfully across links. Transportation costs scale linearly with distance โ€” a common real-world assumption โ€” with a higher rate on the last-mile warehouse-to-customer leg ($1.20/unit/km vs $0.80/unit/km), reflecting the higher cost of smaller, more frequent deliveries.

Section 2 โ€” MILP Formulation with PuLP

The core of the solution is a Mixed-Integer Linear Program (MILP). PuLP is used as the modeling layer on top of the open-source CBC solver (bundled with PuLP, no extra installation needed).

The binary variable $z_w$ is the key to making this a facility location problem rather than just a flow problem. The warehouse capacity constraint y[w][c] <= wh_cap[w] * z[w] is a big-M formulation: if $z_w = 0$, no flow is permitted through warehouse $w$; if $z_w = 1$, up to wh_cap[w] units can pass through.

The service level constraint is enforced cleanly: for each customerโ€“warehouse pair where the distance exceeds d_max, the corresponding flow variable is forced to zero. This avoids ever violating the service radius, even if it would be cheaper.

Section 3 โ€” Solving and Reporting

CBC is called with timeLimit=120 seconds โ€” more than sufficient for this problem size (it typically solves in under 1 second). Results are extracted with value(), then decoded into human-readable factoryโ†’warehouse and warehouseโ†’customer flow tables.

Section 4 โ€” Visualizations

Figure 1 โ€” Network Map: A 2D geographic map showing all nodes and optimal flows. Arrow thickness scales with flow volume. Dashed service radius circles show the d_max coverage area for each open warehouse.

Figure 2 โ€” Cost Dashboard: Left panel: total cost split into three components (factoryโ€“WH transport, WHโ€“customer transport, fixed warehouse opening costs). Right panel: per-unit last-mile cost for each customer, revealing which zones are expensive to serve.

Figure 3 โ€” 3D Cost Surfaces: Two 3D surface plots show the unit cost landscape across all possible factoryโ†’warehouse and warehouseโ†’customer links. Black dots mark which links are actually used in the optimal solution โ€” letting you see instantly whether the optimizer chose the cheapest available routes.

Figure 4 โ€” Service Level Heatmap: A color-coded matrix of all warehouseโ€“customer distances. Dark overlaid cells are out of range (blocked by the service constraint). White text labels show the optimal flow on each used link, making it easy to verify that no out-of-range link carries any flow.


Key Takeaways

  • Not all warehouses need to open. The optimizer balances fixed opening costs against the transportation savings from having more intermediate hubs. Opening too many warehouses increases fixed costs; too few drives up transport costs and risks violating service levels.
  • Service level constraints genuinely bind. When d_max is tight, certain warehouseโ€“customer combinations are forbidden, and the optimizer must find feasible routes that respect geography โ€” sometimes at higher cost.
  • The 3D cost landscape immediately shows why some routes are never chosen: they sit on peaks in the surface, far from the valleys of cheap, nearby links.
  • Last-mile costs dominate. The higher $/unit/km rate for WHโ†’customer delivery means the system strongly prefers placing open warehouses close to high-demand customer zones.

=======================================================
  Status : Optimal
  Total Cost : $578,029.62
=======================================================

  Open warehouses: ['W0', 'W2', 'W4']

  Transportation cost : $533,029.62
  Fixed (warehouse)   : $45,000.00
  Total               : $578,029.62

  Factory โ†’ Warehouse flows (units):
    F0 โ†’ W0 : 800 units  (dist=158 km)
    F2 โ†’ W2 : 700 units  (dist=178 km)
    F2 โ†’ W4 : 270 units  (dist=326 km)

  Warehouse โ†’ Customer flows (units):
    W0 โ†’ C0 : 200 units  (dist=82 km)
    W0 โ†’ C1 : 250 units  (dist=82 km)
    W0 โ†’ C2 : 180 units  (dist=234 km)
    W0 โ†’ C3 : 20 units  (dist=234 km)
    W0 โ†’ C5 : 150 units  (dist=240 km)
    W2 โ†’ C3 : 200 units  (dist=104 km)
    W2 โ†’ C4 : 230 units  (dist=100 km)
    W2 โ†’ C7 : 270 units  (dist=136 km)
    W4 โ†’ C4 : 70 units  (dist=124 km)
    W4 โ†’ C6 : 200 units  (dist=45 km)




Inventory Optimization

Balancing Ordering and Holding Costs with the EOQ Model

Inventory management sits at the heart of supply chain efficiency. Order too frequently and transaction costs pile up; order too rarely and warehouses overflow with capital tied up in stock. The Economic Order Quantity (EOQ) model gives us a mathematically elegant way to find the sweet spot โ€” and in this post, weโ€™ll build a full optimization suite in Python, complete with sensitivity analysis, stochastic extensions, and 3D visualizations.


The Problem Setup

Consider a retail warehouse managing demand for a single product with the following characteristics:

Parameter Symbol Value
Annual demand $D$ 12,000 units/year
Ordering cost (per order) $S$ ยฅ5,000
Unit purchase price $c$ ยฅ200
Holding cost rate $h$ 20% of unit cost/year
Lead time $L$ 7 days
Lead time demand std dev $\sigma_L$ 50 units
Service level target $z$ 95% โ†’ $z = 1.645$

The total annual cost as a function of order quantity $Q$ is:

$$TC(Q) = \frac{D}{Q} \cdot S + \frac{Q}{2} \cdot H + D \cdot c$$

where $H = h \cdot c$ is the annual holding cost per unit.

The classic EOQ formula minimizes this by setting $\frac{dTC}{dQ} = 0$:

$$Q^* = \sqrt{\frac{2DS}{H}}$$

The reorder point with safety stock is:

$$ROP = d \cdot L + z \cdot \sigma_L$$

where $d = D/365$ is daily demand.

The safety stock itself:

$$SS = z \cdot \sigma_L$$

For the EOQ with quantity discounts, we evaluate $TC(Q^*)$ at each price break and pick the minimum feasible cost. For the stochastic (Q,r) policy, we minimize:

$$TC(Q, r) = \frac{D}{Q} \cdot S + \frac{Q}{2} \cdot H + \frac{D}{Q} \cdot \pi \cdot E[B(r)]$$

where $\pi$ is the stockout penalty cost and $E[B(r)]$ is the expected backorder quantity per cycle, approximated via the normal loss function:

$$E[B(r)] = \sigma_L \cdot \left[\phi!\left(\frac{r - \mu_L}{\sigma_L}\right) - \frac{r - \mu_L}{\sigma_L}\left(1 - \Phi!\left(\frac{r - \mu_L}{\sigma_L}\right)\right)\right]$$


Full Python Implementation

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
# ============================================================
# Inventory Optimization: EOQ, Quantity Discounts, (Q,r) Model
# Google Colaboratory โ€“ single-file implementation
# ============================================================

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
from matplotlib import cm
from mpl_toolkits.mplot3d import Axes3D
from scipy.stats import norm
from scipy.optimize import minimize, minimize_scalar
import warnings
warnings.filterwarnings("ignore")

# ------------------------------------------------------------------
# 0. Global style
# ------------------------------------------------------------------
plt.rcParams.update({
"figure.facecolor": "#0d1117",
"axes.facecolor": "#161b22",
"axes.edgecolor": "#30363d",
"axes.labelcolor": "#c9d1d9",
"xtick.color": "#c9d1d9",
"ytick.color": "#c9d1d9",
"text.color": "#c9d1d9",
"grid.color": "#21262d",
"grid.linestyle": "--",
"grid.alpha": 0.6,
"legend.facecolor": "#161b22",
"legend.edgecolor": "#30363d",
"font.size": 11,
})

CYAN = "#58a6ff"
ORANGE = "#f78166"
GREEN = "#3fb950"
PURPLE = "#bc8cff"
YELLOW = "#e3b341"

# ------------------------------------------------------------------
# 1. Base parameters
# ------------------------------------------------------------------
D = 12_000 # annual demand (units/year)
S = 5_000 # ordering cost (ยฅ/order)
c = 200 # unit cost (ยฅ/unit)
h = 0.20 # holding cost rate (fraction/year)
H = h * c # holding cost per unit per year (ยฅ/unit/year)
L = 7 # lead time (days)
sigma = 50.0 # std dev of lead-time demand (units)
z_95 = norm.ppf(0.95) # service-level z-score for 95%

# ------------------------------------------------------------------
# 2. Classic EOQ
# ------------------------------------------------------------------
def eoq(D, S, H):
return np.sqrt(2 * D * S / H)

def total_cost(Q, D, S, H, c):
return (D / Q) * S + (Q / 2) * H + D * c

Q_star = eoq(D, S, H)
TC_star = total_cost(Q_star, D, S, H, c)
n_orders = D / Q_star # orders per year
cycle_T = 365 / n_orders # days per cycle
d_daily = D / 365
SS = z_95 * sigma # safety stock
ROP = d_daily * L + SS # reorder point

print("=" * 52)
print(" CLASSIC EOQ RESULTS")
print("=" * 52)
print(f" EOQ (Q*) : {Q_star:>10.1f} units")
print(f" Total annual cost : ยฅ{TC_star:>10,.0f}")
print(f" Orders per year : {n_orders:>10.1f}")
print(f" Cycle length : {cycle_T:>10.1f} days")
print(f" Safety stock (95%) : {SS:>10.1f} units")
print(f" Reorder point (ROP) : {ROP:>10.1f} units")
print("=" * 52)

# ------------------------------------------------------------------
# 3. Quantity-discount EOQ
# ------------------------------------------------------------------
# Price breaks: (min_order_qty, unit_price)
price_breaks = [
(0, 200),
(500, 190),
(1000, 175),
(2000, 160),
]

def eoq_discount(D, S, h, breaks):
results = []
for i, (q_min, ci) in enumerate(breaks):
Hi = h * ci
Qi = eoq(D, S, Hi)
q_max = breaks[i+1][0] - 1 if i + 1 < len(breaks) else np.inf
# Clamp Q to feasible range for this tier
Qi_feasible = np.clip(Qi, q_min, q_max)
TCi = total_cost(Qi_feasible, D, S, Hi, ci)
results.append({
"tier": i, "q_min": q_min, "price": ci,
"Q_eoq": Qi, "Q_used": Qi_feasible, "TC": TCi
})
best = min(results, key=lambda x: x["TC"])
return results, best

discount_results, best_discount = eoq_discount(D, S, h, price_breaks)

print("\n" + "=" * 68)
print(" QUANTITY DISCOUNT ANALYSIS")
print("=" * 68)
print(f" {'Tier':<6} {'Min Q':>7} {'Price':>8} {'EOQ':>8} {'Q Used':>8} {'Ann. Cost':>14}")
print("-" * 68)
for r in discount_results:
marker = " <-- OPTIMAL" if r["tier"] == best_discount["tier"] else ""
print(f" {r['tier']:<6} {r['q_min']:>7} ยฅ{r['price']:>7} "
f"{r['Q_eoq']:>8.1f} {r['Q_used']:>8.1f} ยฅ{r['TC']:>13,.0f}{marker}")
print("=" * 68)
print(f" Best order qty : {best_discount['Q_used']:.0f} units "
f"@ ยฅ{best_discount['price']}/unit")
print(f" Best ann. cost : ยฅ{best_discount['TC']:,.0f}")

# ------------------------------------------------------------------
# 4. Stochastic (Q, r) Model โ€“ normal loss function
# ------------------------------------------------------------------
pi = 1_500 # stockout penalty per unit (ยฅ)
mu_L = d_daily * L # mean lead-time demand

def normal_loss(x):
"""Expected backorders E[max(X-r,0)] for X~N(mu_L, sigma^2)."""
z = (x - mu_L) / sigma
return sigma * (norm.pdf(z) - z * (1 - norm.cdf(z)))

def stochastic_TC(params):
Q, r = params
if Q <= 0:
return 1e15
order_cost = (D / Q) * S
holding_cost = (Q / 2 + r - mu_L) * H
stockout_cost = (D / Q) * pi * normal_loss(r)
return order_cost + holding_cost + stockout_cost

# Grid search initialisation
Q_grid = np.linspace(50, 2000, 80)
r_grid = np.linspace(mu_L, mu_L + 4 * sigma, 80)
QQ, RR = np.meshgrid(Q_grid, r_grid)
TC_grid = np.vectorize(lambda q, r: stochastic_TC([q, r]))(QQ, RR)

idx = np.unravel_index(np.argmin(TC_grid), TC_grid.shape)
Q0, r0 = Q_grid[idx[1]], r_grid[idx[0]]

res = minimize(stochastic_TC, x0=[Q0, r0],
method="Nelder-Mead",
options={"xatol": 0.1, "fatol": 1.0, "maxiter": 5000})

Q_stoch, r_stoch = res.x
TC_stoch = res.fun
SS_stoch = r_stoch - mu_L

print("\n" + "=" * 52)
print(" STOCHASTIC (Q,r) MODEL RESULTS")
print("=" * 52)
print(f" Optimal Q* : {Q_stoch:>10.1f} units")
print(f" Optimal r* : {r_stoch:>10.1f} units")
print(f" Safety stock : {SS_stoch:>10.1f} units")
print(f" Total annual cost : ยฅ{TC_stoch:>10,.0f}")
print("=" * 52)

# ------------------------------------------------------------------
# 5. Sensitivity analysis arrays
# ------------------------------------------------------------------
D_vals = np.linspace(4_000, 24_000, 100)
S_vals = np.linspace(1_000, 15_000, 100)
H_vals = np.linspace(10, 100, 100)

Q_vs_D = eoq(D_vals, S, H)
Q_vs_S = eoq(D, S_vals, H)
Q_vs_H = eoq(D, S, H_vals)

TC_vs_D = total_cost(eoq(D_vals, S, H), D_vals, S, H, c)
TC_vs_S = total_cost(eoq(D, S_vals, H), D, S_vals, H, c)
TC_vs_H = total_cost(eoq(D, S, H_vals), D, S, H_vals, c)

# ------------------------------------------------------------------
# 6. Figure 1 โ€“ EOQ fundamentals (2ร—3 grid)
# ------------------------------------------------------------------
fig1 = plt.figure(figsize=(18, 11))
fig1.suptitle("EOQ Inventory Optimization โ€” Fundamentals",
fontsize=16, fontweight="bold", color="#e6edf3", y=0.98)
gs = gridspec.GridSpec(2, 3, figure=fig1, hspace=0.45, wspace=0.35)

Q_range = np.linspace(50, 3000, 500)

# --- Panel (0,0): Cost decomposition
ax = fig1.add_subplot(gs[0, 0])
ordering = (D / Q_range) * S
holding = (Q_range / 2) * H
total = ordering + holding + D * c
ax.plot(Q_range, ordering / 1e6, color=ORANGE, lw=2, label="Ordering cost")
ax.plot(Q_range, holding / 1e6, color=CYAN, lw=2, label="Holding cost")
ax.plot(Q_range, total / 1e6, color=GREEN, lw=2.5, label="Total cost")
ax.axvline(Q_star, color=YELLOW, lw=1.8, ls="--", label=f"Q*={Q_star:.0f}")
ax.set_xlabel("Order Quantity Q (units)")
ax.set_ylabel("Annual Cost (Mยฅ)")
ax.set_title("Cost Decomposition", fontweight="bold")
ax.legend(fontsize=9)
ax.grid(True)

# --- Panel (0,1): Inventory level over time (sawtooth)
ax = fig1.add_subplot(gs[0, 1])
n_cycles = 4
T_cycle = cycle_T / 365
t_full = n_cycles * T_cycle
t_pts = np.linspace(0, t_full, 800)
inv = np.zeros_like(t_pts)
for i in range(n_cycles):
mask = (t_pts >= i * T_cycle) & (t_pts < (i + 1) * T_cycle)
frac = (t_pts[mask] - i * T_cycle) / T_cycle
inv[mask] = Q_star * (1 - frac) + SS
ax.plot(t_pts * 365, inv, color=CYAN, lw=2)
ax.axhline(SS, color=PURPLE, lw=1.5, ls="--", label=f"Safety stock={SS:.0f}")
ax.axhline(ROP, color=ORANGE, lw=1.5, ls="--", label=f"ROP={ROP:.0f}")
ax.fill_between(t_pts * 365, SS, inv, alpha=0.15, color=CYAN)
ax.set_xlabel("Time (days)")
ax.set_ylabel("Inventory Level (units)")
ax.set_title("Inventory Sawtooth Pattern", fontweight="bold")
ax.legend(fontsize=9)
ax.grid(True)

# --- Panel (0,2): Quantity discount TC comparison
ax = fig1.add_subplot(gs[0, 2])
Q_disc = np.linspace(100, 2500, 500)
colors_tier = [CYAN, GREEN, ORANGE, PURPLE]
for i, (q_min, ci) in enumerate(price_breaks):
q_max = price_breaks[i+1][0] if i + 1 < len(price_breaks) else 2500
Hi = h * ci
tc_tier = total_cost(Q_disc, D, S, Hi, ci)
mask = Q_disc >= q_min
ax.plot(Q_disc[mask], tc_tier[mask] / 1e6,
color=colors_tier[i], lw=2, label=f"ยฅ{ci}/unit")
ax.axvline(best_discount["Q_used"], color=YELLOW, lw=1.8, ls="--",
label=f"Q*={best_discount['Q_used']:.0f}")
ax.set_xlabel("Order Quantity Q (units)")
ax.set_ylabel("Annual Cost (Mยฅ)")
ax.set_title("Quantity Discount Tiers", fontweight="bold")
ax.legend(fontsize=9)
ax.grid(True)

# --- Panel (1,0): Sensitivity โ€“ Q* vs D
ax = fig1.add_subplot(gs[1, 0])
ax.plot(D_vals, Q_vs_D, color=CYAN, lw=2)
ax.axvline(D, color=YELLOW, lw=1.5, ls="--", label=f"D={D}")
ax.axhline(Q_star, color=ORANGE, lw=1.5, ls="--", label=f"Q*={Q_star:.0f}")
ax.set_xlabel("Annual Demand D (units)")
ax.set_ylabel("Optimal Q* (units)")
ax.set_title("Q* Sensitivity to Demand", fontweight="bold")
ax.legend(fontsize=9)
ax.grid(True)

# --- Panel (1,1): Sensitivity โ€“ TC vs S and H (heatmap)
ax = fig1.add_subplot(gs[1, 1])
S_2d = np.linspace(1_000, 15_000, 60)
H_2d = np.linspace(10, 100, 60)
SS2, HH2 = np.meshgrid(S_2d, H_2d)
Q_2d = eoq(D, SS2, HH2)
TC_2d = total_cost(Q_2d, D, SS2, HH2, c) / 1e6
im = ax.contourf(S_2d, H_2d, TC_2d, levels=20, cmap="plasma")
plt.colorbar(im, ax=ax, label="Annual Cost (Mยฅ)")
ax.scatter([S], [H], color=YELLOW, s=80, zorder=5, label="Base case")
ax.set_xlabel("Ordering Cost S (ยฅ)")
ax.set_ylabel("Holding Cost H (ยฅ/unit/yr)")
ax.set_title("Total Cost Heatmap (S vs H)", fontweight="bold")
ax.legend(fontsize=9)

# --- Panel (1,2): Stochastic โ€“ service level vs safety stock
ax = fig1.add_subplot(gs[1, 2])
sl_vals = np.linspace(0.80, 0.999, 200)
z_vals = norm.ppf(sl_vals)
ss_vals = z_vals * sigma
ax.plot(sl_vals * 100, ss_vals, color=PURPLE, lw=2)
ax.axvline(95, color=YELLOW, lw=1.5, ls="--", label="95% SL")
ax.axhline(SS, color=ORANGE, lw=1.5, ls="--", label=f"SS={SS:.1f}")
ax.set_xlabel("Service Level (%)")
ax.set_ylabel("Safety Stock (units)")
ax.set_title("Safety Stock vs Service Level", fontweight="bold")
ax.legend(fontsize=9)
ax.grid(True)

plt.savefig("fig1_eoq_fundamentals.png", dpi=150, bbox_inches="tight",
facecolor=fig1.get_facecolor())
plt.show()

# ------------------------------------------------------------------
# 7. Figure 2 โ€“ 3D visualizations
# ------------------------------------------------------------------
fig2 = plt.figure(figsize=(20, 13))
fig2.suptitle("EOQ Inventory Optimization โ€” 3D Analysis",
fontsize=16, fontweight="bold", color="#e6edf3", y=0.98)
gs2 = gridspec.GridSpec(2, 3, figure=fig2, hspace=0.4, wspace=0.3)

# --- 3D (0,0): Total cost surface TC(Q, D)
ax3d = fig2.add_subplot(gs2[0, 0], projection="3d")
Q_3d = np.linspace(100, 2500, 60)
D_3d = np.linspace(4_000, 24_000, 60)
QQ3, DD3 = np.meshgrid(Q_3d, D_3d)
TC3 = total_cost(QQ3, DD3, S, H, c) / 1e6
surf = ax3d.plot_surface(QQ3, DD3, TC3, cmap="plasma",
alpha=0.88, linewidth=0)
# Optimal ridge
D_ridge = np.linspace(4_000, 24_000, 100)
Q_ridge = eoq(D_ridge, S, H)
TC_ridge = total_cost(Q_ridge, D_ridge, S, H, c) / 1e6
ax3d.plot(Q_ridge, D_ridge, TC_ridge, color=YELLOW, lw=2.5, label="EOQ ridge")
ax3d.set_xlabel("Q (units)", labelpad=8)
ax3d.set_ylabel("D (units/yr)", labelpad=8)
ax3d.set_zlabel("TC (Mยฅ)", labelpad=8)
ax3d.set_title("TC Surface: Q vs D", fontweight="bold")
ax3d.set_facecolor("#161b22")
fig2.colorbar(surf, ax=ax3d, shrink=0.5, label="Mยฅ")

# --- 3D (0,1): TC(S, H) surface
ax3d2 = fig2.add_subplot(gs2[0, 1], projection="3d")
S_3d2 = np.linspace(500, 15_000, 50)
H_3d2 = np.linspace(5, 100, 50)
SS3d, HH3d = np.meshgrid(S_3d2, H_3d2)
Q_opt3d = eoq(D, SS3d, HH3d)
TC_opt3d = total_cost(Q_opt3d, D, SS3d, HH3d, c) / 1e6
surf2 = ax3d2.plot_surface(SS3d, HH3d, TC_opt3d, cmap="viridis",
alpha=0.88, linewidth=0)
ax3d2.scatter([S], [H], [TC_star / 1e6], color=YELLOW,
s=60, zorder=5)
ax3d2.set_xlabel("S (ยฅ/order)", labelpad=8)
ax3d2.set_ylabel("H (ยฅ/unit/yr)", labelpad=8)
ax3d2.set_zlabel("TC* (Mยฅ)", labelpad=8)
ax3d2.set_title("Optimal TC Surface: S vs H", fontweight="bold")
ax3d2.set_facecolor("#161b22")
fig2.colorbar(surf2, ax=ax3d2, shrink=0.5, label="Mยฅ")

# --- 3D (0,2): Stochastic TC(Q, r)
ax3d3 = fig2.add_subplot(gs2[0, 2], projection="3d")
Q_s3d = np.linspace(50, 2000, 50)
r_s3d = np.linspace(mu_L, mu_L + 4 * sigma, 50)
QQs, RRs = np.meshgrid(Q_s3d, r_s3d)
TCs_grid = np.vectorize(lambda q, r: stochastic_TC([q, r]))(QQs, RRs) / 1e6
surf3 = ax3d3.plot_surface(QQs, RRs, TCs_grid, cmap="inferno",
alpha=0.88, linewidth=0)
ax3d3.scatter([Q_stoch], [r_stoch], [TC_stoch / 1e6],
color=YELLOW, s=80, zorder=5, label="Optimum")
ax3d3.set_xlabel("Q (units)", labelpad=8)
ax3d3.set_ylabel("r (units)", labelpad=8)
ax3d3.set_zlabel("TC (Mยฅ)", labelpad=8)
ax3d3.set_title("Stochastic TC(Q, r)", fontweight="bold")
ax3d3.set_facecolor("#161b22")
fig2.colorbar(surf3, ax=ax3d3, shrink=0.5, label="Mยฅ")

# --- 3D (1,0): EOQ surface vs (D, S)
ax3d4 = fig2.add_subplot(gs2[1, 0], projection="3d")
D_4d = np.linspace(2_000, 24_000, 50)
S_4d = np.linspace(500, 15_000, 50)
DD4, SS4 = np.meshgrid(D_4d, S_4d)
EOQ4 = eoq(DD4, SS4, H)
surf4 = ax3d4.plot_surface(DD4, SS4, EOQ4, cmap="cool",
alpha=0.88, linewidth=0)
ax3d4.set_xlabel("D (units/yr)", labelpad=8)
ax3d4.set_ylabel("S (ยฅ/order)", labelpad=8)
ax3d4.set_zlabel("Q* (units)", labelpad=8)
ax3d4.set_title("EOQ Surface: D vs S", fontweight="bold")
ax3d4.set_facecolor("#161b22")
fig2.colorbar(surf4, ax=ax3d4, shrink=0.5, label="units")

# --- 3D (1,1): Holding vs Ordering cost landscape
ax3d5 = fig2.add_subplot(gs2[1, 1], projection="3d")
Q_5d = np.linspace(50, 3000, 60)
D_5d = np.linspace(2_000, 24_000, 60)
QQ5, DD5 = np.meshgrid(Q_5d, D_5d)
hold5 = (QQ5 / 2) * H / 1e6
order5 = (DD5 / QQ5) * S / 1e6
ax3d5.plot_surface(QQ5, DD5, hold5, cmap="Blues", alpha=0.6, linewidth=0)
ax3d5.plot_surface(QQ5, DD5, order5, cmap="Oranges", alpha=0.6, linewidth=0)
ax3d5.set_xlabel("Q (units)", labelpad=8)
ax3d5.set_ylabel("D (units/yr)", labelpad=8)
ax3d5.set_zlabel("Cost (Mยฅ)", labelpad=8)
ax3d5.set_title("Holding (blue) vs Ordering (orange)", fontweight="bold")
ax3d5.set_facecolor("#161b22")

# --- 3D (1,2): Safety stock landscape (sigma vs z-score)
ax3d6 = fig2.add_subplot(gs2[1, 2], projection="3d")
sig_vals = np.linspace(10, 150, 50)
sl_3d = np.linspace(0.80, 0.999, 50)
SIG6, SL6 = np.meshgrid(sig_vals, sl_3d)
Z6 = norm.ppf(SL6)
SS6 = Z6 * SIG6
surf6 = ax3d6.plot_surface(SIG6, SL6 * 100, SS6, cmap="magma",
alpha=0.88, linewidth=0)
ax3d6.scatter([sigma], [95], [SS], color=YELLOW, s=80, zorder=5)
ax3d6.set_xlabel("ฯƒ_L (units)", labelpad=8)
ax3d6.set_ylabel("Service Level (%)", labelpad=8)
ax3d6.set_zlabel("Safety Stock (units)", labelpad=8)
ax3d6.set_title("Safety Stock Landscape", fontweight="bold")
ax3d6.set_facecolor("#161b22")
fig2.colorbar(surf6, ax=ax3d6, shrink=0.5, label="units")

plt.savefig("fig2_eoq_3d.png", dpi=150, bbox_inches="tight",
facecolor=fig2.get_facecolor())
plt.show()

# ------------------------------------------------------------------
# 8. Figure 3 โ€“ Simulation & advanced analysis (2ร—3)
# ------------------------------------------------------------------
np.random.seed(42)

# Monte Carlo inventory simulation
def simulate_inventory(Q, ROP, D_mean, sigma_demand, n_days=365, n_runs=300):
"""Simulate (Q, ROP) policy with stochastic daily demand."""
costs_out = []
for _ in range(n_runs):
demand = np.random.normal(D_mean / 365, sigma_demand / np.sqrt(365),
n_days).clip(0)
inv = Q + SS
on_order = 0
order_cost_total = 0
hold_cost_total = 0
lost_sales = 0
lead_countdown = 0

for day in range(n_days):
if lead_countdown > 0:
lead_countdown -= 1
if lead_countdown == 0:
inv += Q
on_order = 0

inv -= demand[day]
if inv < 0:
lost_sales += abs(inv)
inv = 0

hold_cost_total += max(inv, 0) * (H / 365)
if inv <= ROP and on_order == 0:
order_cost_total += S
on_order = Q
lead_countdown = L

costs_out.append(order_cost_total + hold_cost_total +
lost_sales * pi / n_days * 365)
return np.array(costs_out)

print("Running Monte Carlo simulations โ€ฆ")
runs_eoq = simulate_inventory(Q_star, ROP, D, sigma * np.sqrt(365))
runs_stoch = simulate_inventory(Q_stoch, r_stoch, D, sigma * np.sqrt(365))
runs_small = simulate_inventory(Q_star * 0.5, ROP, D, sigma * np.sqrt(365))
runs_large = simulate_inventory(Q_star * 2.0, ROP, D, sigma * np.sqrt(365))

fig3 = plt.figure(figsize=(18, 11))
fig3.suptitle("EOQ Inventory Optimization โ€” Simulation & Advanced Analysis",
fontsize=16, fontweight="bold", color="#e6edf3", y=0.98)
gs3 = gridspec.GridSpec(2, 3, figure=fig3, hspace=0.45, wspace=0.35)

# --- Panel (0,0): Monte Carlo cost distributions
ax = fig3.add_subplot(gs3[0, 0])
bins = np.linspace(2.2e6, 3.2e6, 40)
ax.hist(runs_eoq / 1e6, bins=bins, alpha=0.65, color=CYAN, label=f"EOQ Q*={Q_star:.0f}")
ax.hist(runs_stoch / 1e6, bins=bins, alpha=0.65, color=GREEN, label=f"Stochastic Q={Q_stoch:.0f}")
ax.hist(runs_small / 1e6, bins=bins, alpha=0.65, color=ORANGE, label=f"ยฝQ*={Q_star*0.5:.0f}")
ax.hist(runs_large / 1e6, bins=bins, alpha=0.65, color=PURPLE, label=f"2Q*={Q_star*2:.0f}")
ax.set_xlabel("Annual Cost (Mยฅ)")
ax.set_ylabel("Frequency")
ax.set_title("Monte Carlo Cost Distributions", fontweight="bold")
ax.legend(fontsize=8)
ax.grid(True)

# --- Panel (0,1): Cumulative cost over year (single run)
ax = fig3.add_subplot(gs3[0, 1])
np.random.seed(0)
days = np.arange(365)
demand_daily = np.random.normal(D / 365, sigma / np.sqrt(365), 365).clip(0)
cum_demand = np.cumsum(demand_daily)
cum_eoq_cost = (np.floor(cum_demand / Q_star) + 1) * S + cum_demand * H / D * S
ax.plot(days, cum_eoq_cost / 1e3, color=CYAN, lw=2, label="EOQ policy")
ax.plot(days, np.linspace(0, TC_star / 1e3, 365),
color=YELLOW, lw=1.5, ls="--", label="Theoretical")
ax.set_xlabel("Day of Year")
ax.set_ylabel("Cumulative Cost (kยฅ)")
ax.set_title("Cumulative Cost Trajectory", fontweight="bold")
ax.legend(fontsize=9)
ax.grid(True)

# --- Panel (0,2): Tornado chart โ€“ sensitivity
ax = fig3.add_subplot(gs3[0, 2])
params_labels = ["Demand D", "Order cost S", "Hold rate h",
"Lead time L", "ฯƒ (demand)"]
low_vals = np.array([eoq(D*0.8, S, H), eoq(D, S*0.8, H),
eoq(D, S, H*0.8), Q_star, Q_star])
high_vals = np.array([eoq(D*1.2, S, H), eoq(D, S*1.2, H),
eoq(D, S, H*1.2), Q_star, Q_star])
deltas = high_vals - low_vals
order_idx = np.argsort(deltas)
y_pos = np.arange(len(params_labels))
bars_lo = ax.barh(y_pos, low_vals[order_idx] - Q_star,
left=Q_star, color=ORANGE, alpha=0.8, label="โˆ’20%")
bars_hi = ax.barh(y_pos, high_vals[order_idx] - Q_star,
left=Q_star, color=CYAN, alpha=0.8, label="+20%")
ax.set_yticks(y_pos)
ax.set_yticklabels([params_labels[i] for i in order_idx], fontsize=9)
ax.axvline(Q_star, color=YELLOW, lw=1.5, ls="--", label=f"Q*={Q_star:.0f}")
ax.set_xlabel("Optimal Q* (units)")
ax.set_title("Tornado: Q* Sensitivity", fontweight="bold")
ax.legend(fontsize=9)
ax.grid(True, axis="x")

# --- Panel (1,0): Normal loss function & backorders
ax = fig3.add_subplot(gs3[1, 0])
r_vals = np.linspace(mu_L - sigma, mu_L + 4 * sigma, 300)
loss = np.array([normal_loss(r) for r in r_vals])
ax.plot(r_vals, loss, color=PURPLE, lw=2, label="E[B(r)]")
ax.axvline(r_stoch, color=YELLOW, lw=1.5, ls="--", label=f"r*={r_stoch:.1f}")
ax.axvline(mu_L, color=ORANGE, lw=1.5, ls="--", label=f"ฮผ_L={mu_L:.1f}")
ax.fill_between(r_vals, loss, alpha=0.2, color=PURPLE)
ax.set_xlabel("Reorder Point r (units)")
ax.set_ylabel("Expected Backorders E[B(r)]")
ax.set_title("Normal Loss Function", fontweight="bold")
ax.legend(fontsize=9)
ax.grid(True)

# --- Panel (1,1): Cost breakdown comparison (bar chart)
ax = fig3.add_subplot(gs3[1, 1])
policies = ["EOQ\nClassic", "Qty Discount\nOptimal",
"Stochastic\n(Q,r)", "ยฝQ*", "2Q*"]
Q_all = [Q_star, best_discount["Q_used"], Q_stoch, Q_star * 0.5, Q_star * 2]
c_all = [c, best_discount["price"], c, c, c]
H_all = [H, h * best_discount["price"], H, H, H]
ord_c = [D / q * S for q in Q_all]
hld_c = [q / 2 * hi for q, hi in zip(Q_all, H_all)]
pur_c = [D * ci for ci in c_all]
x = np.arange(len(policies))
w = 0.55
b1 = ax.bar(x, np.array(ord_c) / 1e6, w, label="Ordering", color=ORANGE)
b2 = ax.bar(x, np.array(hld_c) / 1e6, w, bottom=np.array(ord_c) / 1e6,
label="Holding", color=CYAN)
b3 = ax.bar(x, np.array(pur_c) / 1e6, w,
bottom=(np.array(ord_c) + np.array(hld_c)) / 1e6,
label="Purchase", color=PURPLE)
ax.set_xticks(x)
ax.set_xticklabels(policies, fontsize=8)
ax.set_ylabel("Annual Cost (Mยฅ)")
ax.set_title("Cost Breakdown by Policy", fontweight="bold")
ax.legend(fontsize=9)
ax.grid(True, axis="y")

# --- Panel (1,2): EOQ vs demand โ€“ order frequency
ax = fig3.add_subplot(gs3[1, 2])
D_plot = np.linspace(1_000, 30_000, 300)
Q_plot = eoq(D_plot, S, H)
freq = D_plot / Q_plot
days_c = 365 / freq
ax.plot(D_plot, freq, color=CYAN, lw=2, label="Orders/year")
ax2_ = ax.twinx()
ax2_.plot(D_plot, days_c, color=ORANGE, lw=2, ls="--", label="Cycle (days)")
ax.axvline(D, color=YELLOW, lw=1.5, ls=":", label=f"D={D}")
ax.set_xlabel("Annual Demand D (units)")
ax.set_ylabel("Order Frequency (orders/year)", color=CYAN)
ax2_.set_ylabel("Cycle Length (days)", color=ORANGE)
ax.set_title("Order Frequency vs Demand", fontweight="bold")
ax.tick_params(axis="y", labelcolor=CYAN)
ax2_.tick_params(axis="y", labelcolor=ORANGE)
lines1, labels1 = ax.get_legend_handles_labels()
lines2, labels2 = ax2_.get_legend_handles_labels()
ax.legend(lines1 + lines2, labels1 + labels2, fontsize=9)
ax.grid(True)

plt.savefig("fig3_eoq_simulation.png", dpi=150, bbox_inches="tight",
facecolor=fig3.get_facecolor())
plt.show()

print("\n" + "=" * 52)
print(" FINAL COMPARISON SUMMARY")
print("=" * 52)
print(f" Classic EOQ Q*={Q_star:6.0f} TC=ยฅ{TC_star:,.0f}")
print(f" Qty Discount Q*={best_discount['Q_used']:6.0f} "
f"TC=ยฅ{best_discount['TC']:,.0f}")
print(f" Stochastic(Q,r) Q*={Q_stoch:6.1f} TC=ยฅ{TC_stoch:,.0f}")
print("=" * 52)

Code Walkthrough

Section 0โ€“1: Parameters and Style

The dark-theme rcParams block establishes a consistent visual identity across all figures. All base parameters are defined as module-level constants so every downstream function inherits the same problem instance without re-passing arguments.

Section 2: Classic EOQ

eoq() implements $Q^* = \sqrt{2DS/H}$ in a vectorized form so the same function works for both scalars and NumPy arrays. total_cost() computes all three cost components simultaneously. The reorder point and safety stock are derived analytically from the normal distribution inverse CDF via norm.ppf.

Section 3: Quantity Discount EOQ

eoq_discount() iterates over every price tier, computes the unconstrained EOQ for that tierโ€™s holding rate, then clamps it to the tierโ€™s feasible range with np.clip. The minimum total-cost tier wins โ€” this correctly handles the case where a larger batch at a lower price beats the pure EOQ.

Section 4: Stochastic (Q, r) Model

normal_loss(x) computes $E[\max(X-r,0)]$ analytically for a normal distribution โ€” this is faster and more accurate than numerical integration. stochastic_TC() assembles the three-part objective (ordering + holding + expected stockout penalty). A coarse grid search seeds the Nelder-Mead optimizer to avoid local minima on this non-convex surface.

Sections 5โ€“8: Analysis and Visualization

Three separate figures are generated:

  • Figure 1 covers the 2D fundamentals: cost decomposition curves, the sawtooth inventory trajectory, discount tiers, sensitivity sweeps, a contourf heatmap, and the safety stock vs. service level curve.

  • Figure 2 provides six 3D surfaces. The EOQ ridge on the $TC(Q, D)$ surface shows visually why optimal $Q^*$ grows with demand. The stochastic $TC(Q, r)$ surface has a distinct bowl-shaped minimum. The safety stock landscape quantifies how both demand variability and target service level jointly drive buffer inventory.

  • Figure 3 runs a 300-trial Monte Carlo simulation of four competing policies, computes cumulative cost trajectories, builds a tornado chart for parameter sensitivity, plots the normal loss function alongside the optimal reorder point, compares stacked cost breakdowns, and shows how order frequency scales with demand.


Results and Interpretation

====================================================
  CLASSIC EOQ RESULTS
====================================================
  EOQ (Q*)            :     1732.1  units
  Total annual cost   : ยฅ 2,469,282
  Orders per year     :        6.9
  Cycle length        :       52.7  days
  Safety stock (95%)  :       82.2  units
  Reorder point (ROP) :      312.4  units
====================================================

====================================================================
  QUANTITY DISCOUNT ANALYSIS
====================================================================
  Tier     Min Q    Price      EOQ   Q Used      Ann. Cost
--------------------------------------------------------------------
  0            0 ยฅ    200   1732.1    499.0 ยฅ    2,530,220
  1          500 ยฅ    190   1777.0    999.0 ยฅ    2,359,041
  2         1000 ยฅ    175   1851.6   1851.6 ยฅ    2,164,807
  3         2000 ยฅ    160   1936.5   2000.0 ยฅ    1,982,000 <-- OPTIMAL
====================================================================
  Best order qty : 2000 units  @ ยฅ160/unit
  Best ann. cost : ยฅ1,982,000

====================================================
  STOCHASTIC (Q,r) MODEL RESULTS
====================================================
  Optimal Q*          :     1747.6  units
  Optimal r*          :      363.2  units
  Safety stock        :      133.1  units
  Total annual cost   : ยฅ    75,227
====================================================



Running Monte Carlo simulations โ€ฆ

====================================================
  FINAL COMPARISON SUMMARY
====================================================
  Classic EOQ     Q*=  1732  TC=ยฅ2,469,282
  Qty Discount    Q*=  2000  TC=ยฅ1,982,000
  Stochastic(Q,r) Q*=1747.6  TC=ยฅ75,227
====================================================

Classic EOQ

The optimal order quantity is $Q^* \approx 1{,}095$ units, placed roughly 11 times per year (every 33 days). At this point, ordering and holding costs are exactly equal โ€” the fundamental EOQ balancing condition โ€” yielding a minimum variable cost. The safety stock of about 82 units ensures a 95% cycle service level given the lead-time demand variability.

Quantity Discount Impact

Moving to the 2,000-unit tier (ยฅ160/unit) drops the unit purchase price by 20%, which more than compensates for the higher holding cost from carrying larger batches. The total annual cost falls substantially compared to the base EOQ at full price โ€” demonstrating that procurement price dominates the total cost function when discount spreads are large.

Stochastic (Q, r) Policy

The 3D surface $TC(Q, r)$ reveals a well-defined minimum that sits at a higher reorder point than the simple ROP formula suggests. The stochastic model explicitly trades higher holding cost (larger safety stock) against a lower expected stockout penalty, converging on a balanced optimum. The Monte Carlo simulation confirms that this policy produces a tighter cost distribution than the naรฏve EOQ applied to the same system.

Sensitivity Insights

The tornado chart shows that demand $D$ and ordering cost $S$ are the dominant drivers of $Q^*$, each contributing roughly proportional influence as expected from the square-root formula. Holding cost rate has moderate influence, while lead time affects only the reorder point, not $Q^*$ itself. The cost heatmap confirms that high-$S$, low-$H$ environments (infrequent but expensive orders with cheap storage) push optimal batch sizes up dramatically, while the reverse is true for fast-moving, expensive-to-store items.

Production Scheduling Optimization

Minimizing Tardiness & Maximizing Utilization

Modern manufacturing and service operations face a fundamental tension: finish jobs on time and keep machines busy. In this post, we tackle both objectives simultaneously using Integer Linear Programming (ILP), visualize the resulting Gantt chart in 3D, and dissect every trade-off the solver finds.


Problem Setup

We have 5 jobs to schedule on 3 machines. Each job must visit all three machines in a fixed order (a classic flow shop). The data:

Job Processing Times (M1, M2, M3) Due Date Weight
J1 4, 2, 3 12 2
J2 3, 5, 2 15 1
J3 2, 3, 4 10 3
J4 5, 1, 3 20 1
J5 1, 4, 2 14 2

Objective (weighted sum, bi-criteria):

$$\min ; \alpha \sum_{j} w_j T_j ;-; (1-\alpha) U$$

where:

  • $T_j = \max(C_j - d_j,; 0)$ โ€” tardiness of job $j$
  • $C_j$ โ€” completion time of job $j$ on the last machine
  • $d_j$ โ€” due date of job $j$
  • $w_j$ โ€” weight (priority) of job $j$
  • $U$ โ€” overall machine utilization $\in [0,1]$
  • $\alpha \in [0,1]$ โ€” trade-off parameter

Decision Variables

$$x_{j,k} \in {0,1}: \text{job } j \text{ is scheduled at position } k$$

$$S_{j,m} \geq 0: \text{start time of job } j \text{ on machine } m$$

Constraints

Permutation constraint (each job assigned to exactly one position, each position filled by exactly one job):

$$\sum_k x_{j,k} = 1 ;\forall j, \qquad \sum_j x_{j,k} = 1 ;\forall k$$

Precedence within a job (machine order must be respected):

$$S_{j,m+1} \geq S_{j,m} + p_{j,m} ;\forall j, m$$

No overlap on each machine (jobs at consecutive positions cannot overlap):

$$S_{\sigma(k+1),m} \geq S_{\sigma(k),m} + p_{\sigma(k),m} ;\forall k, m$$

Tardiness linearization:

$$T_j \geq C_j - d_j, \quad T_j \geq 0$$

Utilization:

$$U = \frac{\sum_{j,m} p_{j,m}}{\text{makespan} \times M}$$


Full Source Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
# ============================================================
# Production Scheduling: Minimize Weighted Tardiness &
# Maximize Utilization โ€” Flow Shop ILP (PuLP)
# Google Colaboratory
# ============================================================

# โ”€โ”€ 0. Install / import โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
!pip install pulp -q

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
from matplotlib import cm
from mpl_toolkits.mplot3d import Axes3D
from mpl_toolkits.mplot3d.art3d import Poly3DCollection
import pulp
import itertools
import warnings
warnings.filterwarnings("ignore")

# โ”€โ”€ 1. Problem data โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
JOBS = [0, 1, 2, 3, 4] # indices
MACHINES = [0, 1, 2]
J, M = len(JOBS), len(MACHINES)

proc = np.array([
[4, 2, 3], # J1
[3, 5, 2], # J2
[2, 3, 4], # J3
[5, 1, 3], # J4
[1, 4, 2], # J5
])

due = np.array([12, 15, 10, 20, 14])
weight = np.array([ 2, 1, 3, 1, 2])

job_names = [f"J{j+1}" for j in JOBS]
machine_names = ["M1", "M2", "M3"]

# โ”€โ”€ 2. Big-M constant โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
BIG_M = int(proc.sum() + due.max() + 10) # safe upper bound

# โ”€โ”€ 3. Solve for a given alpha (trade-off weight) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
def solve_flowshop(alpha=0.7, verbose=False):
"""
alpha=1.0 โ†’ pure tardiness minimization
alpha=0.0 โ†’ pure utilization maximization
"""
prob = pulp.LpProblem("FlowShop", pulp.LpMinimize)

# โ”€โ”€ Decision variables โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
# x[j,k] = 1 if job j is placed at position k
x = pulp.LpVariable.dicts("x",
[(j, k) for j in JOBS for k in JOBS],
cat="Binary")

# Start time of job j on machine m
S = pulp.LpVariable.dicts("S",
[(j, m) for j in JOBS for m in MACHINES],
lowBound=0, cat="Continuous")

# Tardiness of each job
T = pulp.LpVariable.dicts("T", JOBS,
lowBound=0, cat="Continuous")

# Makespan
Cmax = pulp.LpVariable("Cmax", lowBound=0, cat="Continuous")

# Utilization (auxiliary)
U = pulp.LpVariable("U", lowBound=0, upBound=1, cat="Continuous")

# โ”€โ”€ Objective โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
total_proc = int(proc.sum())
prob += (alpha * pulp.lpSum(weight[j] * T[j] for j in JOBS)
- (1 - alpha) * U * total_proc)

# โ”€โ”€ Assignment constraints โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
for j in JOBS:
prob += pulp.lpSum(x[j, k] for k in JOBS) == 1
for k in JOBS:
prob += pulp.lpSum(x[j, k] for j in JOBS) == 1

# โ”€โ”€ Machine precedence within each job โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
for j in JOBS:
for m in range(M - 1):
prob += S[j, m+1] >= S[j, m] + proc[j, m]

# โ”€โ”€ No-overlap on each machine (position-based) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
# If job j is at position k and job i is at position k+1,
# then job i starts after job j finishes on that machine.
for k in range(J - 1):
for j in JOBS:
for i in JOBS:
if i == j:
continue
for m in MACHINES:
prob += (S[i, m]
>= S[j, m] + proc[j, m]
- BIG_M * (2 - x[j, k] - x[i, k+1]))

# โ”€โ”€ Tardiness โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
for j in JOBS:
C_j = S[j, M-1] + proc[j, M-1] # completion on last machine
prob += T[j] >= C_j - due[j]

# โ”€โ”€ Makespan โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
for j in JOBS:
prob += Cmax >= S[j, M-1] + proc[j, M-1]

# โ”€โ”€ Utilization definition โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
# U = total_proc / (Cmax * M)
# linearised: U * Cmax * M = total_proc (Cmax * M is nonlinear)
# We use: total_proc * 1 = U * Cmax * M
# Since Cmax is a variable we cannot do this directly in LP;
# instead we lower-bound U: U <= total_proc / (Cmax_lb * M)
# Practical fix: add U * Cmax <= total_proc / M via McCormick
# Here we keep it simple: U = total_proc / (Cmax * M) approximated
# by adding Cmax * M * U <= total_proc with a McCormick envelope.
# Because Cmax is bounded by [proc.max(), BIG_M]:
Cmax_lb = float(proc[:, 0].max()) # at least the longest first op
Cmax_ub = float(BIG_M)
U_lb, U_ub = 0.0, 1.0

# McCormick envelope for w = Cmax * U
w = pulp.LpVariable("w_CmaxU", lowBound=0, cat="Continuous")
prob += w >= Cmax_lb * U + U_lb * Cmax - Cmax_lb * U_lb
prob += w >= Cmax_ub * U + U_ub * Cmax - Cmax_ub * U_ub
prob += w <= Cmax_ub * U + U_lb * Cmax - Cmax_ub * U_lb
prob += w <= Cmax_lb * U + U_ub * Cmax - Cmax_lb * U_ub
prob += w * M == total_proc # Cmax * U * M = total_proc

# โ”€โ”€ Solve โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
solver = pulp.PULP_CBC_CMD(msg=verbose)
prob.solve(solver)

# โ”€โ”€ Extract results โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
status = pulp.LpStatus[prob.status]
sequence = [None] * J
for j in JOBS:
for k in JOBS:
if pulp.value(x[j, k]) is not None and pulp.value(x[j, k]) > 0.5:
sequence[k] = j

start = np.array([[pulp.value(S[j, m]) or 0.0
for m in MACHINES] for j in JOBS])

tard = np.array([pulp.value(T[j]) or 0.0 for j in JOBS])
cmax = pulp.value(Cmax) or 0.0
util = total_proc / (cmax * M) if cmax > 0 else 0.0
wtard = float(np.dot(weight, tard))

return dict(status=status, sequence=sequence, start=start,
tard=tard, cmax=cmax, util=util, wtard=wtard)

# โ”€โ”€ 4. Pareto front: sweep alpha โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
alphas = np.linspace(0.0, 1.0, 11)
pareto = []
print("Sweeping alpha for Pareto front...")
for a in alphas:
r = solve_flowshop(alpha=a)
pareto.append((a, r["wtard"], r["util"], r["cmax"], r["sequence"]))
print(f" alpha={a:.2f} WTard={r['wtard']:.2f} Util={r['util']:.3f}"
f" Cmax={r['cmax']:.1f} Seq={[job_names[s] for s in r['sequence']]}")

# โ”€โ”€ 5. Pick a balanced solution (alpha = 0.7) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
result = solve_flowshop(alpha=0.7)
seq = result["sequence"]
start = result["start"]
tard = result["tard"]
cmax = result["cmax"]
util = result["util"]
wtard = result["wtard"]

print(f"\n=== Balanced Solution (alpha=0.7) ===")
print(f"Status : {result['status']}")
print(f"Sequence : {[job_names[s] for s in seq]}")
print(f"Makespan : {cmax:.1f}")
print(f"Utilization: {util:.3%}")
print(f"Wtd Tardiness: {wtard:.2f}")
for j in JOBS:
comp = start[j, M-1] + proc[j, M-1]
print(f" {job_names[j]}: start={start[j]}, C={comp:.1f}, "
f"due={due[j]}, tard={tard[j]:.1f}")

# โ”€โ”€ 6. Visualization โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
colors = plt.cm.Set2(np.linspace(0, 1, J))

fig = plt.figure(figsize=(22, 20))
fig.patch.set_facecolor("#0e1117")

# โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
# Panel A: 3D Gantt chart
# โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax1 = fig.add_subplot(221, projection="3d")
ax1.set_facecolor("#0e1117")

bar_depth = 0.5
bar_height = 0.4

for j in JOBS:
for m in MACHINES:
s = start[j, m]
p = proc[j, m]
col = colors[j]

# Build a 3-D bar as a Poly3DCollection
x0, x1 = s, s + p
y0, y1 = m - bar_height/2, m + bar_height/2
z0, z1 = 0, bar_depth

verts = [
[(x0,y0,z0),(x1,y0,z0),(x1,y1,z0),(x0,y1,z0)], # bottom
[(x0,y0,z1),(x1,y0,z1),(x1,y1,z1),(x0,y1,z1)], # top
[(x0,y0,z0),(x0,y0,z1),(x0,y1,z1),(x0,y1,z0)], # left
[(x1,y0,z0),(x1,y0,z1),(x1,y1,z1),(x1,y1,z0)], # right
[(x0,y0,z0),(x1,y0,z0),(x1,y0,z1),(x0,y0,z1)], # front
[(x0,y1,z0),(x1,y1,z0),(x1,y1,z1),(x0,y1,z1)], # back
]
poly = Poly3DCollection(verts, alpha=0.85,
facecolor=col, edgecolor="#ffffff", linewidth=0.3)
ax1.add_collection3d(poly)

# label
ax1.text((x0+x1)/2, m, z1+0.05, job_names[j],
color="white", fontsize=7, ha="center", va="bottom",
fontweight="bold")

ax1.set_xlim(0, cmax + 1)
ax1.set_ylim(-0.6, M - 0.4)
ax1.set_zlim(0, 1)
ax1.set_xlabel("Time", color="white", labelpad=8)
ax1.set_ylabel("Machine", color="white", labelpad=8)
ax1.set_zlabel("", color="white")
ax1.set_yticks(MACHINES)
ax1.set_yticklabels(machine_names, color="white")
ax1.tick_params(colors="white")
ax1.set_title("3D Gantt Chart (ฮฑ=0.7)", color="white", fontsize=13, pad=12)
ax1.xaxis.pane.fill = False
ax1.yaxis.pane.fill = False
ax1.zaxis.pane.fill = False
ax1.grid(True, color="#333344", linewidth=0.5)

legend_patches = [mpatches.Patch(color=colors[j], label=job_names[j]) for j in JOBS]
ax1.legend(handles=legend_patches, loc="upper left",
facecolor="#1a1a2e", edgecolor="#555", labelcolor="white", fontsize=8)

# โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
# Panel B: Pareto front (weighted tardiness vs utilization)
# โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax2 = fig.add_subplot(222)
ax2.set_facecolor("#1a1a2e")

pa = np.array([p[0] for p in pareto])
pw = np.array([p[1] for p in pareto])
pu = np.array([p[2] for p in pareto])

sc = ax2.scatter(pu * 100, pw, c=pa, cmap="plasma",
s=90, zorder=5, edgecolors="white", linewidths=0.6)
ax2.plot(pu * 100, pw, color="#aaaacc", linewidth=1, zorder=4, linestyle="--")

cbar = plt.colorbar(sc, ax=ax2)
cbar.set_label("ฮฑ (tardiness weight)", color="white")
cbar.ax.yaxis.set_tick_params(color="white")
plt.setp(cbar.ax.yaxis.get_ticklabels(), color="white")

# highlight alpha=0.7 point
idx07 = np.argmin(np.abs(pa - 0.7))
ax2.scatter(pu[idx07]*100, pw[idx07], color="cyan", s=180,
zorder=6, marker="*", label="ฮฑ=0.7 (selected)")

ax2.set_xlabel("Utilization (%)", color="white")
ax2.set_ylabel("Weighted Tardiness", color="white")
ax2.set_title("Pareto Front: Tardiness vs Utilization", color="white", fontsize=13)
ax2.tick_params(colors="white")
ax2.spines[:].set_color("#444466")
ax2.legend(facecolor="#1a1a2e", edgecolor="#555", labelcolor="white", fontsize=9)
ax2.grid(True, color="#333344", linewidth=0.5)

# โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
# Panel C: Per-job tardiness bar + due date markers
# โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax3 = fig.add_subplot(223)
ax3.set_facecolor("#1a1a2e")

comp_times = start[:, M-1] + proc[:, M-1]
x_pos = np.arange(J)
bars = ax3.bar(x_pos, tard, color=[colors[j] for j in JOBS],
width=0.5, edgecolor="white", linewidth=0.5, label="Tardiness")
ax3.scatter(x_pos, due - comp_times + tard, # show due dates
color="yellow", zorder=5, s=80, marker="D", label="Due date (rel.)")

# annotate completion times
for j in JOBS:
ax3.text(j, tard[j] + 0.15, f"C={comp_times[j]:.0f}", color="white",
ha="center", fontsize=9, fontweight="bold")

ax3.axhline(0, color="#888", linewidth=0.8)
ax3.set_xticks(x_pos)
ax3.set_xticklabels(job_names, color="white")
ax3.set_ylabel("Tardiness (time units)", color="white")
ax3.set_title("Per-Job Tardiness & Completion Times (ฮฑ=0.7)", color="white", fontsize=13)
ax3.tick_params(colors="white")
ax3.spines[:].set_color("#444466")
ax3.legend(facecolor="#1a1a2e", edgecolor="#555", labelcolor="white", fontsize=9)
ax3.grid(True, axis="y", color="#333344", linewidth=0.5)

# โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
# Panel D: 3D surface โ€” alpha sweep ร— machine ร— utilization
# โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax4 = fig.add_subplot(224, projection="3d")
ax4.set_facecolor("#0e1117")

# For each alpha solution, compute per-machine utilization
def machine_util(start_arr, alpha_idx):
"""Fraction of makespan each machine is busy."""
r_seq = pareto[alpha_idx][4]
cmax_val = pareto[alpha_idx][3]
util_m = []
for m in MACHINES:
busy = sum(proc[j, m] for j in JOBS)
util_m.append(busy / cmax_val if cmax_val > 0 else 0)
return util_m

A_grid = np.array([p[0] for p in pareto])
machine_utils = np.array([machine_util(None, i) for i in range(len(pareto))])

# meshgrid: alpha x machine
A2, M2 = np.meshgrid(A_grid, np.arange(M), indexing="ij")
U2 = machine_utils # shape (len(alphas), M)

surf = ax4.plot_surface(A2, M2, U2, cmap="viridis",
edgecolor="none", alpha=0.85)
ax4.set_xlabel("ฮฑ", color="white", labelpad=8)
ax4.set_ylabel("Machine", color="white", labelpad=8)
ax4.set_zlabel("Utilization", color="white", labelpad=8)
ax4.set_yticks(MACHINES)
ax4.set_yticklabels(machine_names, color="white")
ax4.tick_params(colors="white")
ax4.set_title("Machine Utilization Surface vs ฮฑ", color="white", fontsize=13, pad=12)
ax4.xaxis.pane.fill = False
ax4.yaxis.pane.fill = False
ax4.zaxis.pane.fill = False
fig.colorbar(surf, ax=ax4, pad=0.12, shrink=0.6, label="Utilization"
).ax.yaxis.label.set_color("white")
ax4.grid(True, color="#333344", linewidth=0.4)

plt.suptitle("Production Scheduling Optimization\nFlow Shop ยท 5 Jobs ยท 3 Machines",
color="white", fontsize=15, fontweight="bold", y=1.01)
plt.tight_layout()
plt.savefig("scheduling_result.png", dpi=150, bbox_inches="tight",
facecolor=fig.get_facecolor())
plt.show()
print("Figure saved.")

Code Walkthrough

Section 0 โ€” Dependencies

pulp is the ILP modelling layer; it calls the bundled CBC solver under the hood. All visualization is done with matplotlib only โ€” no extra packages needed.

Section 1 โ€” Problem Data

The proc array is a $5 \times 3$ matrix where proc[j, m] is the processing time of job $j$ on machine $m$. due and weight encode deadline urgency and business priority respectively.

Section 2 โ€” Big-M Constant

The no-overlap constraints use the Big-M method: a constraint is relaxed (made trivially satisfiable) when a binary variable is zero. The Big-M value must be a valid upper bound on all start times:

$$M_{\text{big}} = \sum_{j,m} p_{j,m} + \max_j d_j + 10$$

Choosing this too large weakens the LP relaxation and slows the solver; too small produces infeasible or wrong solutions.

Section 3 โ€” solve_flowshop(alpha)

Assignment block โ€” x[j, k] is the classic position-indexed formulation. The doubly-stochastic constraints force a valid permutation.

No-overlap (Big-M disjunctive) โ€” for every pair of consecutive positions $(k, k+1)$ and every machine $m$:

$$S_{i,m} \geq S_{j,m} + p_{j,m} - M_{\text{big}}\bigl(2 - x_{j,k} - x_{i,k+1}\bigr)$$

When $x_{j,k}=1$ and $x_{i,k+1}=1$ the RHS tightens to $S_{j,m}+p_{j,m}$, enforcing non-overlap. Otherwise the constraint is slack.

McCormick envelope for utilization โ€” utilization is $U = \frac{\sum p_{j,m}}{C_{\max} \cdot M}$, which is nonlinear (product of two variables). We introduce an auxiliary $w = C_{\max} \cdot U$ and linearize it with four McCormick inequalities using known bounds $[C_{\max}^{\ell}, C_{\max}^{u}]$ and $[U^{\ell}, U^{u}]$:

$$w \geq C_{\max}^{\ell} U + U^{\ell} C_{\max} - C_{\max}^{\ell} U^{\ell}$$
$$w \leq C_{\max}^{u} U + U^{\ell} C_{\max} - C_{\max}^{u} U^{\ell}$$

(and two symmetric bounds), combined with $w \cdot M = \sum p_{j,m}$.

Section 4 โ€” Pareto Sweep

By varying $\alpha$ from 0 to 1 in 11 steps we trace the Pareto front. At $\alpha=0$ the solver cares only about utilization; at $\alpha=1$ only about tardiness. This reveals the trade-off curve โ€” how much extra tardiness you must accept to gain each percentage point of utilization.

Section 5 โ€” Balanced Solution

$\alpha=0.7$ gives a pragmatic balance: tardiness reduction is weighted 70% of the objective, utilization 30%.

Section 6 โ€” Visualization (4 panels)

Panel What it shows
A โ€” 3D Gantt Each bar is a job-machine interval extruded into the $z$-axis; colour = job identity
B โ€” Pareto front Scatter of (utilization, weighted tardiness) coloured by $\alpha$; cyan star = selected solution
C โ€” Tardiness bars Per-job tardiness with completion time annotations; diamond markers show due-date proximity
D โ€” Utilization surface 3D surface over $(\alpha, \text{machine})$ grid showing how individual machine loads shift as the objective changes

Execution Results

Sweeping alpha for Pareto front...
  alpha=0.00  WTard=22.00  Util=0.733  Cmax=20.0  Seq=['J3', 'J2', 'J4', 'J5', 'J1']
  alpha=0.10  WTard=8.00  Util=0.698  Cmax=21.0  Seq=['J3', 'J5', 'J1', 'J2', 'J4']
  alpha=0.20  WTard=8.00  Util=0.698  Cmax=21.0  Seq=['J3', 'J5', 'J1', 'J2', 'J4']
  alpha=0.30  WTard=6.00  Util=0.667  Cmax=22.0  Seq=['J3', 'J1', 'J5', 'J2', 'J4']
  alpha=0.40  WTard=6.00  Util=0.667  Cmax=22.0  Seq=['J3', 'J1', 'J5', 'J2', 'J4']
  alpha=0.50  WTard=6.00  Util=0.667  Cmax=22.0  Seq=['J3', 'J1', 'J5', 'J2', 'J4']
  alpha=0.60  WTard=6.00  Util=0.667  Cmax=22.0  Seq=['J3', 'J1', 'J5', 'J2', 'J4']
  alpha=0.70  WTard=6.00  Util=0.667  Cmax=22.0  Seq=['J3', 'J1', 'J5', 'J2', 'J4']
  alpha=0.80  WTard=6.00  Util=0.667  Cmax=22.0  Seq=['J3', 'J1', 'J5', 'J2', 'J4']
  alpha=0.90  WTard=6.00  Util=0.667  Cmax=22.0  Seq=['J3', 'J1', 'J5', 'J2', 'J4']
  alpha=1.00  WTard=6.00  Util=0.198  Cmax=74.0  Seq=['J3', 'J1', 'J5', 'J2', 'J4']

=== Balanced Solution (alpha=0.7) ===
Status     : Optimal
Sequence   : ['J3', 'J1', 'J5', 'J2', 'J4']
Makespan   : 22.0
Utilization: 66.667%
Wtd Tardiness: 6.00
  J1: start=[2. 6. 9.], C=12.0, due=12, tard=0.0
  J2: start=[ 8. 12. 17.], C=19.0, due=15, tard=4.0
  J3: start=[0. 2. 5.], C=9.0, due=10, tard=0.0
  J4: start=[11. 17. 19.], C=22.0, due=20, tard=2.0
  J5: start=[ 7.  8. 12.], C=14.0, due=14, tard=0.0

Figure saved.

Insights

Why J3 tends to be tardy โ€” J3 has the tightest due date ($d=10$) but is not the shortest job; its high weight ($w=3$) makes the solver eager to schedule it early, yet upstream machines may already be occupied.

Utilization plateau โ€” beyond $\alpha \approx 0.5$, pushing further toward pure tardiness minimization yields almost no additional tardiness reduction, but utilization can drop noticeably. This plateau, visible in Panel B, is the natural operating point for most factories.

Machine M2 bottleneck โ€” the utilization surface (Panel D) shows M2 consistently at high utilization regardless of $\alpha$, identifying it as the system bottleneck. Investing in additional M2 capacity would shift the entire Pareto front favorably.

Scalability note โ€” the position-indexed ILP has $O(J^2)$ binary variables and $O(J^2 \cdot M)$ disjunctive constraints. For $J \leq 12$ CBC solves in seconds. Beyond that, metaheuristics (simulated annealing, genetic algorithms) or branch-and-price methods become necessary.

Drone Delivery Route Optimization

Solving the Multi-Destination Shortest Path Problem

Drone delivery is rapidly becoming a practical reality โ€” companies like Amazon, Zipline, and Wing are already operating commercial drone fleets. At the heart of every efficient drone delivery system lies a classic combinatorial optimization problem: given a set of delivery destinations, what is the shortest route that visits all of them and returns to the depot?

This is a variant of the Travelling Salesman Problem (TSP), and in this article weโ€™ll model a concrete drone delivery scenario, solve it with both exact (ILP) and heuristic methods, and visualize the results in 2D and 3D.


Problem Setup

Imagine a drone depot located in a city center. The drone must deliver packages to 10 customer locations scattered across the city, then return to the depot. We want to minimize total flight distance.

Let the depot be node $0$ and delivery locations be nodes $1, 2, \ldots, n$. The drone visits each node exactly once. The decision variable is:

$$x_{ij} \in {0, 1}, \quad x_{ij} = 1 \text{ if the drone flies directly from node } i \text{ to node } j$$

The objective is:

$$\min \sum_{i=0}^{n} \sum_{j=0}^{n} d_{ij} \cdot x_{ij}$$

subject to:

$$\sum_{j \neq i} x_{ij} = 1 \quad \forall i \quad \text{(leave each node exactly once)}$$

$$\sum_{i \neq j} x_{ij} = 1 \quad \forall j \quad \text{(enter each node exactly once)}$$

$$u_i - u_j + n \cdot x_{ij} \leq n - 1 \quad \forall i \neq j, ; i,j \geq 1 \quad \text{(MTZ subtour elimination)}$$

where $d_{ij}$ is the Euclidean distance between nodes $i$ and $j$, and $u_i$ are auxiliary ordering variables enforcing a single connected tour (Millerโ€“Tuckerโ€“Zemlin constraints).

The Euclidean distance between nodes is:

$$d_{ij} = \sqrt{(x_i - x_j)^2 + (y_i - y_j)^2}$$

For the 3D visualization we also assign a flight altitude to each leg, reflecting real drone operations where altitude varies based on urban obstacle avoidance.


Full Source Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
# ============================================================
# Drone Delivery Route Optimization (TSP with ILP + Heuristics)
# Google Colaboratory
# ============================================================

# โ”€โ”€ 0. Install & import โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
!pip install pulp --quiet

import numpy as np
import matplotlib
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
from matplotlib.gridspec import GridSpec
from mpl_toolkits.mplot3d import Axes3D
from mpl_toolkits.mplot3d.art3d import Line3DCollection
import itertools
import time
import pulp
from pulp import LpProblem, LpVariable, LpMinimize, lpSum, LpBinary, LpInteger, value

# โ”€โ”€ 1. Problem instance โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
np.random.seed(42)

N_CUSTOMERS = 10 # number of delivery locations
N_NODES = N_CUSTOMERS + 1 # including depot (node 0)

# 2-D city coordinates (km), depot at center
coords = np.zeros((N_NODES, 2))
coords[0] = [5.0, 5.0] # depot
np.random.seed(42)
coords[1:] = np.random.uniform(0, 10, size=(N_CUSTOMERS, 2))

# Flight altitudes (meters): depot at 0, deliveries between 30-120 m
altitudes = np.zeros(N_NODES)
altitudes[0] = 0.0
altitudes[1:] = np.random.uniform(30, 120, size=N_CUSTOMERS)

# Distance matrix (Euclidean, km)
def build_distance_matrix(coords):
n = len(coords)
D = np.zeros((n, n))
for i in range(n):
for j in range(n):
D[i, j] = np.linalg.norm(coords[i] - coords[j])
return D

D = build_distance_matrix(coords)

print("=== Drone Delivery Route Optimization ===")
print(f"Depot : node 0 at {coords[0]}")
print(f"Customers : nodes 1โ€“{N_CUSTOMERS}")
print(f"Distance matrix (km) shape: {D.shape}")

# โ”€โ”€ 2. Exact ILP solver (MTZ formulation) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
def solve_tsp_ilp(D):
n = len(D)
prob = LpProblem("TSP_Drone", LpMinimize)

# Binary routing variables
x = [[LpVariable(f"x_{i}_{j}", cat=LpBinary) if i != j else None
for j in range(n)] for i in range(n)]

# MTZ auxiliary variables
u = [LpVariable(f"u_{i}", lowBound=1, upBound=n-1, cat=LpInteger)
for i in range(n)]

# Objective
prob += lpSum(D[i][j] * x[i][j]
for i in range(n) for j in range(n) if i != j)

# Flow constraints
for i in range(n):
prob += lpSum(x[i][j] for j in range(n) if i != j) == 1 # leave
prob += lpSum(x[j][i] for j in range(n) if i != j) == 1 # enter

# MTZ subtour elimination (skip depot node 0)
for i in range(1, n):
for j in range(1, n):
if i != j:
prob += u[i] - u[j] + (n - 1) * x[i][j] <= n - 2

# Solve (suppress solver output)
solver = pulp.PULP_CBC_CMD(msg=0)
t0 = time.time()
prob.solve(solver)
elapsed = time.time() - t0

# Extract tour
route = [0]
visited = {0}
current = 0
for _ in range(n - 1):
for j in range(n):
if j != current and j not in visited and x[current][j] is not None:
if value(x[current][j]) is not None and value(x[current][j]) > 0.5:
route.append(j)
visited.add(j)
current = j
break
route.append(0) # return to depot

total_dist = sum(D[route[k]][route[k+1]] for k in range(len(route) - 1))
return route, total_dist, elapsed

# โ”€โ”€ 3. Nearest-Neighbour heuristic (fast baseline) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
def solve_tsp_nn(D):
n = len(D)
visited = [False] * n
route = [0]
visited[0] = True
for _ in range(n - 1):
last = route[-1]
nearest = min((j for j in range(n) if not visited[j]),
key=lambda j: D[last][j])
route.append(nearest)
visited[nearest] = True
route.append(0)
total_dist = sum(D[route[k]][route[k+1]] for k in range(len(route) - 1))
return route, total_dist

# โ”€โ”€ 4. 2-opt improvement โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
def two_opt(route, D):
best = route[:]
improved = True
while improved:
improved = False
for i in range(1, len(best) - 2):
for j in range(i + 1, len(best) - 1):
new_route = best[:i] + best[i:j+1][::-1] + best[j+1:]
d_old = D[best[i-1]][best[i]] + D[best[j]][best[j+1]]
d_new = D[new_route[i-1]][new_route[i]] + D[new_route[j]][new_route[j+1]]
if d_new < d_old - 1e-10:
best = new_route
improved = True
total = sum(D[best[k]][best[k+1]] for k in range(len(best) - 1))
return best, total

# โ”€โ”€ 5. Run all solvers โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
t0 = time.time()
ilp_route, ilp_dist, ilp_time = solve_tsp_ilp(D)
print(f"\n[ILP] Route : {ilp_route}")
print(f"[ILP] Total distance : {ilp_dist:.4f} km (solved in {ilp_time:.2f}s)")

nn_route, nn_dist = solve_tsp_nn(D)
t1 = time.time()
opt_route, opt_dist = two_opt(nn_route, D)
heur_time = time.time() - t1
print(f"\n[NN] Route : {nn_route}")
print(f"[NN] Total distance : {nn_dist:.4f} km")
print(f"\n[2opt] Route : {opt_route}")
print(f"[2opt] Total distance : {opt_dist:.4f} km (solved in {heur_time:.4f}s)")
print(f"\nILP gap vs 2-opt: {abs(ilp_dist - opt_dist):.4f} km")

# โ”€โ”€ 6. Visualisation โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
plt.style.use("dark_background")
DEPOT_COLOR = "#FFD700"
CUST_COLOR = "#00CFFF"
ILP_COLOR = "#FF6B6B"
NN_COLOR = "#B0B0B0"
OPT_COLOR = "#7CFC00"
ALT_CMAP = matplotlib.colormaps["plasma"]

def route_segments(route, coords):
segs = []
for k in range(len(route) - 1):
segs.append((coords[route[k]], coords[route[k+1]]))
return segs

def draw_route_2d(ax, route, coords, color, lw, label, zorder=2):
for k in range(len(route) - 1):
a, b = coords[route[k]], coords[route[k+1]]
ax.annotate("", xy=b, xytext=a,
arrowprops=dict(arrowstyle="->", color=color,
lw=lw, mutation_scale=12),
zorder=zorder)
# invisible line for legend
ax.plot([], [], color=color, lw=lw, label=label)

fig = plt.figure(figsize=(20, 18), facecolor="#0D0D0D")
gs = GridSpec(3, 3, figure=fig,
hspace=0.38, wspace=0.32,
left=0.06, right=0.97, top=0.93, bottom=0.05)

# โ”€โ”€ Panel A: ILP optimal route (2D) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax_ilp = fig.add_subplot(gs[0, :2])
ax_ilp.set_facecolor("#0D0D0D")
draw_route_2d(ax_ilp, ilp_route, coords, ILP_COLOR, 1.8,
f"ILP optimal ({ilp_dist:.3f} km)")
ax_ilp.scatter(*coords[0], s=250, c=DEPOT_COLOR, zorder=5, marker="*",
label="Depot")
ax_ilp.scatter(*coords[1:].T, s=120, c=CUST_COLOR, zorder=5)
for i in range(N_NODES):
offset = [0.15, 0.15]
ax_ilp.text(coords[i,0]+offset[0], coords[i,1]+offset[1],
f"{'D' if i==0 else str(i)}",
color="white", fontsize=8, zorder=6)
ax_ilp.set_title("Panel A โ€” ILP Optimal Route (2D)", color="white", fontsize=12)
ax_ilp.set_xlabel("X (km)", color="white"); ax_ilp.set_ylabel("Y (km)", color="white")
ax_ilp.tick_params(colors="white"); ax_ilp.legend(fontsize=9, loc="upper left")
for sp in ax_ilp.spines.values(): sp.set_edgecolor("#444")

# โ”€โ”€ Panel B: Comparison 2D โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax_cmp = fig.add_subplot(gs[1, :2])
ax_cmp.set_facecolor("#0D0D0D")
draw_route_2d(ax_cmp, nn_route, coords, NN_COLOR, 1.2,
f"NN heuristic ({nn_dist:.3f} km)", zorder=2)
draw_route_2d(ax_cmp, opt_route, coords, OPT_COLOR, 1.8,
f"2-opt improved ({opt_dist:.3f} km)", zorder=3)
draw_route_2d(ax_cmp, ilp_route, coords, ILP_COLOR, 2.2,
f"ILP optimal ({ilp_dist:.3f} km)", zorder=4)
ax_cmp.scatter(*coords[0], s=250, c=DEPOT_COLOR, zorder=6, marker="*")
ax_cmp.scatter(*coords[1:].T, s=100, c=CUST_COLOR, zorder=6)
for i in range(N_NODES):
ax_cmp.text(coords[i,0]+0.15, coords[i,1]+0.15,
f"{'D' if i==0 else str(i)}",
color="white", fontsize=8, zorder=7)
ax_cmp.set_title("Panel B โ€” Route Comparison: NN vs 2-opt vs ILP (2D)",
color="white", fontsize=12)
ax_cmp.set_xlabel("X (km)", color="white"); ax_cmp.set_ylabel("Y (km)", color="white")
ax_cmp.tick_params(colors="white"); ax_cmp.legend(fontsize=9, loc="upper left")
for sp in ax_cmp.spines.values(): sp.set_edgecolor("#444")

# โ”€โ”€ Panel C: 3D flight path (ILP route with altitude) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax3d = fig.add_subplot(gs[0:2, 2], projection="3d")
ax3d.set_facecolor("#0D0D0D")
route_alts = [altitudes[node] for node in ilp_route]

# Colour each leg by altitude (average of endpoints)
n_legs = len(ilp_route) - 1
leg_colors = []
for k in range(n_legs):
avg_alt = (route_alts[k] + route_alts[k+1]) / 2.0
norm_alt = (avg_alt - altitudes.min()) / (altitudes.max() - altitudes.min() + 1e-9)
leg_colors.append(ALT_CMAP(norm_alt))

for k in range(n_legs):
xs = [coords[ilp_route[k],0], coords[ilp_route[k+1],0]]
ys = [coords[ilp_route[k],1], coords[ilp_route[k+1],1]]
zs = [route_alts[k], route_alts[k+1]]
ax3d.plot(xs, ys, zs, color=leg_colors[k], lw=2.0, zorder=3)
# Arrow approximation with quiver
dx = xs[1]-xs[0]; dy = ys[1]-ys[0]; dz = zs[1]-zs[0]
ax3d.quiver(xs[0], ys[0], zs[0], dx*0.85, dy*0.85, dz*0.85,
color=leg_colors[k], arrow_length_ratio=0.25,
linewidth=1.5, alpha=0.85)

# Scatter nodes
ax3d.scatter(coords[0,0], coords[0,1], altitudes[0],
s=300, c=DEPOT_COLOR, marker="*", zorder=5, label="Depot")
ax3d.scatter(coords[1:,0], coords[1:,1], altitudes[1:],
s=80, c=CUST_COLOR, zorder=5, label="Customer")

# Node labels
for i in range(N_NODES):
ax3d.text(coords[i,0], coords[i,1], altitudes[i]+4,
f"{'D' if i==0 else str(i)}",
color="white", fontsize=7, zorder=6)

# Vertical dashed lines (altitude poles)
for i in range(1, N_NODES):
ax3d.plot([coords[i,0]]*2, [coords[i,1]]*2, [0, altitudes[i]],
color="#444444", lw=0.7, linestyle="--")

ax3d.set_xlabel("X (km)", color="white", labelpad=6)
ax3d.set_ylabel("Y (km)", color="white", labelpad=6)
ax3d.set_zlabel("Altitude (m)", color="white", labelpad=6)
ax3d.tick_params(colors="white")
ax3d.set_title("Panel C โ€” 3D Flight Path\n(ILP optimal, coloured by altitude)",
color="white", fontsize=11)
ax3d.legend(fontsize=8, loc="upper left")

# โ”€โ”€ Panel D: Distance bar chart โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax_bar = fig.add_subplot(gs[2, 0])
ax_bar.set_facecolor("#0D0D0D")
methods = ["NN\nHeuristic", "2-opt\nImproved", "ILP\nOptimal"]
dists = [nn_dist, opt_dist, ilp_dist]
colors_bar = [NN_COLOR, OPT_COLOR, ILP_COLOR]
bars = ax_bar.bar(methods, dists, color=colors_bar, width=0.5, zorder=2)
for bar, d in zip(bars, dists):
ax_bar.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.05,
f"{d:.3f}", ha="center", va="bottom", color="white", fontsize=10)
ax_bar.set_title("Panel D โ€” Total Distance Comparison", color="white", fontsize=11)
ax_bar.set_ylabel("Total distance (km)", color="white")
ax_bar.tick_params(colors="white")
ax_bar.set_ylim(0, max(dists) * 1.18)
ax_bar.grid(axis="y", color="#333", zorder=0)
for sp in ax_bar.spines.values(): sp.set_edgecolor("#444")

# โ”€โ”€ Panel E: Per-leg distances (ILP) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax_leg = fig.add_subplot(gs[2, 1])
ax_leg.set_facecolor("#0D0D0D")
leg_dists = [D[ilp_route[k]][ilp_route[k+1]] for k in range(len(ilp_route)-1)]
leg_labels = [f"{ilp_route[k]}โ†’{ilp_route[k+1]}" for k in range(len(ilp_route)-1)]
bar_colors_leg = [ALT_CMAP((route_alts[k]+route_alts[k+1])/(2*120))
for k in range(len(ilp_route)-1)]
ax_leg.bar(range(len(leg_dists)), leg_dists, color=bar_colors_leg, zorder=2)
ax_leg.set_xticks(range(len(leg_dists)))
ax_leg.set_xticklabels(leg_labels, rotation=55, fontsize=7, color="white")
ax_leg.set_title("Panel E โ€” Per-leg Distance (ILP route)", color="white", fontsize=11)
ax_leg.set_ylabel("Distance (km)", color="white")
ax_leg.tick_params(colors="white")
ax_leg.grid(axis="y", color="#333", zorder=0)
for sp in ax_leg.spines.values(): sp.set_edgecolor("#444")

# โ”€โ”€ Panel F: Altitude profile along ILP route โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ax_alt = fig.add_subplot(gs[2, 2])
ax_alt.set_facecolor("#0D0D0D")
positions = np.arange(len(ilp_route))
node_labels = [f"{'D' if n==0 else str(n)}" for n in ilp_route]
alt_vals = [altitudes[n] for n in ilp_route]
# Fill under profile
ax_alt.fill_between(positions, alt_vals, alpha=0.25, color="#00CFFF")
ax_alt.plot(positions, alt_vals, color="#00CFFF", lw=2, marker="o",
markersize=5, zorder=3)
ax_alt.set_xticks(positions)
ax_alt.set_xticklabels(node_labels, fontsize=8, color="white")
ax_alt.set_title("Panel F โ€” Altitude Profile Along ILP Route", color="white", fontsize=11)
ax_alt.set_ylabel("Altitude (m)", color="white")
ax_alt.tick_params(colors="white")
ax_alt.grid(color="#333", zorder=0)
for sp in ax_alt.spines.values(): sp.set_edgecolor("#444")

fig.suptitle("Drone Delivery Route Optimization โ€” TSP with ILP & Heuristics",
color="white", fontsize=15, fontweight="bold", y=0.97)
plt.savefig("drone_tsp.png", dpi=150, facecolor=fig.get_facecolor())
plt.show()
print("Figure saved.")

Code Walkthrough

Section 0 โ€” Environment Setup

PuLP is installed quietly. NumPy, Matplotlib (including mpl_toolkits.mplot3d), and itertools are imported. All randomness is seeded at 42 for reproducibility.

Section 1 โ€” Problem Instance

We place the depot at coordinates $(5, 5)$ and scatter 10 customer nodes uniformly at random over a $10 \times 10$ km grid. Each node also receives a random flight altitude between 30 m and 120 m (the depot sits at 0 m), simulating real urban airspace constraints. The full $11 \times 11$ Euclidean distance matrix $D$ is precomputed once using np.linalg.norm.

Section 2 โ€” Exact ILP Solver (MTZ)

The TSP is cast as an Integer Linear Program:

  • Decision variables $x_{ij} \in {0,1}$ encode whether the drone travels from node $i$ to $j$.
  • Degree constraints enforce exactly one departure and one arrival per node.
  • MTZ constraints $u_i - u_j + (n-1)x_{ij} \leq n-2$ eliminate subtours by assigning an ordering integer $u_i$ to each non-depot node.

PuLPโ€™s CBC solver finds the provably optimal solution. For $n = 11$ nodes this takes under a few seconds on Colab.

Section 3 โ€” Nearest-Neighbour Heuristic

A greedy $O(n^2)$ baseline: always fly to the closest unvisited customer. Fast but suboptimal โ€” it commonly produces routes 20โ€“40% longer than optimal on random instances.

Section 4 โ€” 2-opt Improvement

Starting from the NN tour, 2-opt repeatedly reverses a segment of the route if doing so reduces total distance:

$$\Delta = d_{i-1,i} + d_{j,j+1} - d_{i-1,j} - d_{i,j+1} < 0$$

Iterations continue until no improving swap exists. This reliably closes most of the gap between NN and ILP.

Section 5 โ€” Running All Solvers

All three methods are timed independently. The ILP gap versus 2-opt is printed to quantify how close the heuristic came to optimality.


Visualization Panels

Panel A โ€” ILP Optimal Route (2D). The complete optimal tour is drawn with directional arrows over the city map. The depot (gold star) and customers (cyan dots) are labelled. This is the ground-truth solution.

Panel B โ€” Route Comparison (2D). All three routes are overlaid: grey (NN), green (2-opt), and red (ILP). The visual difference between NN and the optimized routes immediately reveals where greedy choices go wrong โ€” typically long backtracking legs that 2-opt and ILP eliminate.

Panel C โ€” 3D Flight Path. The ILP route is lifted into 3D using the per-node altitude assignments. Each flight leg is coloured by average altitude using the plasma colormap โ€” warmer colours indicate higher segments. Vertical dashed poles connect each delivery point to ground level. Quiver arrows show flight direction. This panel captures the true nature of drone operations, where altitude is a continuous variable.

Panel D โ€” Distance Bar Chart. A direct comparison of total tour length across the three methods. The difference between NN and ILP quantifies the cost of greedy decisions; the 2-opt bar shows how much of that gap is recoverable without integer programming.

Panel E โ€” Per-leg Distances (ILP). Each individual flight leg in the ILP-optimal tour is shown as a bar, coloured by the average altitude of that leg. This reveals which segments dominate total flight time and energy consumption.

Panel F โ€” Altitude Profile. The altitude of the drone at each stop in the ILP route is plotted as a filled line chart, giving a longitudinal view of the vertical profile of the entire delivery mission.


Results

=== Drone Delivery Route Optimization ===
Depot       : node 0  at [5. 5.]
Customers   : nodes 1โ€“10
Distance matrix (km) shape: (11, 11)

[ILP]  Route : [0, 10, 8, 3, 9, 4, 6, 1, 5, 2, 7, 0]
[ILP]  Total distance : 31.5408 km  (solved in 0.18s)

[NN]   Route : [0, 9, 10, 8, 3, 7, 2, 5, 1, 4, 6, 0]
[NN]   Total distance : 34.6317 km

[2opt] Route : [0, 9, 4, 6, 1, 5, 2, 7, 10, 3, 8, 0]
[2opt] Total distance : 31.8671 km  (solved in 0.0003s)

ILP gap vs 2-opt: 0.3263 km

Figure saved.

Key Takeaways

The MTZ-based ILP formulation guarantees global optimality for small-to-medium fleets (up to ~15 nodes in seconds on Colab). The 2-opt heuristic almost always matches or comes within 1โ€“2% of the ILP optimum at a fraction of the compute cost, making it the practical choice for real-time dispatch. For fleets with 50+ destinations, metaheuristics (genetic algorithms, ant colony optimization) or branch-and-cut solvers like Google OR-Tools become necessary.

The 3D altitude model is a reminder that real drone routing is not flat: energy consumption scales with $\sqrt{\Delta x^2 + \Delta y^2 + \Delta z^2}$, and future extensions should incorporate battery capacity constraints, no-fly zones, and wind fields into the objective.