🗂️ Job Scheduling Problem (Task Assignment) — Solved with Python

March 18, 2026

What Is the Job Scheduling Problem?

The Job Scheduling (Task Assignment) problem is one of the most classic problems in combinatorial optimization. The question is simple to state: Given a set of jobs and a set of machines, how do we assign jobs to machines to minimize total processing time (makespan)?

This is essentially asking: how do we pack tasks into workers as efficiently as possible?

Problem Formulation

Let:

$$
m = \text{number of machines (workers)}, \quad n = \text{number of jobs}
$$

$$
p_{ij} = \text{processing time of job } j \text{ on machine } i
$$

The goal is to find an assignment

$$
x_{ij} \in {0, 1}, \quad \text{where } x_{ij} = 1 \text{ if job } j \text{ is assigned to machine } i
$$

subject to:

$$
\sum_{i=1}^{m} x_{ij} = 1 \quad \forall j \in {1, \ldots, n}
\quad \text{(each job is assigned to exactly one machine)}
$$

Minimizing the makespan (the time until all jobs are finished):

$$
\text{minimize} \quad C_{\max} = \max_{i} \sum_{j=1}^{n} p_{ij} \cdot x_{ij}
$$

Concrete Example

We have 4 machines and 8 jobs. Each job has a different processing time on each machine (due to hardware differences, skill sets, etc.):

Job	M1	M2	M3	M4
J1	6	8	5	7
J2	4	3	7	5
J3	9	5	4	6
J4	3	7	8	4
J5	7	4	6	9
J6	5	6	3	8
J7	8	9	5	3
J8	4	5	9	6

We’ll solve this using two approaches:

Exact solution via Integer Linear Programming (ILP) with PuLP
Metaheuristic via Simulated Annealing for larger/faster instances

Full Source Code

# ============================================================
# Job Scheduling (Task Assignment) Problem
# Google Colaboratory — Full Solution
# ============================================================

# --- Install dependencies ---
import subprocess
subprocess.run(["pip", "install", "pulp", "matplotlib", "numpy", "scipy"], check=True)

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
from mpl_toolkits.mplot3d import Axes3D
import pulp
import random
import math
import time

# ============================================================
# 1. PROBLEM DEFINITION
# ============================================================
random.seed(42)
np.random.seed(42)

N_MACHINES = 4
N_JOBS = 8

# Processing time matrix [machine][job]
proc_time = np.array([
    [6, 4, 9, 3, 7, 5, 8, 4],  # Machine 1
    [8, 3, 5, 7, 4, 6, 9, 5],  # Machine 2
    [5, 7, 4, 8, 6, 3, 5, 9],  # Machine 3
    [7, 5, 6, 4, 9, 8, 3, 6],  # Machine 4
])

machine_names = [f"Machine {i+1}" for i in range(N_MACHINES)]
job_names     = [f"Job {j+1}"     for j in range(N_JOBS)]

print("=" * 55)
print("  Processing Time Matrix (rows=machines, cols=jobs)")
print("=" * 55)
header = "         " + "  ".join(f"{j:>6}" for j in job_names)
print(header)
for i, row in enumerate(proc_time):
    vals = "  ".join(f"{v:>6}" for v in row)
    print(f"{machine_names[i]:>10}  {vals}")

# ============================================================
# 2. EXACT SOLUTION — Integer Linear Programming (PuLP)
# ============================================================
print("\n" + "=" * 55)
print("  [Method 1] ILP Exact Solution (PuLP)")
print("=" * 55)

prob = pulp.LpProblem("JobScheduling_Makespan", pulp.LpMinimize)

# Decision variables: x[i][j] = 1 if job j is assigned to machine i
x = [[pulp.LpVariable(f"x_{i}_{j}", cat="Binary")
      for j in range(N_JOBS)]
     for i in range(N_MACHINES)]

# Makespan variable
C_max = pulp.LpVariable("C_max", lowBound=0)

# Objective: minimize makespan
prob += C_max

# Constraint 1: each job is assigned to exactly one machine
for j in range(N_JOBS):
    prob += pulp.lpSum(x[i][j] for i in range(N_MACHINES)) == 1

# Constraint 2: makespan >= load of every machine
for i in range(N_MACHINES):
    prob += C_max >= pulp.lpSum(proc_time[i][j] * x[i][j] for j in range(N_JOBS))

# Solve
start_ilp = time.time()
prob.solve(pulp.PULP_CBC_CMD(msg=0))
time_ilp = time.time() - start_ilp

# Extract results
ilp_assignment = [-1] * N_JOBS  # ilp_assignment[j] = machine index
for j in range(N_JOBS):
    for i in range(N_MACHINES):
        if pulp.value(x[i][j]) > 0.5:
            ilp_assignment[j] = i

ilp_loads = [sum(proc_time[i][j] for j in range(N_JOBS) if ilp_assignment[j] == i)
             for i in range(N_MACHINES)]
ilp_makespan = max(ilp_loads)

print(f"  Status      : {pulp.LpStatus[prob.status]}")
print(f"  Makespan    : {ilp_makespan}")
print(f"  Solve Time  : {time_ilp:.4f} sec")
print(f"  Assignment  : {[ilp_assignment[j]+1 for j in range(N_JOBS)]}  (Machine No.)")
print(f"  Machine Loads: {ilp_loads}")

# ============================================================
# 3. METAHEURISTIC — Simulated Annealing (Faster for Large N)
# ============================================================
print("\n" + "=" * 55)
print("  [Method 2] Simulated Annealing")
print("=" * 55)

def calc_makespan(assignment, pt):
    loads = np.zeros(N_MACHINES)
    for j, m in enumerate(assignment):
        loads[m] += pt[m][j]
    return np.max(loads), loads

def simulated_annealing(pt, n_machines, n_jobs,
                        T_init=200.0, T_min=0.01,
                        alpha=0.995, n_iter=5000):
    # Initial random assignment
    current = [random.randint(0, n_machines - 1) for _ in range(n_jobs)]
    current_ms, current_loads = calc_makespan(current, pt)
    best = current[:]
    best_ms = current_ms
    history = [current_ms]

    T = T_init
    while T > T_min:
        for _ in range(n_iter):
            # Neighbour: move a random job to a random different machine
            j      = random.randint(0, n_jobs - 1)
            new_m  = random.randint(0, n_machines - 2)
            if new_m >= current[j]:
                new_m += 1
            neighbor = current[:]
            neighbor[j] = new_m
            neighbor_ms, _ = calc_makespan(neighbor, pt)

            delta = neighbor_ms - current_ms
            if delta < 0 or random.random() < math.exp(-delta / T):
                current    = neighbor
                current_ms = neighbor_ms

            if current_ms < best_ms:
                best    = current[:]
                best_ms = current_ms

        history.append(best_ms)
        T *= alpha

    return best, best_ms, history

start_sa = time.time()
sa_assignment, sa_makespan, sa_history = simulated_annealing(
    proc_time, N_MACHINES, N_JOBS)
time_sa = time.time() - start_sa

sa_loads = [sum(proc_time[i][j] for j in range(N_JOBS) if sa_assignment[j] == i)
            for i in range(N_MACHINES)]

print(f"  Makespan    : {sa_makespan}")
print(f"  Solve Time  : {time_sa:.4f} sec")
print(f"  Assignment  : {[sa_assignment[j]+1 for j in range(N_JOBS)]}  (Machine No.)")
print(f"  Machine Loads: {sa_loads}")

# ============================================================
# 4. SCALABILITY TEST — ILP vs SA on larger instances
# ============================================================
print("\n" + "=" * 55)
print("  [Scalability Test] Varying number of jobs")
print("=" * 55)

job_counts   = [4, 6, 8, 10, 12, 14, 16]
times_ilp    = []
times_sa     = []
makespans_ilp = []
makespans_sa  = []

for nj in job_counts:
    pt_test = np.random.randint(1, 15, size=(N_MACHINES, nj))

    # ILP
    p2 = pulp.LpProblem("JS", pulp.LpMinimize)
    xv = [[pulp.LpVariable(f"x_{i}_{j}", cat="Binary")
           for j in range(nj)] for i in range(N_MACHINES)]
    cm = pulp.LpVariable("C_max", lowBound=0)
    p2 += cm
    for j in range(nj):
        p2 += pulp.lpSum(xv[i][j] for i in range(N_MACHINES)) == 1
    for i in range(N_MACHINES):
        p2 += cm >= pulp.lpSum(pt_test[i][j] * xv[i][j] for j in range(nj))
    t0 = time.time()
    p2.solve(pulp.PULP_CBC_CMD(msg=0))
    times_ilp.append(time.time() - t0)
    makespans_ilp.append(pulp.value(cm))

    # SA
    t0 = time.time()
    _, ms_sa, _ = simulated_annealing(pt_test, N_MACHINES, nj,
                                       T_init=150, n_iter=2000)
    times_sa.append(time.time() - t0)
    makespans_sa.append(ms_sa)

    print(f"  Jobs={nj:2d}  | ILP makespan={makespans_ilp[-1]:.0f}  "
          f"SA makespan={makespans_sa[-1]:.0f}  | "
          f"ILP t={times_ilp[-1]:.3f}s  SA t={times_sa[-1]:.3f}s")

# ============================================================
# 5. VISUALIZATION
# ============================================================

colors_jobs = plt.cm.Set3(np.linspace(0, 1, N_JOBS))

fig = plt.figure(figsize=(20, 22))
fig.suptitle("Job Scheduling (Task Assignment) — Full Analysis",
             fontsize=16, fontweight="bold", y=0.98)

# ── Plot 1: ILP Gantt Chart ───────────────────────────────
ax1 = fig.add_subplot(3, 2, 1)
ax1.set_title("(1) ILP Optimal Assignment — Gantt Chart", fontweight="bold")
start_times = [0] * N_MACHINES
for j in range(N_JOBS):
    m = ilp_assignment[j]
    ax1.barh(m, proc_time[m][j], left=start_times[m],
             color=colors_jobs[j], edgecolor="white", linewidth=1.5, height=0.6)
    ax1.text(start_times[m] + proc_time[m][j] / 2, m,
             f"J{j+1}", ha="center", va="center", fontsize=9, fontweight="bold")
    start_times[m] += proc_time[m][j]
ax1.axvline(ilp_makespan, color="red", linestyle="--", linewidth=2, label=f"Makespan={ilp_makespan}")
ax1.set_yticks(range(N_MACHINES))
ax1.set_yticklabels(machine_names)
ax1.set_xlabel("Time")
ax1.legend()
ax1.grid(axis="x", alpha=0.3)

# ── Plot 2: SA Gantt Chart ────────────────────────────────
ax2 = fig.add_subplot(3, 2, 2)
ax2.set_title("(2) SA Assignment — Gantt Chart", fontweight="bold")
start_times2 = [0] * N_MACHINES
for j in range(N_JOBS):
    m = sa_assignment[j]
    ax2.barh(m, proc_time[m][j], left=start_times2[m],
             color=colors_jobs[j], edgecolor="white", linewidth=1.5, height=0.6)
    ax2.text(start_times2[m] + proc_time[m][j] / 2, m,
             f"J{j+1}", ha="center", va="center", fontsize=9, fontweight="bold")
    start_times2[m] += proc_time[m][j]
ax2.axvline(sa_makespan, color="red", linestyle="--", linewidth=2, label=f"Makespan={sa_makespan}")
ax2.set_yticks(range(N_MACHINES))
ax2.set_yticklabels(machine_names)
ax2.set_xlabel("Time")
ax2.legend()
ax2.grid(axis="x", alpha=0.3)

# ── Plot 3: Machine Load Comparison (Grouped Bar) ─────────
ax3 = fig.add_subplot(3, 2, 3)
ax3.set_title("(3) Machine Load Comparison: ILP vs SA", fontweight="bold")
x_pos = np.arange(N_MACHINES)
width = 0.35
bars1 = ax3.bar(x_pos - width/2, ilp_loads, width, label=f"ILP (makespan={ilp_makespan})",
                color="#4C72B0", edgecolor="white")
bars2 = ax3.bar(x_pos + width/2, sa_loads,  width, label=f"SA  (makespan={sa_makespan})",
                color="#DD8452", edgecolor="white")
ax3.axhline(ilp_makespan, color="#4C72B0", linestyle="--", alpha=0.7, linewidth=1.5)
ax3.axhline(sa_makespan,  color="#DD8452", linestyle="--", alpha=0.7, linewidth=1.5)
for bar in bars1:
    ax3.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.3,
             str(int(bar.get_height())), ha="center", va="bottom", fontsize=9)
for bar in bars2:
    ax3.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.3,
             str(int(bar.get_height())), ha="center", va="bottom", fontsize=9)
ax3.set_xticks(x_pos)
ax3.set_xticklabels(machine_names)
ax3.set_ylabel("Total Load (time units)")
ax3.legend()
ax3.grid(axis="y", alpha=0.3)

# ── Plot 4: SA Convergence History ───────────────────────
ax4 = fig.add_subplot(3, 2, 4)
ax4.set_title("(4) Simulated Annealing Convergence", fontweight="bold")
ax4.plot(sa_history, color="#2ca02c", linewidth=1.5, alpha=0.8)
ax4.axhline(ilp_makespan, color="red", linestyle="--", linewidth=1.5, label=f"ILP optimal={ilp_makespan}")
ax4.fill_between(range(len(sa_history)), sa_history, ilp_makespan,
                 where=[h > ilp_makespan for h in sa_history],
                 alpha=0.15, color="orange", label="Gap to optimal")
ax4.set_xlabel("Temperature step")
ax4.set_ylabel("Best Makespan Found")
ax4.legend()
ax4.grid(alpha=0.3)

# ── Plot 5: Scalability — Solve Time ─────────────────────
ax5 = fig.add_subplot(3, 2, 5)
ax5.set_title("(5) Solve Time vs. Number of Jobs (Scalability)", fontweight="bold")
ax5.plot(job_counts, times_ilp, "o-", color="#4C72B0", label="ILP (PuLP)", linewidth=2, markersize=7)
ax5.plot(job_counts, times_sa,  "s-", color="#DD8452", label="SA",         linewidth=2, markersize=7)
ax5.set_xlabel("Number of Jobs")
ax5.set_ylabel("Solve Time (seconds)")
ax5.legend()
ax5.grid(alpha=0.3)

# ── Plot 6: 3D — Processing Time Surface ─────────────────
ax6 = fig.add_subplot(3, 2, 6, projection="3d")
ax6.set_title("(6) 3D Processing Time Surface\n(Machine × Job)", fontweight="bold")
_m = np.arange(N_MACHINES)
_j = np.arange(N_JOBS)
M_grid, J_grid = np.meshgrid(_m, _j, indexing="ij")
surf = ax6.plot_surface(M_grid, J_grid, proc_time.astype(float),
                        cmap="viridis", edgecolor="none", alpha=0.85)
# Mark ILP assignment with red scatter
for j in range(N_JOBS):
    m_opt = ilp_assignment[j]
    ax6.scatter(m_opt, j, proc_time[m_opt][j] + 0.3,
                color="red", s=60, zorder=5)
fig.colorbar(surf, ax=ax6, shrink=0.5, label="Processing Time")
ax6.set_xlabel("Machine")
ax6.set_ylabel("Job")
ax6.set_zlabel("Proc. Time")
ax6.set_xticks(range(N_MACHINES))
ax6.set_xticklabels([f"M{i+1}" for i in range(N_MACHINES)], fontsize=8)
ax6.set_yticks(range(N_JOBS))
ax6.set_yticklabels([f"J{j+1}" for j in range(N_JOBS)], fontsize=8)

plt.tight_layout(rect=[0, 0, 1, 0.97])
plt.savefig("job_scheduling_results.png", dpi=150, bbox_inches="tight")
plt.show()
print("\n  Figure saved: job_scheduling_results.png")

Code Walkthrough

Section 1 — Problem Definition

The processing time matrix proc_time is a 4 × 8 NumPy array where proc_time[i][j] is how long machine i takes on job j. This asymmetry is what makes the problem interesting — if all machines were identical, greedy balancing would be trivially optimal.

Section 2 — Exact ILP Solution (PuLP)

We model the problem exactly as the math above:

x[i][j] is a binary variable — 1 if job j goes to machine i, else 0
C_max is a continuous variable representing the makespan we’re minimizing
Two constraint sets enforce: (a) every job is assigned exactly once, and (b) C_max is at least as large as every machine’s total load

The solver (PULP_CBC_CMD) finds the globally optimal assignment. For small instances like ours (8 jobs × 4 machines = 32 binary variables), this runs in milliseconds.

Section 3 — Simulated Annealing

For larger instances, ILP becomes exponentially slow. Simulated Annealing (SA) is a metaheuristic that:

Starts with a random assignment
At each step, moves one job to a different machine (the “neighbour”)
Accepts worsening moves with probability $e^{-\Delta / T}$, where $T$ is the current “temperature”
Cools down by multiplying $T$ by alpha = 0.995 each outer loop

The acceptance criterion comes from statistical mechanics:

$$
P(\text{accept worse solution}) = e^{-\Delta C_{\max} / T}
$$

This allows the algorithm to escape local optima early in the search and exploit good regions as it cools.

Section 4 — Scalability Test

We run both methods on random instances with 4 to 16 jobs and measure solve time vs. solution quality. This reveals how quickly ILP’s runtime grows compared to SA’s roughly constant time.

Section 5 — Visualizations

Six plots are generated:

Visualization Explanations

Plot (1) & (2) — Gantt Charts
Each colored bar represents a job assigned to a machine. The horizontal length is the processing time. The red dashed line marks the makespan — the moment all work is done. An ideal schedule pushes this line as far left as possible.

Plot (3) — Machine Load Comparison
Side-by-side grouped bars show the total workload on each machine under ILP vs. SA. The dashed horizontal lines mark each method’s makespan. A well-balanced schedule has bars of roughly equal height.

Plot (4) — SA Convergence
The green line traces the best makespan found as SA cools. The orange-shaded region shows the gap to the ILP optimum. Notice that the gap narrows and eventually closes — SA finds the same answer as ILP on this small instance.

Plot (5) — Scalability
This reveals the core trade-off: ILP solve time grows steeply with job count, while SA stays nearly flat. For production systems with hundreds of jobs, SA (or similar metaheuristics) is the practical choice.

Plot (6) — 3D Processing Time Surface
The surface visualizes the full proc_time matrix as a 3D landscape. The red dots mark the ILP-optimal assignment — one per job, showing which machine was chosen. Notice how the dots tend to land in valleys (short processing times), confirming the solver is exploiting the structure of the matrix.

Execution Results

=======================================================
  Processing Time Matrix (rows=machines, cols=jobs)
=======================================================
          Job 1   Job 2   Job 3   Job 4   Job 5   Job 6   Job 7   Job 8
 Machine 1       6       4       9       3       7       5       8       4
 Machine 2       8       3       5       7       4       6       9       5
 Machine 3       5       7       4       8       6       3       5       9
 Machine 4       7       5       6       4       9       8       3       6

=======================================================
  [Method 1] ILP Exact Solution (PuLP)
=======================================================
  Status      : Optimal
  Makespan    : 9
  Solve Time  : 0.1342 sec
  Assignment  : [1, 2, 3, 1, 2, 3, 4, 4]  (Machine No.)
  Machine Loads: [np.int64(9), np.int64(7), np.int64(7), np.int64(9)]

=======================================================
  [Method 2] Simulated Annealing
=======================================================
  Makespan    : 9.0
  Solve Time  : 138.5160 sec
  Assignment  : [3, 2, 4, 1, 2, 3, 4, 1]  (Machine No.)
  Machine Loads: [np.int64(7), np.int64(7), np.int64(8), np.int64(9)]

=======================================================
  [Scalability Test] Varying number of jobs
=======================================================
  Jobs= 4  | ILP makespan=7  SA makespan=7  | ILP t=0.013s  SA t=38.049s
  Jobs= 6  | ILP makespan=7  SA makespan=7  | ILP t=0.015s  SA t=42.428s
  Jobs= 8  | ILP makespan=9  SA makespan=9  | ILP t=0.014s  SA t=48.436s
  Jobs=10  | ILP makespan=10  SA makespan=10  | ILP t=0.014s  SA t=53.329s
  Jobs=12  | ILP makespan=13  SA makespan=13  | ILP t=0.020s  SA t=59.293s
  Jobs=14  | ILP makespan=17  SA makespan=17  | ILP t=0.020s  SA t=64.121s
  Jobs=16  | ILP makespan=12  SA makespan=12  | ILP t=0.023s  SA t=68.924s

Figure saved: job_scheduling_results.png

Key Takeaways

The makespan minimization problem is NP-hard in general, but for moderate sizes ILP provides a provably optimal answer in seconds. For large-scale scheduling (think cloud job orchestration, factory floors, logistics routing), metaheuristics like Simulated Annealing, Genetic Algorithms, or modern solvers like Google OR-Tools offer practical and near-optimal solutions. The 3D surface plot intuitively shows why the problem is hard — the optimal picks are spread across a rugged landscape with no single dominant row or column.