A Practical Guide with Python
What is the Warehouse Location Problem?
The Warehouse Location Problem (also called the Facility Location Problem) is a classic combinatorial optimization problem in operations research. The core question is:
Given a set of potential warehouse sites and a set of customers, where should we open warehouses to minimize total cost?
The total cost consists of two components:
- Fixed opening cost — the cost to open a warehouse at a given site
- Transportation cost — the cost to ship goods from a warehouse to each customer
Problem Formulation
Let’s define the math precisely.
Sets:
- $I$ = set of potential warehouse locations ${1, 2, \ldots, m}$
- $J$ = set of customers ${1, 2, \ldots, n}$
Parameters:
- $f_i$ = fixed cost of opening warehouse $i$
- $c_{ij}$ = transportation cost from warehouse $i$ to customer $j$
- $d_j$ = demand of customer $j$
Decision Variables:
- $y_i \in {0, 1}$ — 1 if warehouse $i$ is opened, 0 otherwise
- $x_{ij} \in {0, 1}$ — 1 if customer $j$ is assigned to warehouse $i$
Objective Function (Minimize):
$$\text{Minimize} \quad \sum_{i \in I} f_i y_i + \sum_{i \in I} \sum_{j \in J} c_{ij} d_j x_{ij}$$
Constraints:
$$\sum_{i \in I} x_{ij} = 1 \quad \forall j \in J \quad \text{(each customer assigned to exactly one warehouse)}$$
$$x_{ij} \leq y_i \quad \forall i \in I, j \in J \quad \text{(can only assign to open warehouse)}$$
$$y_i \in {0,1}, \quad x_{ij} \in {0,1}$$
Concrete Example Setup
We’ll solve a realistic instance:
- 10 potential warehouse locations scattered across a 100×100 grid
- 30 customers with varying demands
- Fixed costs differ per site (reflecting land/construction differences)
- Transportation costs proportional to Euclidean distance × demand
Python Source Code
1 | # ============================================================ |
Code Walkthrough
Section 0–1: Setup & Imports
PuLP is installed quietly via subprocess so the cell is self-contained. We import numpy, matplotlib (2D + 3D), scipy.interpolate (for the demand heatmap), and pulp (the ILP solver).
Section 2: Problem Data
1 | wh_coords = np.random.uniform(10, 90, (NUM_WAREHOUSES, 2)) |
10 warehouse candidates and 30 customers are randomly placed on a 100×100 grid. The transport cost matrix $C$ is:
$$c_{ij} = 0.5 \times | \mathbf{w}_i - \mathbf{c}_j |_2 \times d_j$$
where $\mathbf{w}_i$ is warehouse $i$’s location, $\mathbf{c}_j$ is customer $j$’s location, and $d_j$ is demand. The coefficient $0.5$ is a per-unit-distance shipping rate.
Section 3: ILP Model
We declare two sets of binary variables:
| Variable | Meaning |
|---|---|
| $y_i$ | 1 = open warehouse $i$ |
| $x_{ij}$ | 1 = customer $j$ served by warehouse $i$ |
The objective minimizes total fixed + transportation cost. Two constraint families enforce:
- Every customer is served by exactly one warehouse.
- A customer can only be assigned to an open warehouse.
Section 4: Solve
PuLP dispatches to the bundled CBC (Coin-or Branch and Cut) solver — a production-grade open-source MIP solver. msg=0 suppresses verbose output. On this 10-warehouse × 30-customer instance, the solver typically finds the global optimum in milliseconds.
Section 5: Visualization
Four panels are produced:
| Panel | What it shows |
|---|---|
| 2D Assignment Map | Lines from each customer to its assigned warehouse; open warehouses as gold stars; closed candidates as gray squares |
| Cost Breakdown Bar | Stacked bar per open warehouse: fixed cost (blue) + transport cost (red) |
| Demand Heatmap | Interpolated demand density across the grid overlaid with customer bubbles (size ∝ demand) and warehouse stars |
| 3D Cost Surface | For every point on the grid, the minimum transport cost to the nearest open warehouse — valleys reveal optimal coverage zones |
Results
=== Problem Summary === Potential warehouses : 10 Customers : 30 Fixed cost range : 595 – 1831 Demand range : 12.3 – 93.7 === Solver Result === Status : Optimal Total Cost : 18,246.5 Opened warehouses (5) : [2, 3, 4, 6, 8] Fixed cost : 5,983.4 Transport cost : 12,263.2

[Figure saved as warehouse_location_result.png]
Key Takeaways
The optimal solution typically opens 3–5 warehouses out of the 10 candidates, balancing:
- High fixed cost sites are skipped even if they are centrally located.
- Clusters of high-demand customers attract warehouses strongly because transport savings are proportional to $d_j$.
- The 3D cost surface makes it visually obvious that each open warehouse creates a cost valley — customers inside the valley are served cheaply, and ridges between warehouses indicate where assignment boundaries lie.
This exact model scales to real-world instances with hundreds of sites and thousands of customers. For very large instances, metaheuristics (Simulated Annealing, Genetic Algorithms) or Lagrangian Relaxation are common acceleration strategies.