Root Finding with SciPy： Solving Nonlinear Equations

September 23, 2024

Root Finding with SciPy： Solving Nonlinear Equations

Here’s an example of a root-finding problem solved using $SciPy$.

We will find the root of a nonlinear equation using the scipy.optimize module.

Scenario: Suppose you need to find the root of the following nonlinear equation, which is common in physics, engineering, or financial modeling:

$$
f(x) = x^3 - 4x^2 + 6x - 24 = 0
$$

Our goal is to find the value of $( x )$ where the function $( f(x) )$ equals zero.
This means we are looking for the root of the equation.

Solution Approach

To solve this problem, we will use $SciPy$’s fsolve() function, which is designed to find the roots of a given function.
fsolve() uses numerical methods to approximate the root and is highly effective for complex equations where analytical solutions are difficult or impossible.

Step-by-Step Solution Using SciPy

Define the Function: Define the equation $( f(x) = x^3 - 4x^2 + 6x - 24 )$.
Use fsolve() from SciPy: Use the fsolve() function to find the root, providing an initial guess close to where we expect the root to be.
Interpret the Results: The function will return the root value, which can then be interpreted as the solution to the equation.

Here’s the implementation in Python:

import numpy as np
from scipy.optimize import fsolve

# Define the nonlinear function
def equation(x):
    return x**3 - 4*x**2 + 6*x - 24

# Initial guess for the root
initial_guess = 5

# Use fsolve to find the root
root = fsolve(equation, initial_guess)

# Print the results
print(f"Root found: x = {root[0]:.4f}")

# Verify the result by substituting back into the original equation
verification = equation(root[0])
print(f"Verification (should be close to 0): f({root[0]:.4f}) = {verification:.4e}")

Explanation of the Code

Function Definition:
The function equation(x) represents the nonlinear equation we want to solve.
Initial Guess:
The initial_guess value is an estimate of where the root might be. This helps fsolve() start its search.
Different initial guesses can lead to different roots, especially if the function has multiple roots.
Finding the Root:
fsolve(equation, initial_guess) finds the root of the function starting from the initial guess.
It returns an array with the root value.
Verification:
To ensure the root is accurate, we substitute the root back into the equation to see if the result is approximately zero.

Expected Output

1 2	Root found: x = 4.0000 Verification (should be close to 0): f(4.0000) = 0.0000e+00

Interpretation of the Output

The output indicates that the root of the equation was found at $( x = 4.0000 )$.
This means that when $( x = 4.0000 )$, the function $( f(x) = x^3 - 4x^2 + 6x - 24 )$ equals zero, confirming that this is indeed a root of the equation.

The verification step shows that substituting $( x = 4.0000 )$ into the function results in a value of $( 0.0000e+00 )$, which is exactly zero, as expected.
This confirms that the solution is accurate and the root has been correctly identified.

In summary, $( x = 4.0000 )$ is a precise root of the given equation, and the verification confirms the correctness of the result.

Conclusion

$SciPy$’s fsolve() function provides a powerful way to find the roots of nonlinear equations, which are prevalent in various scientific and engineering fields.

This method is particularly useful when analytical solutions are difficult to obtain or do not exist, making numerical methods like fsolve() essential for practical problem-solving.

Statistical Analysis with SciPy：Conducting a Two-Sample t-test to Compare Means

September 22, 2024

Statistical Analysis with SciPy: Conducting a Two-Sample t-test to Compare Means

Here’s an example of a statistical analysis problem solved using $SciPy$.

We will perform a hypothesis test to determine if two independent samples come from populations with the same mean.

Problem: Two-Sample t-test

Scenario: You are a data analyst at a company that wants to compare the average sales between two different regions, Region A and Region B, over a given period.

You have collected sales data from each region.

Your goal is to determine if there is a statistically significant difference in average sales between the two regions using a two-sample t-test.

Hypothesis

Null Hypothesis (H₀): The means of the two populations (Region A and Region B) are equal.
Alternative Hypothesis (H₁): The means of the two populations are not equal.

Data

Let’s generate some sample data to simulate the sales figures from both regions:

Region A: $[23, 25, 21, 22, 20, 30, 24, 28, 19, 27]$
Region B: $[18, 20, 22, 24, 26, 19, 17, 21, 25, 23]$

Step-by-Step Solution Using SciPy

Import Required Libraries: We need $SciPy$ for statistical testing and $NumPy$ for handling numerical operations.
Perform the Two-Sample t-test: We will use $SciPy$’s ttest_ind() function to perform the test. This function compares the means of two independent samples.
Interpret the Results: We will interpret the p-value to decide whether to reject the null hypothesis.

Here is the code implementation:

import numpy as np
from scipy import stats

# Sample data for Region A and Region B
region_a = [23, 25, 21, 22, 20, 30, 24, 28, 19, 27]
region_b = [18, 20, 22, 24, 26, 19, 17, 21, 25, 23]

# Perform a two-sample t-test
t_stat, p_value = stats.ttest_ind(region_a, region_b)

# Print the results
print(f"T-statistic: {t_stat:.4f}")
print(f"P-value: {p_value:.4f}")

# Interpret the p-value
alpha = 0.05  # significance level
if p_value < alpha:
    print("Reject the null hypothesis: There is a significant difference between the means.")
else:
    print("Fail to reject the null hypothesis: No significant difference between the means.")

Explanation of the Output

T-statistic:
The t-statistic measures the difference between the group means relative to the variability in the samples.
A higher absolute t-value indicates a greater difference between the group means.
P-value:
The p-value tells us the probability of observing the data, or something more extreme, if the null hypothesis is true.
A low p-value (typically less than $0.05$) indicates that the observed difference is unlikely under the null hypothesis.
Decision:
- If the p-value is less than the significance level (α = $0.05$), we reject the null hypothesis, suggesting a significant difference between the two regions’ average sales.
- If the p-value is greater than $0.05$, we fail to reject the null hypothesis, implying no statistically significant difference.

Output

1
2
3

T-statistic: 1.6124
P-value: 0.1243
Fail to reject the null hypothesis: No significant difference between the means.

Explanation of the Results

T-statistic: 1.6124
The t-statistic measures the difference between the means of the two samples (Region A and Region B) relative to the variability in the data.
A t-statistic of $1.6124$ suggests there is some difference between the means, but it is not large enough on its own to be conclusive.
P-value: 0.1243
The p-value indicates the probability of observing the data, or something more extreme, if the null hypothesis (that the means are equal) is true.
In this case, the p-value is $0.1243$, which is greater than the typical significance level of $0.05$.
Conclusion: Fail to reject the null hypothesis
Since the p-value is greater than $0.05$, we do not have enough evidence to reject the null hypothesis.
This means that the observed difference in average sales between Region A and Region B is not statistically significant.
In simpler terms, there is no strong evidence to conclude that the means of the two regions are different.

Conclusion

$SciPy$’s ttest_ind() function makes it easy to perform hypothesis testing, allowing you to quickly assess whether differences in sample means are statistically significant.

This type of analysis is fundamental in comparing groups in various fields, including marketing, clinical trials, and product testing.

Advanced Linear Programming Example：Diet Problem with Nutritional Constraints

September 21, 2024

Advanced Linear Programming Example: Diet Problem with Nutritional Constraints

We’ll tackle the Diet Problem with Nutritional Constraints and Cost Minimization, a classic but challenging LP scenario often used in operations research.

Problem Description:

The Diet Problem involves selecting a combination of foods to meet specific nutritional requirements while minimizing the overall cost.

This example will be more complex as it includes multiple nutrient constraints, food availability restrictions, and upper bounds on food intake due to preferences or dietary guidelines.

Problem Setup

Objective:

Minimize the total cost of selected foods while meeting nutritional requirements for calories, protein, fat, and vitamins.

Variables:

$( x_i )$: Quantity of food item $( i )$ to include in the diet (measured in servings or grams).

Parameters:

$( c_i )$: Cost per serving of food item $( i )$.
$( a_{ij} )$: Amount of nutrient $( j )$ provided by one serving of food item $( i )$.
$( l_j )$: Minimum daily requirement of nutrient $( j )$.
$( u_i )$: Maximum servings of food item $( i )$ allowed (due to taste, dietary restrictions, etc.).

Constraints:

Nutritional Constraints: The total intake of each nutrient must meet or exceed the minimum daily requirements.
Upper Bound Constraints: The quantity of each food item must not exceed specified upper limits.
Non-negativity Constraint: Quantities of food items must be non-negative.

Mathematical Formulation

Objective:
$$
\text{Minimize} \quad \sum_{i} c_i x_i
$$
Constraints:
1. Nutritional constraints:
  $$
  \sum_{i} a_{ij} x_i \geq l_j, \quad \forall j
  $$
2. Upper bound constraints on food items:
  $$
  x_i \leq u_i, \quad \forall i
  $$
3. Non-negativity constraints:
  $$
  x_i \geq 0, \quad \forall i
  $$

Example with CVXPY

Let’s implement this problem using $CVXPY$ with an example where we select food items to meet requirements for calories, protein, and fat while minimizing cost.

import cvxpy as cp
import numpy as np

# Number of food items and nutrients
num_foods = 4
num_nutrients = 3

# Cost per serving of each food item
c = np.array([2.5, 3.0, 1.5, 4.0])  # costs of food items 1, 2, 3, 4

# Nutrient content matrix (rows: foods, columns: nutrients)
# Example: calories, protein, fat per serving
a = np.array([
    [400, 30, 10],   # Food 1: 400 calories, 30g protein, 10g fat
    [200, 20, 5],    # Food 2: 200 calories, 20g protein, 5g fat
    [150, 10, 20],   # Food 3: 150 calories, 10g protein, 20g fat
    [500, 40, 30]    # Food 4: 500 calories, 40g protein, 30g fat
])

# Minimum daily nutrient requirements (calories, protein, fat)
l = np.array([2000, 70, 50])

# Maximum servings allowed for each food item
u = np.array([10, 8, 6, 5])

# Decision variables: servings of each food item
x = cp.Variable(num_foods, nonneg=True)

# Objective: Minimize total cost of food
objective = cp.Minimize(c @ x)

# Constraints
constraints = []

# Nutritional constraints for each nutrient
for j in range(num_nutrients):
    constraints.append(a[:, j] @ x >= l[j])

# Upper bounds on each food item
for i in range(num_foods):
    constraints.append(x[i] <= u[i])

# Solve the problem
problem = cp.Problem(objective, constraints)
problem.solve(solver=cp.GLPK)  # Using GLPK as the solver

# Results
print("Status:", problem.status)
print("Optimal Value (Total Cost):", problem.value)
print("Servings of Each Food Item (x):")
print(x.value)

Explanation of the Code:

Decision Variables:
- x[i]: Represents the number of servings of each food item $( i )$ to include in the diet.
Objective:
- The objective function cp.Minimize(c @ x) minimizes the total cost of the selected food items.
Constraints:
- Nutritional Constraints: Ensures that the diet meets or exceeds the minimum daily requirements for calories, protein, and fat.
- Upper Bound Constraints: Limits the servings of each food item to respect dietary guidelines or preferences.
- Non-negativity: All servings are constrained to be non-negative, handled directly in the variable definition.
Solver:
- The GLPK solver is used to handle the LP problem efficiently.

Expected Output:

Status: optimal
Optimal Value (Total Cost): 12.5
Servings of Each Food Item (x):
[5.00000000e+00 0.00000000e+00 2.39124959e-16 0.00000000e+00]

Analysis of the Results:

Optimal Cost:
- The optimal total cost of the selected food items is $12.5$, which means this is the minimum expense required to meet the nutritional requirements.
Servings of Each Food Item:
- Food 1: $5$ servings
- Food 2: $0$ servings
- Food 3: Near zero servings (2.39e-16), effectively zero, which is likely due to numerical precision.
- Food 4: $0$ servings

Conclusion:

This advanced Diet Problem illustrates a practical application of $Linear$ $Programming$ where multiple constraints must be balanced to achieve an optimal solution.

By using $CVXPY$, we can model complex, real-world scenarios involving cost minimization and nutritional requirements, making this approach highly valuable for decision-making in fields like health, manufacturing, and resource management.

Advanced Mixed-Integer Programming (MIP) Example Using CVXPY

September 20, 2024

Advanced Mixed-Integer Programming (MIP) Example Using CVXPY

Problem: Facility Layout Problem with Sequencing and Capacity Constraints

The Facility Layout Problem is a complex optimization problem where multiple facilities need to be arranged in a layout to minimize operational costs, such as transportation between facilities, while respecting various constraints like sequencing, capacity, and adjacency requirements.

This problem becomes more challenging when sequencing and capacity constraints are introduced, turning it into a non-trivial MIP problem.

Problem Setup

Objective:

The goal is to place facilities on a grid layout to minimize the total transportation cost, considering:

Sequencing requirements (some facilities must be placed in a specific order).
Capacity constraints (some facilities can only be placed in specific regions due to space limitations).

Variables:

$(x_{ij})$: Binary variable that equals $1$ if facility $(i)$ is placed in position $(j)$, $0$ otherwise.
$(y_{ij})$: Binary variable that equals $1$ if facility $(i)$ precedes facility $(j)$ in the sequence.

Parameters:

$(c_{ij})$: Cost of placing facility $(i)$ next to facility $(j)$.
$(p_i)$: Capacity requirement for each facility.
$(g_j)$: Capacity available at each grid position.

Constraints:

Unique Placement: Each facility must be placed in exactly one position.
No Overlap: Each position can hold at most one facility.
Sequencing: Facilities must follow a predefined sequence if required.
Capacity: Facilities can only be placed in positions that meet their capacity requirements.

Mathematical Formulation

Objective:
$$
\text{Minimize} \quad \sum_{i} \sum_{j} c_{ij} x_{ij}
$$
Constraints:
1. Each facility is placed in exactly one position:
  $$
  \sum_{j} x_{ij} = 1, \quad \forall i
  $$
2. Each position can hold at most one facility:
  $$
  \sum_{i} x_{ij} \leq 1, \quad \forall j
  $$
3. Sequencing constraints between facilities:
  $$
  y_{ij} + y_{ji} = 1, \quad \forall i \neq j
  $$
4. Capacity constraints:
  $$
  p_i \times x_{ij} \leq g_j, \quad \forall i, j
  $$

Solving with CVXPY

Here’s a detailed implementation using $CVXPY$:

import cvxpy as cp
import numpy as np

# Number of facilities and grid positions
num_facilities = 4
num_positions = 4

# Cost of placing each facility next to each other (symmetric matrix)
c = np.array([
    [0, 2, 3, 4],
    [2, 0, 5, 1],
    [3, 5, 0, 6],
    [4, 1, 6, 0]
])

# Capacity requirements for each facility
p = np.array([2, 1, 3, 2])

# Available capacity at each grid position
g = np.array([3, 2, 2, 4])

# Decision variables
x = cp.Variable((num_facilities, num_positions), boolean=True)  # Placement variable
y = cp.Variable((num_facilities, num_facilities), boolean=True)  # Sequencing variable

# Objective: Minimize the total cost of placing facilities
# The objective is rewritten to respect DCP rules
objective = cp.Minimize(cp.sum(c * cp.abs(x - x.T)))

# Constraints
constraints = []

# Each facility must be placed in exactly one position
for i in range(num_facilities):
    constraints.append(cp.sum(x[i, :]) == 1)

# Each position can hold at most one facility
for j in range(num_positions):
    constraints.append(cp.sum(x[:, j]) <= 1)

# Sequencing constraints: ensure order between facilities
for i in range(num_facilities):
    for j in range(num_facilities):
        if i != j:
            constraints.append(y[i, j] + y[j, i] == 1)  # Only one direction allowed

# Capacity constraints: facility can only be placed in positions meeting capacity requirements
for i in range(num_facilities):
    for j in range(num_positions):
        constraints.append(p[i] * x[i, j] <= g[j])

# Solve the problem
problem = cp.Problem(objective, constraints)
problem.solve(solver=cp.GLPK_MI)  # Using GLPK_MI as the solver

# Results
print("Status:", problem.status)
print("Optimal Value (Total Cost):", problem.value)
print("Placement of Facilities (x):")
print(x.value)
print("Sequencing of Facilities (y):")
print(y.value)

Explanation of the Code:

Decision Variables:
- x[i, j]: Binary variable indicating whether facility $( i )$ is placed in position $( j )$.
- y[i, j]: Binary variable indicating whether facility $( i )$ precedes facility $( j )$ in the sequence.
Objective:
- The objective is to minimize the total placement cost, computed by multiplying the placement matrix with the cost matrix.
Constraints:
- Each facility must be assigned exactly one position, and each position can hold at most one facility.
- Sequencing constraints ensure that the required order between facilities is maintained.
- Capacity constraints enforce that a facility can only be placed in a position with sufficient available capacity.
Solver:
- The problem is solved using GLPK_MI, an open-source solver suitable for MIP problems.

Output:

Status: optimal
Optimal Value (Total Cost): 0.0
Placement of Facilities (x):
[[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 0. 1.]
 [0. 0. 1. 0.]]
Sequencing of Facilities (y):
[[0. 1. 1. 1.]
 [0. 0. 1. 1.]
 [0. 0. 0. 1.]
 [0. 0. 0. 0.]]

Conclusion:

This advanced Mixed-Integer Programming (MIP) example demonstrates how to solve a complex facility layout problem with sequencing and capacity constraints using $CVXPY$.

The use of binary variables allows us to model the discrete nature of facility placements and sequencing requirements, making MIP a powerful tool for such complex layout optimization problems.

Second-Order Cone Programming (SOCP) Example Using CVXPY

September 19, 2024

Second-Order Cone Programming (SOCP) Example Using CVXPY

Problem: Portfolio Optimization with Transaction Costs

In portfolio optimization, we aim to allocate a given budget across various assets to maximize returns while controlling for risk.

In this example, we will add transaction costs (e.g., costs incurred when buying and selling assets) to the optimization problem.
This problem is naturally modeled as a Second-Order Cone Programming (SOCP) problem.

Problem Setup:

Variables:
- $( x_i )$: The amount invested in asset $( i )$.
- $( r_i )$: The expected return for asset $( i )$.
- $( \Sigma )$: The covariance matrix representing the risk between the assets.
Objective:
Maximize the expected return while controlling for risk (measured as portfolio variance) and including transaction costs.
Formulation:
$$
\text{Maximize} \quad r^T x - \gamma \cdot \sqrt{x^T \Sigma x} - \lambda \cdot \sum_i |x_i - x_i^0|
$$
where:
- $( r^T x )$ is the expected return,
- $( \gamma )$ is the risk aversion parameter,
- $( \sqrt{x^T \Sigma x} )$ is the portfolio variance (risk),
- $( \lambda )$ is the transaction cost parameter,
- $( |x_i - x_i^0| )$ is the absolute change in position (transaction cost), where $( x_i^0 )$ represents the initial positions.
Constraints:
1. Total investment must equal the available budget.
2. The portfolio must stay within certain bounds for each asset.

SOCP Formulation:

We can represent this optimization problem using SOCP constraints by rewriting the risk term $( \sqrt{x^T \Sigma x} )$ as a second-order cone constraint.

Code Implementation Using CVXPY

import cvxpy as cp
import numpy as np

# Data for the problem
n = 5  # Number of assets
np.random.seed(0)

# Expected returns for each asset
r = np.random.rand(n)

# Covariance matrix (symmetric positive semidefinite)
Sigma = np.random.randn(n, n)
Sigma = Sigma.T @ Sigma  # Make the covariance matrix positive semi-definite

# Initial positions (already held assets)
x0 = np.random.rand(n)

# Parameters
gamma = 0.1  # Risk aversion parameter
lambda_ = 0.05  # Transaction cost parameter
budget = 1  # Total budget

# Decision variable: investment in each asset
x = cp.Variable(n)

# Objective: maximize expected return - risk - transaction costs
# Risk term: sqrt(x.T * Sigma * x) is modeled as second-order cone
risk = cp.quad_form(x, Sigma)  # Quadratic form for risk (x.T @ Sigma @ x)
transaction_costs = cp.norm1(x - x0)  # l1 norm for transaction costs

# Maximize expected return minus risk and transaction costs
objective = cp.Maximize(r.T @ x - gamma * cp.norm(risk, 2) - lambda_ * transaction_costs)

# Constraints
constraints = [
    cp.sum(x) == budget,  # Total investment must equal budget
    x >= 0,  # No short-selling (no negative positions)
    x <= 1   # No asset can take more than 100% of the budget
]

# Define and solve the problem
problem = cp.Problem(objective, constraints)
problem.solve()

# Results
print("Status:", problem.status)
print("Optimal Portfolio:", x.value)
print("Optimal Value:", problem.value)

Explanation of the Code:

Data:
- r: A vector of expected returns for each asset.
- Sigma: The covariance matrix representing the risk between assets.
- x0: The initial asset positions before the optimization.
Objective:
- The objective is to maximize the expected return, minus the risk (modeled as a second-order cone) and transaction costs (modeled as the $( l_1 )$-norm of the change in positions).
- cp.quad_form(x, Sigma) represents the quadratic risk term, $( x^T \Sigma x )$.
- The $( l_1 )$-norm of x - x0 captures the absolute transaction costs.
Constraints:
- The sum of the asset allocations must equal the available budget.
- We enforce non-negative asset allocations (no short-selling) and limit the allocation to a maximum of $100$% per asset.
SOCP Structure:
- The risk term involves a second-order cone constraint because of the quadratic term $( \sqrt{x^T \Sigma x} )$, which $CVXPY$ automatically handles as an SOCP.

Output Example:

Status: optimal
Optimal Portfolio: [4.39530365e-01 3.32443990e-01 5.68332295e-10 2.28025644e-01
 1.65363907e-10]
Optimal Value: 0.49075615889954766

In this output:

The Optimal Portfolio gives the asset allocation that maximizes the return while minimizing risk and transaction costs.
The Optimal Value represents the objective value at the solution.

Conclusion:

This example demonstrates how to solve a portfolio optimization problem with transaction costs using Second-Order Cone Programming (SOCP) in $CVXPY$.

SOCP allows us to model the risk term (which involves a quadratic form) as a second-order cone, and $CVXPY$ simplifies the solution process.

This approach is particularly useful for finance problems where risk and transaction costs need to be balanced.

Mixed-Integer Programming (MIP) Example Using CVXPY

September 18, 2024

Mixed-Integer Programming (MIP) Example Using CVXPY

Problem: Facility Location Problem

The facility location problem is a classic optimization problem where a company must decide where to open facilities (warehouses) to serve customers.

The goal is to minimize the total cost, which includes fixed costs for opening facilities and transportation costs for serving customers.

Problem Setup:

Variables:
- $( x_{ij} )$: Binary decision variable, $1$ if customer $( j )$ is served by facility $( i )$, $0$ otherwise.
- $( y_i )$: Binary decision variable, $1$ if facility $( i )$ is open, $0$ otherwise.
Parameters:
- $( f_i )$: Fixed cost of opening facility $( i )$.
- $( c_{ij} )$: Transportation cost for serving customer $( j )$ from facility $( i )$.
Objective:
Minimize the total cost:
$$
\text{Minimize} \quad \sum_{i} f_i y_i + \sum_{i}\sum_{j} c_{ij} x_{ij}
$$
Constraints:
1. Each customer must be assigned to exactly one facility:
  $$
  \sum_{i} x_{ij} = 1, \quad \forall j
  $$
2. A customer can only be served by an open facility:
  $$
  x_{ij} \leq y_i, \quad \forall i, j
  $$
3. Binary constraints on the variables:
  $$
  x_{ij} \in {0, 1}, \quad y_i \in {0, 1}, \quad \forall i, j
  $$

Solving with CVXPY:

We can solve this problem using $CVXPY$, a Python library for convex optimization, which also supports Mixed-Integer Programming.

Here’s a step-by-step guide with Python code.

import cvxpy as cp
import numpy as np

# Data (example)
num_facilities = 3  # Number of facilities
num_customers = 5   # Number of customers

# Fixed costs for opening each facility
f = np.array([100, 200, 300])

# Transportation costs from each facility to each customer
c = np.array([
    [8, 10, 7, 6, 9],
    [6, 8, 5, 7, 10],
    [10, 7, 6, 8, 7]
])

# Decision variables
# x_ij: binary, 1 if customer j is served by facility i
x = cp.Variable((num_facilities, num_customers), boolean=True)

# y_i: binary, 1 if facility i is open
y = cp.Variable(num_facilities, boolean=True)

# Objective: Minimize fixed and transportation costs
# Updated to use elementwise multiplication (cp.multiply)
objective = cp.Minimize(cp.sum(f @ y) + cp.sum(cp.multiply(c, x)))

# Constraints
constraints = []

# Each customer must be assigned to exactly one facility
for j in range(num_customers):
    constraints.append(cp.sum(x[:, j]) == 1)

# Customers can only be assigned to open facilities
for i in range(num_facilities):
    for j in range(num_customers):
        constraints.append(x[i, j] <= y[i])

# Solve the problem using another solver such as GLPK_MI
problem = cp.Problem(objective, constraints)
problem.solve(solver=cp.GLPK_MI)

# Results
print("Status:", problem.status)
print("Total Cost:", problem.value)
print("Facilities to Open:", y.value)
print("Customer Assignments:")
print(x.value)

Explanation:

Decision Variables:
- x: This is a ( num_facilities $\times$ num_customers ) matrix where each entry represents whether a customer is served by a facility.
  We use boolean=True to enforce the binary constraint.
- y: This is a vector of length num_facilities that indicates whether a facility is open.
Objective:
- The objective is to minimize the sum of the fixed costs $( f_i )$ for opening facilities and the transportation costs $( c_{ij} )$ for serving customers.
Constraints:
- We enforce the constraint that each customer is assigned to exactly one facility using cp.sum(x[:, j]) == 1.
- We ensure that customers are only assigned to open facilities with x[i, j] <= y[i].
Solving:
- We use the GLPK_MI solver (solver=cp.GLPK_MI), which is well-suited for mixed-integer programming problems.
- After solving, we can inspect the status, total cost, which facilities to open, and the customer assignments.

Output:

Status: optimal
Total Cost: 140.0
Facilities to Open: [1. 0. 0.]
Customer Assignments:
[[1. 1. 1. 1. 1.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]]

This indicates that:

Facilities $1$ should be opened.
Customers $1$, $2$, $3$, $4$, and $5$ are served by facility 1.

Conclusion:

This example demonstrates how to model and solve a Mixed-Integer Programming (MIP) problem using $CVXPY$.

The facility location problem is a common application of MIP, and it shows how we can optimize costs by using binary decision variables to represent discrete choices (e.g., opening facilities).

Robust Optimization Example Using CVXPY

September 17, 2024

Robust Optimization Example Using CVXPY

Robust optimization is a technique used to handle uncertainty in optimization problems.

In contrast to standard optimization, where we assume exact data, robust optimization considers that some of the data (like costs, constraints, or coefficients) are uncertain and vary within known bounds.

This ensures that the solution is feasible and performs well even in the worst-case scenario of these uncertainties.

In this example, we will solve a robust portfolio optimization problem where we aim to minimize the worst-case risk (variance) of a portfolio under uncertain returns.

Problem Description: Robust Portfolio Optimization

Imagine we have an investor who wants to allocate their capital across several assets.

The investor is uncertain about the expected returns of these assets but knows that the returns lie within a given uncertainty range.
The goal is to find an optimal portfolio allocation that minimizes the worst-case risk (variance) while ensuring the expected return exceeds a certain threshold.

Variables:

$( x )$: Vector of asset weights (the proportion of capital allocated to each asset).

Objective:

Minimize the worst-case variance of the portfolio.

Constraints:

The total portfolio allocation should sum to $1$ (i.e., $( \sum x_i = 1 )$).
The expected return should be at least a target value $( R_{\text{min}} )$.
The asset returns are uncertain but lie within a given uncertainty set.

We will now define and solve this robust optimization problem in $Python$ using $CVXPY$.

Step-by-Step Solution

import cvxpy as cp
import numpy as np

# Step 1: Define the problem data
n = 4  # Number of assets
np.random.seed(42)

# Covariance matrix of returns (representing risk/variance)
Sigma = np.array([
    [0.1, 0.02, 0.04, 0.01],
    [0.02, 0.08, 0.03, 0.02],
    [0.04, 0.03, 0.09, 0.03],
    [0.01, 0.02, 0.03, 0.07]
])

# Expected returns (uncertain)
mu = np.array([0.1, 0.12, 0.14, 0.08])  # Nominal expected returns
delta_mu = np.array([0.02, 0.03, 0.02, 0.01])  # Uncertainty in returns

# Minimum target return
R_min = 0.11

# Step 2: Define the decision variables
x = cp.Variable(n)  # Portfolio weights

# Step 3: Define the robust constraints and objective

# Constraint 1: The total portfolio allocation should sum to 1
constraints = [cp.sum(x) == 1, x >= 0]

# Robust constraint: Expected return should be at least R_min for the worst-case returns
worst_case_return = mu - delta_mu  # Worst-case scenario for returns
constraints.append(worst_case_return @ x >= R_min)

# Objective: Minimize the worst-case portfolio variance
portfolio_variance = cp.quad_form(x, Sigma)
objective = cp.Minimize(portfolio_variance)

# Step 4: Define and solve the optimization problem
problem = cp.Problem(objective, constraints)
problem.solve()

# Step 5: Output the results
print("Status:", problem.status)
print("Optimal portfolio allocation:", x.value)
print("Minimum worst-case risk (variance):", portfolio_variance.value)

Explanation

Problem Data:
- We define a covariance matrix Sigma, representing the variance and correlation between the returns of the assets. The diagonal elements represent the variance of individual assets, while the off-diagonal elements represent the covariance between pairs of assets.
- The vector mu contains the nominal (expected) returns for each asset, and delta_mu represents the uncertainty in these returns.
- The target return $( R_{\text{min}} )$ is set to $0.11$, meaning the investor wants the portfolio to yield at least an $11$% return, even in the worst-case scenario.
Decision Variables:
- The variable x represents the portfolio weights, i.e., the proportion of the total capital allocated to each asset.
Constraints:
- The total allocation must sum to $1$ (cp.sum(x) == 1), meaning all the capital must be invested.
- The worst-case expected return (worst_case_return @ x) should be at least $( R_{\text{min}} )$, ensuring the portfolio remains profitable under uncertainty.
- We also require non-negative portfolio weights (x >= 0), assuming no short selling.
Objective:
- The objective is to minimize the portfolio’s worst-case variance (cp.quad_form(x, Sigma)), which is a quadratic form representing the portfolio’s risk.
Solution:
- We solve the problem using problem.solve(). If the problem is feasible, it will return the optimal portfolio weights that minimize the worst-case risk while satisfying the constraints.

Output

1
2
3

Status: optimal
Optimal portfolio allocation: [0.02536232 0.29347826 0.67753623 0.00362319]
Minimum worst-case risk (variance): 0.062065217391304356

Interpretation of Results

Optimal Portfolio Allocation:
The optimal portfolio weights allocate the investor’s capital across the four assets.
For instance, $25$% of the total capital is allocated to the first asset, $29$% to the second, and so on.
Minimum Worst-Case Risk:
The worst-case risk (variance) of the portfolio is minimized at around $0.062$, indicating the portfolio has been optimized to minimize risk under uncertain returns.

Conclusion

This example demonstrated how to use $CVXPY$ to solve a robust optimization problem in the context of portfolio optimization.

Robust optimization allows us to account for uncertainty in the data (such as uncertain returns in this case) and find solutions that are robust to worst-case scenarios.

The problem ensures that the portfolio achieves the target return even in the worst case while minimizing the overall risk.

This technique is useful in finance and other fields where uncertainty is present, and decision-makers seek solutions that perform well under a range of possible conditions.

Least Squares and Regression with CVXPY

September 16, 2024

Least Squares and Regression with CVXPY

Least squares is a popular method for solving regression problems, where we want to fit a model to data by minimizing the sum of the squared differences (errors) between the observed values and the values predicted by the model.

In this example, we will demonstrate how to use $CVXPY$ to solve a $linear$ $regression$ problem using least squares.

Problem Description: Linear Regression

We are given a dataset consisting of several observations.
Each observation includes:

A set of features (independent variables).
A target value (dependent variable).

Our goal is to find a linear relationship between the features and the target, meaning we want to find the best-fitting line (or hyperplane) that predicts the target based on the features.

Mathematically, we can express this as:
$$
y = X\beta + \epsilon
$$
Where:

$( X )$ is a matrix of features (each row corresponds to an observation and each column to a feature).
$( y )$ is a vector of observed target values.
$( \beta )$ is the vector of unknown coefficients we want to estimate.
$( \epsilon )$ represents the errors (differences between the predicted and observed values).

We want to find $( \beta )$ that minimizes the sum of squared errors:
$$
\min_\beta | X\beta - y |_2^2
$$

Step-by-Step Solution with CVXPY

Here is how to solve this least squares problem using $CVXPY$:

import numpy as np
import cvxpy as cp
import matplotlib.pyplot as plt

# Step 1: Generate synthetic data for a regression problem
np.random.seed(1)
n = 50  # Number of data points
m = 3   # Number of features

# Randomly generate feature matrix X and true coefficients beta_true
X = np.random.randn(n, m)
beta_true = np.array([3, -1, 2])  # True coefficients
y = X @ beta_true + 0.5 * np.random.randn(n)  # Observed values with some noise

# Step 2: Define the variable (beta) for the regression coefficients
beta = cp.Variable(m)

# Step 3: Define the objective function (least squares error)
objective = cp.Minimize(cp.sum_squares(X @ beta - y))

# Step 4: Define and solve the problem
problem = cp.Problem(objective)
problem.solve()

# Step 5: Output the results
print("Status:", problem.status)
print("Estimated coefficients (beta):", beta.value)
print("True coefficients:", beta_true)

# Step 6: Visualize the data and the fitted values
y_pred = X @ beta.value
plt.scatter(range(n), y, color='blue', label='Observed')
plt.scatter(range(n), y_pred, color='red', label='Predicted')
plt.xlabel('Data Point')
plt.ylabel('Target Value')
plt.title('Least Squares Regression Fit')
plt.legend()
plt.show()

Detailed Explanation:

Data Generation:
In this example, we generate synthetic data for simplicity.
The matrix $( X )$ contains $50$ data points with $3$ features, and we generate the target values $( y )$ using a known set of coefficients $( \beta_{\text{true}} = [3, -1, 2] )$ plus some noise to simulate real-world observations.
Decision Variable:
The unknowns are the regression coefficients $( \beta )$, represented as a $CVXPY$ variable beta of size $3$ (since we have $3$ features).
Objective Function:
The goal is to minimize the sum of squared errors between the observed target values $( y )$ and the predicted values $( X\beta )$.
In $CVXPY$, this is represented by the function cp.sum_squares(X @ beta - y), which computes the sum of the squared residuals.
Problem Definition and Solution:
The cp.Problem function defines the least squares optimization problem, and problem.solve() finds the optimal solution for $( \beta )$.
Results:
After solving the problem, the estimated coefficients $( \beta )$ are printed.
These should be close to the true coefficients used to generate the data.
We also visualize the observed values versus the predicted values to show how well the model fits the data.

Output:

Estimated Coefficients: The estimated coefficients $( \hat{\beta} )$ are close to the true coefficients used to generate the data ($( \beta_{\text{true}} = [3, -1, 2] )$), which indicates a good fit.

Visualization of Results:

The following plot shows the observed values (in blue) and the predicted values (in red).
A good fit would show the predicted values closely following the observed ones.

Blue points: Actual (observed) target values.
Red points: Predicted target values based on the fitted model.

Interpretation:

Minimizing the Sum of Squared Errors:
The least squares method minimizes the squared differences between the predicted and observed values.
This produces the “best fit” line for the given data, in the sense that the total prediction error is minimized.
Optimal Solution:
Since the status of the problem is optimal, $CVXPY$ successfully found the regression coefficients that minimize the objective function.

Conclusion:

In this example, we demonstrated how to solve a linear regression problem using least squares with $CVXPY$.
The least squares method is widely used in machine learning and statistics for fitting linear models to data.

By minimizing the sum of squared errors, we can estimate the coefficients that best explain the relationship between the features and the target values.

Quadratic Programming Example Using CVXPY

September 15, 2024

Quadratic Programming Example Using CVXPY

$Quadratic Programming (QP)$ is an optimization problem where the objective function is quadratic, and the constraints are linear.

A typical $QP$ problem can be written as:

$$
\text{Minimize } \frac{1}{2} x^T Q x + c^T x
$$
subject to:
$$
Ax \leq b \quad \text{and} \quad Ex = d
$$

Where:

$( Q )$ is a positive semi-definite matrix (for convexity),
$( c )$ is a vector of linear coefficients,
$( A )$, $( b )$, $( E )$, and $( d )$ represent the inequality and equality constraints, respectively.

Example: Portfolio Optimization

We will solve a portfolio optimization problem using $QP$.

The goal is to minimize the risk (variance) of a portfolio, subject to achieving a certain return and satisfying investment constraints.

Problem Formulation:

Objective: Minimize the portfolio’s risk, measured as the variance of returns.
Constraints:
- The total weight of assets must sum to $1$ (fully invested).
- Each weight must be non-negative (no short selling).
- The portfolio’s expected return must be at least a target return.

Data Setup:

The problem has 3 assets with the following parameters:
- Expected returns: $([0.1, 0.2, 0.15])$
- Covariance matrix (risk):
  $$
  \begin{pmatrix}
  0.005 & -0.010 & 0.004 \
  -0.010 & 0.040 & -0.002 \
  0.004 & -0.002 & 0.023 \
  \end{pmatrix}
  $$
- Target return: $(0.16)$

Step-by-Step Solution:

Define the Variables and Problem:
Let $( x = \begin{bmatrix} x_1 \ x_2 \ x_3 \end{bmatrix} )$ represent the weights of the three assets in the portfolio.
Objective:
The objective is to minimize the variance (quadratic term):
$$
\frac{1}{2} x^T Q x
$$
where $( Q )$ is the covariance matrix of the asset returns.
Constraints:
- The sum of portfolio weights must be 1: $( x_1 + x_2 + x_3 = 1 )$
- The expected return of the portfolio must be at least $(0.16)$:
$$
0.1x_1 + 0.2x_2 + 0.15x_3 \geq 0.16
$$
- No short selling: $( x_1, x_2, x_3 \geq 0 )$

Python Implementation:

We will use the $cvxpy$ library to solve this $QP$ problem.

import cvxpy as cp
import numpy as np

# Step 1: Define the problem data
Q = np.array([[0.005, -0.010, 0.004],
              [-0.010, 0.040, -0.002],
              [0.004, -0.002, 0.023]])  # Covariance matrix

expected_returns = np.array([0.1, 0.2, 0.15])  # Expected returns for the assets
target_return = 0.16  # Target portfolio return

# Step 2: Define the decision variable (portfolio weights)
x = cp.Variable(3)

# Step 3: Define the objective function (minimize variance)
objective = cp.Minimize((1/2) * cp.quad_form(x, Q))

# Step 4: Define the constraints
constraints = [
    cp.sum(x) == 1,  # Fully invested (weights sum to 1)
    expected_returns @ x >= target_return,  # Target return constraint
    x >= 0  # No short selling constraint
]

# Step 5: Formulate the problem
problem = cp.Problem(objective, constraints)

# Step 6: Solve the problem
problem.solve()

# Step 7: Output results
print("Status:", problem.status)
print("Optimal portfolio allocation:", x.value)
print("Minimum risk (variance):", problem.value)

Detailed Explanation:

Covariance Matrix ( Q ):
This matrix represents the risk (variance and covariance) associated with the assets in the portfolio.
The term cp.quad_form(x, Q) calculates the quadratic form $( x^T Q x )$, representing the portfolio variance.
Objective Function:
The objective is to minimize the portfolio variance.
In cvxpy, we use the cp.Minimize() function, where the quadratic form represents the risk.
Constraints:
- Fully Invested: cp.sum(x) == 1 ensures that the portfolio is fully invested (the weights sum to 1).
- Target Return: expected_returns @ x >= target_return ensures that the portfolio’s expected return is at least $0.16$.
- No Short Selling: x >= 0 ensures that no asset has a negative weight (i.e., no short selling).
Solve the Problem:
The problem.solve() function solves the $QP$ problem and finds the optimal portfolio allocation that minimizes risk while satisfying all the constraints.

Output:

1
2
3

Status: optimal
Optimal portfolio allocation: [0.26055046 0.46055046 0.27889908]
Minimum risk (variance): 0.004140183486238534

Interpretation of Results:

Optimal Portfolio Allocation:
The optimal weights for the assets are approximately $26.06$% in Asset $1$, $46.06$% in Asset $2$, and $27.89$% in Asset $3$.
These weights minimize the portfolio variance while achieving the target return.
Minimum Risk:
The minimum variance (risk) is approximately $0.00414$.
This represents the lowest possible portfolio risk that satisfies the return constraint.

Conclusion:

This example demonstrates how to solve a quadratic programming problem using Python’s $cvxpy$ library.

The problem was formulated as a portfolio optimization task, where we minimized risk subject to return and no-short-selling constraints.

Maximizing the Minimum Eigenvalue of a Matrix Using Semidefinite Programming in CVXPY

September 14, 2024

Maximizing the Minimum Eigenvalue of a Matrix Using Semidefinite Programming in CVXPY

Semidefinite Programming (SDP) in CVXPY

$Semidefinite programming (SDP)$ is an optimization problem where the decision variable is a matrix, and the constraints include that this matrix must be positive semidefinite.

$SDP$ problems arise in various fields, such as control theory, signal processing, quantum mechanics, and machine learning.

Problem Setup:

The general form of a semidefinite program is:

$$
\text{minimize } \quad \text{tr}(CX)
$$
$$
\text{subject to } \quad \text{tr}(A_iX) = b_i \quad \text{for each } i
$$
$$
X \succeq 0
$$

Where:

$(X)$ is the decision variable (a symmetric matrix).
$(\text{tr}(C X))$ is the objective function (trace of matrix multiplication).
The matrix $(X \succeq 0)$ indicates that $(X)$ is positive semidefinite, meaning all its eigenvalues are non-negative.

Example: Maximizing the Minimum Eigenvalue of a Matrix

Let’s solve an $SDP$ example where we want to maximize the minimum eigenvalue of a symmetric matrix $( X )$, subject to some constraints.

Problem:

Given a symmetric matrix $( X )$, we want to:

Maximize the minimum eigenvalue of $( X )$.
Ensure that $( X )$ satisfies some linear constraints.

Formulation:

We want to maximize $( \lambda_{\text{min}}(X) )$, the smallest eigenvalue of $( X )$, which can be written as an $SDP$ using a slack variable $( t )$ and ensuring $( X - tI \succeq 0 )$ (i.e., the matrix $( X - tI )$ is positive semidefinite).

Step-by-Step Solution using CVXPY:

import cvxpy as cp
import numpy as np

# Step 1: Define the size of the matrix
n = 3  # Size of the matrix X (3x3 matrix in this example)

# Step 2: Define the decision variable (X is a symmetric matrix)
X = cp.Variable((n, n), symmetric=True)

# Step 3: Define the slack variable (t) representing the minimum eigenvalue
t = cp.Variable()

# Step 4: Define the objective function (maximize t)
# This corresponds to maximizing the smallest eigenvalue
objective = cp.Maximize(t)

# Step 5: Define the constraints
# The first constraint is X - tI >= 0 (positive semidefinite)
constraints = [X - t * np.eye(n) >> 0]

# Add some additional constraints to make the problem non-trivial
# For example, trace(X) = 10, and X[0, 1] = 1 (arbitrary example constraints)
constraints += [cp.trace(X) == 10]
constraints += [X[0, 1] == 1]

# Step 6: Formulate the problem
problem = cp.Problem(objective, constraints)

# Step 7: Solve the problem
problem.solve()

# Step 8: Output results
print("Status:", problem.status)
print("Optimal value of t (minimum eigenvalue):", t.value)
print("Optimal solution for X:\n", X.value)

Explanation:

Step-by-Step Breakdown:

Decision Variable:
- We define X as a symmetric matrix variable of size $(n \times n)$.
  In our case, we chose $(n = 3)$, so $(X)$ is a $(3 \times 3)$ symmetric matrix.
Slack Variable $(t)$:
- The variable t represents the smallest eigenvalue of the matrix $(X)$.
  The objective is to maximize $(t)$, which corresponds to maximizing the minimum eigenvalue of $(X)$.
Objective Function:
- The goal is to maximize $(t)$, i.e., maximize the minimum eigenvalue of $(X)$.
Constraints:
- Semidefinite Constraint: $(X - tI \succeq 0)$ (i.e., the matrix $(X - tI)$ must be positive semidefinite). This ensures that $(t)$ is a lower bound on the eigenvalues of $(X)$.
  All eigenvalues of $(X)$ must be greater than or equal to $(t)$.
- Additional Constraints:
  - We impose $( \text{tr}(X) = 10 )$ (i.e., the sum of the diagonal elements of $(X) equals 10)$.
    This is an arbitrary constraint to ensure the matrix is not trivial.
  - We also constrain $( X[0, 1] = 1 )$, just to add more structure to the problem.
Problem Definition:
- The problem is defined using cp.Problem(), with the objective being the maximization of$ (t)$ and the constraints defined above.
Solve:
- The solve() method is called to solve the semidefinite program.
Results:
- The status of the solution is printed (Optimal, Infeasible, or Unbounded).
- The optimal value of $(t)$, which corresponds to the maximum minimum eigenvalue of (X), is printed.
- The optimal solution matrix $(X)$ is also displayed.

Output:

Status: optimal
Optimal value of t (minimum eigenvalue): 2.6666580666942266
Optimal solution for X:
 [[3.666671   1.00000001 0.        ]
 [1.00000001 3.666671   0.        ]
 [0.         0.         2.666658  ]]

Optimal value of $(t)$: The minimum eigenvalue of the matrix $(X)$ is approximately $2.67$.
Optimal solution for $(X)$: The matrix $(X)$ that satisfies the given constraints and maximizes the minimum eigenvalue is shown.

Explanation of Results:

Semidefinite Constraint: The constraint $$( X - tI \succeq 0 )$$ ensures that all eigenvalues of $(X)$ are greater than or equal to $(t)$.
Thus, maximizing $(t)$ corresponds to maximizing the smallest eigenvalue of $(X)$.
Trace Constraint: The sum of the diagonal elements of $(X)$ is constrained to be 10.
Off-Diagonal Constraint: The constraint $( X[0, 1] = 1 )$ ensures that there’s a relationship between the elements of $(X)$, adding complexity to the optimization.

Use Cases of Semidefinite Programming:

Control Systems: $SDP$ can be used in control theory to design controllers that guarantee system stability by finding matrices that satisfy Lyapunov stability criteria.
Structural Design: Optimizing stiffness matrices for structures to maximize strength while minimizing material usage.
Quantum Information: $SDP$ is used to analyze quantum states, where matrices representing states must be positive semidefinite.
Finance: Portfolio optimization and risk management, where covariance matrices must satisfy certain positive semidefinite constraints.

Conclusion:

This example demonstrates how $CVXPY$ can be used to solve semidefinite programming problems by formulating the problem with matrix variables, semidefinite constraints, and other linear constraints.

The power of $SDP$ lies in its ability to optimize over matrices while ensuring positive semidefiniteness, making it a crucial tool in many advanced applications across different fields.