# Define the data for the problem c = np.array([1, 2]) # Coefficients for the objective function A = np.array([[1, 1], [1, -1], [-1, 2]]) # Coefficients for the constraints b = np.array([2, 1, 2]) # Right-hand side values for the constraints
# Define the optimization variables x = cp.Variable(2)
# Define the objective function: minimize c^T x objective = cp.Minimize(c @ x)
# Define the constraints: Ax <= b and x >= 0 constraints = [A @ x <= b, x >= 0]
# Formulate the problem problem = cp.Problem(objective, constraints)
# Solve the problem problem.solve()
# Print the results print("Status:", problem.status) print("Optimal value of the objective function:", problem.value) print("Optimal values of the variables x:", x.value)
Explanation
Objective Function: c @ x is the dot product of the vector c with the variable vector x. We aim to minimize this value.
Constraints: A @ x <= b represents the inequality constraints, and x >= 0 ensures that the variables are non-negative.
Optimization: problem.solve() solves the optimization problem, and the optimal solution is stored in x.value.
Output
When you run this code, it will output the status of the optimization (e.g., “optimal”), the optimal value of the objective function, and the optimal values of the decision variables $( x )$.
1 2 3
Status: optimal Optimal value of the objective function: 3.698490338406144e-10 Optimal values of the variables x: [3.59660015e-10 5.09450927e-12]
This example is a basic introduction, but $CVXPY$ can handle more complex problems, including quadratic programming, mixed-integer programming, and other types of convex optimization problems.
# Step 6: Plot the forecast model.plot(forecast) plt.show()
# Step 7: Plot the forecast components model.plot_components(forecast) plt.show()
Explanation
ds: The column containing the dates.
y: The column containing the values to be forecasted.
fit: The method to train the model with your time series data.
make_future_dataframe: Prepares a dataframe to hold future predictions.
predict: Generates predictions for the given dates.
plot: Visualizes the forecast along with the observed data.
plot_components: Breaks down the forecast into its components (e.g., trend, weekly seasonality).
Result
Running this code will generate a plot of the time series data with the forecasted values and their uncertainty intervals, as well as a breakdown of the forecast components.
# Step 4: Set up the evolutionary algorithm defmain(): population = toolbox.population(n=300) NGEN = 40 CXPB = 0.7 MUTPB = 0.2
# Evaluate the entire population fitnesses = list(map(toolbox.evaluate, population)) for ind, fit inzip(population, fitnesses): ind.fitness.values = fit
# Step 5: Run the algorithm for gen inrange(NGEN): offspring = toolbox.select(population, len(population)) offspring = list(map(toolbox.clone, offspring))
for child1, child2 inzip(offspring[::2], offspring[1::2]): if random.random() < CXPB: toolbox.mate(child1, child2) del child1.fitness.values del child2.fitness.values
for mutant in offspring: if random.random() < MUTPB: toolbox.mutate(mutant) del mutant.fitness.values
invalid_ind = [ind for ind in offspring ifnot ind.fitness.valid] fitnesses = map(toolbox.evaluate, invalid_ind) for ind, fit inzip(invalid_ind, fitnesses): ind.fitness.values = fit
population[:] = offspring
fits = [ind.fitness.values[0] for ind in population]
length = len(population) mean = sum(fits) / length sum2 = sum(x*x for x in fits) std = abs(sum2 / length - mean**2)**0.5
print(f"Gen {gen}: Min {min(fits)} Max {max(fits)} Avg {mean} Std {std}")
Problem Definition: The fitness function (evalOneMax) is defined to maximize x^2.
Individual and Population: An individual is a list of integers, and the population is a list of individuals.
Evolutionary Operators:
mate: Uses two-point crossover.
mutate: Uses bit flip mutation with a small probability.
select: Uses tournament selection.
Algorithm Flow: The algorithm runs for a specified number of generations (NGEN), performing selection, crossover, mutation, and evaluation in each generation.
Output: The algorithm prints the minimum, maximum, average, and standard deviation of fitness values at each generation and finally outputs the best individual found.
Gen 0: Min 309.0 Max 787.0 Avg 562.7166666666667 Std 85.22716540060568 Gen 1: Min 358.0 Max 832.0 Avg 627.3066666666666 Std 76.52445767349342 Gen 2: Min 500.0 Max 862.0 Avg 680.6666666666666 Std 69.2917182801974 Gen 3: Min 521.0 Max 868.0 Avg 741.4666666666667 Std 60.18418027651102 Gen 4: Min 596.0 Max 924.0 Avg 781.9566666666667 Std 53.113226120136446 Gen 5: Min 624.0 Max 929.0 Avg 820.27 Std 50.70007001967588 Gen 6: Min 567.0 Max 941.0 Avg 849.4633333333334 Std 57.31010954059929 Gen 7: Min 681.0 Max 948.0 Avg 883.84 Std 42.49165878930357 Gen 8: Min 645.0 Max 971.0 Avg 904.7033333333334 Std 44.194969422119485 Gen 9: Min 744.0 Max 971.0 Avg 920.1233333333333 Std 34.01541006988307 Gen 10: Min 663.0 Max 976.0 Avg 930.0766666666667 Std 35.58638488086127 Gen 11: Min 671.0 Max 978.0 Avg 936.6533333333333 Std 38.082539597856474 Gen 12: Min 745.0 Max 981.0 Avg 947.5033333333333 Std 31.812628345080054 Gen 13: Min 854.0 Max 983.0 Avg 957.0966666666667 Std 25.950606715236244 Gen 14: Min 764.0 Max 987.0 Avg 960.02 Std 33.497755148667416 Gen 15: Min 691.0 Max 987.0 Avg 967.7833333333333 Std 32.976401494941406 Gen 16: Min 587.0 Max 988.0 Avg 966.6833333333333 Std 43.08553185880072 Gen 17: Min 784.0 Max 988.0 Avg 973.6033333333334 Std 27.84276067889405 Gen 18: Min 692.0 Max 988.0 Avg 974.3433333333334 Std 33.028151460366374 Gen 19: Min 690.0 Max 988.0 Avg 973.64 Std 39.178187809035656 Gen 20: Min 785.0 Max 988.0 Avg 980.99 Std 27.707578385703428 Gen 21: Min 789.0 Max 988.0 Avg 980.0 Std 31.1358314486693 Gen 22: Min 690.0 Max 988.0 Avg 978.8 Std 32.96391966984678 Gen 23: Min 794.0 Max 988.0 Avg 981.1233333333333 Std 26.293499619149838 Gen 24: Min 691.0 Max 988.0 Avg 981.0566666666666 Std 33.116463008937174 Gen 25: Min 788.0 Max 988.0 Avg 979.43 Std 31.13858967048373 Gen 26: Min 694.0 Max 988.0 Avg 981.4366666666666 Std 30.476209118300122 Gen 27: Min 788.0 Max 988.0 Avg 975.47 Std 38.43005464476852 Gen 28: Min 689.0 Max 988.0 Avg 977.1333333333333 Std 40.12283915953909 Gen 29: Min 789.0 Max 988.0 Avg 979.78 Std 32.672694001772946 Gen 30: Min 688.0 Max 988.0 Avg 978.4633333333334 Std 35.322164744281395 Gen 31: Min 697.0 Max 988.0 Avg 977.1433333333333 Std 35.65991384672285 Gen 32: Min 691.0 Max 988.0 Avg 977.4666666666667 Std 39.729530019313195 Gen 33: Min 689.0 Max 988.0 Avg 977.0733333333334 Std 35.092752654769896 Gen 34: Min 694.0 Max 988.0 Avg 979.7866666666666 Std 33.611919843347344 Gen 35: Min 689.0 Max 988.0 Avg 981.41 Std 29.468094045369302 Gen 36: Min 788.0 Max 988.0 Avg 979.11 Std 34.46947296763569 Gen 37: Min 694.0 Max 988.0 Avg 980.1366666666667 Std 31.0949833395823 Gen 38: Min 788.0 Max 988.0 Avg 981.07 Std 30.00364033468625 Gen 39: Min 691.0 Max 988.0 Avg 981.0266666666666 Std 32.25077707935549 Best individual is: [98, 100, 100, 97, 94, 99, 100, 100, 100, 100] (988.0,)
The output represents the results of an evolutionary algorithm running for $40$ generations.
Here’s a breakdown of what each part of the output means:
Generations (Gen 0, Gen 1, etc.):
Each “Gen” line corresponds to a generation in the evolutionary process. The algorithm starts at Gen 0 and evolves through to Gen $39$.
Min, Max, Avg, Std:
Min: The minimum fitness value in the population for that generation.
Max: The maximum fitness value in the population for that generation.
Avg: The average fitness value across the entire population for that generation.
Std: The standard deviation of fitness values in the population, indicating how spread out the fitness values are.
Trends over Generations:
Min: The minimum fitness value generally increases over time, suggesting that the worst-performing individuals are improving.
Max: The maximum fitness value also increases, indicating that the best individuals are becoming fitter.
Avg: The average fitness steadily rises, showing that the overall population is improving.
Std: The standard deviation decreases over time, meaning the population is becoming more homogenous, with individuals having similar fitness values.
Final Generation (Gen 39):
Min 691.0, Max 988.0, Avg 981.0266666666666, Std 32.25077707935549: By the final generation, the population has a high average fitness, with the best individual achieving a fitness of $988.0$.
Best Individual:
The best individual found during the evolutionary process is [98, 100, 100, 97, 94, 99, 100, 100, 100, 100] with a fitness value of 988.0. This individual likely represents an optimal or near-optimal solution to the problem.
In summary, the algorithm successfully evolves a population over $40$ generations, improving the fitness of individuals, and converging towards an optimal solution, as indicated by the high average and maximum fitness values in the later generations.
Customizing DEAP
Custom Fitness Functions: You can create custom fitness functions based on the problem at hand.
Custom Selection, Crossover, and Mutation: $DEAP$ allows you to define and use custom operators.
Multi-Objective Optimization: $DEAP$ also supports multi-objective optimization, where the fitness function can return multiple values (e.g., minimizing both time and cost).
This example demonstrates how to set up and run a basic genetic algorithm using $DEAP$.
You can extend it by adjusting the problem, operators, and parameters to suit different optimization tasks.
# Filter rows where age > 30 filtered_df = df.filter(pl.col('age') > 30) print("\nFiltered DataFrame (age > 30):") print(filtered_df)
# Add a new column 'bonus' which is 10% of the salary df = df.with_column((pl.col('salary') * 0.1).alias('bonus')) print("\nDataFrame with Bonus Column:") print(df)
# Group by 'department' and calculate the average salary and total bonus grouped_df = df.groupby('department').agg([ pl.col('salary').mean().alias('avg_salary'), pl.col('bonus').sum().alias('total_bonus') ]) print("\nGrouped by Department with Aggregations:") print(grouped_df)
# Sort the DataFrame by 'salary' in descending order sorted_df = df.sort('salary', reverse=True) print("\nSorted DataFrame by Salary (Descending):") print(sorted_df)
Explanation
Reading CSV:
pl.read_csv('data.csv') reads the CSV file into a $Polars$ DataFrame.
Displaying the DataFrame:
print(df) shows the original DataFrame.
Selecting Specific Columns:
df.select(['name', 'age', 'salary']) selects only the name, age, and salary columns.
Filtering Rows:
df.filter(pl.col('age') > 30) filters the DataFrame to include only rows where the age is greater than $30$.
Adding a New Column:
df.with_column((pl.col('salary') * 0.1).alias('bonus')) adds a new column bonus which is $10$% of the salary.
Grouping and Aggregating:
df.groupby('department').agg([pl.col('salary').mean().alias('avg_salary'), pl.col('bonus').sum().alias('total_bonus')]) groups the DataFrame by department and calculates the average salary and total bonus for each department.
Sorting:
df.sort('salary', reverse=True) sorts the DataFrame by salary in descending order.
Here are a few useful SciPy sample codes showcasing different aspects of the library, including optimization, integration, interpolation, and signal processing.
1. Optimization: Minimizing a Function
This example demonstrates how to minimize a mathematical function using scipy.optimize.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
import numpy as np from scipy.optimize import minimize
# Define a function to minimize defobjective_function(x): return x**2 + 10*np.sin(x)
# Initial guess x0 = 2.0
# Perform the minimization result = minimize(objective_function, x0, method='BFGS')
print("Minimum value of the function:", result.fun) print("Value of x that minimizes the function:", result.x)
[Output]
Minimum value of the function: 8.31558557947746
Value of x that minimizes the function: [3.83746713]
2. Integration: Calculating the Area Under a Curve
This example shows how to calculate the definite integral of a function using scipy.integrate.quad.
1 2 3 4 5 6 7 8 9 10
from scipy.integrate import quad
# Define a function to integrate defintegrand(x): return np.exp(-x**2)
# Compute the integral from 0 to infinity result, error = quad(integrand, 0, np.inf)
print("Integral result:", result)
[Output]
Integral result: 0.8862269254527579
3. Interpolation: Interpolating Data Points
This example demonstrates how to interpolate data points using scipy.interpolate.interp1d.
import numpy as np import matplotlib.pyplot as plt from scipy.signal import butter, lfilter
# Generate a noisy signal np.random.seed(0) t = np.linspace(0, 1, 500, endpoint=False) signal = np.sin(2 * np.pi * 7 * t) + 0.5 * np.random.randn(500)
# Design a low-pass filter defbutter_lowpass(cutoff, fs, order=5): nyquist = 0.5 * fs normal_cutoff = cutoff / nyquist b, a = butter(order, normal_cutoff, btype='low', analog=False) return b, a
# Apply the filter to the signal deflowpass_filter(data, cutoff, fs, order=5): b, a = butter_lowpass(cutoff, fs, order=order) y = lfilter(b, a, data) return y
# Parameters cutoff = 3.5# Desired cutoff frequency of the filter, Hz fs = 50.0# Sample rate, Hz
# Filter the signal filtered_signal = lowpass_filter(signal, cutoff, fs)
These examples provide a glimpse of the powerful tools available in SciPy for scientific computing, including optimization, integration, interpolation, and signal processing.
PuLP is a Python library used for linear programming (LP) and mixed-integer linear programming (MILP).
It allows you to define and solve optimization problems where you want to minimize or maximize a linear objective function subject to linear constraints.
Installation
First, you need to install PuLP. You can do this using pip:
1
pip install pulp
Basic Usage of PuLP
Here’s a step-by-step guide to solving a simple linear programming problem with PuLP.
Example Problem:
Suppose you are a factory manager and you want to determine the optimal number of products $A$ and $B$ to produce to maximize your profit. Each product requires a certain amount of resources (time, materials) and yields a certain profit.
Objective: Maximize profit.
Constraints:
You have $100$ units of material.
You have $80$ hours of labor.
Product $A$ requires $4$ units of material and $2$ hours of labor.
Product $B$ requires $3$ units of material and $5$ hours of labor.
# Step 1: Define the problem # Create a maximization problem problem = pulp.LpProblem("Maximize_Profit", pulp.LpMaximize)
# Step 2: Define the decision variables # Let x be the number of product A to produce, y be the number of product B to produce x = pulp.LpVariable('x', lowBound=0, cat='Continuous') y = pulp.LpVariable('y', lowBound=0, cat='Continuous')
# Step 3: Define the objective function # Objective function: Maximize 20x + 25y problem += 20*x + 25*y, "Profit"
# Step 6: Display the results print("Status:", pulp.LpStatus[problem.status]) print("Optimal number of product A to produce:", pulp.value(x)) print("Optimal number of product B to produce:", pulp.value(y)) print("Maximum Profit:", pulp.value(problem.objective))
Explanation of the Code:
Define the Problem: We create a linear programming problem with the objective to maximize profit.
Decision Variables: We define the decision variables x and y for the number of products A and B to produce. These variables are continuous and non-negative.
Objective Function: We define the objective function to maximize, which is the total profit: 20*x + 25*y.
Constraints: We add constraints based on the available resources:
Material constraint: 4*x + 3*y <= 100
Labor constraint: 2*x + 5*y <= 80
Solve the Problem: We solve the linear programming problem using PuLP’s solve method.
Display the Results: We print the optimal values of x and y, and the maximum profit.
Interpretation of Output:
Output
1 2 3 4
Status: Optimal Optimal number of product A to produce: 18.571429 Optimal number of product B to produce: 8.5714286 Maximum Profit: 585.714295
Interpretation
The optimal solution is to produce $18.5$ units of product A and $8.57$ units of product B, which yields a maximum profit of $585.71.
This is a basic example of using PuLP in Python.
The library is powerful and can handle more complex constraints, variables, and objectives, including mixed-integer programming.
Here’s a advanced and useful sample code using Altair, which demonstrates how to create a layered chart with interactivity, such as tooltips and selection, along with custom encoding:
import altair as alt from vega_datasets import data
# Load dataset source = data.cars()
# Define a brush selection brush = alt.selection(type='interval')
# Base chart with a scatter plot base = alt.Chart(source).mark_point().encode( x='Horsepower:Q', y='Miles_per_Gallon:Q', color=alt.condition(brush, 'Origin:N', alt.value('lightgray')), tooltip=['Name:N', 'Origin:N', 'Horsepower:Q', 'Miles_per_Gallon:Q'] ).properties( width=600, height=400 ).add_selection( brush )
# Layer a bar chart on top that shows the distribution of 'Origin' for selected points bars = alt.Chart(source).mark_bar().encode( x='count():Q', y='Origin:N', color='Origin:N' ).transform_filter( brush )
# Combine the scatter plot and the bar chart chart = base & bars chart
Result
Explanation
Brush Selection: An interactive selection that allows users to drag over the scatter plot to select a region.
Scatter Plot: A basic plot where points are colored by their origin, and the color changes based on the selection.
Bar Chart: Shows the distribution of car origins, updated based on the selection made in the scatter plot.
Interactivity: Tooltips provide detailed information when hovering over the points, and the chart updates dynamically based on the user’s selection.
This example combines interactivity with advanced encoding and layout, making it useful for exploratory data analysis.