Balance of Payments (BoP) Analysis

Example Problem

We aim to simulate a simplified BoP analysis to understand the interactions between the Current Account (CA), Capital Account (KA), and the Financial Account (FA).

Problem

  1. The Current Account (CA) records trade in goods and services, income, and transfers.
  2. The Capital Account (KA) reflects capital transfers and acquisition/disposal of non-produced assets.
  3. The Financial Account (FA) captures investments in financial assets.

By the BoP identity:
$$
\text{CA} + \text{KA} + \text{FA} = 0
$$

We will simulate and visualize:

  • Changes in the CA due to trade balance adjustments.
  • How these changes are offset by the KA and FA to maintain equilibrium.

Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
import numpy as np
import matplotlib.pyplot as plt

# Parameters
time_steps = 20 # Number of periods
initial_CA = -2.0 # Initial Current Account balance (deficit, in $ billion)
initial_KA = 0.5 # Initial Capital Account balance (in $ billion)
initial_FA = 1.5 # Initial Financial Account balance (in $ billion)
trade_balance_changes = np.linspace(-0.5, 1.0, time_steps) # Simulated trade balance changes

# BoP components
CA = [initial_CA]
KA = [initial_KA]
FA = [initial_FA]

# Simulate BoP dynamics
for i in range(1, time_steps):
# Update Current Account (CA) due to trade balance changes
new_CA = CA[-1] + trade_balance_changes[i]
CA.append(new_CA)

# Capital and Financial Accounts adjust to maintain BoP equilibrium
new_FA = -new_CA - KA[-1] # Financial Account offsets CA and KA
FA.append(new_FA)
KA.append(KA[-1]) # Assume KA remains constant for simplicity

# Visualization
time = np.arange(time_steps)

plt.figure(figsize=(12, 6))

# Plot BoP components
plt.plot(time, CA, label="Current Account (CA)", marker='o')
plt.plot(time, KA, label="Capital Account (KA)", marker='o')
plt.plot(time, FA, label="Financial Account (FA)", marker='o')
plt.axhline(y=0, color='r', linestyle='--', label="Equilibrium Line (BoP=0)")

plt.title("Balance of Payments Dynamics")
plt.xlabel("Time Period")
plt.ylabel("Balance ($ billion)")
plt.legend()
plt.grid()
plt.tight_layout()
plt.show()

Explanation

  1. Current Account Dynamics:
    • Adjustments in trade balance affect the CA over time.
  2. BoP Equilibrium:
    • The KA and FA adjust to ensure that the BoP identity $(\text{CA} + \text{KA} + \text{FA} = 0)$ holds.
  3. Assumptions:
    • The KA remains constant for simplicity.
    • Changes in the CA are offset entirely by the FA.

Results

  • The graph shows how the Current Account evolves due to trade balance changes.
  • The Financial Account responds dynamically to maintain BoP equilibrium.
  • The Capital Account remains constant, highlighting its lesser role in this simplified example.

This example illustrates the balancing act within the BoP framework, showing the interdependence of its components.

Inflation Targeting Simulation

Example Problem: Inflation Targeting Simulation

In this example, we aim to simulate the inflation dynamics under a basic monetary policy rule, such as the $Taylor Rule$.

The $Taylor Rule$ adjusts the nominal interest rate based on deviations of inflation and output from their target levels.

We’ll model how inflation converges to the target over time under this rule.

Problem

Given:

  1. Inflation starts at a level significantly higher than the target.
  2. A central bank adjusts the interest rate to influence inflation.
  3. The inflation equation is simplified to a function of past inflation, output gap, and the interest rate.

We will simulate how inflation evolves over time with:

  • A fixed inflation target.
  • Parameters for monetary policy responsiveness.

Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
import numpy as np
import matplotlib.pyplot as plt

# Parameters
inflation_target = 2.0 # Target inflation (%)
initial_inflation = 5.0 # Initial inflation (%)
output_gap = 1.0 # Initial output gap (%)
beta_inflation = 1.5 # Responsiveness to inflation deviation
beta_output = 0.5 # Responsiveness to output gap
time_steps = 20 # Number of time periods
natural_rate_of_interest = 2.0 # Natural rate of interest

# Simulation variables
inflation = [initial_inflation]
interest_rate = []

# Simulate inflation dynamics
for t in range(time_steps):
# Calculate interest rate using Taylor Rule
interest = natural_rate_of_interest + beta_inflation * (inflation[-1] - inflation_target) + beta_output * output_gap
interest_rate.append(interest)

# Update inflation (simplified dynamics: next inflation is dampened by interest rate)
next_inflation = inflation[-1] - 0.2 * (interest - natural_rate_of_interest)
inflation.append(next_inflation)

# Visualization
time = np.arange(0, time_steps + 1)

plt.figure(figsize=(12, 6))

# Plot inflation dynamics
plt.subplot(2, 1, 1)
plt.plot(time, inflation, marker='o', label="Inflation")
plt.axhline(y=inflation_target, color='r', linestyle='--', label="Inflation Target")
plt.title("Inflation Dynamics under the Taylor Rule")
plt.xlabel("Time Period")
plt.ylabel("Inflation (%)")
plt.legend()
plt.grid()

# Plot interest rate
plt.subplot(2, 1, 2)
plt.plot(time[:-1], interest_rate, marker='o', color='orange', label="Interest Rate")
plt.axhline(y=natural_rate_of_interest, color='r', linestyle='--', label="Natural Rate of Interest")
plt.title("Interest Rate Adjustment")
plt.xlabel("Time Period")
plt.ylabel("Interest Rate (%)")
plt.legend()
plt.grid()

plt.tight_layout()
plt.show()

Explanation

  1. Taylor Rule:
    • Sets the nominal interest rate based on inflation deviation and output gap.
  2. Inflation Dynamics:
    • Inflation adjusts gradually in response to interest rate changes.
  3. Visualization:
    • The top graph shows how inflation converges to the target.
    • The bottom graph displays the interest rate adjustments over time.

Results

  • Inflation converges toward the target ($2$%) over time due to the central bank’s interest rate adjustments.
  • Interest rates stabilize as inflation reaches the target.
    This illustrates the effectiveness of inflation targeting under the $Taylor Rule$.

Debt Dynamics with Stochastic Shocks

Problem Statement: Debt Dynamics with Stochastic Shocks

Objective:
Analyze the debt-to-GDP ratio of a country over time under stochastic (random) variations in interest rates $(r_t)$ and GDP growth rates $(g_t)$.

This captures the uncertainty in economic conditions and provides insights into debt sustainability in a volatile environment.


Model

The debt-to-GDP ratio evolves as:
$$
b_{t+1} = b_t \cdot \frac{1 + r_t}{1 + g_t} - p
$$

  • $b_t$: Debt-to-GDP ratio at time $t$.
  • $r_t$: Random interest rate at $t$, drawn from a normal distribution.
  • $g_t$: Random GDP growth rate at $t$, drawn from a normal distribution.
  • $p$: Primary balance as a percentage of GDP ($p > 0$ for surplus, $p < 0$ for deficit).

The randomness in $r_t$ and $g_t$ represents economic uncertainties, such as financial market shocks or unexpected GDP contractions.


Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
import numpy as np
import matplotlib.pyplot as plt

# Parameters
T = 50 # Time horizon (years)
b0 = 0.8 # Initial debt-to-GDP ratio
p = -0.02 # Primary deficit (-2% of GDP)

# Stochastic parameters
mean_r = 0.03 # Mean interest rate (3%)
std_r = 0.01 # Standard deviation of interest rate
mean_g = 0.02 # Mean GDP growth rate (2%)
std_g = 0.015 # Standard deviation of GDP growth rate

# Simulation
np.random.seed(42) # For reproducibility
r_t = np.random.normal(mean_r, std_r, T) # Random interest rates
g_t = np.random.normal(mean_g, std_g, T) # Random GDP growth rates

time = np.arange(T)
debt_to_gdp = np.zeros(T)
debt_to_gdp[0] = b0

for t in range(1, T):
debt_to_gdp[t] = debt_to_gdp[t-1] * ((1 + r_t[t]) / (1 + g_t[t])) - p

# Visualization
plt.figure(figsize=(12, 6))

# Debt-to-GDP Ratio Over Time
plt.plot(time, debt_to_gdp, label="Debt-to-GDP Ratio", color="blue")
plt.axhline(1, color="black", linestyle="--", alpha=0.7, label="100% Debt-to-GDP")
plt.fill_between(time, 0, debt_to_gdp, color="blue", alpha=0.2)
plt.title("Debt-to-GDP Dynamics with Stochastic Shocks")
plt.xlabel("Time (Years)")
plt.ylabel("Debt-to-GDP Ratio")
plt.legend()
plt.grid()
plt.tight_layout()
plt.show()

# Summary
print(f"Initial Debt-to-GDP Ratio: {b0:.2f}")
print(f"Debt-to-GDP Ratio After {T} Years: {debt_to_gdp[-1]:.2f}")
print(f"Average Interest Rate: {np.mean(r_t):.2%}")
print(f"Average GDP Growth Rate: {np.mean(g_t):.2%}")

Explanation of Code

  1. Stochastic Variables:

    • The interest rate $(r_t)$ and GDP growth rate $(g_t)$ are drawn from normal distributions with specified means and standard deviations.
    • This randomness models real-world economic uncertainties.
  2. Debt Evolution:

    • At each time step, the debt-to-GDP ratio evolves based on the stochastic parameters.
  3. Visualization:

    • The debt-to-GDP trajectory shows the impact of economic volatility.
    • The shaded region emphasizes the fluctuation over time.

Results

Initial Debt-to-GDP Ratio: 0.80
Debt-to-GDP Ratio After 50 Years: 2.31
Average Interest Rate: 2.77%
Average GDP Growth Rate: 2.03%
  1. Debt Path:

    • The debt-to-GDP ratio fluctuates due to variations in interest rates and GDP growth rates.
    • Persistent high interest rates or low growth can lead to unsustainable debt.
  2. Long-Term Trend:

    • The final debt-to-GDP ratio depends on the cumulative effect of the stochastic shocks.
  3. Policy Implications:

    • Governments should account for uncertainty when setting fiscal targets.
    • Debt sustainability requires managing both expected trends and risks.

Business Cycle Analysis with Real GDP Data

Problem Statement: Business Cycle Analysis with Real GDP Data

Objective:
Analyze the business cycle by identifying and visualizing the cyclical components of Real GDP using the $Hodrick$-$Prescott (HP) filter$.

This method separates the trend and cyclical components of GDP, allowing us to examine deviations from the long-term trend.


Steps:

  1. Data:
    Use simulated or publicly available Real GDP time series data.

  2. Hodrick-Prescott Filter:

    • Decompose GDP into its trend $(T_t)$ and cyclical $(C_t)$ components:
      $$
      GDP_t = T_t + C_t
      $$
    • The $HP filter$ minimizes the following loss function:
      $$
      \sum_t (GDP_t - T_t)^2 + \lambda \sum_t \left[(T_{t+1} - T_t) - (T_t - T_{t-1})\right]^2
      $$
      where $(\lambda)$ is a smoothing parameter (typically $1600$ for quarterly data).
  3. Visualize:
    Plot the original GDP, trend, and cyclical components to analyze economic fluctuations.


Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
import numpy as np
import matplotlib.pyplot as plt
from statsmodels.tsa.filters.hp_filter import hpfilter

# Simulated GDP Data (quarterly, 50 periods)
np.random.seed(42)
time = np.arange(50)
trend = 2.5 * time + 100 # Linear trend
cyclical = 15 * np.sin(0.3 * time) # Cyclical component
noise = np.random.normal(0, 5, size=time.shape) # Random noise
gdp = trend + cyclical + noise # Simulated GDP

# Apply the HP filter
gdp_trend, gdp_cycle = hpfilter(gdp, lamb=1600)

# Plotting
plt.figure(figsize=(14, 8))

# Original and Trend
plt.subplot(2, 1, 1)
plt.plot(time, gdp, label="Original GDP", color="blue")
plt.plot(time, gdp_trend, label="Trend (HP Filter)", color="red", linestyle="--")
plt.title("Real GDP and Trend Component")
plt.xlabel("Time (Quarters)")
plt.ylabel("GDP")
plt.legend()
plt.grid()

# Cyclical Component
plt.subplot(2, 1, 2)
plt.plot(time, gdp_cycle, label="Cyclical Component", color="green")
plt.axhline(0, color="black", linestyle="--", linewidth=0.8)
plt.title("Cyclical Component of GDP (Deviations from Trend)")
plt.xlabel("Time (Quarters)")
plt.ylabel("GDP Deviations")
plt.legend()
plt.grid()

plt.tight_layout()
plt.show()

# Print summary statistics
print(f"Original GDP Mean: {np.mean(gdp):.2f}")
print(f"Trend Component Mean: {np.mean(gdp_trend):.2f}")
print(f"Cyclical Component Mean: {np.mean(gdp_cycle):.2f} (should be ~0)")

Explanation of the Code

  1. Simulated GDP Data:

    • Real GDP is modeled as a combination of a long-term trend, a cyclical fluctuation, and random noise.
  2. Hodrick-Prescott Filter:

    • The $HP filter$ separates the GDP into a smooth trend and short-term deviations (cyclical component).
    • The smoothing parameter $(\lambda)$ controls the smoothness of the trend; $1600$ is standard for quarterly data.
  3. Visualization:

    • The first plot shows the original GDP and its trend.
    • The second plot highlights the cyclical component, indicating deviations from the long-term trend.

Results

Original GDP Mean: 161.77
Trend Component Mean: 0.00
Cyclical Component Mean: 161.77 (should be ~0)
  1. Trend Component:

    • Represents the long-term economic growth path.
  2. Cyclical Component:

    • Indicates business cycle fluctuations around the trend.
    • Peaks and troughs correspond to economic expansions and contractions, respectively.
  3. Insights:

    • The cyclical component helps identify recessions and booms.
    • Policymakers and economists use this analysis to design countercyclical measures.

Price Discrimination in a Monopoly

Problem Statement: Price Discrimination in a Monopoly

Objective:
A monopolist sells its product to two distinct markets with different demand elasticities.
The monopolist can price discriminate, setting different prices for each market to maximize overall profit.

The goal is to:

  1. Simulate the optimal pricing strategy for the monopolist.
  2. Compute the quantities sold and profits in each market.
  3. Visualize the results to illustrate price discrimination.

Assumptions

  1. Demand Functions:

    • Market $1$: $( Q_1 = a_1 - b_1 \cdot P_1 )$
    • Market $2$: $( Q_2 = a_2 - b_2 \cdot P_2 )$
      $ P_1 $ and $ P_2 $ are prices set in each market.
  2. Profit Function:
    The monopolist’s total profit is:
    $$
    \pi = (P_1 - c) \cdot Q_1 + (P_2 - c) \cdot Q_2
    $$
    where $c$ is the marginal cost.

  3. Goal:
    Maximize $\pi$ by choosing optimal $P_1$ and $P_2$.


Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
import numpy as np
from scipy.optimize import minimize
import matplotlib.pyplot as plt

# Parameters
a1, b1 = 100, 2 # Market 1: Demand intercept and slope
a2, b2 = 120, 4 # Market 2: Demand intercept and slope
c = 20 # Marginal cost

# Demand functions
def demand1(p1):
return max(0, a1 - b1 * p1)

def demand2(p2):
return max(0, a2 - b2 * p2)

# Profit function
def total_profit(prices):
p1, p2 = prices
q1 = demand1(p1)
q2 = demand2(p2)
return -((p1 - c) * q1 + (p2 - c) * q2) # Negative for minimization

# Initial guesses and bounds
initial_prices = [50, 50]
bounds = [(c, a1 / b1), (c, a2 / b2)] # Prices must be at least marginal cost

# Solve for optimal prices
result = minimize(total_profit, initial_prices, bounds=bounds)
p1_opt, p2_opt = result.x
q1_opt = demand1(p1_opt)
q2_opt = demand2(p2_opt)
profit_opt = -(result.fun)

# Visualization
prices = np.linspace(0, 80, 500)
revenues1 = [(p - c) * demand1(p) for p in prices]
revenues2 = [(p - c) * demand2(p) for p in prices]

plt.figure(figsize=(12, 6))

# Revenue curves
plt.plot(prices, revenues1, label="Market 1 Profit", color="blue")
plt.plot(prices, revenues2, label="Market 2 Profit", color="orange")

# Optimal points
plt.scatter(p1_opt, (p1_opt - c) * q1_opt, color="blue", label=f"Opt. Price Market 1: ${p1_opt:.2f}")
plt.scatter(p2_opt, (p2_opt - c) * q2_opt, color="orange", label=f"Opt. Price Market 2: ${p2_opt:.2f}")

plt.axvline(p1_opt, color="blue", linestyle="--", alpha=0.7)
plt.axvline(p2_opt, color="orange", linestyle="--", alpha=0.7)

plt.title("Profit Maximization with Price Discrimination")
plt.xlabel("Price")
plt.ylabel("Profit")
plt.legend()
plt.grid()
plt.tight_layout()
plt.show()

# Print results
print(f"Optimal Price in Market 1: ${p1_opt:.2f}")
print(f"Optimal Price in Market 2: ${p2_opt:.2f}")
print(f"Quantity Sold in Market 1: {q1_opt:.2f}")
print(f"Quantity Sold in Market 2: {q2_opt:.2f}")
print(f"Total Profit: ${profit_opt:.2f}")

Explanation of Code

  1. Demand Functions:

    • Market $1$ has a lower price sensitivity ($b_1 < b_2$), indicating it is less elastic.
    • Market $2$ is more elastic ($b_2 > b_1$), meaning customers are more price-sensitive.
  2. Profit Maximization:

    • The monopolist maximizes profit by finding the best prices for each market using scipy.optimize.minimize.
    • The constraints ensure prices are at least equal to the marginal cost.
  3. Visualization:

    • The revenue curves for both markets show how profit changes with price.
    • The optimal prices for both markets are highlighted.

Results

ptimal Price in Market $1$: $35.00
Optimal Price in Market $2$: $25.00
Quantity Sold in Market $1$: 30.00
Quantity Sold in Market $2$: 20.00
Total Profit: $550.00
  1. Optimal Prices:

    • The monopolist sets a higher price in the less elastic market (Market $1$).
    • The price is lower in the more elastic market (Market $2$).
  2. Quantities and Profit:

    • The quantity sold is higher in Market $2$ due to its larger demand intercept.
    • The monopolist achieves maximum profit by price discrimination.
  3. Graph:

    • The revenue curves illustrate the profit-maximizing prices visually.
    • The points of tangency indicate the optimal strategy.

Optimal Taxation and the Laffer Curve

Problem Statement: Optimal Taxation and the Laffer Curve

The Laffer Curve illustrates the relationship between tax rates and government revenue.

At very low tax rates, government revenue is minimal, and at very high tax rates, revenue also decreases because of reduced economic activity.

The goal is to:

  1. Simulate the Laffer Curve based on a theoretical economy.
  2. Identify the optimal tax rate that maximizes government revenue.
  3. Visualize the results.

Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
import numpy as np
import matplotlib.pyplot as plt

# Define the economy model
def government_revenue(tax_rate, max_income=1_000_000):
"""
Simulates government revenue based on a tax rate.
Revenue decreases at very high tax rates due to reduced economic activity.
"""
# Economic activity decreases as tax rate increases
economic_activity = max_income * (1 - 0.5 * tax_rate**2)
revenue = tax_rate * economic_activity
return max(0, revenue) # Ensure revenue is non-negative

# Generate tax rates
tax_rates = np.linspace(0, 1, 101) # Tax rates from 0% to 100%

# Compute government revenues for each tax rate
revenues = [government_revenue(rate) for rate in tax_rates]

# Find the optimal tax rate
optimal_index = np.argmax(revenues)
optimal_tax_rate = tax_rates[optimal_index]
optimal_revenue = revenues[optimal_index]

# Visualization
plt.figure(figsize=(12, 6))

# Laffer curve
plt.plot(tax_rates, revenues, label="Laffer Curve", color="blue", linewidth=2)
plt.axvline(optimal_tax_rate, color="red", linestyle="--", label="Optimal Tax Rate")
plt.scatter(optimal_tax_rate, optimal_revenue, color="red", zorder=5)
plt.title("Optimal Taxation and the Laffer Curve")
plt.xlabel("Tax Rate")
plt.ylabel("Government Revenue")
plt.legend()
plt.grid()

# Annotate the optimal point
plt.annotate(
f"Optimal Tax Rate = {optimal_tax_rate:.2%}\nRevenue = ${optimal_revenue:,.0f}",
(optimal_tax_rate, optimal_revenue),
textcoords="offset points",
xytext=(10, -50),
arrowprops=dict(arrowstyle="->", color="red")
)

plt.tight_layout()
plt.show()

Explanation of the Code

  1. Economic Model:
    • The government_revenue function calculates revenue as $( \text{Revenue} = \text{Tax Rate} \times \text{Economic Activity} )$.
    • Economic activity decreases quadratically with higher tax rates, representing reduced incentives for productivity at high taxation levels.
  2. Simulation:
    • Tax rates range from $0$% to $100$%.
    • Government revenue is computed for each tax rate.
  3. Optimization:
    • The optimal tax rate corresponds to the maximum government revenue, determined using np.argmax.
  4. Visualization:
    • The Laffer Curve shows the relationship between tax rates and revenue.
    • The optimal tax rate is highlighted with a vertical line and annotated.

Results

  1. Laffer Curve:
    • The curve rises as the tax rate increases initially but falls after a certain point due to reduced economic activity.
  2. Optimal Tax Rate:
    • This is the tax rate where government revenue is maximized.
    • The plot includes a clear marker and annotation for this point.

Principal Component Analysis (PCA) on High-Dimensional Data

Problem Statement: Principal Component Analysis (PCA) on High-Dimensional Data

Objective:
We have a dataset with multiple highly correlated features, which makes it difficult to interpret and use for predictive modeling.
The goal is to use Principal Component Analysis (PCA) to reduce the dimensionality of the dataset while retaining most of its variance.

The steps are:

  1. Generate synthetic high-dimensional data.
  2. Perform PCA to reduce dimensions.
  3. Visualize the explained variance and transformed dataset in $2D$ space.

Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler

# Generate synthetic high-dimensional data
np.random.seed(42)
n_samples = 500
n_features = 10

# Highly correlated features
base_feature = np.random.normal(0, 1, n_samples)
data = np.array([
base_feature + np.random.normal(0, 0.1, n_samples) for _ in range(n_features)
]).T

# Add some noise to create weaker correlations
for i in range(n_features):
data[:, i] += np.random.normal(0, 0.5, n_samples)

# Create a DataFrame
columns = [f"Feature_{i+1}" for i in range(n_features)]
df = pd.DataFrame(data, columns=columns)

# Standardize the data
scaler = StandardScaler()
data_scaled = scaler.fit_transform(df)

# Perform PCA
pca = PCA()
data_pca = pca.fit_transform(data_scaled)

# Explained variance ratio
explained_variance_ratio = pca.explained_variance_ratio_

# Reduce to 2 components for visualization
pca_2d = PCA(n_components=2)
data_2d = pca_2d.fit_transform(data_scaled)

# Visualization
plt.figure(figsize=(14, 6))

# Scree plot of explained variance
plt.subplot(1, 2, 1)
plt.plot(np.cumsum(explained_variance_ratio), marker='o', linestyle='--', color='b')
plt.title("Cumulative Explained Variance")
plt.xlabel("Number of Components")
plt.ylabel("Explained Variance Ratio")
plt.grid()

# Scatter plot of first two principal components
plt.subplot(1, 2, 2)
plt.scatter(data_2d[:, 0], data_2d[:, 1], alpha=0.6, color='green')
plt.title("2D Projection of Data (PCA)")
plt.xlabel("Principal Component 1")
plt.ylabel("Principal Component 2")
plt.grid()

plt.tight_layout()
plt.show()

# Print the explained variance ratio
print("Explained Variance Ratio for each component:")
for i, ratio in enumerate(explained_variance_ratio):
print(f"Component {i+1}: {ratio:.4f}")

Explanation of Code

  1. Data Generation:
    • A synthetic dataset with $10$ highly correlated features is generated by adding small random noise to a base feature.
    • Additional noise ensures some weaker correlations for realism.
  2. Standardization:
    • Features are standardized to have a mean of $0$ and a standard deviation of $1$, which is necessary for PCA.
  3. PCA:
    • Principal components are computed to transform the high-dimensional dataset into a lower-dimensional space.
    • The explained variance ratio shows how much variance each principal component captures.
  4. Visualization:
    • A scree plot shows the cumulative variance explained by the components.
    • A $2D$ scatter plot visualizes the dataset in the first two principal components.

Results

Explained Variance Ratio for each component:
Component 1: 0.8043
Component 2: 0.0274
Component 3: 0.0267
Component 4: 0.0237
Component 5: 0.0220
Component 6: 0.0209
Component 7: 0.0201
Component 8: 0.0198
Component 9: 0.0178
Component 10: 0.0172
  1. Scree Plot:
    • Shows how many components are required to explain most of the variance.
    • Typically, we select enough components to explain $90$-$95$% of the variance.
  2. 2D Scatter Plot:
    • Displays the high-dimensional data projected onto the first two principal components, which often reveal underlying patterns or clusters.

Epidemic Spread Model (SIR Model)

Problem Statement: Epidemic Spread Model (SIR Model)

The SIR model is a basic mathematical model to describe the spread of infectious diseases.
It divides the population into three compartments:

  • S (Susceptible): People who can catch the disease.
  • I (Infected): People currently infected and able to spread the disease.
  • R (Recovered): People who have recovered and are now immune.

The dynamics are governed by these equations:
$$
\frac{dS}{dt} = -\beta S I
$$
$$
\frac{dI}{dt} = \beta S I - \gamma I
$$
$$
\frac{dR}{dt} = \gamma I
$$

  • $(\beta)$ is the infection rate.
  • $(\gamma)$ is the recovery rate.

Simulate the SIR model for $160$ days with a population of $1,000$, assuming an initial infection of $1$ person, $(\beta = 0.3)$, and $(\gamma = 0.1)$.

Visualize the results.


Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
import numpy as np
from scipy.integrate import odeint
import matplotlib.pyplot as plt

# Parameters
N = 1000 # Total population
beta = 0.3 # Infection rate
gamma = 0.1 # Recovery rate
I0 = 1 # Initial infected
R0 = 0 # Initial recovered
S0 = N - I0 - R0 # Initial susceptible
days = 160 # Simulation duration

# SIR model differential equations
def sir_model(y, t, N, beta, gamma):
S, I, R = y
dSdt = -beta * S * I / N
dIdt = beta * S * I / N - gamma * I
dRdt = gamma * I
return [dSdt, dIdt, dRdt]

# Initial conditions
y0 = [S0, I0, R0]

# Time points
t = np.linspace(0, days, days)

# Solve ODEs
result = odeint(sir_model, y0, t, args=(N, beta, gamma))
S, I, R = result.T

# Visualization
plt.figure(figsize=(12, 6))

# Plot the results
plt.plot(t, S, label="Susceptible", color="blue")
plt.plot(t, I, label="Infected", color="red")
plt.plot(t, R, label="Recovered", color="green")
plt.title("SIR Epidemic Model")
plt.xlabel("Days")
plt.ylabel("Number of People")
plt.legend()
plt.grid()

plt.tight_layout()
plt.show()

Explanation of Code

  1. Model Dynamics:
    • The sir_model function defines the SIR equations.
    • The odeint function solves these differential equations numerically.
  2. Parameters:
    • $(N)$: Total population.
    • $(\beta)$: Controls how fast the disease spreads.
    • $(\gamma)$: Determines how fast infected individuals recover.
  3. Initial Conditions:
    • Start with $1$ infected person, $0$ recovered, and the rest susceptible.
  4. Time Points:
    • Simulate the system for $160$ days.
  5. Visualization:
    • Plot the population dynamics of the S, I, and R compartments over time.

Results

  1. Blue Curve (Susceptible): Shows the decreasing number of people vulnerable to infection.
  2. Red Curve (Infected): Demonstrates the rise and eventual decline of infections as the epidemic progresses.
  3. Green Curve (Recovered): Represents the growing number of immune individuals over time.

This visualization provides insight into how diseases spread and decline in a population.

Practical Example in Econometrics:Estimating the Effect of Advertising on Sales Using Multiple Regression

In this example, we analyze the impact of different advertising channels (TV, radio, and online ads) on product sales.

This example demonstrates the application of econometrics in marketing analysis.


Problem Statement

We aim to estimate the following relationship:
$$
\text{Sales} = \beta_0 + \beta_1 \cdot \text{TV} + \beta_2 \cdot \text{Radio} + \beta_3 \cdot \text{Online} + \epsilon
$$
where:

  • $ \text{Sales} $: Units sold.
  • $ \text{TV}, \text{Radio}, \text{Online} $: Advertising budgets in respective channels.
  • $ \beta_0, \beta_1, \beta_2, \beta_3 $: Coefficients to be estimated.

Python Implementation

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
import numpy as np
import pandas as pd
import statsmodels.api as sm
import matplotlib.pyplot as plt
import seaborn as sns

# Generate synthetic data
np.random.seed(42)
n = 100
tv = np.random.uniform(0, 100, n) # TV advertising budget
radio = np.random.uniform(0, 50, n) # Radio advertising budget
online = np.random.uniform(0, 30, n) # Online advertising budget
error = np.random.normal(0, 10, n) # Random noise
sales = 5 + 0.3 * tv + 0.4 * radio + 0.2 * online + error # Sales formula

# Create a DataFrame
data = pd.DataFrame({
'TV': tv,
'Radio': radio,
'Online': online,
'Sales': sales
})

# Prepare data for regression
X = data[['TV', 'Radio', 'Online']]
X = sm.add_constant(X) # Add intercept
y = data['Sales']

# Fit multiple regression model
model = sm.OLS(y, X).fit()
print(model.summary())

# Visualize coefficients
coefficients = model.params[1:]
plt.figure(figsize=(8, 5))
coefficients.plot(kind='bar', color=['blue', 'green', 'orange'])
plt.title('Impact of Advertising Budgets on Sales')
plt.ylabel('Coefficient Value')
plt.xticks(rotation=0)
plt.grid(axis='y')
plt.show()

# Pairplot to examine relationships
sns.pairplot(data, x_vars=['TV', 'Radio', 'Online'], y_vars='Sales', kind='reg', height=4)
plt.suptitle("Relationships Between Advertising and Sales", y=1.02)
plt.show()

Explanation of Code

  1. Synthetic Data Generation:

    • Simulates advertising budgets and their effect on sales.
    • Adds random noise $ \epsilon $ to make the data realistic.
  2. Regression Analysis:

    • OLS (Ordinary Least Squares) is used to estimate the coefficients $( \beta_0, \beta_1, \beta_2, \beta_3 $).
  3. Visualizations:

    • Bar Plot: Displays the estimated impact (coefficients) of each advertising channel on sales.
    • Pairplot: Shows scatter plots with regression lines to visualize individual relationships.

Key Outputs

  1. Regression Summary:
    • Coefficients:
      • Quantify the effect of each advertising channel on sales.
    • R-squared:
      • Indicates how much of the variability in sales is explained by the model.
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                  Sales   R-squared:                       0.600
Model:                            OLS   Adj. R-squared:                  0.588
Method:                 Least Squares   F-statistic:                     48.08
Date:                Tue, 03 Dec 2024   Prob (F-statistic):           4.65e-19
Time:                        03:01:43   Log-Likelihood:                -368.43
No. Observations:                 100   AIC:                             744.9
Df Residuals:                      96   BIC:                             755.3
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const          0.9997      3.278      0.305      0.761      -5.507       7.507
TV             0.3416      0.033     10.267      0.000       0.276       0.408
Radio          0.4417      0.068      6.474      0.000       0.306       0.577
Online         0.3098      0.114      2.727      0.008       0.084       0.535
==============================================================================
Omnibus:                        5.375   Durbin-Watson:                   2.376
Prob(Omnibus):                  0.068   Jarque-Bera (JB):                4.964
Skew:                          -0.402   Prob(JB):                       0.0836
Kurtosis:                       3.738   Cond. No.                         205.
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
  1. Bar Plot:
    • Highlights the relative importance of each advertising channel.

  1. Pairplot:
    • Provides an intuitive visualization of relationships between budgets and sales.


This example can be extended to include interaction terms (e.g., $ \text{TV} \times \text{Online} $) or to model diminishing returns on advertising investments.

Practical Example in Geometry:Finding the Intersection of Two Circles

We solve a common geometric problem: determining the intersection points of two circles.

This has applications in fields like $computer$ $graphics$, $robotics$, and $navigation$.


Problem Statement

Two circles are defined by:

  1. Circle 1: Center $(x_1, y_1)$, radius $r_1$.
  2. Circle 2: Center $(x_2, y_2)$, radius $r_2$.

The task is to find the intersection points of these circles.

If they intersect, there can be two points, one point (tangent), or no intersection.


Python Implementation

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
import numpy as np
import matplotlib.pyplot as plt

def find_circle_intersection(x1, y1, r1, x2, y2, r2):
# Distance between circle centers
d = np.sqrt((x2 - x1)**2 + (y2 - y1)**2)

# Check for no intersection or containment
if d > r1 + r2 or d < abs(r1 - r2):
return None, None # No intersection

# Check for tangency
if d == 0 and r1 == r2:
return "Infinite", None # Infinite intersections (identical circles)

# Calculate intersection points
a = (r1**2 - r2**2 + d**2) / (2 * d)
h = np.sqrt(r1**2 - a**2)

# Midpoint between circle centers
x3 = x1 + a * (x2 - x1) / d
y3 = y1 + a * (y2 - y1) / d

# Offset for intersection points
x_intersect1 = x3 + h * (y2 - y1) / d
y_intersect1 = y3 - h * (x2 - x1) / d

x_intersect2 = x3 - h * (y2 - y1) / d
y_intersect2 = y3 + h * (x2 - x1) / d

return (x_intersect1, y_intersect1), (x_intersect2, y_intersect2)

# Example: Define two circles
x1, y1, r1 = 0, 0, 5
x2, y2, r2 = 6, 0, 5

# Find intersection points
intersection1, intersection2 = find_circle_intersection(x1, y1, r1, x2, y2, r2)

# Visualization
circle1 = plt.Circle((x1, y1), r1, color='blue', fill=False, label='Circle 1')
circle2 = plt.Circle((x2, y2), r2, color='red', fill=False, label='Circle 2')

fig, ax = plt.subplots(figsize=(8, 8))
ax.add_artist(circle1)
ax.add_artist(circle2)
plt.scatter([x1, x2], [y1, y2], color='black', label='Centers') # Centers of circles

# Plot intersection points if they exist
if intersection1 and intersection2:
plt.scatter([intersection1[0], intersection2[0]], [intersection1[1], intersection2[1]],
color='green', label='Intersection Points')
elif intersection1 == "Infinite":
plt.text(0, 0, "Infinite Intersections (Identical Circles)", color="purple")

# Adjust plot
plt.axhline(0, color='gray', linestyle='--', linewidth=0.5)
plt.axvline(0, color='gray', linestyle='--', linewidth=0.5)
plt.xlim(-10, 15)
plt.ylim(-10, 10)
plt.gca().set_aspect('equal', adjustable='box')
plt.legend()
plt.title("Intersection of Two Circles")
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.grid()
plt.show()

Explanation of Code

  1. Distance Check:

    • The distance $( d )$ between the circle centers determines whether they intersect:
      • $( d > r_1 + r_2 )$: No intersection (too far apart).
      • $( d < |r_1 - r_2| )$: No intersection (one circle is contained within the other).
      • $( d = 0 )$ and $( r_1 = r_2 )$: Infinite intersections (identical circles).
  2. Intersection Calculation:

    • The formula for $( a )$ calculates the distance from the first circle’s center to the midpoint between intersection points.
    • $( h )$ is the perpendicular distance from the midpoint to the intersection points.
  3. Visualization:

    • The circles are drawn with plt.Circle.
    • Intersection points are plotted in green, and the centers are marked in black.

Key Outputs

  1. Intersection Points:

    • The two green points indicate where the circles intersect.
    • If no points exist, the code returns None.
  2. Graph:

    • The visualization shows the circles, their centers, and their intersection points clearly.

This example can be extended to solve 3D sphere intersection problems or optimized for large datasets of circles.