statsmodels in Python
$statsmodels$ is a powerful $Python$ library used for statistical modeling and analysis.
It provides classes and functions for many statistical models, such as linear regression, generalized linear models, time series analysis, and more.
Here are some sample codes using $statsmodels$:
1. Ordinary Least Squares (OLS) Regression
1 | import statsmodels.api as sm |
Explanation:
Xis the independent variable, andyis the dependent variable.sm.add_constant(X)adds a constant (intercept) to the model.model = sm.OLS(y, X)creates an OLS model.results = model.fit()fits the model to the data.results.summary()provides a detailed summary of the regression results.
Output:
1 | OLS Regression Results |
2. Logistic Regression
1 | import statsmodels.api as sm |
Explanation:
yis a binary outcome ($0$ or $1$).sm.Logit(y, X)creates a logistic regression model.results = model.fit()fits the logistic regression model to the data.results.summary()provides a summary of the logistic regression results.
Output:
1 | Optimization terminated successfully. |
3. Time Series Analysis using ARIMA
1 | import statsmodels.api as sm |
Explanation:
time_series_datais the time series data to model.sm.tsa.ARIMA(time_series_data, order=(1, 1, 1))creates an ARIMA model with specific orders for AR, differencing, and MA.results.get_forecast(steps=10)forecasts the next $10$ steps in the time series.
Output:
4. ANOVA (Analysis of Variance)
1 | import statsmodels.api as sm |
Explanation:
dfcontains a categorical variableGroupand a continuous variableValue.ols('Value ~ Group', data=df)fits an OLS model treatingGroupas a categorical variable.sm.stats.anova_lm(model, typ=2)performs ANOVA on the fitted model.
Output:
1 | sum_sq df F PR(>F) |
5. Autoregressive Model (AR)
1 | import statsmodels.api as sm |
Explanation:
AutoReg(time_series_data, lags=1)creates an autoregressive model with a specified number of lags.- The
lagsparameter specifies how many previous time points to use for predicting the next value. results.predict(start=90, end=109)generates predictions for the specified range.
Output:
These examples demonstrate the versatility of $statsmodels$ in performing various statistical analyses and generating useful insights from data.







