Installing Statsmodels takes just a few commands, but the process varies slightly depending on your operating system and Python setup. The library supports Python 3.9 through 3.14, so you’ll need one of these versions installed before starting.
I recommend using pip for most installations. Conda works well if you’re managing complex scientific computing environments. Both methods handle dependencies automatically, installing NumPy, SciPy, Pandas, and Patsy alongside Statsmodels.
Statsmodel Beginner’s Learning Path
What you need before installing
Your system needs Python 3.9 or newer. Check your version by opening a terminal and running:
You should see something like Python 3.12.3 or similar. If your version is older than 3.9, upgrade Python first.
You also need pip (Python’s package installer) or conda (if you’re using Anaconda). Most Python installations include pip by default. Verify it’s installed:
Installing Statsmodels with pip
The simplest installation method uses pip. Open your terminal and run:
This installs the latest stable version (currently 0.14.5) along with all required dependencies. The process takes a minute or two depending on your internet connection and system speed.
For Python 3 systems with both Python 2 and 3 installed:
Use pip3 instead to avoid conflicts:
Installing a specific version:
If you need a particular version for compatibility reasons:
pip install statsmodels==0.14.4
Installing with Conda
Conda users should install from the conda-forge channel, which maintains the most up-to-date builds:
conda install -c conda-forge statsmodels
Conda automatically resolves dependencies and ensures compatibility with your existing packages. I find this approach more reliable when you’re working with multiple scientific computing libraries that might have conflicting requirements.
Platform-specific considerations
Each operating system has quirks that affect installation, particularly when building from source or dealing with compiled components.
Windows Installation
Windows users rarely encounter issues with pip installation since pre-built wheels are available for all recent Python versions. The installation command is the same:
If you’re using Anaconda on Windows:
Anaconda’s integrated environment handles everything:
conda install -c conda-forge statsmodels
Troubleshooting Windows builds:
If you need to build from source (which you usually don’t), you’ll need Microsoft Visual C++ 14.0 or greater. Get it from “Microsoft C++ Build Tools.” Most users never need this since pre-built wheels handle the compilation.
MacOS Installation
MacOS users can install directly with pip:
For Apple Silicon (M1/M2/M3) Macs:
Pre-built wheels are available for ARM64 architecture. The standard pip command works:
The library runs natively on Apple Silicon without Rosetta translation.
If you need to build from source:
Install Xcode Command Line Tools first:
This provides the C compiler needed for building compiled components. Then install normally with pip.
Linux Installation
Most Linux distributions come with Python and pip pre-installed. Install Statsmodels with:
For Debian/Ubuntu systems:
If pip isn’t installed:
sudo apt update
sudo apt install python3-pip
pip3 install statsmodels
For Red Hat/Fedora/CentOS:
sudo dnf install python3-pip
pip3 install statsmodels
Building from source on Linux:
You’ll need gcc, which is typically already installed. Verify with:
If it’s missing, install the build essentials:
# Debian/Ubuntu
sudo apt install build-essential
# Red Hat/Fedora
sudo dnf groupinstall "Development Tools"
Installing in a virtual environment
I always recommend using virtual environments to avoid dependency conflicts between projects. Here’s how to set one up and install Statsmodels inside it.
Creating and activating a virtual environment:
# Create the environment
python -m venv statsmodels_env
# Activate on Windows
statsmodels_env\Scripts\activate
# Activate on MacOS/Linux
source statsmodels_env/bin/activate
Your terminal prompt should change to show (statsmodels_env) indicating you’re in the virtual environment.
Install Statsmodels in the environment:
Deactivate when you’re done:
Verifying your installation
After installation, confirm Statsmodels works correctly. Open a Python interpreter:
Then run:
import statsmodels.api as sm
print(sm.__version__)
You should see the version number (like 0.14.5) printed. If you get an error, the installation didn’t complete successfully.
Test with a simple regression:
import statsmodels.api as sm
import numpy as np
# Generate sample data
X = np.random.rand(100, 2)
y = np.random.rand(100)
# Add constant and fit model
X = sm.add_constant(X)
model = sm.OLS(y, X).fit()
print(model.summary())
If this runs without errors and produces a regression summary table, your installation is working correctly.
Understanding dependencies
Statsmodels installs several other packages automatically. Here’s what gets included and why:
NumPy (≥ 1.18.0): Provides array operations and numerical computing foundations
SciPy (≥ 1.4.0): Supplies scientific computing functions, optimization algorithms, and special functions
Pandas (≥ 1.0.0): Enables DataFrame support and data manipulation
Patsy (≥ 0.5.0): Handles formula parsing for R-style model specifications
These versions represent minimum requirements. Your installation will use whatever compatible versions are already installed or fetch the latest ones if you’re starting fresh.
Installing optional dependencies
Some Statsmodels features require additional packages that aren’t installed by default.
For plotting and visualization:
Matplotlib is needed for diagnostic plots, regression visualizations, and many examples in the documentation.
For regularized models:
Required if you’re using regularized regression methods like LASSO, Ridge, or Elastic Net.
For Jupyter notebook support:
pip install jupyter ipython
Useful for interactive statistical analysis and following Statsmodels tutorials.
For enhanced optimization:
Improves numerical derivatives in some advanced models.
Installing the development version
The GitHub repository usually contains bug fixes and features before they appear in stable releases. If you need cutting-edge functionality:
pip install git+https://github.com/statsmodels/statsmodels
This requires git to be installed on your system. The installation takes longer since it builds from source.
For development work:
Clone the repository and install in editable mode:
git clone https://github.com/statsmodels/statsmodels.git
cd statsmodels
pip install -e .
This links your local repository to your Python environment. Changes you make to the source code immediately affect imports without reinstalling.
Troubleshooting common installation issues
“ModuleNotFoundError: No module named ‘statsmodels’”
You installed Statsmodels in one Python environment but are running code in a different one. Check which Python your IDE is using:
import sys
print(sys.executable)
Install Statsmodels using the pip associated with that Python:
/path/to/python -m pip install statsmodels
“Microsoft Visual C++ 14.0 is required” (Windows)
You’re trying to build from source but lack the compiler. Install pre-built wheels instead:
pip install --only-binary :all: statsmodels
Or get the Visual C++ Build Tools if you specifically need to build from source.
“gcc: command not found” (Linux)
Your system needs a C compiler. Install build tools:
# Debian/Ubuntu
sudo apt install build-essential
# Red Hat/Fedora
sudo dnf groupinstall "Development Tools"
ImportError with NumPy or SciPy
Version conflicts between dependencies can cause import errors. Update everything:
pip install --upgrade statsmodels numpy scipy pandas
Installation hangs or times out
Try increasing the timeout or using a different mirror:
pip install --timeout 300 statsmodels
Upgrading Statsmodels
Keep your installation current to get bug fixes and new features:
pip install --upgrade statsmodels
With conda:
Check for updates regularly, especially if you encounter bugs that might already be fixed in newer versions.
Uninstalling Statsmodels
Remove the library when you no longer need it:
pip uninstall statsmodels
With conda:
This doesn’t affect other packages in your environment.
What comes next
Now that Statsmodels is installed, you’re ready to start building statistical models. The library includes dozens of built-in datasets for practice. Load one and fit your first model:
import statsmodels.api as sm
# Load a built-in dataset
data = sm.datasets.get_rdataset("mtcars", "datasets").data
# Fit a simple linear model
X = sm.add_constant(data['wt'])
y = data['mpg']
model = sm.OLS(y, X).fit()
print(model.summary())
This demonstrates the basic workflow: load data, specify a model, fit it, and examine results. From here, you can explore more complex models, time series analysis, and advanced statistical techniques that Statsmodels offers.

