Time-Series Forecasting with Prophet in Python

Galaxy Glossary

How do I create a time-series forecast with Prophet in Python?

Prophet is an open-source Python library from Meta that enables analysts to create accurate, interpretable time-series forecasts with just a few lines of code.

Sign up for the latest in SQL knowledge from the Galaxy Team!
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Description

Prophet is an open-source library released by Meta’s Core Data Science team that brings state-of-the-art forecasting techniques to Python and R. Designed to be “analyst friendly,” Prophet hides the complexity of Bayesian curve fitting behind an intuitive API that resembles scikit-learn. In this article you’ll learn exactly how to build, evaluate, and improve a time-series forecast in Python with Prophet, along with hard-won best practices and pitfalls to avoid.

What Is Prophet?

Prophet models a time series as an additive combination of four components:

  • Trend — long-term increase or decrease
  • Seasonality — repeating patterns (daily, weekly, yearly, etc.)
  • Holidays — user-defined special events
  • Noise — unexplained variation, modeled with a Bayesian error term

By separately modeling each component, Prophet can handle complex real-world signals where seasonality changes over time, holidays have variable effects, and growth saturates or shifts abruptly.

Why Use Prophet for Time-Series Forecasting?

Traditional ARIMA-style models require stationarity, manual parameter search, and deep statistical knowledge. Machine-learning alternatives like LSTM networks need large datasets and extensive tuning. Prophet strikes a middle ground that suits the majority of business forecasting problems:

  • Quick to prototype — reasonable forecasts in minutes
  • Handles messy data — missing values, trend changes, irregular holidays
  • Human-interpretable plots for each model component
  • Scales from hourly to yearly data without changing code

Because most analytics teams already work in Python, Prophet integrates naturally into existing ETL, visualization, and deployment pipelines.

Installing and Setting Up Prophet

Prophet depends on cmdstanpy under the hood. Since 2023, the recommended install method is:

pip install prophet # Prophet 1.2+ automatically installs CmdStan

If you run into compiler errors, ensure you have a C++ toolchain (build-essential on Debian/Ubuntu, Xcode on macOS, or Visual C++ Build Tools on Windows).

Understanding Prophet’s Additive Model

The core equation is:

y(t) = g(t) + s(t) + h(t) + ε_t

where

  • g(t) = trend function (piecewise linear or logistic)
  • s(t) = seasonality modeled with Fourier series
  • h(t) = holiday effects using indicator variables
  • ε_t = error term with Student-t distribution

Prophet infers changepoints, seasonal amplitudes, and holiday coefficients via maximum a posteriori (MAP) estimation. You can add domain knowledge through priors and custom regressors, but the defaults work remarkably well.

Step-by-Step Guide to Building a Forecast

1. Load and Inspect Data

import pandas as pd
from prophet import Prophet

# Example: daily sales data
sales = pd.read_csv("daily_sales.csv", parse_dates=["date"])
print(sales.head())

Prophet expects two columns named ds (datestamp) and y (value). Rename if necessary:

sales = sales.rename(columns={"date": "ds", "revenue": "y"})

2. Instantiate the Model

m = Prophet(
seasonality_mode="additive", # or "multiplicative"
changepoint_prior_scale=0.2, # higher => more flexible trend
yearly_seasonality=True,
weekly_seasonality=True,
daily_seasonality=False
)

These hyperparameters balance bias vs. variance. Start with defaults, then adjust.

3. Fit the Model

m.fit(sales)

Training usually finishes in seconds for datasets < 100k rows.

4. Create a Future DataFrame

future = m.make_future_dataframe(periods=90) # forecast 90 days ahead

The returned DataFrame contains both historical dates (for in-sample predictions) and future dates.

5. Generate the Forecast

forecast = m.predict(future)

forecast now holds point estimates (yhat) and uncertainty intervals (yhat_lower, yhat_upper). You can merge these back into the original dataset for downstream dashboards.

6. Visualize Results

fig1 = m.plot(forecast)
fig2 = m.plot_components(forecast)

The first figure overlays the forecast and confidence bands on historical data. The second breaks out trend, holidays, and each seasonal component — a fast way to sanity-check model behavior.

Best Practices for Accurate Forecasts

  • Aggregate to a sensible granularity. Prophet shines on hourly/daily/weekly data. If minute-level noise dominates, aggregate first.
  • Log-transform multiplicative series. Non-negative values that grow exponentially can be modeled additively after a log transform.
  • Add domain-specific holidays. For e-commerce, features like Black Friday drive huge spikes.
  • Use cross-validation. Prophet offers cross_validation and performance_metrics helpers to choose hyperparameters empirically.
  • Explain your forecast. Stakeholders trust models they can interpret. Share component plots and holdout accuracy.

Common Mistakes and How to Avoid Them

  • Ignoring changepoints. Sudden level shifts (e.g., COVID-19) break forecasts. Increase changepoint_prior_scale or supply changepoints manually.
  • Forgetting to cap logistic growth. If you pick growth="logistic" but omit cap/floor, results will be nonsense.
  • Over-fitting with too many seasonalities. Adding every Fourier term imaginable reduces bias but explodes variance. Keep it parsimonious.

Complete Code Example

"""Prophet end-to-end example with cross-validation"""
import pandas as pd
from prophet import Prophet
from prophet.diagnostics import cross_validation, performance_metrics

# 1. Load data
sales = pd.read_csv("daily_sales.csv", parse_dates=["date"]).rename(
columns={"date": "ds", "revenue": "y"}
)

# 2. Model
m = Prophet(yearly_seasonality=True, weekly_seasonality=True, changepoint_prior_scale=0.2)

# 3. Fit
m.fit(sales)

# 4. Forecast 6 months ahead
future = m.make_future_dataframe(periods=180)
forecast = m.predict(future)

# 5. Accuracy via rolling CV
cv = cross_validation(m, initial="730 days", period="180 days", horizon="90 days")
print(performance_metrics(cv).head())

# 6. Plots
m.plot(forecast)
m.plot_components(forecast)

Conclusion

Prophet democratizes advanced time-series forecasting by combining a powerful Bayesian framework with a simple, scikit-learn-like interface. With minimal configuration you can deliver forecasts that are both accurate and explainable — a rare combination in data science. By following the best practices covered and avoiding common pitfalls, you’ll be well equipped to add reliable forecasting to dashboards, data pipelines, and decision-making workflows.

Why Time-Series Forecasting with Prophet in Python is important

Accurate forecasts drive inventory planning, capacity allocation, budgeting, and strategic initiatives. Without scalable forecasting, companies rely on gut instinct or fragile spreadsheets, leading to stock-outs, overstaffing, and missed revenue. Prophet offers an accessible yet statistically rigorous solution that analysts can adopt quickly, bridging the gap between classical econometrics and deep-learning models.

Time-Series Forecasting with Prophet in Python Example Usage


from prophet import Prophet
m = Prophet().fit(df)
future = m.make_future_dataframe(periods=30)
forecast = m.predict(future)

Common Mistakes

Frequently Asked Questions (FAQs)

What data format does Prophet require?

A pandas DataFrame with two columns: ds (datestamp as datetime) and y (numeric value). Additional regressors, holiday indicators, and capacity variables can be added as extra columns.

Can Prophet model multiple seasonalities?

Yes. Prophet automatically models yearly, weekly, and daily seasonality. You can also add custom cycles (e.g., quarterly) using add_seasonality.

How do I measure forecast accuracy?

Use Prophet’s cross_validation function to generate rolling-origin splits, then feed the output to performance_metrics for MAE, MAPE, RMSE, and coverage.

Is Prophet suitable for minute-level data?

Prophet can technically handle high-frequency data, but its assumptions work best at hourly or coarser intervals. For tick-level or sub-minute series, consider specialized models or aggregate the data first.

Want to learn about other SQL terms?