Adding a Secondary Y-Axis in ggplot2

Galaxy Glossary

How do I add a secondary y-axis in ggplot2?

Using ggplot2’s sec.axis argument to overlay a second, independently scaled y-axis on the same plot.

Sign up for the latest in SQL knowledge from the Galaxy Team!
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Description

Overview

A secondary y-axis allows you to display two different measures—each with its own units—on the same plot. In ggplot2, this is achieved with the sec.axis argument inside a scale function (usually scale_y_continuous()). Unlike spreadsheet software where secondary axes can be added with a click, ggplot2 requires an explicit mathematical transformation that maps one data series onto the scale of another.

Why Secondary Axes Matter

In data science and analytics, you often need to compare variables that live on different numeric scales: revenue (millions) versus conversion rate (percent), or temperature (°C) versus energy consumption (kWh). Placing both series on a single primary axis can hide important variations; using separate plots makes pattern detection harder. A properly implemented secondary y-axis keeps the viewer’s attention on a single visual while preserving each variable’s integrity.

How ggplot2 Implements Secondary Axes

The sec.axis Argument

In ggplot2 >= 2.2.0, you add a secondary axis by appending sec.axis = sec_axis(~ transform(.), name = "Axis Label") to a scale function:

scale_y_continuous(
name = "Primary Axis Label",
sec.axis = sec_axis(~ transform(.), name = "Secondary Axis Label")
)

  • The ~ denotes an anonymous function. The dot (.) is the input (the primary axis values) that you map to the secondary scale.
  • The transform must be invertible; e.g., multiply by a constant, add an offset, or any monotonic function.

Single Coordinate System, Two Scales

ggplot’s philosophy insists on a single coordinate system. That means you never add a second geom on a separate coordinate system; instead you transform one variable so it fits the numeric range of the other. Then you instruct ggplot how to reverse that transformation to label the secondary axis correctly.

Key Functions

  • scale_y_continuous() or scale_x_continuous() – host the sec.axis.
  • sec_axis() – builds the secondary axis; accepts a transform formula and a label.
  • geom_line(), geom_col(), geom_point() – the geoms you overlay.

Step-by-Step Example

Suppose you have daily data for outside temperature and electricity usage. You want a line for temperature and bars for kWh on the same plot. Temperature ranges 0-30°C, consumption 200-1000 kWh.

1. Load and prep data

library(tidyverse)

set.seed(42)
df <- tibble(
day = seq.Date(Sys.Date() - 29, Sys.Date(), by = "1 day"),
temp_c = runif(30, 0, 30),
kwh = runif(30, 200, 1000)
)

# Compute a multiplier so kWh fits roughly the same range as temp
mult <- max(df$temp_c) / max(df$kwh)

2. Plot

ggplot(df, aes(x = day)) +
geom_col(aes(y = kwh * mult), fill = "steelblue", alpha = 0.6) +
geom_line(aes(y = temp_c), color = "red", size = 1) +
scale_y_continuous(
name = "Temperature (°C)",
sec.axis = sec_axis(~ . / mult, name = "Energy (kWh)")
) +
theme_minimal() +
labs(title = "Temperature vs. Energy Consumption")

Notice how you multiply kWh by mult so it shares the primary scale, and then divide inside sec_axis() to compute the right labels.

Best Practices

Use Simple, Linear Transforms

Stick to linear relationships (multiply and/or add). Non-linear transforms (log, sqrt) make axis interpretation difficult.

Provide Clear Labels and Legends

Label both axes and differentiate geoms (color, linetype) so the audience instantly knows which axis corresponds to which data series.

Avoid Misleading Dual Axes

Dual axes can accidentally imply correlation where none exists. Only use them when variables share a logical connection (e.g., input versus output) or when patterns need to be temporally compared. Consider a faceted plot if scales or units are totally unrelated.

Document the Transformation

Expose the scale conversion in comments or the legend. If sharing code inside a collaborative SQL/BI platform like Galaxy, annotate the notebook or query to explain the transform so teammates can reproduce or audit it.

Common Mistakes

1. Forgetting to Transform the Data

Why it’s wrong: Plotting raw kWh on the primary axis without scaling will compress the temperature line into a nearly flat line.
Fix: Scale the secondary metric (e.g., kwh * mult) before plotting it.

2. Using Different Data Frames with Different x-Axes

Why it’s wrong: Misaligned x-values lead to faulty visual comparisons.
Fix: Combine data into a single data frame or ensure identical breaks.

3. Assuming ggplot Automatically Calculates Scale

Why it’s wrong: ggplot does not derive secondary axes automatically; you must supply the mathematical relationship.
Fix: Manually compute the multiplier or transformation function.

Real-World Use Cases

  • Finance: Plot stock price (USD) with trading volume (millions of shares).
  • Manufacturing: Overlay machine temperature (°C) with defect rate (%).
  • Marketing: Display advertising spend ($) against click-through rate (%).

Working Code Example

# Two y-axes: revenue vs. conversion rate
library(ggplot2)

sales <- data.frame(
month = 1:12,
revenue = c(120, 130, 140, 160, 155, 170, 190, 200, 210, 225, 230, 250),
conversion = c(2.0, 2.1, 2.3, 2.4, 2.2, 2.5, 2.7, 2.8, 2.9, 3.0, 3.1, 3.3)
)

coef <- max(sales$revenue) / max(sales$conversion)

ggplot(sales, aes(month)) +
geom_col(aes(y = revenue), fill = "forestgreen", alpha = 0.5) +
geom_line(aes(y = conversion * coef), colour = "darkred", size = 1.2) +
scale_y_continuous(
name = "Revenue (kUSD)",
sec.axis = sec_axis(~ . / coef, name = "Conversion Rate (%)")
) +
theme_minimal() +
labs(title = "Monthly Revenue vs. Conversion Rate")

Frequently Asked Questions

How do I decide on the transformation multiplier?

Use a simple ratio such as max(primary) / max(secondary) or another scaling factor that brings both series into a visually comparable range without distortion.

Can I add a secondary x-axis instead of y-axis?

Yes. You can use scale_x_continuous(sec.axis = ...). The same transformation principles apply.

Is it possible to have more than two axes in ggplot?

No. ggplot2 supports only one secondary axis per plot. If you need more, create additional plots or facets.

Why does my secondary axis show incorrect values?

Check that the transform in sec_axis() is the exact inverse of the scaling applied to the plotted data. A mismatch causes incorrect tick labels.

Why Adding a Secondary Y-Axis in ggplot2 is important

Dual-axis plots enable analysts to compare variables with different units without splitting attention across multiple visuals. In data engineering workflows, especially when generating automated reports, mastering secondary axes helps deliver clearer insights in less dashboard real estate. Understanding the required transformations prevents misleading visuals and ensures analytical integrity.

Adding a Secondary Y-Axis in ggplot2 Example Usage


ggplot(df, aes(day)) +
  geom_line(aes(y = temp_c)) +
  geom_col(aes(y = kwh * mult)) +
  scale_y_continuous(sec.axis = sec_axis(~ . / mult))

Common Mistakes

Frequently Asked Questions (FAQs)

How do I decide on the transformation multiplier?

Use a ratio like max(primary) / max(secondary) or another factor that brings both series into comparable ranges without distortion.

Can I add a secondary x-axis instead of y-axis?

Yes. Apply the same principle with scale_x_continuous(sec.axis = ...).

Is it possible to have more than two axes in ggplot?

No. ggplot2 allows at most one secondary axis. For additional metrics, use facets or multiple plots.

Why does my secondary axis show incorrect values?

Ensure the function in sec_axis() exactly inverts the scaling applied to the plotted data.

Want to learn about other SQL terms?