Featured image of post Scatter Plot

Scatter Plot

A scatter plot is a fundamental statistical chart for showing the relationship between two numeric variables. One variable is assigned to the X-axis and the other to the Y-axis, and each observation is plotted as a point. The distribution of points reveals correlation, trends, clusters, and outliers.

Historical Background

Scatter plots became important in the nineteenth century through the work of statisticians and scientists such as John Herschel and Francis Galton. Galton’s studies of height, correlation, and regression helped establish the analytical value of point-based comparison.

Data Structure

DataRole
X valueHorizontal position
Y valueVertical position
ObservationOne plotted point
Optional categoryColor or shape
Optional sizeAdditional quantitative variable

Purpose

The purpose is to understand how two variables move together. A scatter plot can show positive correlation, negative correlation, nonlinear structure, clusters, or unusual observations.

Design Notes

  • Start axes at meaningful ranges rather than automatically forcing zero.
  • Use transparency when points overlap.
  • Add trend lines only when they support the analysis.
  • Use color for categories, not decoration.
  • Consider a scatterplot matrix for many variables.

Summary

Scatter plots are one of the most important tools in exploratory data analysis. They are simple, flexible, and effective for revealing relationships between numeric variables.

Licensed under CC BY-NC-SA 4.0
Last updated on Jun 12, 2026 09:25 +0900
Built with Hugo
Theme Stack designed by Jimmy