# Basic Scatter Plot and linear fitted line

Lets scatter into some points created by data in xy-space. Data are scattered everywhere but what relation is there between some specific variable with other. Cutting down to simple talking and stick to the heading, we can use `mtcars`

dataset in R.

The dataset from Motor Trend US magazine, 1974 comprises fuel consumption and 10 various aspects of automobile design and their performance for 32 automobiles of different models. I will try to obtain the scatter plot for the model and the fitted line for the model.

In R, there are three popular packages for obtaining plots *Base Graphics*, *Lattice Plot* and *ggplot2*. Here we will create a scatter plot between two variables `mpg`

(mile per gallon) and `disp`

(displacement) along with the fitted regession line with equation and \(R²\) value in it using all there graphics packages.

Lets first fit a linear model,

```
mdl <- lm(mpg ~ disp, data = mtcars)
sumry <- summary(mdl)
cf <- round(coef(mdl), 2)
eqn <- paste(terms(mdl)[[2]],
paste0(cf[1], ifelse(cf[2] < 0, " ", " + "),
cf[2], " ", terms(mdl)[[3]]), sep = " = ")
sumry.lbl <- paste0("R^2: ", round(sumry$r.squared, 2),
", adj R^2: ", round(sumry$adj.r.squared, 2))
```

## Plots

### Base Graphics

```
with(mtcars, {
plot(disp, mpg, pch = 22, bg = "gray",
xlab = "Displacement", ylab = "Mile per Gallon",
main = "Displacement vs Mile per Gallon")
abline(mdl, col = "red", lty = 2, lwd = 2)
text(max(disp), max(mpg), adj = c(1, 1), family = "monospace",
label = paste(eqn, sumry.lbl, sep = "\n"))
})
```

### Lattice Plot

```
library(lattice)
lm.panel <- function(x, y, ...) {
panel.xyplot(x, y, pch = 22, fill = "gray",
cex = 1.2, col = "black")
panel.text(max(x), max(y), pos = 2,
fontfamily = "monospace",
label = paste(eqn, sumry.lbl, sep = "\n"))
panel.abline(mdl, col = "red", lty = 2, lwd = 2)
}
xyplot(mpg ~ disp, data = mtcars,
panel = lm.panel,
main = "Displacement vs Mile per Gallon",
xlab = "Displacement", ylab = "Mile per Gallon")
```

### ggplot

```
library(ggplot2)
plt <- qplot(disp, mpg, data = mtcars, geom = c("point"),
xlab = "Displacement",
ylab = "Mile per Gallon",
main = "Displacement vs Mile per Gallon",
size = I(3), shape = I(22), fill = I("grey"))
plt + theme_bw() +
geom_smooth(method = "lm", color = "red", linetype = 2) +
annotate(x = Inf, y = Inf, geom = "text",
hjust = 1.2, vjust = 1.2,
family = "monospace",
label = paste(eqn, sumry.lbl, sep = "\n"))
```

The fitted regression summary is,

```
Call:
lm(formula = mpg ~ disp, data = mtcars)
Residuals:
Min 1Q Median 3Q Max
-4.892200650 -2.202190927 -0.963085639 1.627154680 7.230540273
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 29.59985475616 1.22971951531 24.07041 < 2.22e-16
disp -0.04121511996 0.00471183331 -8.74715 9.3803e-10
Residual standard error: 3.25145449 on 30 degrees of freedom
Multiple R-squared: 0.71834334, Adjusted R-squared: 0.708954785
F-statistic: 76.51266 on 1 and 30 DF, p-value: 9.38032654e-10
```

This means, the effect of displacement on mile per gallon of the cars in the model is negative and its magnitude is 0.04. In other words, on one unit change of displacement, the car will travel 0.04 less per gallon.