Scatter plots in data science

broken image

Let’s hypothesize that the cars are hybrids. ggplot2 looks for the mapped variables in the data argument, in this case, mpg. The mapping argument is always paired with aes(), and the x and y arguments of aes() specify which variables to map to the x and y axes. This defines how variables in your dataset are mapped to visual properties. You’ll learn a whole bunch of them throughout this chapter.Įach geom function in ggplot2 takes a mapping argument. ggplot2 comes with many geom functions that each add a different type of layer to a plot. The function geom_point() adds a layer of points to your plot, which creates a scatterplot. You complete your graph by adding one or more layers to ggplot(). So ggplot(data = mpg) creates an empty graph, but it’s not very interesting so I’m not going to show it here. The first argument of ggplot() is the dataset to use in the graph. ggplot() creates a coordinate system that you can add layers to. With ggplot2, you begin a plot with the function ggplot(). Does this confirm or refute your hypothesis about fuel efficiency and engine size? In other words, cars with big engines use more fuel.

broken image

The plot shows a negative relationship between engine size ( displ) and fuel efficiency ( hwy).

broken image