Anita Owens

View Original

How to create a correlation matrix in R

I really love correlation analysis. It's an awesome way of determining if two numeric variables have a relationship. You can also determine how strong the relationship might be. If you are looking at just 2 variables this is where the scatterplot comes into play. If you have many variables to compare, a correlation matrix is just what you need. 

I decided to create a step-by-step guide on creating a correlation matrix using the R programming language. The first step is finding a dataset to use. I'm using a dataset from an online statistics course at Penn State. The data is from a study researching if a person's brain size, weight, and height can predict intelligence.

See this content in the original post
See this content in the original post
See this content in the original post
See this content in the original post

This plot allows us to visualize the relationship among all variables in one image.
We can see that height and weight suggests a positive correlation. (4th column, 3rd row from the top)

See this content in the original post

The corr() function calculates the Pearson's correlation coefficient and creates a new matrix in your environment.

See this content in the original post
See this content in the original post
See this content in the original post

Now that looks better!

See this content in the original post
See this content in the original post

Now you have your correlation matrix with the corresponding correlation coefficients for easy visualization.

If you want to continue the example on the Stat 501 course page to get your regression equation, residuals, and R-squared, use the fit function to  run your regression analysis similar to the example shown using Minitab.

See this content in the original post

 

A correlation matrix is a great way of visualizing numeric data if you want find out if your variables are correlated. Happy analyzing!

See this search field in the original post