In this article we will discuss multiple ways to plot a correlation matrix in pandas.
Table of Contents
Preparing DataSet
To quickly get started, let’s create a sample dataframe to experiment. We’ll use the pandas library with some random data.
import pandas as pd import numpy as np # DataFrame with some random values df = pd.DataFrame(np.random.randint(0,100, size=(100, 6)), columns=list('ABCDEF')) print(df.head())
Contents of the created dataframe are,
A B C D E F 0 3 38 71 80 71 68 1 80 15 45 51 29 87 2 0 72 35 37 52 49 3 67 21 28 43 53 57 4 44 67 14 47 64 30
A correlation matrix is generally used to visualize the correlation coefficients between all the features in a DataFrame. To get the correlation matrix, we can simply use the “corr” function on the pandas DataFrame.
print(df.corr())
Output
A B C D E F A 1.000000 0.121004 0.028870 0.081519 0.082788 0.007588 B 0.121004 1.000000 0.137948 0.186861 0.072054 0.042191 C 0.028870 0.137948 1.000000 0.105994 0.015434 0.010137 D 0.081519 0.186861 0.105994 1.000000 0.027067 0.105773 E 0.082788 0.072054 0.015434 0.027067 1.000000 0.003142 F 0.007588 0.042191 0.010137 0.105773 0.003142 1.000000
Here you have the correlation coefficients for all the feature combinations. Obviously, it is a little difficult to interpret, which is why visualizing this matrix can help understand the insights better.
Styling the correlation matrix directly
The simplest way to visualize the correlation matrix is to directly colorcode the above matrix. We are going to the style attribute to add some background gradient.
# storing the correlation matrix corr = df.corr() # adding background gradient corr.style.background_gradient(cmap='coolwarm')
Output
Adding a background gradient makes it slightly easier to read, as the dark blue color shows more negatively correlated features while the lighter shades show more positively correlated features. We can play around with these gradients using the cmap attribute.
Using matplotlib plotting library
Matplotlib is the standard library in python for all visualization methods. We are going to use it for plotting the correlation matrix as below.
# import import matplotlib.pyplot as plt # set figure size f = plt.figure(figsize=(8, 8)) # using matshow plt.matshow(df.corr(), fignum=f.number) # adding color scale cb = plt.colorbar() cb.ax.tick_params(labelsize=14) # print plt.show()
Output
As observed, we have similar output as the above method where the darker blue shade shows a more negative correlation and the light blue color shows a more positive correlation.
Using Seaborn heatmaps
Another easier way to plot the correlation matrix is to use the heatmaps from the seaborn library. Heatmaps, as the name suggests, are a graphical representation of data where values are depicted by color. Let’s plot the correlation matrix below.
# import import seaborn as sns # heatmap using seaborn sns.heatmap(df.corr(), annot=True)
Output
As observed, this also gives us a similar output with a clean representation with values (annotations) as well.
Summary
In this article, we have discussed multiple ways to plot the correlation matrix in pandas.
Pandas Tutorials Learn Data Analysis with Python

Pandas Tutorial Part #1  Introduction to Data Analysis with Python

Pandas Tutorial Part #2  Basics of Pandas Series

Pandas Tutorial Part #3  Get & Set Series values

Pandas Tutorial Part #4  Attributes & methods of Pandas Series

Pandas Tutorial Part #5  Add or Remove Pandas Series elements

Pandas Tutorial Part #6  Introduction to DataFrame

Pandas Tutorial Part #7  DataFrame.loc[]  Select Rows / Columns by Indexing

Pandas Tutorial Part #8  DataFrame.iloc[]  Select Rows / Columns by Label Names

Pandas Tutorial Part #9  Filter DataFrame Rows

Pandas Tutorial Part #10  Add/Remove DataFrame Rows & Columns

Pandas Tutorial Part #11  DataFrame attributes & methods

Pandas Tutorial Part #12  Handling Missing Data or NaN values

Pandas Tutorial Part #13  Iterate over Rows & Columns of DataFrame

Pandas Tutorial Part #14  Sorting DataFrame by Rows or Columns

Pandas Tutorial Part #15  Merging or Concatenating DataFrames

Pandas Tutorial Part #16  DataFrame GroupBy explained with examples
Are you looking to make a career in Data Science with Python?
Data Science is the future, and the future is here now. Data Scientists are now the most soughtafter professionals today. To become a good Data Scientist or to make a career switch in Data Science one must possess the right skill set. We have curated a list of Best Professional Certificate in Data Science with Python. These courses will teach you the programming tools for Data Science like Pandas, NumPy, Matplotlib, Seaborn and how to use these libraries to implement Machine learning models.
Checkout the Detailed Review of Best Professional Certificate in Data Science with Python.
Remember, Data Science requires a lot of patience, persistence, and practice. So, start learning today.