How do i make a correlation chart in python?

In this short guide, I’ll show you how to create a Correlation Matrix using Pandas. I’ll also review the steps to display the matrix using Seaborn and Matplotlib.

To start, here is a template that you can apply in order to create a correlation matrix using pandas:

df.corr()

Next, I’ll show you an example with the steps to create a correlation matrix for a given dataset.

Step 1: Collect the Data

Firstly, collect the data that will be used for the correlation matrix.

For example, I collected the following data about 3 variables:

A B C
45 38 10
37 31 15
42 26 17
35 28 21
39 33 12

Step 2: Create a DataFrame using Pandas

Next, create a DataFrame in order to capture the above dataset in Python:

import pandas as pd

data = {'A': [45,37,42,35,39],
        'B': [38,31,26,28,33],
        'C': [10,15,17,21,12]
        }

df = pd.DataFrame(data,columns=['A','B','C'])
print (df)

Once you run the code, you’ll get the following DataFrame:

How do i make a correlation chart in python?

Step 3: Create a Correlation Matrix using Pandas

Now, create a correlation matrix using this template:

df.corr()

This is the complete Python code that you can use to create the correlation matrix for our example:

import pandas as pd

data = {'A': [45,37,42,35,39],
        'B': [38,31,26,28,33],
        'C': [10,15,17,21,12]
        }

df = pd.DataFrame(data,columns=['A','B','C'])

corrMatrix = df.corr()
print (corrMatrix)

Run the code in Python, and you’ll get the following matrix:

How do i make a correlation chart in python?

Step 4 (optional): Get a Visual Representation of the Correlation Matrix using Seaborn and Matplotlib

You can use the seaborn and matplotlib packages in order to get a visual representation of the correlation matrix.

First import the seaborn and matplotlib packages:

import seaborn as sn
import matplotlib.pyplot as plt

Then, add the following syntax at the bottom of the code:

sn.heatmap(corrMatrix, annot=True)
plt.show()

So the complete Python code would look like this:

import pandas as pd
import seaborn as sn
import matplotlib.pyplot as plt

data = {'A': [45,37,42,35,39],
        'B': [38,31,26,28,33],
        'C': [10,15,17,21,12]
        }

df = pd.DataFrame(data,columns=['A','B','C'])

corrMatrix = df.corr()
sn.heatmap(corrMatrix, annot=True)
plt.show()

Run the code, and you’ll get the following correlation matrix:

How do i make a correlation chart in python?

That’s it! You may also want to review the following source that explains the steps to create a Confusion Matrix using Python. Alternatively, you may check this guide about creating a Covariance Matrix in Python.

Surprised to see no one mentioned more capable, interactive and easier to use alternatives.

A) You can use plotly:

  1. Just two lines and you get:

  2. interactivity,

  3. smooth scale,

  4. colors based on whole dataframe instead of individual columns,

  5. column names & row indices on axes,

  6. zooming in,

  7. panning,

  8. built-in one-click ability to save it as a PNG format,

  9. auto-scaling,

  10. comparison on hovering,

  11. bubbles showing values so heatmap still looks good and you can see values wherever you want:

import plotly.express as px
fig = px.imshow(df.corr())
fig.show()

How do i make a correlation chart in python?

B) You can also use Bokeh:

All the same functionality with a tad much hassle. But still worth it if you do not want to opt-in for plotly and still want all these things:

from bokeh.plotting import figure, show, output_notebook
from bokeh.models import ColumnDataSource, LinearColorMapper
from bokeh.transform import transform
output_notebook()
colors = ['#d7191c', '#fdae61', '#ffffbf', '#a6d96a', '#1a9641']
TOOLS = "hover,save,pan,box_zoom,reset,wheel_zoom"
data = df.corr().stack().rename("value").reset_index()
p = figure(x_range=list(df.columns), y_range=list(df.index), tools=TOOLS, toolbar_location='below',
           tooltips=[('Row, Column', '@level_0 x @level_1'), ('value', '@value')], height = 500, width = 500)

p.rect(x="level_1", y="level_0", width=1, height=1,
       source=data,
       fill_color={'field': 'value', 'transform': LinearColorMapper(palette=colors, low=data.value.min(), high=data.value.max())},
       line_color=None)
color_bar = ColorBar(color_mapper=LinearColorMapper(palette=colors, low=data.value.min(), high=data.value.max()), major_label_text_font_size="7px",
                     ticker=BasicTicker(desired_num_ticks=len(colors)),
                     formatter=PrintfTickFormatter(format="%f"),
                     label_standoff=6, border_line_color=None, location=(0, 0))
p.add_layout(color_bar, 'right')

show(p)

How do i make a correlation chart in python?

How do you make a correlation graph in Python?

You can plot correlation between two columns of pandas dataframe using sns. regplot(x=df['column_1'], y=df['column_2']) snippet. You can see the correlation of the two columns of the dataframe as a scatterplot.

How do you plot a correlation chart?

How to plot a correlation graph in Excel.
Select two columns with numeric data, including column headers. ... .
On the Inset tab, in the Chats group, click the Scatter chart icon. ... .
Right click any data point in the chart and choose Add Trendline… from the context menu..

How do you plot a correlation on a scatter plot in Python?

Correlation and Scatterplots — Basic Analytics in Python..
Load the seaborn library..
Specify the source data frame..
Set the x axis, which is generally the name of a predictor/independent variable..
Set the y axis, which is generally the name of a response/dependent variable..

How do you visualize a correlation?

The simplest way to visualize correlation is to create a scatter plot of the two variables. A typical example is shown to the right. (Click to enlarge.) The graph shows the heights and weights of 19 students.