How to get total of Pandas column?

In this article, we will discuss how to get the total or sum of any DataFrame column in Pandas. Additionally, we will also understand how to store the total as a new row in the DataFrame.

Table of Content

Preparing DataSet

To quickly get started, let’s create a sample dataframe to experiment. We’ll use the pandas library with some random data.

import pandas as pd
import numpy as np

# List of Tuples
employees = [('Shubham', 25, 5, 4),
            ('Riti', 30, 7, 7),
            ('Shanky', 23, 2, 2),
            ('Shreya', 24, 2, 0),
            ('Aadi', 33, 11, 5),
            ('Sim', 28, 4, 4)]

# Create a DataFrame object from list of tuples
df = pd.DataFrame(employees,
                  columns=['Name', 'Age', 'Experience', 'RelevantExperience'],
                  index = ['A', 'B', 'C', 'D', 'E', 'F'])
print(df)

Contents of the created dataframe are,

      Name  Age  Experience  RelevantExperience
A  Shubham   25           5                   4
B     Riti   30           7                   7
C   Shanky   23           2                   2
D   Shreya   24           2                   0
E     Aadi   33          11                   5
F      Sim   28           4                   4

Now, we will make operations on this DataFrame.

Advertisements

Get total of a DataFrame column in Pandas

To get the total of a pandas column, we can simply use the DataFrame.column.sum() method. Let’s understand some of the key attributes of the function.

DataFrame.sum(axis=None, skipna=None, numeric_only=None, min_count=0, **kwargs)
  • axis: 0 for index-wise sum and 1 for column-wise sum
  • skipna: To skip NA values
  • numeric_only: If True, it will consider only the numeric columns
  • min_count : Minimum valid values to perform the operation, else it will return NaN

Let’s understand it by getting the total of the “Experience” column.

# get sum of Experience column
print(df['Experience'].sum())

Output

31

As observed, we have the total of the “Experience” column.

Store the column total in the DataFrame

Now, let’s understand how to store this total as a new row in the DataFrame. Here, we are going to use the .loc property of the DataFrame.

# store the total in DataFrame
df.loc["Total", "Experience"] = df['Experience'].sum()

print(df)

Output

          Name   Age  Experience  RelevantExperience
A      Shubham  25.0         5.0                 4.0
B         Riti  30.0         7.0                 7.0
C       Shanky  23.0         2.0                 2.0
D       Shreya  24.0         2.0                 0.0
E         Aadi  33.0        11.0                 5.0
F          Sim  28.0         4.0                 4.0
Total      NaN   NaN        31.0                 NaN

As observed, we have a new row “Total” which contains the total of the Experience column. We can alternatively use “at” property as well instead of loc as shown below.

# store the total in DataFrame
df.at["Total", "Experience"] = df['Experience'].sum()

print(df)

Output

          Name   Age  Experience  RelevantExperience
A      Shubham  25.0         5.0                 4.0
B         Riti  30.0         7.0                 7.0
C       Shanky  23.0         2.0                 2.0
D       Shreya  24.0         2.0                 0.0
E         Aadi  33.0        11.0                 5.0
F          Sim  28.0         4.0                 4.0
Total      NaN   NaN        31.0                 NaN

Store the total for all columns

Instead of storing the total for just one column, say, we need to store the total for all numeric columns. Here, we will again use the DataFrame.sum() method, but instead of specifying a column, we will just use the numeric_only attribute.

# store total for all columns
df.loc['Total'] = df.sum(numeric_only=True)

print(df)

Output

          Name    Age  Experience  RelevantExperience
A      Shubham   25.0         5.0                 4.0
B         Riti   30.0         7.0                 7.0
C       Shanky   23.0         2.0                 2.0
D       Shreya   24.0         2.0                 0.0
E         Aadi   33.0        11.0                 5.0
F          Sim   28.0         4.0                 4.0
Total      NaN  163.0        31.0                22.0

As observed, we have the totals for all the columns stored in a new row.

Summary

In this article, we have discussed how to get the total of Pandas columns.

Advertisements

Thanks for reading.

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top