How to sum rows by specific columns in Pandas DataFrame?

In this article, we will discuss multiple ways to sum DataFrame rows for given columns in pandas.

Table of Content

Preparing dataset

To quickly get started, let’s create a sample dataframe to experiment. We’ll use the pandas library with some random data.

import pandas as pd

# List of Tuples
employees = [('Shubham', 25, 5, 4),
            ('Riti', 30, 7, 7),
            ('Shanky', 23, 2, 2),
            ('Shreya', 24, 2, 0),
            ('Aadi', 33, 11, 5),
            ('Sim', 28, 4, 4)]

# Create a DataFrame object from list of tuples
df = pd.DataFrame(employees,
                  columns=['Name', 'Age', 'Experience', 'RelevantExperience'],
                  index = ['A', 'B', 'C', 'D', 'E', 'F'])
print(df)

Contents of the created dataframe are,

      Name  Age  Experience  RelevantExperience
A  Shubham   25           5                   4
B     Riti   30           7                   7
C   Shanky   23           2                   2
D   Shreya   24           2                   0
E     Aadi   33          11                   5
F      Sim   28           4                   4

Row wise sum of specific columns in Pandas DataFrame using + operator

In case we just have a few columns to add, we can use this method to directly add the column values. Let’s understand with an example, say, we need to sum the columns “Experience” and “RelevantExperience”, and further save the output in a new column called “sum_experience”.

Advertisements
# row-wise sum of the columns
df['sum_experience'] = df['Experience'] + df['RelevantExperience']

print (df)

Output

      Name  Age  Experience  RelevantExperience  sum_experience
A  Shubham   25           5                   4               9
B     Riti   30           7                   7              14
C   Shanky   23           2                   2               4
D   Shreya   24           2                   0               2
E     Aadi   33          11                   5              16
F      Sim   28           4                   4               8

As observed, we have the sum of the columns stored in the new column. This is the simplest way to add columns given the number of columns is less.

Row wise sum of specific columns in Pandas DataFrame using sum() function

In scenarios, where we have a lot of columns that need to be selected for summation, we can use the sum function. Let’s understand by again adding the columns “Experience” and “RelevantExperience”.

# row-wise sum of the columns
df['sum_experience'] = df[['Experience', 'RelevantExperience']].sum(axis=1)

print (df)

Output

      Name  Age  Experience  RelevantExperience  sum_experience
A  Shubham   25           5                   4               9
B     Riti   30           7                   7              14
C   Shanky   23           2                   2               4
D   Shreya   24           2                   0               2
E     Aadi   33          11                   5              16
F      Sim   28           4                   4               8

Note that, above we have selected the columns manually only since we just had two columns, but we can use alternative methods to select multiple columns as explained here.

Row wise sum of specific columns in Pandas DataFrame using eval function()

Another way is to use the eval function to add the row values for given columns. However, we need to mention individual column names here as well.

# row-wise sum of the columns
df = df.eval('sum_experience = Experience + RelevantExperience')

print (df)

Output

      Name  Age  Experience  RelevantExperience  sum_experience
A  Shubham   25           5                   4               9
B     Riti   30           7                   7              14
C   Shanky   23           2                   2               4
D   Shreya   24           2                   0               2
E     Aadi   33          11                   5              16
F      Sim   28           4                   4               8

We have similar output as the above methods.

Summary

In this article, we have discussed how to sum DataFrame rows for given columns in Pandas. Thanks.

Advertisements

Thanks for reading.

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top