How to Merge two text columns in Pandas DataFrame?

In this article, we will discuss two text based columns of a DataFrame in Pandas.

Table Of Contents

Overview of Pandas DataFrame?

Pandas DataFrame is labelled two dimensional, size-mustable data structure with rows and columns, we can perform arithmetic operations align on both row and column labels. The Pandas DataFrame contains three elements,

  1. Data
  2. Rows
  3. Columns

To create text column in dataframe we need to specify dtype as ‘str’ in the argument of pandas DataFrame. We can combine two text columns of a dataFrame into one column using different techniques. Let’s discuss them one by one,

Using + operator to combine two DataFrame Columns

Arithmetaic operator ‘+’ is used to combine two strings in python. We can also combine two string type DataFrame Column values used ‘+’ operator. We have created apandas script to combine two columns of dataframe ‘Name’ and ‘Surname’ and assign combined value in the thired column ‘Fullname’.

import pandas as pd

# initialize list with two columns
data = [['Reema', 'Thakker'],
        ['Rekha', 'chande'],
        ['Jaya', 'baru']]

# Create the pandas DataFrame
df = pd.DataFrame(data, columns = ['Name', 'Surname'])

# Combine two columns 'Name' and 'Surname'
df["FullName"] = df['Name'].astype(str) + "-" + df['Surname'].astype(str)

# Print DataFrame.
print(df)

Output

    Name  Surname       FullName
0  Reema  Thakker  Reema-Thakker
1  Rekha   chande   Rekha-chande
2   Jaya     baru      Jaya-baru

In the above script, we have first created a list with two columns, ‘Name’ and ‘Surname’. In the second part, we have created a dataframe from the list. The ‘+’ operator is used between two columns to combine two column values and assign a combined value in the third column of ‘Surname.’

Combine two Columns using apply() method

In Pandas, the apply() method is used to apply different functions to the DataFrame contents. We can also use the apply() function to apply the join() function on two columns.

A Pandas script to join two columns of list ‘Name’ and ‘Surname’ into one column ‘FullName’

import pandas as pd

# initialize list with two columns
data = [['Reema', 'Thakker'],
        ['Rekha', 'chande'],
        ['Jaya', 'baru']]

# Create the pandas DataFrame
df = pd.DataFrame(data, columns = ['Name', 'Surname'])

# combining two columns with apply method
df["FullName"] = df[["Name", "Surname"]].apply("-".join, axis=1)

# print dataframe.
print(df)

Output

    Name  Surname       FullName
0  Reema  Thakker  Reema-Thakker
1  Rekha   chande   Rekha-chande
2   Jaya     baru      Jaya-baru

In the above script, first we have created a list with two columns ‘Name’ and ‘Surname’. Then using the same list a pandas DataFrame is created. To combine both columns, the apply() function is used with join function as an argument and axis is 1.

Using DataFrame.agg() to combine two columns of text

Pandas DataFrame.agg() function is used to apply a function or a list of function names, that needs to be executed along one of the axis of the DataFrame. A pandas script to join two columns ‘name’ and ‘surname’ into a column ‘fullname’ using DataFrame.agg() function is as follows,

import pandas as pd

# initialize list with two columns
data = [['Reema', 'Thakker'],
        ['Rekha', 'chande'],
        ['Jaya', 'baru']]

# Create the pandas DataFrame
df = pd.DataFrame(data, columns = ['Name', 'Surname'])

# combining two columns with apply() method
df["FullName"] = df[['Name', 'Surname']].agg('-'.join, axis=1)

# print dataframe
print(df)

Output

    Name  Surname       FullName
0  Reema  Thakker  Reema-Thakker
1  Rekha   chande   Rekha-chande
2   Jaya     baru      Jaya-baru

In the above script, first we have created a list with two columns ‘Name’ and ‘Surname’. Then using the same list a pandas DataFrame is created. To combine both columns, the DataFrame.agg() function is used with join() function as an argument.

Combine two columns of text Using Series.str.cat()

In Pandas, the Series.str.cat() function is used to concatenate strings in the Series. A pandas script to join two columns ‘name’ and ‘surname’ into a column ‘fullname’ using Series.str.cat() function is as follows,

import pandas as pd

# initialize list with two columns
data = [['Reema', 'Thakker'],
        ['Rekha', 'chande'],
        ['Jaya', 'baru']]

# Create the pandas DataFrame
df = pd.DataFrame(data, columns = ['Name', 'Surname'])

# combining two columns with cat() method
df["FullName"] = df["Name"].str.cat(df["Surname"], sep="-")

# print dataframe
print(df)

Output

    Name  Surname       FullName
0  Reema  Thakker  Reema-Thakker
1  Rekha   chande   Rekha-chande
2   Jaya     baru      Jaya-baru

In the above script, we have created a list with two columns ‘name’ and ‘surname’, then using the same list we have created a DataFrame. To combine two columns, dataframe.str.cat() function is used with two arguments. First is the column names which is to be combined and second is the symbol to be joined.

Combine two columns of text Using DataFrame.apply() and lambda

A create pandas script to join two columns ‘name’ and ‘surname’ using dataframe.apply() function with lamba function.

import pandas as pd

# initialize list with two columns
data = [['Reema', 'Thakker'],
        ['Rekha', 'chande'],
        ['Jaya', 'baru']]

# Create the pandas DataFrame
df = pd.DataFrame(data, columns = ['Name', 'Surname'])

# combining two columns with apply() method
df["FullName"] = df[["Name", "Surname"]].apply(lambda x: "-".join(x), axis =1)

# print dataframe
print(df)

Output

    Name  Surname       FullName
0  Reema  Thakker  Reema-Thakker
1  Rekha   chande   Rekha-chande
2   Jaya     baru      Jaya-baru

In the above script we have used lambda function with the apply() function. A lambda function is a small anonymous function, and it can take only one expression. We specified apply() function and lambda expression as an arguments.

Combine two columns of text Using map() function

A pandas script to combine two column values using map() function is as follows,

import pandas as pd

# initialize list with two columns
data = [['Reema', 'Thakker'],
        ['Rekha', 'chande'],
        ['Jaya', 'baru']]

# Create the pandas DataFrame
df = pd.DataFrame(data, columns = ['Name', 'Surname'])

# combining two columns with map() method
df["FullName"] = df["Name"].map(str) + "-" + df["Surname"]

# print dataframe
print(df)

Output

    Name  Surname       FullName
0  Reema  Thakker  Reema-Thakker
1  Rekha   chande   Rekha-chande
2   Jaya     baru      Jaya-baru

In the above script, we have used the map() function to map one column in string type and combine with any other column. First we have created a list. Then using the list, a dataframe is created with two columns. Then the map() function is applied on ‘name’ column and joined with ‘surname’ column.

Summary

In this article we learned how to combine two columns of text in a Pandas DataFrame. We have discussed, what is dataframe in pandas, syntax of dataframe, how to create text columns in DataFrame and what are the methods to combine two text columns into one column. Also explained each method with example and output.

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top