We generally ignore the warning statements while running the codes, but the real question is whether it is meant to be ignored or may not. In this article, we are going to discuss one type of such warning called “SettingWithCopyWarning” which may impact our work on a frequent basis if we are not working cautiously.
Table of Contents
Introduction
To quickly get started, let’s create a sample dataframe to experiment. We’ll use the pandas library with some random data.
import pandas as pd import numpy as np # List of Tuples employees = [('Shubham', 'Data Scientist', 'Tech', 5), ('Riti', 'Data Engineer', 'Tech' , 7), ('Shanky', 'Program Manager', 'Tech' , 2), ('Shreya', 'Graphic Designer', 'Design' , 2), ('Aadi', 'Backend Developer', 'Tech', 11), ('Sim', 'Data Engineer', 'Tech', 4)] # Create a DataFrame object from list of tuples df = pd.DataFrame(employees, columns=['Name', 'Designation', 'Team', 'Experience'], index=[0, 1, 2, 3, 4, 5]) print(df)
Contents of the created dataframe are,
Name Designation Team Experience 0 Shubham Data Scientist Tech 5 1 Riti Data Engineer Tech 7 2 Shanky Program Manager Tech 2 3 Shreya Graphic Designer Design 2 4 Aadi Backend Developer Tech 11 5 Sim Data Engineer Tech 4
Understanding the SettingWithCopyWarning in Pandas- Case 1
Before getting into solving these warnings, first let’s try to understand the root cause of such warnings. Consider an example, say, we need to change the Team of all the “Program Managers” to “PMO”. Let’s try to change it using the code below.
# change Team value to PMO for program managers df[df['Designation'] == "Program Manager"]['Team'] = 'PMO'
Output
<ipython-input-2-768a2accd2a0>:1: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy df[df['Designation'] == "Program Manager"]['Team'] = 'PMO'
As noticed, the code ran without any errors but we do get a “SettingWithCopyWarning” warning. So, ideally we should go ahead since there are no errors. But wait, let’s look at the data if the above change worked.
Frequently Asked:
# check data print (df[df['Designation'] == "Program Manager"])
Output
Name Designation Team Experience 2 Shanky Program Manager Tech 2
Our value is not updated in the DataFrame, so clearly there is some issue, which is what the “SettingWithCopyWarning” was trying to convey. Let’s try the read more carefully, so it says that the “A value is trying to be set on a copy of a slice from a DataFrame.”
This is exactly the problem that we are using chained statement, i.e., first we using the get statement (to filter rows) and then set statement (to set the Team value as PMO). The pandas library is generally not good with such chained statements directly, which is why they have .loc or .iloc functions to execute such things.
Solution for SettingWithCopyWarning in Case 1
Now, since we understand the warning, let’s look at the ideal way to execute such statements. Let’s perform the same thing as above using the .loc statement as below.
# update the value using loc statement df.loc[df['Designation'] == "Program Manager", "Team"] = 'PMO' print (df)
Output
Name Designation Team Experience 0 Shubham Data Scientist Tech 5 1 Riti Data Engineer Tech 7 2 Shanky Program Manager PMO 2 3 Shreya Graphic Designer Design 2 4 Aadi Backend Developer Tech 11 5 Sim Data Engineer Design 4
Voila! We didn’t get any warning this time and our DataFrame is also updated with the “PMO” values for “Program Manager”.
Understanding the SettingWithCopyWarning – Case 2
Let’s look at another scenario where “SettingWithCopyWarning” could possibly cause trouble. Suppose, this time we save the filtered DataFrame (Tech Team) and then we try to update the Team of “Program Manager” as PMO.
# save the filtered dataframe tech_df = df[df['Team'] == "Tech"] # updating the Team for program managers tech_df.loc[tech_df['Designation'] == "Program Manager", "Team"] = 'PMO'
Output
SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy isetter(loc, value)
We are using the .loc function here but then also resulted in “SettingWithCopyWarning”. What can possibly go wrong here? First, let’s look at the DataFrame if the values got updated.
# check data print (tech_df)
Output
Name Designation Team Experience 0 Shubham Data Scientist Tech 5 1 Riti Data Engineer Tech 7 2 Shanky Program Manager PMO 2 4 Aadi Backend Developer Tech 11
As observed, the DataFrame got updated with “PMO” values. So, should we continue and ignore the error? Probably not!
Here, the problem with not with the .loc statement rather it is with the first line where we are storing the filtered DataFrame into a new object. This statement is creating a View (meaning any operation on new DataFrame can impact the original DataFrame) and not a Copy (meaning any operation on the new DataFrame will not impact the original DataFrame).
Therefore, in this case, updating the values in “tech_df” might lead to updating the original DataFrame as well (“df”) which is unintended.
Solution for SettingWithCopyWarning in Case 2
Let’s discuss the ideal way to handle this situation. The recommended way is to always use the .copy statement while saving the DataFrame into a different object. Let’s try below.
# save the filtered dataframe using copy tech_df = df[df['Team'] == "Tech"].copy() # updating the Team for program managers tech_df.loc[tech_df['Designation'] == "Program Manager", "Team"] = 'PMO' print (tech_df)
Output
Name Designation Team Experience 0 Shubham Data Scientist Tech 5 1 Riti Data Engineer Tech 7 2 Shanky Program Manager PMO 2 4 Aadi Backend Developer Tech 11
Here you go, this does not result in any Warnings and our DataFrame is also updated with the required value.
Summary
In this article, we have discussed multiple scenarios of “SettingWithCopyWarning” and how to resolve them. Thanks.