In this article we will discuss how to convert data type of a dataframe column from string to datetime. The data can be in custom string formats or embedded in big text. Also, how to handle error while converting the data type.

Python’s Pandas module provides a function to convert a given argument to datetime i.e.

Important parameters:

  • arg : Element to be converted to datetime type like int, float, string, datetime, list, tuple, 1-d array or Series.
  • errors : Way to handle error. It can be : {‘ignore’, ‘raise’, ‘coerce’}, default value is ‘raise’
    • ‘raise’: In case of invalid parsing raise an exception
    • ‘coerce’: In case of invalid parsing set as NaT
    • ‘ignore’: In case of invalid parsing return the input
  • format : string, default None
    • Like we pass in strftime to parse date & time string in format eg “%d/%m/%Y” etc.

Returns:

It Converts the given value to date time format and return value depends on the input, for example,

  • If a series of string is passed then it will return a series of datetime64 type.
  • If a scalar entity is passed then it returns a datetime64 object.

As this function can covert the data type of a series from string to datetime. Let’s see how to use this to convert data type of a column from string to datetime.

Convert the Data type of a column from string to datetime64

Suppose we have a dataframe in which column ‘DOB’ contains the dates in string format ‘DD/MM/YYYY’ i.e.

Contents of the dataframe empDfObj is as follows,

To check the data types of columns use attribute Dataframe.dtypes i.e.

Output:

Data type of column ‘DOB’ is string, basically it contains the date of births as string but in DD/MM/YYYY format. Now to convert the data type of column ‘DOB’ to datetime64 we will use pandas.to_datetime() i.e.

Contents of the updated dataframe are,

Lets check the data types of columns in updated dataframe,

Output:

Now the data type of column ‘DOB’ is datetime64.

pd.to_datetime() converts the date time strings in ISO8601 format to datetime64 type. Strings type that it can automatically handles are,

‘DD-MM-YYYY HH:MM AM/PM’
‘YYYY-MM-DDTHH:MM:SS’
‘YYYY-MM-DDT HH:MM:SS.ssssss’
etc.
Let’s see an example where we have dates as string type but in different ISO8601 format i.e.

Contents of the dataframe are,

Let’s check the data types of columns i.e.

Output:

Datatype of column ‘DOB’ is string but it is in different string format. Let’s convert the data type of column ‘DOB’ to datetime64 i.e.

Lets check the data types of columns in updated dataframe,

Output:

Now the data type of column ‘DOB’ is datetime64.

Convert the Data type of a column from custom format string to datetime64

There might be cases when our dataframe have columns which contains date & time column in custom formats like,
DDMMYYYY
DD–MM–YY

To convert data type of column from these custom strings formats to datetime, we need to pass the format argument in pd.to_datetime(). Let’s see an example where we have column in dataframe which contains dates in custom string format i.e. DDMMYYYY,

Contents of the dataframe are,

Let’s check the data types of columns i.e.

Output:

Datatype of column ‘DOB‘ is string. Let’s convert the data type of column ‘DOB’ to datetime64 i.e.

Lets check the data types of columns in updated dataframe,

Output:

Now the data type of column ‘DOB’ is datetime64. Here we passed the date time string format in format argument of pd.to_datetime().

Convert the Data type of a column from string to datetime by extracting date & time strings from big string

There might be scenarios when our column in dataframe contains some text and we need to fetch date & time from those texts like,

  • date of birth is 07091985
  • 11101998 is DOB

To fetch datetime from this big text we need to pass exact argument in pd.to_dataframe(), if passed as False it will try to match the format anywhere in string. Let’s understand by an example,
Suppose we have a dataframe with a column DOB, that contains text in which date time is at different place like,

Contents of the dataframe are,

Let’s check the data types of columns i.e.

Output:

Datatype of column ‘DOB’ is string. Let’s convert the contents of column DOB by removing extra text and making the data type of column ‘DOB’ to datetime64 i.e.

Lets check the data types of columns in updated dataframe,

Output:

Now the data type of column ‘DOB’ is datetime64 and extra text is also removed.

Another Example : Extract date & time from big string in a column and add new columns of datetime64 format

Suppose we have a column in dataframe that contains big text and inside the text it contains date and time at different places i.e.

Contents of the dataframe are,

Datatype of column ‘DOB’ is string / object. Let’s add new columns in dataframe that contains date and time from this big text i.e.

Output

For DOB_time column we provided time only, therefore it picked the default date i.e. 1900-01-01. Where as DOB_date contains the date only. But the data type of both DOB_time & DOB_date is datetime64.

Handle error while Converting the Data type of a column from string to datetime

To handle errors while converting data type of a column we can pass the errors arguments to customize the behavior i.e.

  • ‘raise’: In case of invalid parsing raise an exception
  • ‘coerce’: In case of invalid parsing set as NaT
  • ‘ignore’: In case of invalid parsing return the input

Let’s see an example of ignoring errors while converting the type of column from string to datetime

Output

Complete example is as follows,

Output:

If you didn't find what you were looking, then do suggest us in the comments below. We will be more than happy to add that.

Do Subscribe with us for more Articles / Tutorials like this,