Select Rows by Timestamp in Pandas

This tutorial will discuss about different ways to select DataFrame rows by timestamp in Pandas.

Table Of Contents

Preparing DataSet

First we will create a DataFrame with some hard coded values.

import pandas as pd

data = {'A': [11, 12, 13, 14, 15, 16, 17],
        'B': [21, 22, 23, 24, 25, 26, 27],
        'Last_Visited_Timestamp': ['2023-01-14 02:00:00',
                                   '2023-01-14 03:00:00',
                                   '2023-01-14 03:30:00',
                                   '2023-01-14 04:15:00',
                                   '2023-01-14 06:30:11',
                                   '2023-01-14 09:11:21',
                                   '2023-01-14 11:35:31']}

index=['D1', 'D2', 'D3', 'D4', 'D5', 'D6', 'D7']

# Create DataFrame from dictionary
df = pd.DataFrame.from_dict(data)

# Set list index as Index of Dataframe
df.set_index(pd.Index(index), inplace=True)

print (df)

Output

     A   B Last_Visited_Timestamp
D1  11  21    2023-01-14 02:00:00
D2  12  22    2023-01-14 03:00:00
D3  13  23    2023-01-14 03:30:00
D4  14  24    2023-01-14 04:15:00
D5  15  25    2023-01-14 06:30:11
D6  16  26    2023-01-14 09:11:21
D7  17  27    2023-01-14 11:35:31

Suppose we have a DataFrame where one column contains some Timestamp values. We want to select some specific rows from this DataFrame based on the Timestamps in any given range.

For example or sample DataFrame is like this,

     A   B Last_Visited_Timestamp
D1  11  21    2023-01-14 02:00:00
D2  12  22    2023-01-14 03:00:00
D3  13  23    2023-01-14 03:30:00
D4  14  24    2023-01-14 04:15:00
D5  15  25    2023-01-14 06:30:11
D6  16  26    2023-01-14 09:11:21
D7  17  27    2023-01-14 11:35:31

No we want to select only those rows from this DataFrame, where the values in column Last_Visited_Timestamp is between a given start and end timestamps i.e. between ‘2023-01-14 03:15:00’ and ‘2023-01-14 06:43:00’, like this,

D3  13  23    2023-01-14 03:30:00
D4  14  24    2023-01-14 04:15:00
D5  15  25    2023-01-14 06:30:11

We can select the timestamp column and then we will apply multiple conditions on it,

  • First condition is that the timestamp value should be greater than the start timestamp
  • Second condition is that the timestamp value should be less than the end timestamp.

It will give us a boolean series where each true value represent that the particular value in column satisfies the given condition. Then we will pass this boolean series into the loc[] attribute and it will give us the rows for which the given column has values in the given timestamp range.

In the below example, we are going to select only those rows from the DataFrame where the Last_Visited_Timestamp column contains the values in the timestamp range i.e. from ‘2023-01-14 03:15:00’ to ‘2023-01-14 06:43:00’.

# Starting Timestamp
startingTimestamp = '2023-01-14 03:15:00'
# Ending Timestamp
endingTimestamp = '2023-01-14 06:43:00'

# Select rows where column 'Last_Visited_Timestamp’ has values in the i.e.
# from '2023-01-14 3:15:00' till '2023-01-14 06:43:00'
subDf = df.loc[ (df[ 'Last_Visited_Timestamp'] > startingTimestamp) & (df['Last_Visited_Timestamp'] <= endingTimestamp) ]

print(subDf)

Output

     A   B Last_Visited_Timestamp
D3  13  23    2023-01-14 03:30:00
D4  14  24    2023-01-14 04:15:00
D5  15  25    2023-01-14 06:30:11

Select Rows by Timestamp Range using query() method

In the previous example we used the loc[] attribute to select the rows by timestamp range. We can do that same thing using the query() method of DataFrame. We need to pass a query inside this query() method of the DataFrame.

Like in the below example, we are going to select rows where column Last_Visited_Timestamp has values in the given timestamp range that is from ‘2023-01-14 03:15:00’ to ‘2023-01-14 06:43:00’.

# Starting Timestamp
startingTimestamp = '2023-01-14 03:15:00'

# Ending Timestamp
endingTimestamp = '2023-01-14 06:43:00'

# Select rows where column 'Last_Visited_Timestamp’ has values in the i.e.
# from '2023-01-14 03:15:00' till '2023-01-14 06:43:00'
subDf = df.query('Last_Visited_Timestamp >= @startingTimestamp and Last_Visited_Timestamp <= @endingTimestamp' )

print (subDf)

Output

     A   B Last_Visited_Timestamp
D3  13  23    2023-01-14 03:30:00
D4  14  24    2023-01-14 04:15:00
D5  15  25    2023-01-14 06:30:11

Select Rows by Timestamp Range using Series.between() method

We can use the Series.between() method to select rows in a DataFrame that contains the values in a column in a timestamp Range.

We will pass the start timestamp and end timestamp as arguments in between() method and it will return a boolean array where true represents that that particular value in the selected column is in the given timestamp range.

Then we will pass that boolean series to the loc[] attribute and it will select only those rows from the DataFrame for which the given column has a value in the given timestamp range.

Like in the below example, we are going to select rows where column Last_Visited_Timestamp has values in the given timestamp range that is from ‘2023-01-14 03:15:00’ to ‘2023-01-14 06:43:00’.

# Starting Timestamp
startingTimestamp = '2023-01-14 03:15:00'

# Ending Timestamp
endingTimestamp = '2023-01-14 06:43:00'

# Select rows where column 'Last_Visited_Timestamp’ has values in the given Timestamp Range
# from '2023-01-14 03:15:00' till '2023-01-14 06:43:00'

subDf = df.loc[df['Last_Visited_Timestamp'].between(startingTimestamp, endingTimestamp) ]

print(subDf)

Output

     A   B Last_Visited_Timestamp
D3  13  23    2023-01-14 03:30:00
D4  14  24    2023-01-14 04:15:00
D5  15  25    2023-01-14 06:30:11

Summary

We learned how to select DataFrame Rows within a Timestamp Range in Pandas. Thanks.

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top