Python: Get last N lines of a text file, like tail command

In this article, we will discuss a memory-efficient solution to read the last line or last N lines from a text or CSV file in python. Then we will also see how to real only last line or check if the last line in the file matches the given line.

We have created a function to read last N lines from a text file,

import os


def get_last_n_lines(file_name, N):
    # Create an empty list to keep the track of last N lines
    list_of_lines = []
    # Open file for reading in binary mode
    with open(file_name, 'rb') as read_obj:
        # Move the cursor to the end of the file
        read_obj.seek(0, os.SEEK_END)
        # Create a buffer to keep the last read line
        buffer = bytearray()
        # Get the current position of pointer i.e eof
        pointer_location = read_obj.tell()
        # Loop till pointer reaches the top of the file
        while pointer_location >= 0:
            # Move the file pointer to the location pointed by pointer_location
            read_obj.seek(pointer_location)
            # Shift pointer location by -1
            pointer_location = pointer_location -1
            # read that byte / character
            new_byte = read_obj.read(1)
            # If the read byte is new line character then it means one line is read
            if new_byte == b'\n':
                # Save the line in list of lines
                list_of_lines.append(buffer.decode()[::-1])
                # If the size of list reaches N, then return the reversed list
                if len(list_of_lines) == N:
                    return list(reversed(list_of_lines))
                # Reinitialize the byte array to save next line
                buffer = bytearray()
            else:
                # If last read character is not eol then add it in buffer
                buffer.extend(new_byte)

        # As file is read completely, if there is still data in buffer, then its first line.
        if len(buffer) > 0:
            list_of_lines.append(buffer.decode()[::-1])

    # return the reversed list
    return list(reversed(list_of_lines))

This function accepts 2 arguments i.e. a file path as a string and an integer N ( number of lines to be read from last). It returns a list of last N lines of the file.

How does this function work?

First of all, it creates an empty list to store last N lines of a file. Then it opens the given file for reading in binary format and starts reading each byte from the end of the file until the start of the file i.e.in reverse direction. While reading bytes, as soon as it encounters a new line character ‘\n’, it means a line is read successfully. It then reverses the string and add that string / line in a list and continues reading next bytes from the file in reverse direction till the top of the file is reached or our list size becomes N.

It internally uses two functions i.e.

  • file_object.tell(): It gives the pointer’s current position in the file, i.e. number of bytes from the beginning of the file.
  • file_object.seek(offset, reference_point): It moves the pointer to a reference_point + offset

 Let’s use the above-created function to fetch last N lines from a text file,

Suppose we have a text file ‘sample.txt’ & its contents are,

Hello this is a sample file
It contains sample text
Dummy Line A
Dummy Line B
Dummy Line C
This is the end of file

Now we will fetch last N lines from this file,

Get Last 3 lines of a text file as a list in python

# Get last three lines from file 'sample.txt'
last_lines = get_last_n_lines("sample.txt", 3)

print('Last 3 lines of File:')
# Iterate over the list of last 3 lines and print one by one
for line in last_lines:
    print(line)

Output:

Last 3 lines of File:
Dummy Line B
Dummy Line C
This is the end of file

It returned the last 3 lines from file ‘sample.txt’ as a list of strings and then we iterated over the list to print the last 3 lines of the file.

Let’s look at another example,

Get last 5 lines of a text file or CSV file

# get last five lines from the file
last_lines = get_last_n_lines("sample.txt", 5)

print('Last 5 lines of File:')
# Iterate over the list of last 5 lines and print one by one
for line in last_lines:
    print(line)

Output:

Last 5 lines of File:
It contains sample text
Dummy Line A
Dummy Line B
Dummy Line C
This is the end of file

Efficiency of solution:

This is an efficient solution because we read the lines from last only and at max only N lines were in memory at a time.

So, even if we have large file with size in GBs and we want to read last 10 lines, then this solution will give results in efficiently because we started from last and read till last 10 lines only, it doesn’t matter how large the file was.

Read the last line of a text file or CSV file or log file

we can use the same function, to read the last line of a file. We need to pass the 1 as argument N in function get_last_n_lines() and it will return a list containing the last line. For example,

# get last line of the file
last_lines = get_last_n_lines("sample.txt", 1)

print('Last Line of File:')
print(last_lines[0])

Output:

Last Line of File:
This is the end of file

This is how we read and printed the last line from a file. It is also an efficient solution even in case of large files because we started reading from last and in backward direction.

Check if the last line in the file matches the given line

Let’s check if the last line in file ‘sample.txt’ is exactly ‘This is the end of file’,

# get last line of the file
last_lines = get_last_n_lines("sample.txt", 1)

# Match the returned last line of file with the give string
if last_lines[0] == 'This is the end of file' :
    print('Last Line matched')

Output:

Last Line matched

It proves that the last line of the file matches the given string.

Check if the last line in the file contains given substring

Let’s check if the last line of file ‘sample.txt’ contains the string ‘is’

sub_string_to_match = 'is'

# Check if the last line of file contains the given sub-string or not
if sub_string_to_match in get_last_n_lines("sample.txt", 1)[0]:
    print('Positive: Last Line contains the given sub string')
else:
    print('Negative: Last Line do not contains the given sub string')

Output:

Positive: Last Line contains the given sub string

It proves that the last line of the file includes the given substring.

The complete example is as follows,

import os


def get_last_n_lines(file_name, N):
    # Create an empty list to keep the track of last N lines
    list_of_lines = []
    # Open file for reading in binary mode
    with open(file_name, 'rb') as read_obj:
        # Move the cursor to the end of the file
        read_obj.seek(0, os.SEEK_END)
        # Create a buffer to keep the last read line
        buffer = bytearray()
        # Get the current position of pointer i.e eof
        pointer_location = read_obj.tell()
        # Loop till pointer reaches the top of the file
        while pointer_location >= 0:
            # Move the file pointer to the location pointed by pointer_location
            read_obj.seek(pointer_location)
            # Shift pointer location by -1
            pointer_location = pointer_location -1
            # read that byte / character
            new_byte = read_obj.read(1)
            # If the read byte is new line character then it means one line is read
            if new_byte == b'\n':
                # Save the line in list of lines
                list_of_lines.append(buffer.decode()[::-1])
                # If the size of list reaches N, then return the reversed list
                if len(list_of_lines) == N:
                    return list(reversed(list_of_lines))
                # Reinitialize the byte array to save next line
                buffer = bytearray()
            else:
                # If last read character is not eol then add it in buffer
                buffer.extend(new_byte)

        # As file is read completely, if there is still data in buffer, then its first line.
        if len(buffer) > 0:
            list_of_lines.append(buffer.decode()[::-1])

    # return the reversed list
    return list(reversed(list_of_lines))



def main():
    print("*** Get Last N lines of a text file or csv file ***")

    print('** Get last 3 lines of text file or csv file **')

    # Get last three lines from file 'sample.txt'
    last_lines = get_last_n_lines("sample.txt", 3)

    print('Last 3 lines of File:')
    # Iterate over the list of last 3 lines and print one by one
    for line in last_lines:
        print(line)

    print('** Get last 5 lines of text file or csv file **')

    # get last five lines from the file
    last_lines = get_last_n_lines("sample.txt", 5)

    print('Last 5 lines of File:')
    # Iterate over the list of last 5 lines and print one by one
    for line in last_lines:
        print(line)

    print('*** Get last line of text file or csv file or log file***')

    # get last line of the file
    last_lines = get_last_n_lines("sample.txt", 1)

    print('Last Line of File:')
    print(last_lines[0])

    print('*** Check if last line in file matches the given line ***')

    # get last line of the file
    last_lines = get_last_n_lines("sample.txt", 1)

    # Match the returned last line of file with the give string
    if last_lines[0] == 'This is the end of file' :
        print('Last Line matched')

    print('**** Check if last line in file contains given sub-string ****')

    sub_string_to_match = 'is'

    # Check if the last line of file contains the given sub-string or not
    if sub_string_to_match in get_last_n_lines("sample.txt", 1)[0]:
        print('Positive: Last Line contains the given sub string')
    else:
        print('Negative: Last Line do not contains the given sub string')

if __name__ == '__main__':
   main()

 

Output:

O

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top