Python: Get list of files in directory sorted by size

In this article, we will discuss different ways to get list of all files in a directory / folder sorted by size.

Table of contents

Get list of files in directory sorted by size using glob()

In python, the glob module provides a function glob() to find files in a directory based on matching pattern. Similar to the unix path expansion rules, we can use wildcards and regular expression to match & find few or all files in a directory. We will use this to get a list of all files in a directory but sorted by the size (in bytes). Steps are as follows,

  1. Get a list of all files in a directory using glob()
  2. Sort the list of files based on the size of files using sorted() function.
    • For this, use os.stat(file_path).st_size to fetch the file size from stat object of file. Then encapsulate that in a lambda function and pass that as the key argument in the sorted() function.

Complete example to get a list of all files in directory sorted size (in bytes) is as follows,

import glob
import os

dir_name = 'C:/Program Files/Java/jdk-15.0.1/include/'

# Get a list of files (file paths) in the given directory 
list_of_files = filter( os.path.isfile,
                        glob.glob(dir_name + '*') )

# Sort list of files in directory by size 
list_of_files = sorted( list_of_files,
                        key =  lambda x: os.stat(x).st_size)

# Iterate over sorted list of files in directory and 
# print them one by one along with size
for elem in list_of_files:
    file_size  = os.stat(elem).st_size 
    print(file_size, ' -->', elem)   

Output:

3774  --> C:/Program Files/Java/jdk-15.0.1/include\jvmticmlr.h
7154  --> C:/Program Files/Java/jdk-15.0.1/include\jdwpTransport.h
11461  --> C:/Program Files/Java/jdk-15.0.1/include\jawt.h
21158  --> C:/Program Files/Java/jdk-15.0.1/include\classfile_constants.h
74681  --> C:/Program Files/Java/jdk-15.0.1/include\jni.h
83360  --> C:/Program Files/Java/jdk-15.0.1/include\jvmti.h

In the above solution we created a list of files in a folder, sorted by size (in bytes). First, we create a list of files in the given directory using glob.glob(). This list contains the file paths. Then we passed this list to the sorted() function along with key argument lambda x: os.stat(x).st_size . The key argument in sorted() function is used as comparator while sorting. Therefore, it sorted the list of file paths based on the size of the files.

Important Point:

The os.stat(file_path) function returns an object that contains the file statistics. We can fetch the st_size attribute of the stat object i.e. the size of the file in bytes.

In the above solution we created a list of files in a folder, sorted by size. But the list contains the complete path of the files. What if we want only file names in sorted order by size?

Get list of files in directory sorted by size using os.listdir()

In Python, the os module provides a function listdir(dir_path), which returns a list of file names in the given directory path. Then we can sort this list of file names based on the size, using lambda x: os.stat(x).st_size as the key argument in the sorted() function.

Complete example to get list of files in directory sorted by size is as follows,

import os

dir_name = 'C:/Program Files/Java/jdk-15.0.1/include/'

# Get list of all files only in the given directory
list_of_files = filter( lambda x: os.path.isfile(os.path.join(dir_name, x)),
                        os.listdir(dir_name) )

# Sort list of file names by size 
list_of_files = sorted( list_of_files,
                        key =  lambda x: os.stat(os.path.join(dir_name, x)).st_size)

# Iterate over sorted list of file names and 
# print them one by one along with size
for file_name in list_of_files:
    file_path = os.path.join(dir_name, file_name)
    file_size  = os.stat(file_path).st_size 
    print(file_size, ' -->', file_name)   

Output:

3774  --> jvmticmlr.h
7154  --> jdwpTransport.h
11461  --> jawt.h
21158  --> classfile_constants.h
74681  --> jni.h
83360  --> jvmti.h

In this solution we created a list of file names in a folder sorted by file size. The sorted() function uses the key argument as the comparator while sorting the items in given list. Therefore, by passing lambda x: os.stat(os.path.join(dir_name, x)).st_size as the key argument, we forced it to sort the files by size.

Python: Get list of files in directory and sub-directories sorted by size

In both the previous examples we created a list of files in a directory sorted by size. But it covered the files in the given directory only, not in nested directories. So, if you want to get a list of files in directory and sub-directory sorted by size then checkout this example,

import glob
import os

dir_name = 'C:/Program Files/Java/jdk-15.0.1/include'

# Get a list of files (file paths) in the given directory 
list_of_files = filter( os.path.isfile,
                        glob.glob(dir_name + '/**/*', recursive=True) )

# Sort list of files in directory by size 
list_of_files = sorted( list_of_files,
                        key =  lambda x: os.stat(x).st_size)

# Iterate over sorted list of files in directory and 
# print them one by one along with size
for elem in list_of_files:
    file_size  = os.stat(elem).st_size 
    print(file_size, ' -->', elem)   

Output:

583  --> C:/Program Files/Java/jdk-15.0.1/include\win32\jni_md.h
898  --> C:/Program Files/Java/jdk-15.0.1/include\win32\jawt_md.h
3774  --> C:/Program Files/Java/jdk-15.0.1/include\jvmticmlr.h
4521  --> C:/Program Files/Java/jdk-15.0.1/include\win32\bridge\AccessBridgeCallbacks.h
7154  --> C:/Program Files/Java/jdk-15.0.1/include\jdwpTransport.h
11461  --> C:/Program Files/Java/jdk-15.0.1/include\jawt.h
21158  --> C:/Program Files/Java/jdk-15.0.1/include\classfile_constants.h
35096  --> C:/Program Files/Java/jdk-15.0.1/include\win32\bridge\AccessBridgeCalls.h
74681  --> C:/Program Files/Java/jdk-15.0.1/include\jni.h
76585  --> C:/Program Files/Java/jdk-15.0.1/include\win32\bridge\AccessBridgePackages.h
83360  --> C:/Program Files/Java/jdk-15.0.1/include\jvmti.h

We used the glob() function with pattern ‘/**/*’ and recursive=True argument. It gave a list of all files in the given directory and all sub-directories. Then using the lambda x: os.stat(x).st_size as the key argument in the sorted() function, we created a list of files sorted by size (in bytes).

Summary:

We learned about different ways to get a list of files in a folder, sorted by size (in bytes).

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top