Python Get list of files in directory with size

In this article, we will discuss different ways to get list of all files in a directory or folder along with size in python.

Table of contents

Get list of all files in directory with size using glob()

In python, the glob module provides a function glob() to find files or directories in a given directory based on the matching pattern. Similar to unix path expansion rules, we can use wildcards and regular expression to match & find few or all files in a directory using the globe() function. We will use this to get a list of all files in a directory along with the size. Steps are as follows,

  1. Get a list of all files and directories in a given directory using glob() function.
  2. Using the filter() function and os.path.isfileIO(), select files only from the list.
  3. For each file in the list, calculate its size and create a list of tuples i.e. list of file paths and size.

Complete example to get list of files in directory with size is as follows,

import glob
import os

dir_name = 'C:/Program Files/Java/jdk-15.0.1/include/'

# Get a list of files (file paths) in the given directory 
list_of_files = filter( os.path.isfile,
                        glob.glob(dir_name + '*') )

# get list of ffiles with size
files_with_size = [ (file_path, os.stat(file_path).st_size) 
                    for file_path in list_of_files ]

# Iterate over list of tuples i.e. file_paths with size
# and print them one by one
for file_path, file_size in files_with_size:
    print(file_size, ' -->', file_path)  

Output:

Advertisements
21158  --> C:/Program Files/Java/jdk-15.0.1/include\classfile_constants.h
11461  --> C:/Program Files/Java/jdk-15.0.1/include\jawt.h
7154  --> C:/Program Files/Java/jdk-15.0.1/include\jdwpTransport.h
74681  --> C:/Program Files/Java/jdk-15.0.1/include\jni.h
83360  --> C:/Program Files/Java/jdk-15.0.1/include\jvmti.h
3774  --> C:/Program Files/Java/jdk-15.0.1/include\jvmticmlr.h

The os.stat(file_path) function returns an object that contains the file statistics. We can fetch the st_size attribute of the stat object i.e. the size of file in bytes.

In the above solution we created a list of files in a folder and then for each file we fetched the file size in bytes using os.stat()function and then created a list of tuple i.e. file_path & file size. But the list contains the name of files along with the size in bytes.

Get list of files names in directory with size using os.listdir()

In Python, the os module provides a function listdir(dir_path), which returns a list of file & directory names in the given directory path. Using the filter() function and os.path.isfileIO(), select files only from the list. Then we can iterate over this list of file names and fetch the size of each file. Then we can create a list of tuples i.e. file name and size.

Complete example to get list of file names in directory with size is as follows,

import os

dir_name = 'C:/Program Files/Java/jdk-15.0.1/include/'

# Get list of all files only in the given directory
list_of_files = filter( lambda x: os.path.isfile(os.path.join(dir_name, x)),
                        os.listdir(dir_name) )

# Create a list of files in directory along with the size
files_with_size = [ (file_name, os.stat(os.path.join(dir_name, file_name)).st_size) 
                    for file_name in list_of_files  ]

# Iterate over list of files along with size 
# and print them one by one.
for file_name, size in files_with_size:
    print(size, ' -->', file_name) 

Output:

21158  --> classfile_constants.h
11461  --> jawt.h
7154  --> jdwpTransport.h
74681  --> jni.h
83360  --> jvmti.h
3774  --> jvmticmlr.h

In this solution we created a list of file names in a folder along with the size in bytes.

Python: Get list of files in directory and sub-directories with size

In both the previous examples we created a list of files in a directory with size. But it covered the files in the given directory only, not in nested directories. So, if you want to get a list of files in directory and sub-directory with the size then checkout this example,

import glob
import os

dir_name = 'C:/Program Files/Java/jdk-15.0.1/include'

# Get a list of files (file paths) in the given directory 
list_of_files = filter( os.path.isfile,
                        glob.glob(dir_name + '/**/*', recursive=True) )

# get list of ffiles with size
files_with_size = [ (file_path, os.stat(file_path).st_size) 
                    for file_path in list_of_files ]

# Iterate over list of tuples i.e. file_paths with size
# and print them one by one
for file_path, file_size in files_with_size:
    print(file_size, ' -->', file_path)   

Output:

21158  --> C:/Program Files/Java/jdk-15.0.1/include\classfile_constants.h
11461  --> C:/Program Files/Java/jdk-15.0.1/include\jawt.h
7154  --> C:/Program Files/Java/jdk-15.0.1/include\jdwpTransport.h
74681  --> C:/Program Files/Java/jdk-15.0.1/include\jni.h
83360  --> C:/Program Files/Java/jdk-15.0.1/include\jvmti.h
3774  --> C:/Program Files/Java/jdk-15.0.1/include\jvmticmlr.h
898  --> C:/Program Files/Java/jdk-15.0.1/include\win32\jawt_md.h
583  --> C:/Program Files/Java/jdk-15.0.1/include\win32\jni_md.h
4521  --> C:/Program Files/Java/jdk-15.0.1/include\win32\bridge\AccessBridgeCallbacks.h
35096  --> C:/Program Files/Java/jdk-15.0.1/include\win32\bridge\AccessBridgeCalls.h
76585  --> C:/Program Files/Java/jdk-15.0.1/include\win32\bridge\AccessBridgePackages.h

We used the glob() function with pattern ‘/**/*’ and recursive argument with value True. It gave a list of all files in given directory and all sub-directories recursively. Then using the os.stat(file_path).st_size function, we calculated the size of each file and created a list of files along with the size.

Summary:

We learned about different ways to get a list of files in a folder with the size.

Pandas Tutorials -Learn Data Analysis with Python

   

Are you looking to make a career in Data Science with Python?

Data Science is the future, and the future is here now. Data Scientists are now the most sought-after professionals today. To become a good Data Scientist or to make a career switch in Data Science one must possess the right skill set. We have curated a list of Best Professional Certificate in Data Science with Python. These courses will teach you the programming tools for Data Science like Pandas, NumPy, Matplotlib, Seaborn and how to use these libraries to implement Machine learning models.

Checkout the Detailed Review of Best Professional Certificate in Data Science with Python.

Remember, Data Science requires a lot of patience, persistence, and practice. So, start learning today.

Join a LinkedIn Community of Python Developers

Leave a Comment

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top