In this article we will discuss how to create a zip archive from selected files or files from a directory based on filters.
Python’s zipfile module provides a ZipFile class for zip file related stuff. Let’s use this to create a zip archive file.
First import the class from module i.e.
from zipfile import ZipFile
Create a zip archive from multiple files in Python
Steps are,
- Create a ZipFile object by passing the new file name and mode as ‘w’ (write mode). It will create a new zip file and open it within ZipFile object.
- Call write() function on ZipFile object to add the files in it.
- call close() on ZipFile object to Close the zip file.
# create a ZipFile object zipObj = ZipFile('sample.zip', 'w') # Add multiple files to the zip zipObj.write('sample_file.csv') zipObj.write('test_1.log') zipObj.write('test_2.log') # close the Zip File zipObj.close()
It will create a zip file ‘sample.zip’ with given files inside it.
We can do the same thing with “with open” . It will automatically close the zip file when ZipFile object goes out of scope i.e.
# Create a ZipFile Object with ZipFile('sample2.zip', 'w') as zipObj2: # Add multiple files to the zip zipObj2.write('sample_file.csv') zipObj2.write('test_1.log') zipObj2.write('test_2.log')
Create a zip archive of a directory
To zip all the contents of a directory in a zip archive, we need to iterate over all the files in directory and it’s sub directories, then add each entry to the zip file using ZipFile.write()
from zipfile import ZipFile import os from os.path import basename # create a ZipFile object with ZipFile('sampleDir.zip', 'w') as zipObj: # Iterate over all the files in directory for folderName, subfolders, filenames in os.walk(dirName): for filename in filenames: #create complete filepath of file in directory filePath = os.path.join(folderName, filename) # Add file to zip zipObj.write(filePath, basename(filePath))
It will zip all the contents of a directory in to a single zip file i..e ‘sampleDir.zip’. It’s contents will be,
sampleDir/sample_file.csv 2018-11-30 21:44:46 2829 sampleDir/logs/test_1.log 2018-11-30 21:44:36 3386 sampleDir/logs/test_2.log 2018-11-30 21:44:56 3552
Zip selected files from a directory based on filter or wildcards
To zip selected files from a directory we need to check the condition on each file path while iteration before adding it to zip file.
Let’s create function that Iterates over a directory and filter the contents with given callback. Files which pass the filter will only be added in zip i.e.
from zipfile import ZipFile import os from os.path import basename # Zip the files from given directory that matches the filter def zipFilesInDir(dirName, zipFileName, filter): # create a ZipFile object with ZipFile(zipFileName, 'w') as zipObj: # Iterate over all the files in directory for folderName, subfolders, filenames in os.walk(dirName): for filename in filenames: if filter(filename): # create complete filepath of file in directory filePath = os.path.join(folderName, filename) # Add file to zip zipObj.write(filePath, basename(filePath))
Let’s zip only csv files from a directory i.e. pass a lambda function as argument in it.
zipFilesInDir('sampleDir', 'sampleDir2.zip', lambda name : 'csv' in name)
It will create a zip archive ‘sampleDir2.zip’ with all csv files from given directory.
Complete example is as follows:
from zipfile import ZipFile import os from os.path import basename # Zip the files from given directory that matches the filter def zipFilesInDir(dirName, zipFileName, filter): # create a ZipFile object with ZipFile(zipFileName, 'w') as zipObj: # Iterate over all the files in directory for folderName, subfolders, filenames in os.walk(dirName): for filename in filenames: if filter(filename): # create complete filepath of file in directory filePath = os.path.join(folderName, filename) # Add file to zip zipObj.write(filePath, basename(filePath)) def main(): print('*** Create a zip file from multiple files ') #create a ZipFile object zipObj = ZipFile('sample.zip', 'w') # Add multiple files to the zip zipObj.write('sample_file.csv') zipObj.write('test_1.log') zipObj.write('test_2.log') # close the Zip File zipObj.close() print('*** Create a zip file from multiple files using with ') # Create a ZipFile Object with ZipFile('sample2.zip', 'w') as zipObj2: # Add multiple files to the zip zipObj2.write('sample_file.csv') zipObj2.write('test_1.log') zipObj2.write('test_2.log') # Name of the Directory to be zipped dirName = 'sampleDir' # create a ZipFile object with ZipFile('sampleDir.zip', 'w') as zipObj: # Iterate over all the files in directory for folderName, subfolders, filenames in os.walk(dirName): for filename in filenames: #create complete filepath of file in directory filePath = os.path.join(folderName, filename) # Add file to zip zipObj.write(filePath) print('*** Create a zip archive of only csv files form a directory ***') zipFilesInDir('sampleDir', 'sampleDir2.zip', lambda name : 'csv' in name) if __name__ == '__main__': main()
Pandas Tutorials -Learn Data Analysis with Python
-
Pandas Tutorial Part #1 - Introduction to Data Analysis with Python
-
Pandas Tutorial Part #2 - Basics of Pandas Series
-
Pandas Tutorial Part #3 - Get & Set Series values
-
Pandas Tutorial Part #4 - Attributes & methods of Pandas Series
-
Pandas Tutorial Part #5 - Add or Remove Pandas Series elements
-
Pandas Tutorial Part #6 - Introduction to DataFrame
-
Pandas Tutorial Part #7 - DataFrame.loc[] - Select Rows / Columns by Indexing
-
Pandas Tutorial Part #8 - DataFrame.iloc[] - Select Rows / Columns by Label Names
-
Pandas Tutorial Part #9 - Filter DataFrame Rows
-
Pandas Tutorial Part #10 - Add/Remove DataFrame Rows & Columns
-
Pandas Tutorial Part #11 - DataFrame attributes & methods
-
Pandas Tutorial Part #12 - Handling Missing Data or NaN values
-
Pandas Tutorial Part #13 - Iterate over Rows & Columns of DataFrame
-
Pandas Tutorial Part #14 - Sorting DataFrame by Rows or Columns
-
Pandas Tutorial Part #15 - Merging or Concatenating DataFrames
-
Pandas Tutorial Part #16 - DataFrame GroupBy explained with examples
Are you looking to make a career in Data Science with Python?
Data Science is the future, and the future is here now. Data Scientists are now the most sought-after professionals today. To become a good Data Scientist or to make a career switch in Data Science one must possess the right skill set. We have curated a list of Best Professional Certificate in Data Science with Python. These courses will teach you the programming tools for Data Science like Pandas, NumPy, Matplotlib, Seaborn and how to use these libraries to implement Machine learning models.
Checkout the Detailed Review of Best Professional Certificate in Data Science with Python.
Remember, Data Science requires a lot of patience, persistence, and practice. So, start learning today.
I was having trouble finding simplified examples showing the Open, Write, Close process. Thank you for creating such a concise explanation!
# Zip the files from given directory that matches the filter
from zipfile import ZipFile
import os
def zipFilesInDir(dirName, zipFileName, filter):
# create a ZipFile object
with ZipFile(zipFileName, ‘w’) as zipObj:
for folderName, subfolders, filenames in os.walk(“C:\\Users\\SainiV01\\Documents\\copy”):
for filename in filenames:
# if filter(filename):
# create complete filepath of file in directory
filePath = os.path.join(folderName, filename)
# Add file to zip
zipObj.write(filePath)
zipFilesInDir(“C:\\Users\\SainiV01\\Documents\\copy”, ‘sampleDir.zip’, lambda name: ‘csv’ in name)
this code is not working as per the expection…in this deirectory there are 2 files i want to keep those files as a sampleDir.zip zip file.
some could please help what i did wrong here
Hi Vinay,
In the function zipFilesInDir(), while adding file in zip using write() function, we need to pass the arcname also i.e.
zipObj.write(filePath, basename(filePath))
Actually we were adding file in zip with complete path name, that was causing the issue. In examples above I used only files in local directory, therefore didn’t encountered this issue earlier.
I have updated the code above. It should work fine now.
Thanks,
Varun