How do i get a list of files in a directory and subfolders in python?

Pretty simple solution would be to run a couple of sub process calls to export the files into CSV format:

import subprocess

# Global variables for directory being mapped

location = '.' # Enter the path here.
pattern = '*.py' # Use this if you want to only return certain filetypes
rootDir = location.rpartition['/'][-1]
outputFile = rootDir + '_directory_contents.csv'

# Find the requested data and export to CSV, specifying a pattern if needed.
find_cmd = 'find ' + location + ' -name ' + pattern +  ' -fprintf ' + outputFile + '  "%Y%M,%n,%u,%g,%s,%A+,%P\n"'[find_cmd, shell=True]

That command produces comma separated values that can be easily analyzed in Excel.


The resulting CSV file doesn't have a header row, but you can use a second command to add them.

# Add headers to the CSV
headers_cmd = 'sed -i.bak 1i"Permissions,Links,Owner,Group,Size,ModifiedTime,FilePath" ' + outputFile[headers_cmd, shell=True]

Depending on how much data you get back, you can massage it further using Pandas. Here are some things I found useful, especially if you're dealing with many levels of directories to look through.

Add these to your imports:

import numpy as np
import pandas as pd

Then add this to your code:

# Create DataFrame from the csv file created above.
df = pd.read_csv[outputFile]
# Format columns
# Get the filename and file extension from the filepath 
df['FileName'] = df['FilePath'].str.rsplit["/",1].str[-1]
df['FileExt'] = df['FileName'].str.rsplit['.',1].str[1]

# Get the full path to the files. If the path doesn't include a "/" it's the root directory
df['FullPath'] = df["FilePath"].str.rsplit["/",1].str[0]
df['FullPath'] = np.where[df['FullPath'].str.contains["/"], df['FullPath'], rootDir]

# Split the path into columns for the parent directory and its children
df['ParentDir'] = df['FullPath'].str.split["/",1].str[0]
df['SubDirs'] = df['FullPath'].str.split["/",1].str[1]
# Account for NaN returns, indicates the path is the root directory
df['SubDirs'] = np.where[df.SubDirs.str.contains['NaN'], '', df.SubDirs]

# Determine if the item is a directory or file.
df['Type'] = np.where[df['Permissions'].str.startswith['d'], 'Dir', 'File']

# Split the time stamp into date and time columns
df[['ModifiedDate', 'Time']] = df.ModifiedTime.str.rsplit['+', 1, expand=True]
df['Time'] = df['Time'].str.split['.'].str[0]

# Show only files, output includes paths so you don't necessarily need to display the individual directories.
df = df[df['Type'].str.contains['File']]

# Set columns to show and their order.
df=df[['FileName','ParentDir','SubDirs','FullPath','DocType','ModifiedDate','Time', 'Size']]

filesize=[] # Create an empty list to store file sizes to convert them to something more readable.

# Go through the items and convert the filesize from bytes to something more readable.
for items in df['Size'].items[]:
    df['Size'] = filesize 

# Send the data to an Excel workbook with sheets by parent directory
with pd.ExcelWriter["scripts_directory_contents.xlsx"] as writer:
    for directory, data in df.groupby['ParentDir']:
    data.to_excel[writer, sheet_name = directory, index=False] 

# To convert sizes to be more human readable
def convert_bytes[size]:
    for x in ['b', 'K', 'M', 'G', 'T']:
        if size < 1024:
            return "%3.1f %s" % [size, x]
        size /= 1024

    return size

In this article we will discuss different methods to generate a list of all files in directory tree.

Creating a list of files in directory and sub directories using os.listdir[]

Python’s os module provides a function to get the list of files or folder in a directory i.e.


It returns a list of all the files and sub directories in the given path.

We need to call this recursively for sub directories to create a complete list of files in given directory tree i.e.

    For the given path, get the List of all files in the directory tree 
def getListOfFiles[dirName]:
    # create a list of file and sub directories 
    # names in the given directory 
    listOfFile = os.listdir[dirName]
    allFiles = list[]
    # Iterate over all the entries
    for entry in listOfFile:
        # Create full path
        fullPath = os.path.join[dirName, entry]
        # If entry is a directory then get the list of files in this directory 
        if os.path.isdir[fullPath]:
            allFiles = allFiles + getListOfFiles[fullPath]
    return allFiles

Call the above function to create a list of files in a directory tree i.e.

dirName = '/home/varun/Downloads';

# Get the list of all files in directory tree at given path
listOfFiles = getListOfFiles[dirName]

Creating a list of files in directory and sub directories using os.walk[]

Python’s os module provides a function to iterate over a directory tree i.e.


It iterates of the directory tree at give path and for each directory or sub directory it returns a tuple containing,
[ , , .
Iterate over the directory tree and generate a list of all the files at given path,

# Get the list of all files in directory tree at given path
listOfFiles = list[]
for [dirpath, dirnames, filenames] in os.walk[dirName]:
    listOfFiles += [os.path.join[dirpath, file] for file in filenames]

Complete example is as follows,

import os

