如何在 Python 中列出目录中的文件

摘要：在本教程中，您将学习如何使用 Python 的 os.walk() 函数列出目录中的文件。

有时，您可能需要列出目录中的所有文件以进行处理。例如，您可能想要找到目录中的所有图像并调整它们的大小。要列出目录中的所有文件，您可以使用 os.walk() 函数。

os.walk() 函数通过自上而下或自下而上遍历目录树来生成目录中的文件名os.walk() 函数为目录树中的每个目录生成一个包含三个字段dirpathdirnames 和 filenames）的元组。

请注意os.walk() 函数会检查整个目录树。因此，您可以使用它来获取根目录下所有目录及其子目录中的所有文件。

Python 列出文件示例

假设您有一个文件夹 D:\web，其中包含以下目录和文件：

D:\web
├── assets
|  ├── css
|  |  └── style.css
|  └── js
|     └── app.js
├── blog
|  ├── read-file.html
|  └── write-file.html
├── about.html
├── contact.html
└── index.html

以下示例展示了如何使用 os.walk() 函数列出 D:\web 目录中的所有 HTML 文件：

import os


path = 'D:\\web'

html_files = []

for dirpath, dirnames, filenames in os.walk(path):
    for filename in filenames:
        if filename.endswith('.html'):
            html_files.append(os.path.join(dirpath, filename))

for html_file in html_files:
    print(html_file)

输出：

D:\web\about.html
D:\web\contact.html
D:\web\index.html
D:\web\blog\read-file.html
D:\web\blog\write-file.html

它的工作原理：

首先，初始化一个列表以存储 HTML 文件的路径：

html_files = []

第二，调用 os.walk() 函数来检查 D:\web 文件夹中的目录：

for dirpath, dirnames, filenames in os.walk(path):

dirpath 存储目录，而 filenames 存储该目录中的文件。

第三，遍历 filenames，如果文件扩展名为 .html，则将它们添加到 html_files 列表中：

# ...
for filename in filenames:
        if filename.endswith('.html'):
            html_files.append(os.path.join(dirpath, filename))

注意os.path.join() 通过将 dirpath 与 filename 连接起来返回文件的完整路径。

最后，打印输出 html_files 列表中的文件名：

for html_file in html_files:
    print(html_file)

定义可重用的列出文件函数

通过使用 os.walk() 函数，我们可以定义一个可重用的 list_files() 函数，如下所示：

import os


def list_files(path, extentions=None):
    """ List all files in a directory specified by path
    Args:
        path - the root directory path
        extensions - a iterator of file extensions to include, pass None to get all files.
    Returns:
        A list of files specified by extensions
    """
    filepaths = []
    for root, _, files in os.walk(path):
        for file in files:
            if extentions is None:
                filepaths.append(os.path.join(root, file))
            else:
                for ext in extentions:
                    if file.endswith(ext):
                        filepaths.append(os.path.join(root, file))

    return filepaths


if __name__ == '__main__':
    filepaths = list_files(r'D:\web', ('.html', '.css'))
    for filepath in filepaths:
        print(filepath)

输出：

D:\web\about.html
D:\web\contact.html
D:\web\index.html
D:\web\assets\css\style.css
D:\web\blog\read-file.html
D:\web\blog\write-file.html

使列出文件函数更高效

如果文件数量较少list_files() 函数可以正常工作。然而，当文件数量很大时，返回一个大型文件列表并不是内存高效的。

为了解决这个问题，您可以使用生成器来逐个生成文件，而不是返回一个列表。

import os


def list_files(path, extentions=None):
    """ List all files in a directory specified by path
    Args:
        path - the root directory path
        extensions - a iterator of file extensions to include, pass None to get all files.
    Returns:
        A list of files specified by extensions
    """
    for root, _, files in os.walk(path):
        for file in files:
            if extentions is None:
                yield os.path.join(root, file)
            else:
                for ext in extentions:
                    if file.endswith(ext):
                        yield os.path.join(root, file)


if __name__ == '__main__':
    filepaths = list_files(r'D:\web', ('.html', '.css'))
    for filepath in filepaths:
        print(filepath)

总结

使用 os.walk() 函数递归地列出目录中的文件。
使用 os.walk() 函数定义一个可重用的函数来列出目录中的文件。

Menu

Share

如何在 Python 中列出目录中的文件

Python 列出文件示例

定义可重用的列出文件函数

使列出文件函数更高效

总结

Comment

Python变量

使用Reduce() 函数将列表归约（简化）为单个值

Python语法

Python赋值运算符

Python算数运算符

Python函数的关键字参数

Python比较运算符

Python数字类型

Python字符串类型

Python面向对象编程