在说os.walk前,我们先在ipython3中help一下,打印如下
Help on function walk in module os:
walk(top, topdown=True, onerror=None, followlinks=False)
Directory tree generator.
For each directory in the directory tree rooted at top (including top
itself, but excluding '.' and '..'), yields a 3-tuple,dirpath, dirnames, filenames
dirpath is a string, the path to the directory. dirnames is a list of
the names of the subdirectories in dirpath (excluding '.' and '..').
filenames is a list of the names of the non-directory files in dirpath.
Note that the names in the lists are just names, with no path components.
To get a full path (which begins with top) to a file or directory in
dirpath, do os.path.join(dirpath, name).
If optional arg 'topdown' is true or not specified, the triple for a
directory is generated before the triples for any of its subdirectories
(directories are generated top down). If topdown is false, the triple
for a directory is generated after the triples for all of its
subdirectories (directories are generated bottom up).
函数声明如下
os.walk(top, topdown=True, onerror=None, followlinks=False)
os.walk返回一个生成器,每次遍历返回的对象是一个元组,元组中包含三个元素:
dirpath:当前遍历的文件夹的路径,类型为字符串;
dirname:当前遍历的文件夹下的子文件夹的名字,类型为列表;
filenames:当前遍历的文件夹下的文件的名字,类型为列表;
参数说明:
top:需要遍历的目录
topdown:topdown为True则优先遍历顶层目录,topdown为False则优先遍历子目录
onerror:异常时的回调
followlinks:followlinks如果为真,则会遍历目录下的快捷方式实际所指的目录(默认关闭)
创建一个包含子文件夹和文件的walk文件夹,通过终端tree看一下目录结构:
walk
├── test1
│ ├── test1_1
│ │ ├── test1_1_1
│ │ └── test1_1_a.py
│ ├── test1_2
│ │ └── test1_2_1
│ ├── test1_3
│ ├── test1_a.py
│ └── test1_b.py
├── test2
│ ├── test2_1
│ └── test2_a.py
├── test3
└── test.py
我们通过for循环遍历os.walk()的返回结果
import os
for root, dirs, files in os.walk("/home/python/walk",True):
print("root:%s"%root)
print("dirs:%s"%dirs)
print("files:%s"%files)
print("-------------------------------")
运行结果如下:
root:/home/python/walk
dirs:['test2', 'test1', 'test3']
files:['test.py']
-------------------------------
root:/home/python/walk/test2
dirs:['test2_1']
files:['test2_a.py']
-------------------------------
root:/home/python/walk/test2/test2_1
dirs:[]
files:[]
-------------------------------
root:/home/python/walk/test1
dirs:['test1_2', 'test1_3', 'test1_1']
files:['test1_a.py', 'test1_b.py']
-------------------------------
root:/home/python/walk/test1/test1_2
dirs:['test1_2_1']
files:[]
-------------------------------
root:/home/python/walk/test1/test1_2/test1_2_1
dirs:[]
files:[]
-------------------------------
root:/home/python/walk/test1/test1_3
dirs:[]
files:[]
-------------------------------
root:/home/python/walk/test1/test1_1
dirs:['test1_1_1']
files:['test1_1_a.py']
-------------------------------
root:/home/python/walk/test1/test1_1/test1_1_1
dirs:[]
files:[]
-------------------------------
root:/home/python/walk/test3
dirs:[]
files:[]
-------------------------------
*注意,如果你传入的需要遍历的路径是一个相对路径的话,那么返回的dirpath即root也将是相对路径
使用os.walk()获取文件夹下的文件大小:
import os
from os.path import join,getsize
def getdirsize(dir):
size = 0
for root,dirs,files in os.walk(dir):
size += sum([getsize(join(root,name))for name in files])
return size