Scott's Blog

学则不固, 知则不惑

0%

Python 浅拷贝与深拷贝

这篇文章以 list,dict,class 为例,带你了解 Python 中的浅拷贝与深拷贝。

变量与指针

在理解下面的例子之前,我们需要对 Python 中的变量有一个基本的理解。

对变量赋值,在其他语言中,我们可能理解为开辟一个区域(理解为容器、或者是一个篮子),再将值放进去。

1
2
// C 语言:在内存中开辟一个区域,将值放进去
int x 4;

在 Python 中,写法是一样的,但是最好的理解是将一个指针指向了这个区域:

1
2
3
# Python:定义了一个指针 x,指向了内存中
# 这个包含 4 的容器的地方
x = 4

因为 Python 变量相当于一个指针,指向这个内存区域,所以无需提前声明好变量是什么类型,指针中间也可以指向别的值,这也是人们说的 Python 的动态类型。

所以你可以这么做:

1
2
3
x = 1         # x is an integer
x = 'hello' # now x is a string
x = [1, 2, 3] # now x is a list

但是动态类型也有缺点,那就是如果两个变量名指向同一个数据,那么一个变量所做的修改,另一个变量所指向的值也会发生变化。

下面来看具体的例子:

以 list 为例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
# ---------------- 浅拷贝 ------------------ #
# 构造一个列表
print("--------- New List XS, YS")
xs = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
ys = list(xs) # Make a shallow copy
print("xs:" , xs)
print("ys:", ys)

# 对 xs 添加元素: 不会影响 ys
print("--------- Append Element To XS")
xs.append(['I am new'])
print("xs:" , xs)
print("ys:", ys)

# 对 xs 修改元素:两者都会影响
print("--------- Modified To XS")
xs[1][2] = 'x'
print("xs:" , xs)
print("ys:", ys)

# 对 yx 修改元素:两者会影响
print("--------- Modified To YS")
ys[0][0] = 'y'
print("xs:" , xs)
print("ys:", ys)

# -------------------------- Deep Copy --------------- #
import copy
zs = copy.deepcopy(xs)
print("--------- Deep Copy ZS from XS")
print("xs:", xs)
print("zs:", zs)

# 对 zs 修改元素,不会影响 xz
print("--------- Modified To ZS")
zs[2][0] = 'z'
print("xs:" , xs)
print("zs:", zs)

# 对 xs 修改元素,也不会影响 xs
print("--------- Modified To XS")
xs[2][1] = 'z'
print("xs:", xs)
print("zs:", zs)

上面代码的输出:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
--------- New List XS, YS
xs: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
ys: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

--------- Append Element To XS
xs: [[1, 2, 3], [4, 5, 6], [7, 8, 9], ['I am new']]
ys: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
--------- Modified To XS
xs: [[1, 2, 3], [4, 5, 'x'], [7, 8, 9], ['I am new']]
ys: [[1, 2, 3], [4, 5, 'x'], [7, 8, 9]]
--------- Modified To YS
xs: [['y', 2, 3], [4, 5, 'x'], [7, 8, 9], ['I am new']]
ys: [['y', 2, 3], [4, 5, 'x'], [7, 8, 9]]

--------- Deep Copy ZS from XS
xs: [['y', 2, 3], [4, 5, 'x'], [7, 8, 9], ['I am new']]
zs: [['y', 2, 3], [4, 5, 'x'], [7, 8, 9], ['I am new']]
--------- Modified To ZS
xs: [['y', 2, 3], [4, 5, 'x'], [7, 8, 9], ['I am new']]
zs: [['y', 2, 3], [4, 5, 'x'], ['z', 8, 9], ['I am new']]
--------- Modified To XS
xs: [['y', 2, 3], [4, 5, 'x'], [7, 'z', 9], ['I am new']]
zs: [['y', 2, 3], [4, 5, 'x'], ['z', 8, 9], ['I am new']]

以 dict 为例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
# 新建一个 dict,用来演示深拷贝与浅拷贝
d = {
'name1': {"scott zhang": 1},
'name2': {"the weeknd": 2}
}
print("A new dict:", d)

# 查看 d 中 name1 的 id
d_id = id(d['name1'])
print(d_id)

# 浅拷贝,更详细的说是
# 将 d['name1'] 所在内存区域的地址绑定给 d_copy1
# 摘自 realpython:
# (Assignment statements in Python do not create copies of objects
# they only bind names to an object.)
# 浅拷贝内部值只是指针, 所以对 d_copy1 内部 name1 字典内的修改,会将 d 中的值也修改
d_copy1 = d.copy()

# 演示
# 对 copy 过来的 dict 修改,再打印原来的数据
d_copy1['name1']['scott zhang'] = 0
print(d)


# 为什么?
# 因为其实内部指向的是同一个 dict
print(
id(d_copy1['name1']) == id(d['name1'])
)

# 为什么要指向同一个 dict?
# 因为浅拷贝指向的是指针,深拷贝会将数据重新赋值,非常耗费系统资源。

上面代码的输出:

1
2
3
4
A new dict: {'name1': {'scott zhang': 1}, 'name2': {'the weeknd': 2}}
1901449775104
{'name1': {'scott zhang': 0}, 'name2': {'the weeknd': 2}}
True

以 class 为例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
import copy

# A Point
class Point:
def __init__(self, x, y):
self.x = x
self.y = y

def __repr__(self):
return f'Point({self.x!r}, {self.y!r})'


# A Rectagle
class Rectangle:
def __init__(self, topleft, bottomright):
self.topleft = topleft
self.bottomright = bottomright

def __repr__(self):
return (f'Rectangle({self.topleft!r}, '
f'{self.bottomright!r})')


if __name__ == '__main__':
# Make a point and it copy obj
print("---- Point")
point = Point(0, 0)
point_copy = copy.copy(point)
print("point:", point)
print("point copy:", point_copy)

# point is point_copy?
# no, because class Point using int in x and y
# int is immutable type in python
# so there is no difference between deep/shallow copy for Point class
print("point is point copy?", point is point_copy)

# modified point
print("---- Modified Point")
point.x = 1
print("point:", point)
print("point copy:", point_copy)

# 如果是 Rectangle 内部的 Pointer 呢?
print("---- Rectangle")

# new rectangle and a copy from it
rect = Rectangle(topleft=Point(0, 1), bottomright=Point(1, 0))
rect_copy = copy.copy(rect)
print("rect:", rect)
print("rect copy:", rect_copy)

# rect is rect_copy?
print("rect_copy is rect?", rect_copy is rect)
print("---- Modified rect")
rect.topleft.x = 999
print("rect:", rect)
print("rect copy:", rect_copy)

# using deep copy
print("---- New Deep Copy Rect")
rect_copy_deep = copy.deepcopy(rect)
print("rect", rect)
print("rect copy deep:", rect_copy_deep)

# modified a deep copy obj
print("---- Modified Deep Copy Rect")
rect.bottomright.x = 666
print("rect", rect)
print("rect copy deep:", rect_copy_deep)

上面代码的输出:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
---- Point
point: Point(0, 0)
point copy: Point(0, 0)
point is point copy? False
---- Modified Point
point: Point(1, 0)
point copy: Point(0, 0)

---- Rectangle
rect: Rectangle(Point(0, 1), Point(1, 0))
rect copy: Rectangle(Point(0, 1), Point(1, 0))
rect_copy is rect? False
---- Modified rect
rect: Rectangle(Point(999, 1), Point(1, 0))
rect copy: Rectangle(Point(999, 1), Point(1, 0))
---- New Deep Copy Rect
rect Rectangle(Point(999, 1), Point(1, 0))
rect copy deep: Rectangle(Point(999, 1), Point(1, 0))
---- Modified Deep Copy Rect
rect Rectangle(Point(999, 1), Point(666, 0))
rect copy deep: Rectangle(Point(999, 1), Point(1, 0))

总结

  • 浅拷贝(shallow copy)并不会克隆对象中的子对象,因此原对象和对象中,被拷贝的内部对象不是独立两份,而是同一份。
  • 深拷贝(Deep copy)会递归的拷贝所有子对象,这保证了内部对象的独立性,但是这样速度很慢。
  • 你可以使用 copy 模块拷贝任意对象(包括你自定义的类),你还可以通过实现 __copy()____deepcopy()__ 来自定义拷贝的过程。