* Python: Copy and Deepcopy
*
* Date: 30.12.2023
*\
In the *last article I talked about mutability and the notion of variables in Python. It turned out that it can be problematic to pass variables, that are references to mutable objects, to other program logic and that under certain circumstances you can resolve these problems by copying the mutable object. Let's have a deeper look at mutable objects, the nature of variables in Python and the copy functions, that the Python standard library provides.
Mutable and immutable objects in Python can have other mutable or immutable objects as elements. Consider for example the two-dimensional list
which is obviously a mutable object containing mutable objects. Or another interesting object, we will going to cover, the tuple
which is an immutable object, that contains another immutable object and a mutable one. As you *know that "a" is mutable, it might not surprise you that you can change the contents of the list as you wish
>>> a[1] = (1,2,3)
>>> a
[2, (1, 2, 3)]
and the fact that the elements of "b" cannot be changed will also fulfill your expectations of the behavior of an immutable object:
Traceback (most recent call last):
File "
TypeError: 'tuple' object does not support item assignment
>>> b[1] = (4,5,6)
Traceback (most recent call last):
File "
TypeError: 'tuple' object does not support item assignment
We can therefore conclude that the elements of the tuple are constant references (that is references, whose target objects cannot be changed), while the elements of the list are variable references (which can be assigned new or different objects at any time). Consequently, the situation after the original definitions can be visualized as in Figure 1.

Figure 1 shows how the involved mutable and immutable objects are referenced in our two-dimensional variables "a" and "b". Here, "a" is a reference containing variable references (being a list) to two separate list objects and "b" contains constant references (being a tuple) to a tuple and a list. The variable references are indicated as solid lines, and the constant references as dashed ones.
Note that the fact that a reference is constant does not mean, that the referenced object is immutable: It is still possible to say
>>> b
((1, 2, 3), [4, 0, 6])
which would change the lower list object containing [4, 5, 6] in Figure 1 to [4, 0, 6], even though the list object itself can not be exchanged with another object in the tuple referenced by "b".
In order to use "copy", first you have to import the function from the "copy" module, which we will do as
Suppose now, you wish to use the built-in "copy" function to create a copy "c" of the variable "a" as defined in Listing 1. You would do this via
>>> c
[[1, 2, 3], [4, 5, 6]]
As in the *last article, you can use the operator "is" to check the relationships between the elements of "a" and "c". You will find out that the lists "a" and "c" are distinct
False
and that their elements are equivalent:
[[1, 2, 3], [4, 5, 6]]
>>> c
[[1, 2, 3], [4, 5, 6]]
However, when checking the elements, there is a surprise lurking. Indeed, the elements of the lists are the same objects!
True
>>> a[1] is c[1]
True
The situation you created can be summarized as in Figure 2. While the variables "a" and "c" reference separate lists, these lists also have elements, that would need to be copied. These elements are not copied, because the function "copy" does not work recursively; instead, the "copy" function creates what is called a "shallow copy": It inserts references to the original objects, that make up "a", wherever possible.

This can be confirmed by changing one of the elements of "a" and observing how the corresponding element of "c" behaves:
>>> a
[[1, 0, 3], [4, 5, 6]]
>>> c
[[1, 0, 3], [4, 5, 6]]
In Listing 4, the object referenced by the element 0 of the list "a" is changed, which you can do because that element is a list itself and thus mutable. After checking that the element 0 of "a" really changed from [1, 2, 3] to [1, 0, 3] we retrieve the value for "c" and find out that it indeed changed together with "a". This behavior might make sense as a memory and processing power saving mechanism in certain situations, but can be quite dangerous in others. Note that you would get the same effect by changing the referenced object through "c" instead of "a"!
Further note, that the situation would be similar, if the objects referenced by the elements of "a" were immutable (e.g. tuples). In this case, however, the immutability of the objects would prevent the effect shown in Listing 4, as it would be impossible to change them, but they would have to be replaced with a new object and a new reference altogether.
In case you want to avoid this effect while programming an application and you are sure have memory and processing power to spare you can consider to use "deepcopy" instead of "copy".
The function "deepcopy" is another function, that is available in the "copy" module and can be imported exactly as shown in Listing 3.
It works similar to "copy", but differs in the important behavioral detail, that it creates true copies of all the layers of the target object recursively instead of keeping references to the original. In our example this means that "deepcopy" starts at the list referenced by "a" and makes a copy, that the new variable "c" will reference. While "copy" would insert references to the objects of "a" at this point, the "deepcopy" function continues with the second layer of the hierarchy. It copies the lists referenced by the elements of "a" and inserts references to the newly created and identical objects as elements of "c". This process continues with the objects referenced by the most recently copied objects until there is nothing left to copy. Similar to a shallow copy, you can create a deep copy "d" of "a" as defined in Listing 1 via
>>> d
[[1, 2, 3], [4, 5, 6]]
We can now repeat the experiment from Listing 4 and will find out that
>>> a
[[1, 0, 3], [4, 5, 6]]
>>> d
[[1, 2, 3], [4, 5, 6]]
which means that after using "deepcopy", "d" is untouched by changes to "a", which stands in contrast to Listing 4. The situation resulting from the usage of "deepcopy" to create "d" from "a" is depicted in Figure 3.

Figure 3 again indicates that the two lists are now entirely separate, as "deepcopy" copies the whole hierarchy of "a".
We have seen that when copying an object that contains other objects, "copy" inserts references to the original objects wherever possible, while "deepcopy" copies all the referenced objects recursively. When considering the recursive nature of the copying process, that is implemented in "deepcopy" you can imagine that "deepcopy" takes up more memory and uses more processing resources than "copy". This can become a problem, if the hierarchy to be copied is relatively deep, but on the other hand, it can make the program execution significantly safer, if this is not the case. As a consequence, you have to choose well which of the two functions to use in your specific use case and consider the related tradeoff.