I encountered a kind of variable scoping that I did not expect while working on this pull request for the Poetry core packaging module. Poetry still supports Python 2.7 and several tests failed on Python 2.7 while I was working on it. However, all tests were passing on Python 3.5 and above.
The tests failed because of an
for file in include.elements: # type: List[Path] # omitted for brevity if file.is_dir(): if self.format in formats: for f in file.glob("**/*"): # type: List[Path] rel_path = f.relative_to(self._path) if ( rel_path not in set([f.path for f in to_add]) > and not f.is_dir() and not self.is_excluded(rel_path) ): AttributeError: BuildIncludeFile instance has no attribute 'is_dir'
The inner for-loop traverses the directory’s
Path objects with an ill-named variable called
f. In this for-loop, we check if
f is already in the list
to_add and if it is a directory and if
f is not excluded using its relative path,
to_add is a list of
BuildIncludeFile objects which themselves have the following attributes:
I used a list comprehension to transform
to_add to a
set1 and check if the traversed file is already in
to_add. In this list comprehension,
[f.path for f in to_add], I also used
f as a variable name. This was my unfortunate mistake.
f # type: BuildIncludeFile in the list comprehension shadows the outer for-loop
f # type: Path variable which caused the second condition in the if-statement,
if ... f.is_dir() to throw the
AttributeError in Python 2.7.
f continues to shadow the outer for-loop even once the list comprehension is iterated through and is the last
BuildIncludeFile instance in
to_add instead of the traversed file
Path from the outer for-loop. Because I expected a
Path instance for
f instead of a
BuildIncludeFile which does not actually have an
is_dir() method, it threw this
A much simpler example of this phenomenon in Python 2 is the following2:
>>> [x for x in range(10)] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> print(x) # Python 3 will throw a `NameError` on the other hand. 9
The fix for this variable scoping issue was easy. Renaming the variable in the outer scope from
f to something more descriptive like
current_file prevented any kind of unexpected scoping behaviour.
Python 3 handles variable scope by only temporarily shadowing the outer scope. According to Guido, the list comprehension’s variables “leaked” onto the outer scope because it was “an intentional compromise to make list comprehensions blindingly fast”. Guido called this Python’s dirty little secret.
It’s often said in the halls of colleges and between ping pong tables in startups that “there are only two hard things in Computer Science: cache invalidation and naming things”5. I never thought that the second of these two hard things, naming, would creep up on me as a variable in a list comprehension. 🤦♀️
setbecause I only needed the
BuildIncludeFile.pathand not the entire class itself. Had I not accessed just the paths of the
BuildIncludeFileobjects, then checking for membership of the file in a simple
set(to_add)would not have worked since they are of different types: