Turning Python scripts into Python programs

TLDR: Start your Python scripts by writing if __name__ == '__main__':, because it will encourage you to think about how the script is organized.

Python is a powerful tool for engineers. Duck-typing with a massive standard library minimizes development time and significantly eases the learning curve. The low barrier to entry is both Python's most important strength as well as it's most noticeable weakness; a lot of Python code is written without consideration for organization and maintainability. This results in a lot of useful and frequently-run code being distributed across dozens of disorganized and unmaintainable single-file scripts.

The problem lies in the tendency to write scripts, rather than programs. The distinction between the two isn't well-defined, and I'm not about to provide my own hard and fast definitions here. For the purposes of this article, let's say that a script is code whose logic is not grouped into classes and functions, whose entry point is implicitly at the start of the file, and whose execution is understood as being strictly top-down. This is the result of code being written by engineers who understand the problem domain better than anyone else, but who don't consider the long-term (or often short-term) problems of unmaintainability.

I don't want to bash scripts (ha). Like anything else, there's times when a script is the right tool for the right job. The problem arises when a script is used often and its scope gradually increases, without the script's structure ever changing. Inevitably this leads to namespace pollution and towers of nested conditionals, making each change exponentially harder. Even worse, scope increases as a function of dependence on the script; almost by definition, the scripts you rely on the most are the ones that are hardest to change. Before too long, the only fix is to scrap it and start again.

The robust solution is to prevent anyone from writing code until they've received 5 ECTS credits in system development. No obvious flaws in that plan (and I'm sure that nobody with a semester's CS education has ever written spaghetti code). The much less expensive 80% solution is to encourage better practice in code organization right from the outset. And with Python, there's a single line of code that nudges the programmer in the right direction.

So, what is this magic line of code? Behold:

if __name__ == '__main__':

This is the Python idiom for an explicit entry point, for those who haven't seen this before. The conditional is true when the file itself is run (not imported as a module). If you are unlucky enough to have used Java, then you can think of it as the Python equivalent of public static void main(String[] args). While code is still interpreted top-down, the idiom explicitly states "this is where the real work is done".

By providing a clear entry point for the script, the programmer is made to think about what all the code above it does. It's an obvious place for a main() function to be called, and thus an opportunity to write one. Code is moved into the new function, and already the namespace is being cleared up. Once one function has been created, it's easier to start creating more. Distinct blocks of logic are being organized; the "parse csv" logic is no longer aligned with the "sum columns" logic, but is instead under the parse_csv(filename) function. Making small changes doesn't require digging through the script line by line, but instead by identifying the relevant function and making changes there. The structure of the code is no longer a left-right top-down monolith, but instead a single productive function with the tools necessary for its execution. And now that the structure has fundamentally changed, other programming paradigms become available. Implementing classes and separating things into discrete modules suddenly becomes natural.

I'm not going to claim that if __name__ == '__main__' is a panacea for bad code (that, of course, would be Nothing). But it is the best bang for your buck that a line of Python can give you. Almost any good Python program will include it, and it's a clear place to start any project. If nothing else, it will force you to ask yourself, "What should I put here?"

The Benevolent Dictator for Life himself wrote an article titled Python main() functions, including how to handle command-line arguments and error codes.

← Read my other thoughts | Written 2020-08-17 | License