It’s common to create small scripts which we want to combine into a larger script. We don’t want to copy and paste the code. We want to leave the working code in one file and use it in multiple places. Often we want to combine elements from multiple files to create more sophisticated scripts.
The problem we have is that when we import a script it actually starts running. This is generally not what we expect when we import a script so that we can reuse it. How can we import the functions (or classes) from a file without having the script start doing something?
Let’s say that we have a handy implementation of the haversine distance function called haversine()
, and it’s in a file named script.py
.
Initially, the file might look like this:
import csv
import pathlib
from math import radians, sin, cos, sqrt, asin
from functools import partial
MI= 3959
NM= 3440
KM= 6373
def haversine( lat_1: float, lon_1: float,
lat_2: float, lon_2: float, *, R: float ) -> float:
... and more ...
nm_haversine = partial(haversine, R=NM)
source_path = pathlib.Path("waypoints.csv")
with source_path.open() as source_file:
reader= csv.DictReader(source_file)
start = next(reader)
for point in reader:
d = nm_haversine(
float(start['lat']), float(start['lon']),
float(point['lat']), float(point['lon'])
)
print(start, point, d)
start= point
We’ve omitted the body of the haversine()
function, showing only … and more…, since the full source is available on github. We’ve focused on the context in which the function is in a Python script that also opens a file, wapypoints.csv
, and does some processing on that file.
How can we import this module without it printing a display of distances between waypoints in our waypoints.csv
file?
Python scripts can be simple to write. Indeed, it’s often too simple to create a working script. Here’s how we transform a simple script into a reusable library:
1) Identify the statements that do the work of the script: we’ll distinguish between definition and action. Statements such as import
, def
, and class
are clearly definitional—they support the work but they don’t do the work. Almost all other statements take action. In our example, we have four assignment statements that are more definition than action. The distinction is entirely one of intent. All statements, by definition, take an action. These actions, though, are more like the action of the def statement than they are like the action of the with
statement later in the script. Here are the generally definitional statements:
MI= 3959
NM= 3440
KM= 6373
def haversine( lat_1: float, lon_1: float,
lat_2: float, lon_2: float, *, R: float ) -> float:
... and more ...
nm_haversine = partial(haversine, R=NM)
The rest of the statements clearly take an action toward producing the printed results.
2) Wrap the actions into a function:
def analyze():
source_path = pathlib.Path("waypoints.csv")
with source_path.open() as source_file:
reader= csv.DictReader(source_file)
start = next(reader)
for point in reader:
d = nm_haversine(
float(start['lat']), float(start['lon']),
float(point['lat']), float(point['lon'])
)
print(start, point, d)
start= point
3) Where possible, extract literals and make them into parameters. This is often a simple movement of the literal to a parameter with a default value. From this:
def analyze():
source_path = pathlib.Path("waypoints.csv")
To this:
def analyze(source_name="waypoints.csv"):
source_path = pathlib.Path(source_name)
4) Include the following as the only high-level action statements in the script file:
if __name__ == "__main__":
analyze()
We’ve packaged the action of the script as a function. The top-level action script is now wrapped in an if statement so that it isn’t executed during import.
The most important rule for Python is that an import
of a module is essentially the same as running the module as a script. The statements in the file are executed in order from top to bottom.
When we import a file, we’re generally interested in executing the def
and class
statements. We might be interested in some assignment statements.
When Python runs a script, it sets a number of built-in special variables. One of these is __name__
. This variable has two different values, depending on the context in which the file is being executed:
The top-level script, executed from the command line: In this case, the value of the built-in special name of __name__
is set to __main__
.
A file being executed because of an import statement: In this case, the value of __name__
is the name of the module being created.
The standard name of __main__
may seem a little odd at first. Why not use the filename in all cases? This special name is assigned because a Python script can be read from one of many sources. It can be a file. Python can also be read from the stdin
pipeline, or it can be provided on the Python command line using the -c
option.
When a file is being imported, however, the value of __name__
is set to the name of the module. It will not be __main__
. In our example, the value __name__
during import processing will be script
.
We can now build useful work around a reusable library. We might make several files that look like this:
File trip_1.py
:
from script import analyze
analyze('trip_1.csv')
Or perhaps something even more complex:
File all_trips.py
:
from script import analyze
for trip in 'trip_1.csv', 'trip_2.csv':
analyze(trip)
The goal is to decompose a practical solution into two collections of features:
To get to this goal, we’ll often start with a script that conflates both sets of features. This kind of script can be viewed as a spike solution. Our spike solution should evolve towards a more refined solution as soon as we’re sure that it works.
A spike or piton is a piece of removable mountain-climbing gear that doesn’t get us any higher on the route, but it enables us to climb safely.
Help us improve this content by editing this page on GitHub