Ch. 11 Modules (Python3)
Built-In Module
A built-in module is a module that comes pre-installed with Python; examples of built-in modules include sys, time, and math.
File being executed as both a script and an import module
A file can better support being executed as both a script and an importable module by utilizing the __name__ special name. __name__ is a global string variable automatically added to every module that contains the name of the module; for example: my_funcs.__name__ would have the value "my_funcs", and google_search.__name__ would have the value "google_search". (Note that __name__ has two underscores before name, and two underscores after). However, the value of __name__ for the executing script is always set to "__main__", to differentiate the script from imported modules. The following comparison can be performed: See image.
Module
A module is a file containing Python code that can be imported and used by scripts, other modules, or the interactive interpreter.
Executing modules as scripts
An import statement executes the code contained within the imported module. Thus, any statements in the global scope of a module, like printing or getting user input, will be executed when that module is imported. Execution of those statements may be an unintended side effect of the import. Commonly a programmer wants to treat a Python file as both a script executed by the interpreter, and as an importable module. When used as an importable module, the file should not produce side effects when imported. See image.
Importing a Module (2/3)
Evaluating an import statement initiates the following process to load the module: 1. A check is conducted to determine whether the module has already been imported. If already imported, then the loaded module is used. 2. If not already imported, a new module object is created and inserted in sys.modules 3. The code in the module is executed in the new module object's namespace.
Python Standard Library (cntd.)
Examples of standard library module usage is provided below. See image.
Finding Modules (2/ )
For simple programs, a module might simply be placed in the same directory. Larger projects might contain tens or hundreds of modules, or use third-party modules located in different directories. In such cases, a programmer might set the environmental variable PYTHONPATH in the operating system. An operating system environmental variable is much like a variable in a Python script, except that an environmental variable is stored by the computer's operating system and can be accessed by every program running on the computer. In Windows, a user can set the value of PYTHONPATH permanently through the control panel, or temporarily on a single instance of a command terminal (cmd.exe) using the command set: PYTHONPATH="c:\dir1;c:\other\directory".
Place import statements at top of file
Good practice is to place import statements at the top of a file. There are few useful instances of placing import statements in any location other than the top. The benefit of placing import statements at the top is that a reader of the program can quickly identify the modules required for the program to run. A module being required by another program is often called a dependency.
Finding Modules (1/ )
Importing a module begins a search to find the corresponding file on the computer's file system. The interpreter first checks for a matching built-in module. If no matching built-in module is found, then the interpreter searches the list of directories contained by sys.path, located in the sys module. Note / Rule: A programmer must be careful to not give a name to a module that is already used by a built-in module. In such cases, the interpreter would load the built-in module because built-in names are checked first.
Reloading modules doesn't affect attributes imported using "from"
Importing attributes directly using "from", and then reloading the corresponding module, will not update the imported attributes. See image. Reloading modules is typically useful in long-running programs, when restarting and initializing the entire program may be an expensive operation. A common scenario is a web server that is communicating with multiple clients on the internet. Instead of restarting the server and disconnecting all of the clients, a single module can be reloaded dynamically as the server runs.
Packages
Instead of importing a single module at a time, an entire directory of modules can be imported all at once. A package is a directory that, when imported, gives access to all of the modules stored in the directory. Large projects are often organized using packages to group related modules. See image.
Python Standard Library
Python includes by default a collection of modules that can be imported into new programs. The Python standard library includes various utilities and tools for performing common program behaviors. For example, the math module provides progressmathematical functions, the datetime module provides date and calendar capabilities, the random module can produce random numbers, the sqlite3 module can be used to connect to SQL databases, and so on. Before starting any new project, good practice is to research what is available in the standard library, or on the internet, to help complete the task. See image.
A few standard Library Modules
See image.
Using the random module
See image.
Importing Packages (cntd.)
See image. Using packages helps to avoid module name collisions. For example, consider if another package called 3DGraphics also contained a module called canvas.py. Though both modules share a name, they are differentiated by the package that contains them, i.e., ASCIIArt.canvas is different from 3DGraphics.canvas. A programmer should take care when using the from technique of importing. A common error is to overwrite an imported module with another package's identically named module.
Reloading modules and reload() function
Sometimes a Python program imports a module, but then the source code of the imported module needs to be changed. Since modules are executed only once when imported, changing the module's source does not immediately affect the running Python instance. Instead of restarting the entire Python program, the reload() function can be used to reload and re-execute the changed module. The reload function is located in the imp standard library module. See image.
File being executed as both a script and an import module (cntd.)
The contents of the branch typically include a user interface to functions or class definitions within the file. A user can execute the file as a script and interact with the user interface, or another script can import the file just to use the definitions. The google_search.py file is modified below to fix the unintentional search. See image. The google_search.py file has been modified to compare __name__ to "__main__". When the file is executed as a script, a single search request is made and the results are displayed. Executing domain_freq.py imports google_search, which now does not perform the initial search because __name__ is equal to "google_search".
Script
The interactive Python interpreter provides the most basic way to execute Python code. However, all of the defined variables, functions, classes, etc., are lost when a programmer closes the interpreter. Thus, a programmer will typically write Python code in a file, and then pass that file as input to the interpreter. Such a file is called a script.
hashlib module and secure hash
The program below imports names from the hashlib module, a Python standard library module that contains a number of algorithms for creating a secure hash of a text message. A secure hash correlates exactly to a single series of characters. A sender of an email might create and send a secure hash along with the contents of the message. The email's recipient creates their own secure hash from the message contents and compares to the received hash to detect if any the message was changed. See image.
sys.path contains:
The sys.path variable initially contains the following directories: 1. The directory of the executing script. 2. A list of directories specified by the environmental variable PYTHONPATH. 3. The directory where Python is installed (typically C:\Python27 or similar on Windows).
Module rules
A module's filename should end with ".py"; otherwise, the interpreter will not be able to import the module. The module_name item should match the filename of the module, but without the .py extension. For example, if a programmer wants to import a module whose filename is HTTPServer.py, the import statement import HTTPServer would be used. Note that import statements are case-sensitive; thus, import ABC is distinct from import aBc. The interpreter must also be able to find the module to import. The simplest solution is to keep modules in the same directory as the executing script; however, the interpreter can also search the computer's file system for the modules. Later material covers these search mechanisms.
Normal import statement vs. "from" import statement
A normal import statement, such as import HTTPServer, adds the new module into the global namespace, after which a programmer can access names through attribute reference operations (e.g., HTTPServer.address). In contrast, using "from" adds only the specified names. A statement such as from HTTPServer import address copies only the address variable from HTTPServer into the importing module's namespace. The following animation illustrates. See image.
Import specific names from a module
A programmer can specify names to import from a module by using the from keyword in an import statement: See image.
Modules help with Scripts
A programmer may find themselves writing the same function over and over again in multiple scripts, or creating very long and difficult to maintain scripts . A solution is to use a module, which is a file containing Python code that can be imported and used by scripts, other modules, or the interactive interpreter.
Importing a Module (1/3)
To import a module means to execute the code contained by the module, and make the definitions within that module available for use by the importing program.
Importing Packages
To import a package, a programmer writes an import statement and gives the name of the directory where the package is located. To indicate that a directory is a package, the directory must include a file called __init__.py. The __init__.py file often is empty, but may contain code to initialize the package when imported. The interpreter searches for the package in the directories listed in sys.path. See image.
Normal import statement vs. "from" import statement (cntd.)
Using "from" changes how an imported name is used in a program. See image.
Importing a Module (3/3)
When importing a module, the interpreter first checks to see if that module has already been imported. A dictionary of the loaded modules is stored in sys.modules (available from the sys standard library module). If the module has not yet been loaded, then a new module object is created. A module object is simply a namespace that contains definitions from the module. If the module has already been loaded, then the existing module object is used. If a module is not found in sys.modules, then the module is added and the statements within the module's code are executed. Definitions in the module's code, e.g., variable assignments and function definitions, are placed in the module's namespace. The module is then added to the importing script or module's namespace, so that the importer can access the definitions.