Building reusable libraries - packages and namespaces

Previous lesson | Index | Next lesson

The previous lesson showed how the source command can be used to separate a program into multiple files, each responsible for a different area of functionality. This is a simple and useful technique for achieving modularity. However, there are a number of drawbacks to using the source command directly. Tcl provides a more powerful mechanism for handling reusable units of code called packages. A package is simply a bundle of files implementing some functionality, along with a name that identifies the package, and a version number that allows multiple versions of the same package to be present. A package can be a collection of Tcl scripts, or a binary library, or a combination of both. Binary libraries are not discussed in this tutorial.

Using packages

The package command provides the ability to use a package, compare package versions, and to register your own packages with an interpreter. A package is loaded by using the package require command and providing the package name and optionally a version number. The first time a script requires a package Tcl builds up a database of available packages and versions. It does this by searching for package index files in all of the directories listed in the tcl_pkgPath and auto_path global variables, as well as any subdirectories of those directories. Each package provides a file called pkgIndex.tcl that tells Tcl the names and versions of any packages in that directory, and how to load them if they are needed.

It is good style to start every script you create with a set of package require statements to load any packages required. This serves two purposes: making sure that any missing requirements are identified as soon as possible; and, clearly documenting the dependencies that your code has. Tcl and Tk are both made available as packages and it is a good idea to explicitly require them in your scripts even if they are already loaded as this makes your scripts more portable and documents the version requirements of your script.

Creating a package

There are three steps involved in creating a package:

Adding a package provide statement to your script.
Creating a pkgIndex.tcl file.
Installing the package where it can be found by Tcl.

The first step is to add a package provide statement to your script. It is good style to place this statement at the top of your script. The package provide command tells Tcl the name of your package and the version being provided.

The next step is to create a pkgIndex.tcl file. This file tells Tcl how to load your package. In essence the index file is simply a Tcl file which is loaded into the interpreter when Tcl searches for packages. It should use the package ifneeded command register a script which will load the package when it is required. The pkgIndex.tcl file is evaluated globally in the interpreter when Tcl first searches for any package. For this reason it is very bad style for an index script to do anything other than tell Tcl how to load a package; index scripts should not define procs, require packages, or perform any other action which may affect the state of the interpreter.

The simplest way to create a pkgIndex.tcl script is to use the pkg_mkIndex command. The pkg_mkIndex command scans files which match a given pattern in a directory looking for package provide commands. From this information it generates an appropriate pkgIndex.tcl file in the directory.

Once a package index has been created, the next step is to move the package to somewhere that Tcl can find it. The tcl_pkgPath and auto_path global variables contain a list of directories that Tcl searches for packages. The package index and all the files that implement the package should be installed into a subdirectory of one of these directories. Alternatively, the auto_path variable can be extended at run-time to tell Tcl of new places to look for packages.

package require ?-exact? name ?version?: Loads the package identified by name. If the -exact switch is given along with a version number then only that exact package version will be accepted. If a version number is given, without the -exact switch then any version equal to or greater than that version (but with the same major version number) will be accepted. If no version is specified then any version will be loaded. If a matching package can be found then it is loaded and the command returns the actual version number; otherwise it generates an error.
package provide name ?version?: If a version is given this command tells Tcl that this version of the package indicated by name is loaded. If a different version of the same package has already been loaded then an error is generated. If the version argument is omitted, then the command returns the version number that is currently loaded, or the empty string if the package has not been loaded.
pkg_mkIndex ?-direct? ?-lazy? ?-load pkgPat? ?-verbose? dir ?pattern pattern ...?: Creates a pkgIndex.tcl file for a package or set of packages. The command works by loading the files matching the patterns in the directory, dir and seeing what new packages and commands appear. The command is able to handle both Tcl script files and binary libraries (not discussed here).

Namespaces

One problem that can occur when using packages, and particularly when using code written by others is that of name collision. This happens when two pieces of code try to define a procedure or variable with the same name. In Tcl when this occurs the old procedure or variable is simply overwritten. This is sometimes a useful feature, but more often it is the cause of bugs if the two definitions are not compatible. To solve this problem, Tcl provides a namespace command to allow commands and variables to be partitioned into separate areas, called namespaces. Each namespace can contain commands and variables which are local to that namespace and cannot be overwritten by commands or variables in other namespaces. When a command in a namespace is invoked it can see all the other commands and variables in its namespace, as well as those in the global namespace. Namespaces can also contain other namespaces. This allows a hierarchy of namespaces to be created in a similar way to a file system hierarchy, or the Tk widget hierarchy. Each namespace itself has a name which is visible in its parent namespace. Items in a namespace can be accessed by creating a path to the item. This is done by joining the names of the items with ::. For instance, to access the variable bar in the namespace foo, you could use the path foo::bar. This kind of path is called a relative path because Tcl will try to follow the path relative to the current namespace. If that fails, and the path represents a command, then Tcl will also look relative to the global namespace. You can make a path fully-qualified by describing its exact position in the hierachy from the global namespace, which is named ::. For instance, if our foo namespace was a child of the global namespace, then the fully-qualified name of bar would be ::foo::bar. It is usually a good idea to use fully-qualified names when referring to any item outside of the current namespace to avoid surprises.

A namespace can export some or all of the command names it contains. These commands can then be imported into another namespace. This in effect creates a local command in the new namespace which when invoked calls the original command in the original namespace. This is a useful technique for creating short-cuts to frequently used commands from other namespaces. In general, a namespace should be careful about exporting commands with the same name as any built-in Tcl command or with a commonly used name.

Some of the most important commands to use when dealing with namespaces are:

namespace eval path script: This command evaluates the script in the namespace specified by path. If the namespace doesn't exist then it is created. The namespace becomes the current namespace while the script is executing, and any unqualified names will be resolved relative to that namespace. Returns the result of the last command in script.
namespace delete ?namespace namespace ...?: Deletes each namespace specified, along with all variables, commands and child namespaces it contains.
namespace current: Returns the fully qualified path of the current namespace.
namespace export ?-clear? ?pattern pattern ...?: Adds any commands matching one of the patterns to the list of commands exported by the current namespace. If the -clear switch is given then the export list is cleared before adding any new commands. If no arguments are given, returns the currently exported command names. Each pattern is a glob-style pattern such as *, [a-z]*, or *foo*.
namespace import ?-force? ?pattern pattern ...?: Imports all commands matching any of the patterns into the current namespace. Each pattern is a glob-style pattern such as foo::*, or foo::bar.

Using namespace with packages

William Duquette has an excellent guide to using namespaces and packages at http://www.wjduquette.com/tcl/namespaces.html. In general, a package should provide a namespace as a child of the global namespace and put all of its commands and variables inside that namespace. A package shouldn't put commands or variables into the global namespace by default. It is also good style to give your package and the namespace it provides the same name, to avoid confusion.

Example

This example creates a package which provides a stack data structure.

# Register the package
package provide tutstack 1.0
package require Tcl      8.5

# Create the namespace
namespace eval ::tutstack {
    # Export commands
    namespace export create destroy push pop peek empty

    # Set up state
    variable stack
    variable id 0
}

# Create a new stack
proc ::tutstack::create {} {
    variable stack
    variable id

    set token "stack[incr id]"
    set stack($token) [list]
    return $token
}

# Destroy a stack
proc ::tutstack::destroy {token} {
    variable stack

    unset stack($token)
}

# Push an element onto a stack
proc ::tutstack::push {token elem} {
    variable stack

    lappend stack($token) $elem
}

# Check if stack is empty
proc ::tutstack::empty {token} {
    variable stack

    set num [llength $stack($token)]
    return [expr {$num == 0}]
}

# See what is on top of the stack without removing it
proc ::tutstack::peek {token} {
    variable stack

    if {[empty $token]} {
	error "stack empty"
    }

    return [lindex $stack($token) end]
}

# Remove an element from the top of the stack
proc ::tutstack::pop {token} {
    variable stack

    set ret [peek $token]
    set stack($token) [lrange $stack($token) 0 end-1]
    return $ret
}

And some code which uses it:

package require tutstack 1.0

set stack [tutstack::create]
foreach num {1 2 3 4 5} { tutstack::push $stack $num }

while { ![tutstack::empty $stack] } {
    puts "[tutstack::pop $stack]"
}

tutstack::destroy $stack

Ensembles

A common way of structuring related commands is to group them together into a single command with sub-commands. This type of command is called an ensemble command, and there are many examples in the Tcl standard library. For instance, the string command is an ensemble whose sub-commands are length, index, match etc. Tcl 8.5 introduced a handy way of converting a namespace into an ensemble with the namespace ensemble command. This command is very flexible, with many options to specify exactly how sub-commands are mapped to commands within the namespace. The most basic usage is very straightforward, however, and simply creates an ensemble command with the same name as the namespace and with all exported procedures registered as sub-commands. To illustrate this, we will convert our stack data structure into an ensemble:

package require tutstack 1.0
package require Tcl      8.5

namespace eval ::tutstack {
    # Create the ensemble command
    namespace ensemble create
}

# Now we can use our stack through the ensemble command
set stack [tutstack create]
foreach num {1 2 3 4 5} { tutstack push $stack $num }

while { ![tutstack empty $stack] } {
    puts "[tutstack pop $stack]"
}

tutstack destroy $stack

As well as providing a nicer syntax for accessing functionality in a namespace, ensemble commands also help to clearly distinguish the public interface of a package from the private implementation details, as only exported commands will be registered as sub-commands and the ensemble will enforce this distinction. Readers who are familiar with object-oriented programming (OOP) will realise that the namespace and ensemble mechanisms provide many of the same encapsulation advantages. Indeed, many OO extensions for Tcl build on top of the powerful namespace mechanism.

Previous lesson | Index | Next lesson