Online Book Reader

Home Category

Beautiful Code [233]

By Root 5082 0
write a for loop to process all the elements of a one-dimensional array, or two nested for loops to process all the elements of a two-dimensional array. Indeed, if you know ahead of time how many dimensions the array consists of, you can use the right number of for loops to directly loop over all of the elements of the array. But how do you write a general forloop that will process all of the elements of an N-dimensional array when N can be an arbitrary integer?

There are two basic solutions to this problem. One solution is to use recursion by thinking about the problem in terms of a recursive case and a base case. Thus, if copy_ND(a, b, N) is a function that copies an N-dimensional array pointed to by b to another N-dimensional array pointed to by a, a simple recursive implementation might look like:

if (N==0)

copy memory from b to a

return

set up ptr_to_a and ptr_to_b

for n=0 to size of first dimension of a and b

copy_ND(ptr_to_a, ptr_to_b, N-1)

add stride_a[0] to ptr_to_a and stride_b[0] to ptr_b

Notice the use of the single for loop and the check for the base case that stops the recursion when N reaches 0.

It is not always easy to think about how to write every algorithm as a recursive algorithm, even though the code just shown can often be used as a starting model. Recursion also requires the use of a function call at each iteration of the loop. So it can be all too easy for recursion to create slow code, unless some base-case optimization is performed (such as stopping when N==1 and doing the memory copy in a for loop locally).

Most languages will not do that kind of optimization automatically, so an elegant-looking recursive solution might end up looking much more contrived by the time optimizations are added.

In addition, many algorithms require the storage of intermediate information that will be used during later recursions. For example, what if the maximum or minimum value in the array must be tracked? Typically, such values become part of the recursive call structure and are passed along as arguments in the recursive call. In the end, each algorithm that uses recursion must be written in a slightly different way. Thus, it is hard to provide the programmer with additional simplifying tools for recursive solutions.

Instead of using recursion, NumPy uses iteration to accomplish most of its N-dimensional algorithms. Every recursive solution can be written using an iterative solution. Iterators are an abstraction that simplifies thinking about these algorithms. Therefore, using iterators, N-dimensional routines can be developed that run quickly, while the code can still be read and understood using a single looping structure.

An iterator is an abstract concept that encapsulates the idea of walking through all of the elements of an array with just one loop. In Python itself, iterators are objects that can be used as the predicate of any for loop. For example:

for x in iterobj:

process(x)

will run the function process on all of the elements of iterobj. The most important requirement of iterobj is that it has some way to get its "next" element. Thus, the concept of an iterator refocuses the problem of looping over all the elements of a data structure to one of finding the next element.

In order to understand how iterators are implemented and used in NumPy, it is crucial to have at least some conception of how NumPy views the memory in an N-dimensional array. The next section should clarify this point.

Multidimensional Iterators in NumPy > Memory Models for an N-Dimensional Array

19.2. Memory Models for an N-Dimensional Array

The simplest model for an N-dimensional array in computer memory can be used whenever all of the elements of the array are sitting next to each other in a contiguous segment. Under such circumstances, getting to the next element of the array is as simple as adding a fixed constant to a pointer to the memory location of the current data pointer. As a result, an iterator for contiguous memory arrays requires just adding a fixed constant to the current

Return Main Page Previous Page Next Page

®Online Book Reader