Online Book Reader

Home Category

Beautiful Code [239]

By Root 5365 0
this check should be made only once outside of the loop, and then a single approach to finding the next value should be used inside the loop. However, this kind of interface is a bit messy to implement in C. It would require two different macros similar to ITER_NEXT and two different while loops. As a result, nothing to this effect was implemented in NumPy at the time of the writing of this chapter. People wishing to get the small speed gain available for contiguous cases are assumed to be knowledgeable enough to write the simple loop themselves (bypassing the iterator entirely).

Multidimensional Iterators in NumPy > Iterator Use

19.6. Iterator Use

A good abstraction proves its worth when it makes coding simpler under diverse conditions, or when it ends up being useful in ways that were originally unintended. Both of these affirmations of value are definitely true of the NumPy iterator object. With only slight modifications, the original simple NumPy iterator has become a workhorse for implementing other NumPy features, such as iteration over all but one dimension and iteration over multiple arrays at the same time. In addition, when we had to quickly add

some enhancements to the code for generating random numbers and for broadcast-based copying, the existence of the iterator and its extensions made implementation much easier.

19.6.1. Iteration Over All But One Dimension

A common motif in NumPy is to gain speed by concentrating optimizations on the loop over a single dimension where simple striding can be assumed. Then an iteration strategy that iterates over all but the last dimension is used. This was the approach introduced by NumPy's predecessor, Numeric, to implement the math functionality.

In NumPy, a slight modification to the NumPy iterator makes it possible to use this basic strategy in any code. The modified iterator is returned from the constructor as follows:

it = PyArray_IterAllButAxis(array, &dim).

The PyArray_IterAllButAxis function takes a NumPy array and the address of an integer representing the dimension to remove from the iteration. The integer is passed by reference (the & operator) because if the dimension is specified as -1, the function determines which dimension to remove from iteration and places the number of that dimension in the argument. When the input dimension is -1, the routine chooses the dimension with the smallest nonzero stride.

Another choice for the dimension to remove might have been the dimension with the largest number of elements. That choice would minimize the number of outer loop iterations and reserve the most elements for the presumably fast inner loop. The problem with that choice is that getting information in and out of memory is often the slowest part of an algorithm on general-purpose processors.

As a result, the choice made by NumPy is to make sure that the inner loop is proceeding with data that is as close together as possible. Such data is more likely to be accessed more quickly during the speed-critical inner loop.

The iterator is modified by:

Dividing the iteration size by the length of the dimension being removed.

Setting the number of elements in the selected dimension to 1 (so the array storing one less than the total number is set to 0): dims_m1[i]=0.

Setting the backstrides entry for that dimension to 0 so that the continual rewrapping of the counter in the given dimension back to 0 will never alter the data pointer.

Resetting the contiguous flag to 0 because processing will not be contiguous in memory (each iteration has to skip an entire dimension of the array).

The altered iterator is returned by the function. It can now be used everywhere an iterator was previously used. Each time through the loop, the iterator will point to the first element of the selected dimension of the array.

19.6.2. Multiple Iterations

Another common task in NumPy is to iterate over several arrays in concert. For example, the implementation of array addition requires iterating over both arrays using a connected iterator so that the output array

Return Main Page Previous Page Next Page

®Online Book Reader