Array API

The test of a first-rate intelligence is the ability to hold two
opposed ideas in the mind at the same time, and still retain the
ability to function.
F. Scott Fitzgerald
For a successful technology, reality must take precedence over public
relations, for Nature cannot be fooled.
Richard P. Feynman

Array structure and data access

These macros all access the :ctype:`PyArrayObject` structure members. The input argument, arr, can be any :ctype:`PyObject *` that is directly interpretable as a :ctype:`PyArrayObject *` (any instance of the :cdata:`PyArray_Type` and its sub-types).

Data access

These functions and macros provide easy access to elements of the ndarray from C. These work for all arrays. You may need to take care when accessing the data in the array, however, if it is not in machine byte-order, misaligned, or not writeable. In other words, be sure to respect the state of the flags unless you know what you are doing, or have previously guaranteed an array that is writeable, aligned, and in machine byte-order using :cfunc:`PyArray_FromAny`. If you wish to handle all types of arrays, the copyswap function for each type is useful for handling misbehaved arrays. Some platforms (e.g. Solaris) do not like misaligned data and will crash if you de-reference a misaligned pointer. Other platforms (e.g. x86 Linux) will just work more slowly with misaligned data.

Creating arrays

From scratch

Warning

If data is passed to :cfunc:`PyArray_NewFromDescr` or :cfunc:`PyArray_New`, this memory must not be deallocated until the new array is deleted. If this data came from another Python object, this can be accomplished using :cfunc:`Py_INCREF` on that object and setting the base member of the new array to point to that object. If strides are passed in they must be consistent with the dimensions, the itemsize, and the data of the array.

From other objects

Dealing with types

General check of Python Type

Data-type checking

For the typenum macros, the argument is an integer representing an enumerated array data type. For the array type checking macros the argument must be a :ctype:`PyObject *` that can be directly interpreted as a :ctype:`PyArrayObject *`.

Converting data types

New data types

Special functions for NPY_OBJECT

Array flags

The flags attribute of the PyArrayObject structure contains important information about the memory used by the array (pointed to by the data member) This flag information must be kept accurate or strange results and even segfaults may result.

There are 6 (binary) flags that describe the memory area used by the data buffer. These constants are defined in arrayobject.h and determine the bit-position of the flag. Python exposes a nice attribute- based interface as well as a dictionary-like interface for getting (and, if appropriate, setting) these flags.

Memory areas of all kinds can be pointed to by an ndarray, necessitating these flags. If you get an arbitrary PyArrayObject in C-code, you need to be aware of the flags that are set. If you need to guarantee a certain kind of array (like :cdata:`NPY_ARRAY_C_CONTIGUOUS` and :cdata:`NPY_ARRAY_BEHAVED`), then pass these requirements into the PyArray_FromAny function.

Basic Array Flags

An ndarray can have a data segment that is not a simple contiguous chunk of well-behaved memory you can manipulate. It may not be aligned with word boundaries (very important on some platforms). It might have its data in a different byte-order than the machine recognizes. It might not be writeable. It might be in Fortan-contiguous order. The array flags are used to indicate what can be said about data associated with an array.

In versions 1.6 and earlier of NumPy, the following flags did not have the _ARRAY_ macro namespace in them. That form of the constant names is deprecated in 1.7.

Note

Arrays can be both C-style and Fortran-style contiguous simultaneously. This is clear for 1-dimensional arrays, but can also be true for higher dimensional arrays.

Even for contiguous arrays a stride for a given dimension arr.strides[dim] may be arbitrary if arr.shape[dim] == 1 or the array has no elements. It does not generally hold that self.strides[-1] == self.itemsize for C-style contiguous arrays or self.strides[0] == self.itemsize for Fortran-style contiguous arrays is true. The correct way to access the itemsize of an array from the C API is PyArray_ITEMSIZE(arr).

:cfunc:`PyArray_UpdateFlags` (obj, flags) will update the obj->flags for flags which can be any of :cdata:`NPY_ARRAY_C_CONTIGUOUS`, :cdata:`NPY_ARRAY_F_CONTIGUOUS`, :cdata:`NPY_ARRAY_ALIGNED`, or :cdata:`NPY_ARRAY_WRITEABLE`.

Combinations of array flags

Flag-like constants

These constants are used in :cfunc:`PyArray_FromAny` (and its macro forms) to specify desired properties of the new array.

Flag checking

For all of these macros arr must be an instance of a (subclass of) :cdata:`PyArray_Type`, but no checking is done.

Warning

It is important to keep the flags updated (using :cfunc:`PyArray_UpdateFlags` can help) whenever a manipulation with an array is performed that might cause them to change. Later calculations in NumPy that rely on the state of these flags do not repeat the calculation to update them.

Array method alternative API

Conversion

Shape Manipulation

Warning

matrix objects are always 2-dimensional. Therefore, :cfunc:`PyArray_Squeeze` has no effect on arrays of matrix sub-class.

Item selection and manipulation

Calculation

Tip

Pass in :cdata:`NPY_MAXDIMS` for axis in order to achieve the same effect that is obtained by passing in axis = None in Python (treating the array as a 1-d array).

Note

The rtype argument specifies the data-type the reduction should take place over. This is important if the data-type of the array is not “large” enough to handle the output. By default, all integer data-types are made at least as large as :cdata:`NPY_LONG` for the “add” and “multiply” ufuncs (which form the basis for mean, sum, cumsum, prod, and cumprod functions).

Functions

Array Functions

Note

The simulation of a C-style array is not complete for 2-d and 3-d arrays. For example, the simulated arrays of pointers cannot be passed to subroutines expecting specific, statically-defined 2-d and 3-d arrays. To pass to functions requiring those kind of inputs, you must statically define the required array and copy data.

Other functions

Auxiliary Data With Object Semantics

New in version 1.7.0.

When working with more complex dtypes which are composed of other dtypes, such as the struct dtype, creating inner loops that manipulate the dtypes requires carrying along additional data. NumPy supports this idea through a struct :ctype:`NpyAuxData`, mandating a few conventions so that it is possible to do this.

Defining an :ctype:`NpyAuxData` is similar to defining a class in C++, but the object semantics have to be tracked manually since the API is in C. Here’s an example for a function which doubles up an element using an element copier function as a primitive.:

typedef struct {
    NpyAuxData base;
    ElementCopier_Func *func;
    NpyAuxData *funcdata;
} eldoubler_aux_data;

void free_element_doubler_aux_data(NpyAuxData *data)
{
    eldoubler_aux_data *d = (eldoubler_aux_data *)data;
    /* Free the memory owned by this auxadata */
    NPY_AUXDATA_FREE(d->funcdata);
    PyArray_free(d);
}

NpyAuxData *clone_element_doubler_aux_data(NpyAuxData *data)
{
    eldoubler_aux_data *ret = PyArray_malloc(sizeof(eldoubler_aux_data));
    if (ret == NULL) {
        return NULL;
    }

    /* Raw copy of all data */
    memcpy(ret, data, sizeof(eldoubler_aux_data));

    /* Fix up the owned auxdata so we have our own copy */
    ret->funcdata = NPY_AUXDATA_CLONE(ret->funcdata);
    if (ret->funcdata == NULL) {
        PyArray_free(ret);
        return NULL;
    }

    return (NpyAuxData *)ret;
}

NpyAuxData *create_element_doubler_aux_data(
                            ElementCopier_Func *func,
                            NpyAuxData *funcdata)
{
    eldoubler_aux_data *ret = PyArray_malloc(sizeof(eldoubler_aux_data));
    if (ret == NULL) {
        PyErr_NoMemory();
        return NULL;
    }
    memset(&ret, 0, sizeof(eldoubler_aux_data));
    ret->base->free = &free_element_doubler_aux_data;
    ret->base->clone = &clone_element_doubler_aux_data;
    ret->func = func;
    ret->funcdata = funcdata;

    return (NpyAuxData *)ret;
}

Array Iterators

As of Numpy 1.6, these array iterators are superceded by the new array iterator, :ctype:`NpyIter`.

An array iterator is a simple way to access the elements of an N-dimensional array quickly and efficiently. Section 2 provides more description and examples of this useful approach to looping over an array.

Broadcasting (multi-iterators)

Neighborhood iterator

New in version 1.4.0.

Neighborhood iterators are subclasses of the iterator object, and can be used to iter over a neighborhood of a point. For example, you may want to iterate over every voxel of a 3d image, and for every such voxel, iterate over an hypercube. Neighborhood iterator automatically handle boundaries, thus making this kind of code much easier to write than manual boundaries handling, at the cost of a slight overhead.

Array Scalars

Data-type descriptors

Warning

Data-type objects must be reference counted so be aware of the action on the data-type reference of different C-API calls. The standard rule is that when a data-type object is returned it is a new reference. Functions that take :ctype:`PyArray_Descr *` objects and return arrays steal references to the data-type their inputs unless otherwise noted. Therefore, you must own a reference to any data-type object used as input to such a function.

Conversion Utilities

For use with :cfunc:`PyArg_ParseTuple`

All of these functions can be used in :cfunc:`PyArg_ParseTuple` (...) with the “O&” format specifier to automatically convert any Python object to the required C-object. All of these functions return :cdata:`NPY_SUCCEED` if successful and :cdata:`NPY_FAIL` if not. The first argument to all of these function is a Python object. The second argument is the address of the C-type to convert the Python object to.

Warning

Be sure to understand what steps you should take to manage the memory when using these conversion functions. These functions can require freeing memory, and/or altering the reference counts of specific objects based on your use.

Other conversions

Miscellaneous

Importing the API

In order to make use of the C-API from another extension module, the import_array () command must be used. If the extension module is self-contained in a single .c file, then that is all that needs to be done. If, however, the extension module involves multiple files where the C-API is needed then some additional steps must be taken.

Checking the API Version

Because python extensions are not used in the same way as usual libraries on most platforms, some errors cannot be automatically detected at build time or even runtime. For example, if you build an extension using a function available only for numpy >= 1.3.0, and you import the extension later with numpy 1.2, you will not get an import error (but almost certainly a segmentation fault when calling the function). That’s why several functions are provided to check for numpy versions. The macros :cdata:`NPY_VERSION` and :cdata:`NPY_FEATURE_VERSION` corresponds to the numpy version used to build the extension, whereas the versions returned by the functions PyArray_GetNDArrayCVersion and PyArray_GetNDArrayCFeatureVersion corresponds to the runtime numpy’s version.

The rules for ABI and API compatibilities can be summarized as follows:

ABI incompatibility is automatically detected in every numpy’s version. API incompatibility detection was added in numpy 1.4.0. If you want to supported many different numpy versions with one extension binary, you have to build your extension with the lowest NPY_FEATURE_VERSION as possible.

Internal Flexibility

Memory management

Threading support

These macros are only meaningful if :cdata:`NPY_ALLOW_THREADS` evaluates True during compilation of the extension module. Otherwise, these macros are equivalent to whitespace. Python uses a single Global Interpreter Lock (GIL) for each Python process so that only a single thread may excecute at a time (even on multi-cpu machines). When calling out to a compiled function that may take time to compute (and does not have side-effects for other threads like updated global variables), the GIL should be released so that other Python threads can run while the time-consuming calculations are performed. This can be accomplished using two groups of macros. Typically, if one macro in a group is used in a code block, all of them must be used in the same code block. Currently, :cdata:`NPY_ALLOW_THREADS` is defined to the python-defined :cdata:`WITH_THREADS` constant unless the environment variable :cdata:`NPY_NOSMP` is set in which case :cdata:`NPY_ALLOW_THREADS` is defined to be 0.

Group 1

This group is used to call code that may take some time but does not use any Python C-API calls. Thus, the GIL should be released during its calculation.

Group 2

This group is used to re-acquire the Python GIL after it has been released. For example, suppose the GIL has been released (using the previous calls), and then some path in the code (perhaps in a different subroutine) requires use of the Python C-API, then these macros are useful to acquire the GIL. These macros accomplish essentially a reverse of the previous three (acquire the LOCK saving what state it had) and then re-release it with the saved state.

Tip

Never use semicolons after the threading support macros.

Priority

Default buffers

Other constants

Miscellaneous Macros

Enumerated Types