In this article we will discuss what is a structured numpy array and how to create it and sort it using different functions.

## What is a Structured Numpy Array ?

A Structured Numpy Array is an array of structures (Similar to C struct). As numpy arrays are homogeneous i.e. they can contain data of same type only. So, instead of creating a numpy array of int or float, we can create numpy array of homogeneous structures too.

Let’s understand by an example,

Suppose we want to create a numpy array with elements of following structure

struct { char name[10]; float marks; int gradeLevel; }

It means each element in numpy array should be a structure of above type. This kind of numpy arrays are called structured numpy arrays.

Let’s see how to create that,

## Creating a Structured Numpy Array

First of all import numpy module i.e.

import numpy as np

Now to create a structure numpy array we can pass a list of tuples containing the structure elements i.e.

[('Sam', 33.3, 3), ('Mike', 44.4, 5), ('Aadi', 66.6, 6), ('Riti', 88.8, 7)]

But as elements of a numpy array are homogeneous, so how will be the size and type of structure will be decided ?

For that we need to pass the type of above structure type i.e. schema in dtype parameter. Let’s create a dtype for above structure i.e.

# Creating the type of a structure dtype = [('Name', (np.str_, 10)), ('Marks', np.float64), ('GradeLevel', np.int32)]

Let’s create a numpy array based on this dtype i.e.

# Creating a Strucured Numpy array structuredArr = np.array([('Sam', 33.3, 3), ('Mike', 44.4, 5), ('Aadi', 66.6, 6), ('Riti', 88.8, 7)], dtype=dtype)

It will create a structured numpy array and its contents will be,

[('Sam', 33.3, 3) ('Mike', 44.4, 5) ('Aadi', 66.6, 6) ('Riti', 88.8, 7)]

Let’s check the data type of the above created numpy array is,

print(structuredArr.dtype)

Output:

[('Name', '<U10'), ('Marks', '<f8'), ('GradeLevel', '<i4')]

It is basically the structure type specifying a structure of String of size 10, float and int.

## How to Sort a Structured Numpy Array ?

Suppose we have a very big structured numpy array and we want to sort that numpy array based on specific fields of the structure. For this,

both **numpy.sort()** and **numpy.ndarray.sort()** provides a parameter ‘**order**‘ , in which it can accept a single argument or list of arguments. Then it will sort the structured numpy array by this given order parameter as field of structure.

Let’s see how to do that,

#### Sort the Structured Numpy array by field ‘**Name**‘ of the structure

# Sort the Structured Numpy array by field 'Name' of the structure modArr = np.sort(structuredArr, order='Name') print('Sorted Array : ') print(modArr)

Output:

Sorted Array : [('Aadi', 66.6, 6) ('Mike', 44.4, 5) ('Riti', 88.8, 7) ('Sam', 33.3, 3)]

It sorted all the elements in this structured numpy array based on first field of the structure i.e. ‘Name’.

#### Sort the Structured Numpy array by field ‘**Marks**‘ of the structure

# Sort the Structured Numpy array by field 'Marks' of the structure modArr = np.sort(structuredArr, order='Marks') print('Sorted Array : ') print(modArr)

Output:

Sorted Array : [('Sam', 33.3, 3) ('Mike', 44.4, 5) ('Aadi', 66.6, 6) ('Riti', 88.8, 7)]

It sorted all the elements in this structured numpy array based on second field of the structure i.e. ‘Marks’.

#### Sort the Structured Numpy array by ‘Name’ & ‘GradeLevel’ fields of the structure

# Sort by Name & GradeLevel modArr = np.sort(structuredArr, order=['Name', 'GradeLevel']) print('Sorted Array : ') print(modArr)

Output:

Sorted Array : [('Aadi', 66.6, 6) ('Mike', 44.4, 5) ('Riti', 88.8, 7) ('Sam', 33.3, 3)]

It sorted all the elements in this structured numpy array based on multiple fields of the structure i.e. ‘Name’ and ‘GradeLevel’.

Structured numpy arrays are useful when you want to load a big csv file in a single numpy array and perform operations on it.

**Complete example is as follows,**

import numpy as np def main(): print('*** Creating a Structured Numpy Array ***') # Creating the type of a structure dtype = [('Name', (np.str_, 10)), ('Marks', np.float64), ('GradeLevel', np.int32)] # Creating a Strucured Numpy array structuredArr = np.array([('Sam', 33.3, 3), ('Mike', 44.4, 5), ('Aadi', 66.6, 6), ('Riti', 88.8, 7)], dtype=dtype) print('Contents of the Structured Numpy Array : ') print(structuredArr) print('Data type of the Structured Numpy Array : ') print(structuredArr.dtype) print('*** Sorting a Structured Numpy Array by <Name> field ***') # Sort the Structured Numpy array by field 'Name' of the structure modArr = np.sort(structuredArr, order='Name') print('Sorted Array : ') print(modArr) print('*** Sorting a Structured Numpy Array by <Marks> field ***') # Sort the Structured Numpy array by field 'Marks' of the structure modArr = np.sort(structuredArr, order='Marks') print('Sorted Array : ') print(modArr) print('*** Sorting a Structured Numpy Array by <Name> & <GradeLevel> fields ***') # Sort by Name & GradeLevel modArr = np.sort(structuredArr, order=['Name', 'GradeLevel']) print('Sorted Array : ') print(modArr) if __name__ == '__main__': main()

**Output:**

*** Creating a Structured Numpy Array *** Contents of the Structured Numpy Array : [('Sam', 33.3, 3) ('Mike', 44.4, 5) ('Aadi', 66.6, 6) ('Riti', 88.8, 7)] Data type of the Structured Numpy Array : [('Name', '<U10'), ('Marks', '<f8'), ('GradeLevel', '<i4')] *** Sorting a Structured Numpy Array by <Name> field *** Sorted Array : [('Aadi', 66.6, 6) ('Mike', 44.4, 5) ('Riti', 88.8, 7) ('Sam', 33.3, 3)] *** Sorting a Structured Numpy Array by <Marks> field *** Sorted Array : [('Sam', 33.3, 3) ('Mike', 44.4, 5) ('Aadi', 66.6, 6) ('Riti', 88.8, 7)] *** Sorting a Structured Numpy Array by <Name> & <GradeLevel> fields *** Sorted Array : [('Aadi', 66.6, 6) ('Mike', 44.4, 5) ('Riti', 88.8, 7) ('Sam', 33.3, 3)]