Last modified: August 03, 2024

This article is written in: 🇺🇸

Joining and Splitting Arrays

In NumPy, manipulating the structure of arrays is a common operation. Whether combining multiple arrays into one or splitting a single array into several parts, NumPy provides a set of intuitive functions to achieve these tasks efficiently. This guide will cover various methods to join and split arrays, providing detailed explanations and practical examples.

Stacking Arrays

Stacking is the technique of joining arrays along a new axis. The function for this is np.stack(), and the axis on which you want to stack the arrays can be specified using the axis parameter.

Example: Stacking Along a New Axis

import numpy as np

a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6], [7, 8]])

# Vertical stacking (default axis=0)
c = np.stack((a, b))
print("Vertically stacked:\n", c)

# Horizontal stacking (axis=1)
d = np.stack((a, b), axis=1)
print("\nHorizontally stacked:\n", d)

Expected output:

Vertically stacked:
[[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]

Horizontally stacked:
[[[1 2]
  [5 6]]

 [[3 4]
  [7 8]]]

Concatenating Arrays

Concatenation merges arrays along an existing axis. Use np.concatenate() for this.

Example: Concatenation Along Existing Axes

a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6], [7, 8]])

# Vertical concatenation (default axis=0)
c = np.concatenate((a, b))
print("Vertically concatenated:\n", c)

# Horizontal concatenation (axis=1)
d = np.concatenate((a, b), axis=1)
print("\nHorizontally concatenated:\n", d)

Expected output:

Vertically concatenated:
[[1 2]
 [3 4]
 [5 6]
 [7 8]]

Horizontally concatenated:
[[1 2 5 6]
 [3 4 7 8]]

Appending to Arrays

You can append values to an array using np.append(). It adds values at the end along the specified axis.

Example: Appending Values

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

# Append b to a
c = np.append(a, b)
print("Appended array:\n", c)

Expected output:

Appended array:
[1 2 3 4 5 6]

Appending Arrays involves combining the elements of the second array to the end of the first array, resulting in the creation of a new combined array. This operation effectively extends the first array by adding the elements from the second array in sequence.

Splitting Arrays

Splitting breaks down an array into smaller chunks. np.split() is the function used to divide arrays.

Regular and Custom Splits

a = np.array([1, 2, 3, 4, 5, 6])

# Split into three equal parts
b = np.split(a, 3)
print("Regular split:\n", b)

# Split at the 2nd and 4th indices
c = np.split(a, [2, 4])
print("\nCustom split:\n", c)

Expected output:

Regular split:
[array([1, 2]), array([3, 4]), array([5, 6])]

Custom split:
[array([1, 2]), array([3, 4]), array([5, 6])]

Advanced Joining and Splitting Techniques

Example: HStack and VStack

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

# Horizontal stack
h_stack = np.hstack((a, b))
print("Horizontal stack:\n", h_stack)

# Vertical stack
v_stack = np.vstack((a, b))
print("\nVertical stack:\n", v_stack)

Expected output:

Horizontal stack:
[1 2 3 4 5 6]

Vertical stack:
[[1 2 3]
 [4 5 6]]

Example: DStack

a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6], [7, 8]])

# Depth stack
d_stack = np.dstack((a, b))
print("Depth stack:\n", d_stack)

Expected output:

Depth stack:
[[[1 5]
  [2 6]]

 [[3 7]
  [4 8]]]

Practical Applications and Considerations

Understanding how to join and split arrays is crucial for various data manipulation tasks, including:

  1. Data Preprocessing involves combining and splitting datasets to prepare them for training and testing machine learning models, ensuring that the data is properly formatted and ready for analysis.
  2. Data Augmentation creates variations of existing data, which helps in training robust machine learning models by providing diverse examples for learning.
  3. Feature Engineering includes the process of combining different feature sets into a single array, facilitating comprehensive analysis and enhancing the model's ability to learn relevant patterns.
  4. Batch Processing refers to splitting large datasets into smaller, manageable batches, which allows for efficient processing and analysis, especially when dealing with limited computational resources.

Summary Table

Operation Method/Function Description Example Code Example Output
Stack stack() Joins arrays along a new axis. np.stack((a, b), axis=0) [[[1 2] [3 4]] [[5 6] [7 8]]]
Horizontal Stack hstack() Combines arrays horizontally along columns. np.hstack((a, b)) [1 2 3 4 5 6]
Vertical Stack vstack() Combines arrays vertically along rows. np.vstack((a, b)) [[1 2 3] [4 5 6]]
Depth Stack dstack() Stacks arrays along the third axis (depth). np.dstack((a, b)) [[[1 5] [2 6]] [[3 7] [4 8]]]
Concatenate concatenate() Merges arrays along an existing axis. np.concatenate((a, b), axis=0) [[1 2] [3 4] [5 6] [7 8]]
Append append() Appends values to the end of an array. np.append(a, b) [1 2 3 4 5 6]
Split split() Splits an array into multiple sub-arrays. np.split(a, [2, 4]) [array([1, 2]), array([3, 4]), array([5, 6])]
Horizontal Split hsplit() Splits an array into multiple sub-arrays horizontally (column-wise). np.hsplit(a, 2) [array([[1], [3]]), array([[2], [4]])]
Vertical Split vsplit() Splits an array into multiple sub-arrays vertically (row-wise). np.vsplit(a, 2) [array([[1, 2]]), array([[3, 4]])]

Table of Contents

    Joining and Splitting Arrays
    1. Stacking Arrays
      1. Example: Stacking Along a New Axis
    2. Concatenating Arrays
      1. Example: Concatenation Along Existing Axes
    3. Appending to Arrays
      1. Example: Appending Values
    4. Splitting Arrays
      1. Regular and Custom Splits
    5. Advanced Joining and Splitting Techniques
      1. Example: HStack and VStack
      2. Example: DStack
    6. Practical Applications and Considerations
    7. Summary Table