NumPy ndarray Basics for Deep Learning Beginners: array, dtype, shape, reshape, astype

NumPy ndarray Basics for Deep Learning Beginners: array, dtype, shape, reshape, astype

1. Why Start with `ndarray`?

When you dive into deep learning, you’ll repeatedly see code like this:

Inspecting the shape of an input tensor
Using reshape to prepare batches
Converting data to float32 for GPU computation

All of these operations are built on the foundation of NumPy’s ndarray.

PyTorch’s Tensor is essentially a structure that closely mirrors NumPy’s ndarray.
Inputs, weights, and outputs in deep learning models are all multidimensional arrays (tensors).

Therefore, mastering ndarray is equivalent to learning the core syntax of tensor operations.

2. What Is an `ndarray`?

ndarray stands for N‑dimensional array—a concise way to refer to an array with N dimensions.

1‑D: vector
2‑D: matrix
3‑D and above: tensor (image batches, time series, video, etc.)

A quick example:

import numpy as np

x = np.array([1, 2, 3])            # 1‑D (vector)
M = np.array([[1, 2], [3, 4]])     # 2‑D (matrix)

print(type(x))          # <class 'numpy.ndarray'>
print(x.ndim, x.shape)  # number of dimensions, shape
print(M.ndim, M.shape)

ndim: how many dimensions
shape: the size of each dimension

3. How Similar Is a PyTorch `Tensor`?

A PyTorch tensor is ultimately just a multidimensional array.

import torch

x_np = np.array([[1, 2], [3, 4]])   # NumPy ndarray
x_torch = torch.tensor([[1, 2], [3, 4]])  # PyTorch Tensor

print(type(x_np))      # numpy.ndarray
print(type(x_torch))   # torch.Tensor

print(x_np.shape)      # (2, 2)
print(x_torch.shape)   # torch.Size([2, 2])

Commonalities:

Both are multidimensional numeric arrays.
Concepts like shape, reshape, and dtype are almost identical.
Operations (+, *, @, etc.) behave similarly.

Differences that matter in deep learning:

NumPy runs on the CPU and has no automatic differentiation.
PyTorch tensors can use the GPU and support autograd.

Typical workflow:

Concept practice / data manipulation → NumPy
Actual model training → PyTorch

The more comfortable you are with ndarray, the more natural PyTorch tensor operations feel.

4. `np.array`: The Basic Way to Create an `ndarray`

The most fundamental constructor for an ndarray is np.array.

4.1 From Python Lists to `ndarray`

import numpy as np

# 1‑D array (vector)
x = np.array([1, 2, 3])
print(x)
print(x.ndim)   # 1
print(x.shape)  # (3,)

# 2‑D array (matrix)
M = np.array([[1, 2, 3],
              [4, 5, 6]])
print(M)
print(M.ndim)   # 2
print(M.shape)  # (2, 3)

A Python list (or list of lists) becomes an ndarray when passed to np.array.
The familiar batch_size x feature_dim matrix in deep learning is just this structure.

4.2 Quickly Creating Initialized Arrays

For experiments or training examples, you often need arrays filled with zeros or random values.

zeros = np.zeros((2, 3))       # 2x3 matrix, all zeros
ones = np.ones((2, 3))         # 2x3 matrix, all ones
randn = np.random.randn(2, 3)  # Gaussian random numbers

print(zeros.shape)  # (2, 3)

The same pattern applies in PyTorch:

import torch

zeros_t = torch.zeros((2, 3))
ones_t = torch.ones((2, 3))
randn_t = torch.randn((2, 3))

5. `dtype`: Understanding the Data Type

dtype represents the data type of the numbers stored in the array.

Common values:

int32, int64: integers
float32, float64: floating‑point numbers

Let’s check them:

x = np.array([1, 2, 3])
print(x.dtype)  # usually int64 or int32

y = np.array([1.0, 2.0, 3.0])
print(y.dtype)  # usually float64

5.1 Specifying `dtype` When Creating an Array

x = np.array([1, 2, 3], dtype=np.float32)
print(x.dtype)  # float32

In deep learning, float32 (PyTorch’s torch.float32) is the default because it balances GPU performance and memory usage.

6. `shape`: Reading the Data’s “Form"

shape is a tuple that describes the size of each dimension.

import numpy as np

x = np.array([1, 2, 3])
print(x.shape)  # (3,)

M = np.array([[1, 2, 3],
              [4, 5, 6]])
print(M.shape)  # (2, 3)

Typical shapes in deep learning:

A single feature vector: (feature_dim,) → e.g., (3,)
A batch of data: (batch_size, feature_dim) → e.g., (32, 3)
An image batch (PyTorch default): (batch_size, channels, height, width) → e.g., (32, 3, 224, 224)

Getting comfortable with these shapes in NumPy makes it easier to debug shape errors in PyTorch.

7. `reshape`: Changing the Form

reshape changes the shape of an array while keeping the total number of elements the same.

import numpy as np

x = np.array([1, 2, 3, 4, 5, 6])
print(x.shape)  # (6,)

M = x.reshape(2, 3)
print(M)
print(M.shape)  # (2, 3)

Key point:

The total number of elements before and after reshape must match.

7.1 Using `-1` for Automatic Inference

In batching or image processing, -1 is handy. It tells NumPy to infer that dimension.

x = np.array([[1, 2, 3],
              [4, 5, 6]])  # shape: (2, 3)

# Flatten to 1‑D
flat = x.reshape(-1)        # shape: (6,)
print(flat)

# Reshape back to 2 rows, columns inferred
M = flat.reshape(2, -1)     # shape: (2, 3)
print(M)

PyTorch behaves similarly:

import torch

x_t = torch.tensor([[1, 2, 3],
                    [4, 5, 6]])  # (2, 3)

flat_t = x_t.reshape(-1)        # (6,)
M_t = flat_t.reshape(2, -1)     # (2, 3)

Once you’re comfortable with reshape, you can:

Flatten feature maps in CNNs
Arrange RNN/LSTM inputs as (batch, seq_len, feature)
Move batch dimensions around

8. `astype`: Changing the Data Type

astype converts an array’s data type.

import numpy as np

x = np.array([1, 2, 3])      # integer
print(x.dtype)               # int32 or int64

x_float = x.astype(np.float32)
print(x_float)
print(x_float.dtype)         # float32

Common deep‑learning scenarios:

Convert integer labels to floats for loss calculation.
Standardize data from float64 to float32.
Ensure type compatibility before passing to PyTorch.

Example:

import torch
import numpy as np

x = np.array([1, 2, 3], dtype=np.int32)
x = x.astype(np.float32)              # convert to float32
x_torch = torch.from_numpy(x)         # convert to tensor
print(x_torch.dtype)                  # torch.float32

Mismatched types can trigger errors like “Expected Float but got Double” in PyTorch.

9. Summary: The `ndarray` Basics Covered Today

What we’ve covered:

What is an ndarray? – The core data structure for all deep‑learning data.
Relationship to PyTorch Tensor – Conceptually the same, but with GPU and autograd support.
np.array – Create arrays from Python lists.
dtype – Specify numeric types (int, float, 32/64‑bit).
shape – Understand the dimensionality of your data.
reshape – Re‑shape arrays while keeping element count constant.
astype – Convert between numeric types.

Mastering these four concepts (array, dtype, shape, reshape, astype) equips you to:

Handle tensor shape errors confidently.
Bridge the gap between research papers and code.
Follow PyTorch tutorials with ease.

NumPy ndarray Basics for Deep Learning Beginners: array, dtype, shape, reshape, astype

1. Why Start with `ndarray`?

2. What Is an `ndarray`?

3. How Similar Is a PyTorch `Tensor`?

4. `np.array`: The Basic Way to Create an `ndarray`

4.1 From Python Lists to `ndarray`

4.2 Quickly Creating Initialized Arrays

5. `dtype`: Understanding the Data Type

5.1 Specifying `dtype` When Creating an Array

6. `shape`: Reading the Data’s “Form"

7. `reshape`: Changing the Form

7.1 Using `-1` for Automatic Inference

8. `astype`: Changing the Data Type

9. Summary: The `ndarray` Basics Covered Today

Similar Posts

The Real Relationship Between NumPy and PyTorch in Deep‑Learning Code—and How to Learn Them in the Right Order

NumPy Indexing & Slicing: Mastering Tensor Manipulation

NumPy for Deep Learning Beginners: Why It Comes Before PyTorch

NumPy Basics for Deep Learning: +, -, *, /, **, Comparisons, sum/mean/max/min, and axis

Leave a comment

Add a New Comment

1. Why Start with ndarray?

2. What Is an ndarray?

3. How Similar Is a PyTorch Tensor?

4. np.array: The Basic Way to Create an ndarray

4.1 From Python Lists to ndarray

4.2 Quickly Creating Initialized Arrays

5. dtype: Understanding the Data Type

5.1 Specifying dtype When Creating an Array

6. shape: Reading the Data’s “Form"

7. reshape: Changing the Form

7.1 Using -1 for Automatic Inference

8. astype: Changing the Data Type

9. Summary: The ndarray Basics Covered Today

Similar Posts

The Real Relationship Between NumPy and PyTorch in Deep‑Learning Code—and How to Learn Them in the Right Order

NumPy Indexing & Slicing: Mastering Tensor Manipulation

NumPy for Deep Learning Beginners: Why It Comes Before PyTorch

NumPy Basics for Deep Learning: +, -, *, /, **, Comparisons, sum/mean/max/min, and axis

Leave a comment

Add a New Comment

1. Why Start with `ndarray`?

2. What Is an `ndarray`?

3. How Similar Is a PyTorch `Tensor`?

4. `np.array`: The Basic Way to Create an `ndarray`

4.1 From Python Lists to `ndarray`

5. `dtype`: Understanding the Data Type

5.1 Specifying `dtype` When Creating an Array

6. `shape`: Reading the Data’s “Form"

7. `reshape`: Changing the Form

7.1 Using `-1` for Automatic Inference

8. `astype`: Changing the Data Type

9. Summary: The `ndarray` Basics Covered Today